Home/Blog/All/Redis Streams as a Queue: Where It Fits and Where It Breaks

Tech

Redis Streams as a Queue: Where It Fits and Where It Breaks

A practical guide to using Redis Streams as a queuing system: durability, delivery semantics, ordering, consumer rebalancing, and when to choose something else.

← Back to all blogsApril 23, 2026#tech#redis#distributed-systems#queues

Redis Streams as a Queue: Where It Fits and Where It Breaks

Redis is often the first thing teams reach for when they need a queue quickly.

That instinct is not wrong.

Redis is simple to operate, fast, widely understood, and already sitting in many production stacks. If the alternative is designing a homegrown job table in Postgres or blocking a request thread until slow work finishes, Redis can be a meaningful step up.

But "Redis can be used as a queue" and "Redis is a good queue for my system" are not the same statement.

That distinction matters most once your workload stops being a toy:

consumers crash mid-processing
messages need to survive restarts
ordering starts to matter
one consumer becomes ten
one region becomes two
"we can replay later" stops being a hand-wavy promise and becomes an operational requirement

Redis Streams is the closest Redis gets to a real messaging log. It gives you append-only records, consumer groups, pending message tracking, and redelivery mechanics. That makes it much more capable than simple Redis lists for queue-like workloads.

It still has very different tradeoffs from systems designed primarily as durable event backbones, such as Kafka, RabbitMQ, Pulsar, or managed cloud queues.

This post is about that boundary.

The short version

Redis Streams works well when:

you want a simple queue with low operational overhead
your throughput is moderate, not internet-scale
your consumers are inside the same operational boundary as Redis
replay requirements are limited
you can tolerate at-least-once delivery and build idempotent consumers
"fast and pragmatic" matters more than "perfectly durable under every failure mode"

Redis Streams is a poor fit when:

the queue is mission-critical and must survive infrastructure-level failures with high confidence
retention and replay are core product requirements
you need very large backlogs or long-lived message history
multi-region, cross-datacenter, or independently scaled consumers are first-class needs
strict per-key ordering under high parallelism is non-negotiable
you need stronger queue semantics than Redis can provide without significant application logic

If you remember nothing else, remember this:

Redis Streams is a pragmatic queue. It is not a magical one.

First: what Redis Streams actually gives you

A Redis Stream is an append-only sequence of records, each with an ID such as:

text

1713885000123-0
1713885000123-1
1713885000450-0

Each entry contains field-value pairs. Producers append with XADD. Consumers usually read via consumer groups using XREADGROUP.

Consumer groups introduce three important pieces of machinery:

a group-level read cursor
per-consumer ownership of delivered messages
a Pending Entries List (PEL) for messages delivered but not yet acknowledged

That PEL is the key feature that turns Streams from "just a log-like data structure" into "something queue-ish."

The basic flow is:

Producer appends a message with XADD.
A consumer group reads it with XREADGROUP.
Redis marks the message as pending for that consumer.
The consumer processes the message.
The consumer acknowledges it with XACK.

If the consumer dies before XACK, the message stays pending and can be claimed by another consumer later.

That is the happy path. The interesting part is what guarantees this does and does not imply.

Delivery semantics: not at-most-once, not exactly-once

Teams often ask, "What semantics do Redis Streams give me? At-most-once? At-least-once? Exactly-once?"

The honest answer is:

Default practical model: at-least-once
Possible with application choices: at-most-once-like behavior, but with durability tradeoffs
Exactly-once: no, not in the distributed systems sense

The reason is simple: in Redis Streams, delivery and successful completion are not the same event.

Redis knows when:

a producer appended a message
a consumer was given that message
a consumer acknowledged that message

Redis does not know whether your business side effect truly completed unless your application tells it so by sending XACK after processing.

That creates a critical gap:

message was delivered
work may or may not have happened
XACK may or may not have happened

If failure happens in that gap, Redis will keep the message pending and allow it to be reclaimed later. That is exactly why the system leans toward at-least-once delivery.

Why Redis Streams is usually at-least-once

Once a message is delivered to a consumer and recorded in the PEL, Redis considers it pending until the consumer acknowledges it.

Here is the normal flow:

Rendering diagram…

Those two diagrams together show the semantics clearly.

The message was:

appended once
delivered once initially
processed once, maybe successfully
acknowledged zero times
then delivered again through reclaim

So from the system’s point of view, the safe behavior is to try again. That is at-least-once.

If the consumer:

crashes after processing but before XACK
times out while handling the message
gets partitioned away from Redis
is slow enough that another worker claims the message

then the same message may be processed again.

That is classic at-least-once delivery.

Another way to say it:

XREADGROUP means "someone may now work on this"
XACK means "this unit of work is complete enough to forget"

Everything in between is failure territory.

If duplicate work is expensive or dangerous, your consumer has to be idempotent. In practice, that means using:

idempotency keys
dedupe tables
compare-and-set updates
transactional guards in downstream storage

If your business logic cannot safely tolerate duplicate processing, Redis Streams alone is not enough.

Can I do at-most-once?

You can choose to treat delivery as at-most-once by acknowledging aggressively or by not reclaiming pending messages after failures.

That is usually a bad deal.

At-most-once means you are explicitly accepting loss in exchange for simplicity. For some workloads that is fine:

best-effort analytics events
non-critical notifications
cache warming jobs
low-value background fan-out

But once the queue carries billing events, order state transitions, fraud actions, or user-visible workflow steps, silent loss becomes much worse than duplicate handling.

So the better framing is:

Redis Streams naturally wants to be at-least-once
At-most-once is something you opt into by accepting message loss

Exactly-once is still an application problem

No Redis primitive can make "side effect happened exactly once" true across Redis, your consumer, your database, and whatever external API you call next.

Even if Redis delivered a message once, your handler could:

commit to Postgres
crash before XACK
get redelivered
commit again unless your write path is idempotent

That is why "exactly-once" claims in messaging systems are usually narrower than people think. The real invariant you want is almost always:

effectively-once business processing through idempotent consumers

That remains true with Redis Streams.

Durability: better than lists, weaker than people assume

Redis is an in-memory system with persistence options, not a disk-first commit log.

That sentence alone should shape how much trust you place in it as a queue.

What durability depends on

Whether a stream entry survives failure depends on:

whether AOF is enabled
AOF fsync policy
whether you rely on RDB snapshots
whether replication has caught up
whether failover promotes a replica that has the latest writes
whether you trim streams aggressively

This is the part many teams underestimate. They hear "Redis has persistence" and mentally round that up to "durable queue."

Those are not equivalent.

Failure model by persistence mode

If you are using RDB snapshots only, you can lose recent messages between snapshots.

If you are using AOF every second, you can still lose up to roughly a second of acknowledged writes during a crash, depending on timing.

If you are using AOF always, durability improves, but write latency and operational cost go up.

If you also rely on replication and failover, then the failover point matters. A promoted replica may not have every last write that the old primary had accepted.

So for Redis Streams, the realistic durability model is:

decent for many product workloads
often acceptable for internal asynchronous jobs
not the same class of durability as a replicated disk-backed event log designed for replay and retention

The key durability question

Ask this:

If I lose the last few seconds of queued messages during a bad failover, is that survivable?

If the answer is "annoying but manageable," Redis may be fine.

If the answer is "that becomes a finance incident," raise the bar and use a system built for stronger durability guarantees.

Ordering: yes, but only within narrow boundaries

Redis Streams gives you a stable append order in the stream itself. That is useful, but it is easy to over-interpret.

What is ordered

Within a single stream, entries have ordered IDs. If a single consumer reads serially, it will observe them in stream order.

That is the easiest case.

What breaks ordering in practice

The moment you add:

multiple consumers in one group
retries and reclaims
slow handlers
redelivery after timeout

you no longer have a clean global processing order.

Example:

Message A is delivered to consumer 1.
Message B is delivered to consumer 2.
Consumer 2 finishes and acknowledges B quickly.
Consumer 1 stalls, crashes, or gets partitioned.
A is claimed and processed later.

The stream order was A then B. The effect order became B then A.

That is not a Redis bug. That is what happens in distributed consumers once you allow parallelism and failure recovery.

The right mental model for ordering

Use Redis Streams when one of these is true:

global ordering does not matter
ordering matters only within a small partition key that you enforce in application logic
you are willing to serialize work per entity yourself

Do not use Redis Streams expecting:

strict global processing order with many consumers
automatic partition-key routing like Kafka partitions
effortless "same customer always goes to same worker" behavior

If entity ordering matters, a common pattern is to shard by entity into separate streams or to route messages through a deterministic partitioning layer in the producer. That can work well at moderate scale, but it is something you build. Redis does not hand it to you.

Consumer rebalance and failure recovery

This is where Redis Streams becomes useful and also where its limitations become obvious.

Kafka has a formal consumer group rebalance protocol. Redis Streams does not.

There is no built-in notion of:

partition assignment
cooperative rebalance
revocation callbacks
generation epochs
automatic redistribution of ownership after membership changes

Instead, Redis gives you a lower-level primitive: the pending list.

What happens when a consumer dies

Suppose consumer payments-worker-3 reads messages and then disappears.

Redis will not automatically move those pending messages to another consumer just because the process died.

Those entries remain pending in the PEL until one of the following happens:

the original consumer comes back and continues
another consumer explicitly claims them
your application decides they are abandoned and reassigns them

The usual mechanism is XAUTOCLAIM or XCLAIM, based on idle time.

That means your "rebalance" logic is really:

detect old pending work
decide the idle timeout threshold
claim abandoned messages
process them again safely

This is not a true rebalance protocol. It is recovery through redelivery and claiming.

What a network partition looks like

Now consider a partition between consumers and Redis.

Several cases matter:

Consumer loses connection before receiving the message No issue. Another consumer can read it later.
Consumer receives the message, processes part of it, then loses connection before XACK Redis still considers it pending. It may later be claimed and processed again.
Consumer finishes the side effect, but XACK never reaches Redis Same result: duplicate processing risk after reclaim.
A failover happens during the partition Now you also have to reason about replication lag and whether the new primary has the latest stream and PEL state.

The operational conclusion is straightforward:

Consumer rebalance under failure in Redis Streams is mostly your responsibility.

You need sensible settings for:

idle timeout before reclaim
maximum processing time expectations
dead-letter handling for poison messages
dedupe on the consumer side
observability on pending depth and oldest idle message age

If you do not build those pieces, the queue will look fine in staging and become confusing in production.

Poison messages and dead-letter queues

Redis Streams does not give you a first-class dead-letter queue policy out of the box.

You can build one, and many teams do:

track delivery count with XPENDING
move messages exceeding a retry threshold into another stream
annotate failure reason in a side store or companion stream
XACK the original message after quarantining it

That works, but again, it is application policy, not a complete built-in queue framework.

If your workload has a high probability of malformed or permanently failing jobs, budget for this logic early.

Backpressure and backlog growth

Queues rarely fail because enqueue and dequeue exist. They fail because backlog behavior was never treated as a first-class concern.

Redis makes this sharper because it is memory-backed.

If consumers lag:

stream length grows
memory usage grows
eviction policy starts to matter
unrelated Redis workloads can get impacted

That is a major reason not to colocate a large, bursty queue workload on the same Redis cluster that also serves:

hot caching for read traffic
rate limiting
session storage
distributed locks

Mixing those responsibilities is convenient right until the queue backlog eats your memory budget and turns a background problem into a front-door outage.

If you are serious about Redis as a queue, isolate it operationally.

Good use cases for Redis Streams as a queue

Redis Streams is a strong fit for workloads like:

async jobs behind a web application
notification fan-out where occasional duplicate work is acceptable
email, webhook, or push dispatch pipelines with idempotent downstream handling
cache rebuild or materialized view refresh jobs
moderate-rate event processing inside one product boundary
workflow steps where backlog is short-lived and replay depth is limited
internal platform automation where operational simplicity matters more than long-term retention

These workloads tend to share a pattern:

they benefit from fast enqueue/dequeue
they can tolerate at-least-once delivery
they do not need months of retention
they do not need independent consumer groups replaying history forever

Redis also shines when the team already runs it well and wants the smallest operational jump from synchronous code to asynchronous processing.

When Redis should not be your queue

Redis Streams is the wrong choice, or at least a risky one, for:

financial event pipelines
audit-grade event retention
event sourcing backbones
large multi-tenant asynchronous platforms with unpredictable backlog growth
analytics pipelines needing replay over large history
cross-region messaging with strict durability expectations
workloads where one consumer group is not enough and many independent subscribers need the same historical stream
systems where message loss across failover is unacceptable

In these cases, the issue is not that Redis cannot be made to work.

The issue is that you are fighting the system’s nature.

Kafka, Pulsar, RabbitMQ, or a managed cloud queue may have more setup cost, but they align better with the problem.

That alignment matters more than raw throughput benchmarks.

Redis Lists vs Redis Streams

If you are deciding between a simple Redis List queue and Streams, the answer is usually Streams.

Lists are fine for very basic work queues, but they break down quickly once you need:

consumer groups
pending tracking
redelivery
replay
observability into unacked work

Streams gives you a much saner operational model for real background processing.

So the serious comparison is not "Lists or Streams?" It is usually:

"Streams, or should we use a purpose-built messaging system instead?"

A practical decision framework

Use Redis Streams if all of these are true:

the team wants low operational overhead
at-least-once delivery is acceptable
consumers are idempotent
message retention is short
backlog size is bounded and monitored
queue failure is recoverable without existential damage
you are willing to build claim, retry, DLQ, and observability policy in the app

Avoid Redis Streams if any of these are true:

silent loss is unacceptable
queue durability is a board-level problem
replay is a product feature
you need formal consumer rebalance behavior
strict ordering under parallelism is a hard requirement
backlog may become very large or very long-lived

The production checklist

If you do choose Redis Streams, do not stop at XADD and XREADGROUP.

Treat the queue as a real subsystem and build these pieces:

AOF-based persistence with durability settings chosen intentionally
Redis replication and failover semantics understood by the team
idempotent consumers
reclaim logic using XAUTOCLAIM or equivalent
retry thresholds and dead-letter handling
metrics for stream length, pending count, retry count, and oldest idle pending message
alerts on backlog growth and stuck consumers
memory isolation from unrelated Redis workloads
stream trimming policy that matches actual recovery needs

That is the difference between "we use Redis as a queue" and "we have a queueing system."

Final take

Redis Streams is a very good answer to a surprisingly large class of queue problems.

It is fast, practical, and much more capable than many teams expect. For async jobs, notifications, moderate internal pipelines, and product workflows with idempotent consumers, it can be exactly the right engineering tradeoff.

But it is still a tradeoff.

It does not give you exactly-once processing. It does not give you Kafka-style rebalance. It does not preserve clean ordering once parallel consumers and retries enter the picture. Its durability is configurable and useful, but not something you should romanticize.

Use Redis Streams when you want a pragmatic queue and you understand the failure model.

Do not use it when what you really need is a durable event backbone with stronger guarantees than Redis was designed to provide.