RabbitMQ in Production: DLQ, Retry with TTL, and a Generic Consumer Framework

Published at Jan 10, 2026

#rabbitmq#messaging#dlq#csharp#production#architecture

Confidentiality note: I can’t disclose the real project where this solution was implemented. The scenario below is fictional, but all technical decisions, patterns, and trade-offs are real and were applied in production systems.

If you’ve only used RabbitMQ in a “Hello World” scenario, it probably felt like:

  • create a queue
  • publish a message
  • consume it

In production, reality is different:

  • messages fail
  • consumers crash
  • retries need control and delay
  • DLQ is mandatory
  • observability becomes critical
  • boilerplate grows fast with dozens of consumers

This article presents a practical, production-oriented approach to building a resilient RabbitMQ-based solution, using exchanges, DLQ/TTL, and a generic producer/consumer framework designed to reduce operational risk and long-term maintenance cost.


1) Context and problem — why RabbitMQ?

Imagine a fictional product called TaskPulse, composed of three services:

  • Gateway.Api — receives HTTP commands
  • Collector.Worker — collects and normalizes data
  • Notifier.Worker — sends notifications

Initially, everything was synchronous:

  • Gateway.Api directly called Collector and Notifier
  • any external instability affected requests
  • traffic spikes resulted in timeouts
  • long-running jobs degraded API responsiveness

What we needed:

  • service decoupling
  • asynchronous processing
  • controlled retry
  • message durability

When RabbitMQ makes sense

  • work queues and async processing
  • at-least-once delivery
  • fine-grained routing (topic, direct)
  • explicit DLQ and retry control

When it doesn’t

  • high-throughput streaming and long replay → Kafka
  • complex workflow orchestration → Temporal
  • simple managed queues → SQS (or equivalent)

2) Essential concepts (only what matters)

Just what’s required for a correct implementation:

  • Producer — publishes messages
  • Consumer — processes messages
  • Queue — message buffer
  • Exchange — routing entry point
  • Binding — exchange → queue rule
  • Routing Key — logical routing path
  • Ack / Nack / Reject
  • DLQ (Dead Letter Queue) — messages that failed or expired

Exchanges in practice

  • direct — exact match
  • topic — pattern-based (task.*, user.#)
  • fanout — broadcast

For domain-based events, topic is usually the best trade-off.


3) Architecture overview

Proposed topology:

  • Main exchange: taskpulse.events (topic)
  • Dead-letter exchange: dlx (direct)

Design principles

No retry logic in code
No loops, sleeps, or recursive retries.

Retry is a topology concern
DLQ + TTL provide delay, backpressure, and predictability.

Message flow

Producer
  |
  v
taskpulse.events (topic)
  |
task.created
  |(reject)
  v
task.created.dlq (TTL)
  |(TTL expires)
  v
task.created
  |(after N failures)
  v
task.created.parking

Separating retry DLQ from a parking lot queue prevents silent message loss and simplifies manual reprocessing.


4) RabbitMQ setup

A realistic Docker Compose setup:

services:
  rabbitmq:
    image: rabbitmq:3.13-management-alpine
    ports:
      - "5672:5672"
      - "15672:15672"
    environment:
      RABBITMQ_DEFAULT_USER: app
      RABBITMQ_DEFAULT_PASS: app123
    healthcheck:
      test: ["CMD", "rabbitmq-diagnostics", "check_port_connectivity"]
      interval: 10s
      timeout: 5s
      retries: 10
      start_period: 40s

Why this matters:

  • 15672 is essential for production troubleshooting
  • healthcheck avoids startup race conditions

5) Producer — publishing responsibly

Publishing messages without persistence, metadata, or consistency is a common source of incidents.

Example event:

{
  "taskId": "1f9c...",
  "createdAt": "2026-01-10T12:00:00Z",
  "source": "integration-x"
}

Generic publisher (C#)

public interface IMessagePublisher
{
  Task PublishAsync<T>(string exchange, string routingKey, T message, 
    CancellationToken ct = default) where T : class;
}

public sealed class RabbitMqPublisher : IMessagePublisher, IDisposable
{
  private readonly ConnectionFactory _factory;
  private IConnection? _connection;
  private IModel? _channel;

  public RabbitMqPublisher(string host, int port, string user, string pass)
  {
    _factory = new ConnectionFactory
    {
      HostName = host,
      Port = port,
      UserName = user,
      Password = pass,
      DispatchConsumersAsync = true
    };
  }

  public Task PublishAsync<T>(string exchange, string routingKey, T message, 
    CancellationToken ct = default) where T : class
  {
    _connection ??= _factory.CreateConnection();
    _channel ??= _connection.CreateModel();

    var json = JsonSerializer.Serialize(message);
    var body = Encoding.UTF8.GetBytes(json);

    var props = _channel.CreateBasicProperties();
    props.Persistent = true;
    props.MessageId = Guid.NewGuid().ToString();
    props.ContentType = "application/json";

    _channel.BasicPublish(exchange, routingKey, props, body);
    return Task.CompletedTask;
  }

  public void Dispose()
  {
    _channel?.Dispose();
    _connection?.Dispose();
  }
}

Key decisions

  • Persistent = true
  • MessageId for idempotency
  • ContentType for tooling and debugging

6) Consumer — where systems usually fail

A production-grade consumer must:

  • use prefetch (QoS)
  • disable auto-ack
  • handle exceptions consistently
  • explicitly reject messages to trigger DLQ

Generic consumer framework

public abstract class RabbitMqConsumerWorker<TMessage> : BackgroundService
  where TMessage : class
{
  protected abstract string QueueName { get; }
  
  private readonly ConnectionFactory _factory;
  private IConnection? _connection;
  private IModel? _channel;

  protected RabbitMqConsumerWorker(ConnectionFactory factory)
  {
    _factory = factory;
  }

  protected override Task ExecuteAsync(CancellationToken stoppingToken)
  {
    _connection = _factory.CreateConnection();
    _channel = _connection.CreateModel();

    _channel.BasicQos(
      prefetchSize: 0,
      prefetchCount: 10,
      global: false
    );

    var consumer = new AsyncEventingBasicConsumer(_channel);
    consumer.Received += OnMessageAsync;

    _channel.BasicConsume(
      queue: QueueName,
      autoAck: false,
      consumer: consumer
    );

    return Task.CompletedTask;
  }

  private async Task OnMessageAsync(object sender, BasicDeliverEventArgs ea)
  {
    try
    {
      var json = Encoding.UTF8.GetString(ea.Body.ToArray());
      var message = JsonSerializer.Deserialize<TMessage>(json);

      if (message is null)
      {
        _channel.BasicReject(ea.DeliveryTag, requeue: false);
        return;
      }

      await ProcessAsync(message);
      _channel.BasicAck(ea.DeliveryTag, multiple: false);
    }
    catch (Exception ex)
    {
      // Log exception
      _channel.BasicReject(ea.DeliveryTag, requeue: false);
    }
  }

  protected abstract Task ProcessAsync(TMessage message);

  public override void Dispose()
  {
    _channel?.Dispose();
    _connection?.Dispose();
    base.Dispose();
  }
}

If a consumer crashes:

  • unacked messages are re-delivered
  • the system enters at-least-once delivery
  • idempotency becomes mandatory

7) Retry and DLQ

Adopted pattern:

  • error → reject(requeue: false)
  • message goes to DLQ
  • TTL applies delay
  • message returns to main queue

Conceptual configuration

  • task.created
    • DLX → dlx
  • task.created.dlq
    • TTL → 120s
    • DLX → taskpulse.events

8) Guarantees and trade-offs

RabbitMQ provides at-least-once delivery.

Implications:

  • duplicate messages are possible
  • ordering is not guaranteed with multiple consumers

Mitigation strategies:

  • idempotent consumers
  • deduplication using MessageId
  • idempotent domain operations

Exactly-once is not guaranteed. You either build it or accept the trade-off.


9) Observability and operations

Key indicators:

  • Ready vs Unacked
  • DLQ growth
  • publish/consume rate
  • message age (lag)

Minimal alerts:

  • DLQ > 0 for a sustained period
  • growing main queue
  • zero consumers on critical queues

10) Best practices and common pitfalls

Best practices

  • small, versioned messages
  • manual ack
  • DLQ always
  • controlled prefetch

Common mistakes

  • autoAck = true
  • retry logic in code
  • ack after partial failure
  • ungoverned queue proliferation

11) When not to use RabbitMQ

  • streaming → Kafka
  • workflows → Temporal
  • simple managed queues → SQS

Right tool, right problem.


12) Conclusion

The difference between “it works” and “production-ready” lies in:

  • clear DLQ strategy
  • predictable retry
  • idempotency
  • observability
  • reduced boilerplate

Natural next steps:

  • parking lot tooling
  • schema versioning
  • correlation-id and tracing
  • HA and quorum queues (evaluate)