A silently dropped connection during a checkout POST request leaves your client guessing whether the payment cleared, often resulting in a catastrophic double-charge. Building a REST API that survives production traffic means shifting focus from basic CRUD theories to handling concurrency, network failures, and data drift.

  • Pagination: Cursor-based, eliminates data drift under concurrent load
  • Error handling: RFC 9457, Problem Details for HTTP APIs
  • Idempotency: Mandatory for state-changing requests via Idempotency-Key headers
  • Versioning: Accept/Content-Type header preferred, URI /v1 as fallback
  • Rate limiting: HTTP 429 coupled with precise Retry-After headers

Resource Naming and URL Structure

Plural Nouns Over Verbs

Your endpoints represent resources, not remote procedure calls. Using verbs in your URLs instantly breaks the standard REST contract and creates routing chaos. Rely on HTTP methods to define the action.

Instead of creating endpoints like /getCustomers or /createOrder, stick to plural nouns. The endpoint /customers handles both retrieval and creation depending on whether you send a GET or POST request. Pluralizing everything keeps your naming conventions perfectly consistent across the entire system.

The One-Level Nesting Rule

Mapping relational databases directly to URL structures is a common trap. Deeply nested URLs like /customers/123/orders/456/items/789 quickly become unreadable and brittle.

Limit your nesting to a single level. If you need to access a specific item, you already have its unique identifier. You can fetch it directly via /items/789 rather than traversing the entire customer hierarchy. Use nesting strictly to show direct containment, such as retrieving all orders for a specific user (/customers/123/orders).

HTTP Methods and the Reality of CRUD

Mapping Actions to HTTP Verbs Effectively

Treat HTTP verbs as the precise vocabulary of your API. POST creates new resources, while GET retrieves them without causing side effects.

The distinction between PUT and PATCH is where most implementations fail. PUT completely replaces a resource; you must send the entire payload. PATCH applies partial modifications. If a client only wants to update an email address, force them to use PATCH so a PATCH /customers/123 carrying only {"email": "new@site.com"} leaves every other field untouched, whereas the same body sent as PUT would wipe the name, phone, and address to null. Using POST for updates or GET for state-changing actions breaks cacheability and confuses intermediary proxies.

Choosing the Exact 2xx, 4xx, and 5xx Status Codes

Returning a 200 OK with an error message buried in the JSON payload is an architectural anti-pattern. HTTP status codes exist to let clients and load balancers parse the result without opening the body.

  • 201 Created: Use this when a POST successfully creates a resource, accompanied by a Location header pointing to the new URI.
  • 202 Accepted: Perfect for asynchronous processing where the request is valid but not yet complete.
  • 400 Bad Request: The payload is malformed or fails validation.
  • 422 Unprocessable Entity: The JSON is perfectly formatted, but violates business logic.
  • 404 Not Found: Use strictly for missing resources, never for unauthorized access.

Standardizing Error Responses with RFC 9457

Moving Beyond Generic Error Messages

Dumping an unhandled exception or returning {"error": "Something went wrong"} forces client developers to guess what broke. Predictable error formats drastically reduce debugging time.

Modern APIs adopt RFC 9457 (Problem Details for HTTP APIs). This standardizes how APIs communicate failures, ensuring every error envelope looks exactly the same across your entire architecture.

Structuring the Problem Details JSON Envelope

Implementing RFC 9457 gives your clients an actionable, machine-readable format. Your error response should include specific keys.

  • type: A URI reference that identifies the problem type (e.g., https://example.com/probs/out-of-credit).
  • title: A short, human-readable summary of the problem.
  • status: The HTTP status code generated by the origin server.
  • detail: A human-readable explanation specific to this occurrence.
  • instance: A URI reference that identifies the specific occurrence of the problem.

Serve this body with a Content-Type: application/problem+json header so clients know to parse it as a structured problem rather than a generic payload. A validation failure looks like this:

{
  "type": "https://api.example.com/probs/validation",
  "title": "Your request parameters did not validate.",
  "status": 422,
  "detail": "The field 'email' must be a valid email address.",
  "instance": "/customers/123"
}

Pagination: Why Cursor Beats Offset in Production

The Data Drift Problem in Offset Pagination

Offset pagination (?limit=20&offset=40) relies on counting rows from the beginning of a database table. In a high-traffic production environment, this is incredibly fragile.

Imagine a user fetching the first 20 records. While they read, five new records are inserted at the top of the database. When the user requests the next page (offset=20), the database shifts everything down. The user suddenly sees duplicate records from page 1 appearing on page 2. This is data drift, and offset pagination guarantees it will happen.

Implementing Scalable Cursor-Based Filtering and Sorting

Cursor-based pagination solves data drift by completely ignoring row counts. Instead, you pass a unique identifier representing the exact row where the client left off.

The client requests ?limit=20. Your API returns the data along with a next_cursor token (often a base64 encoded timestamp or UUID). The client sends this token in their next request (?limit=20&cursor=XYZ). The database instantly seeks to that exact index and fetches the next batch. Under the hood the query stays flat no matter how deep the client scrolls:

SELECT * FROM orders
WHERE created_at < :cursor
ORDER BY created_at DESC
LIMIT 20;

Because it seeks by an indexed value instead of counting rows, this never drifts and executes significantly faster on large datasets than OFFSET, which still has to walk every skipped row.

Idempotency Keys in POST Requests

Preventing Duplicate Transactions

Mobile networks drop packets constantly. A client sends a POST request to charge a credit card, the server processes the payment, but the network drops the 200 OK response. The client, assuming a timeout, retries the request. You just charged the customer twice.

You need idempotency. An idempotent endpoint guarantees that making the same request multiple times produces the exact same result as making it once.

Handling Retries with Idempotency Keys

Require clients to send an Idempotency-Key header (usually a UUID) with every POST and PATCH request.

Your server caches this key alongside the final response payload. If a duplicate request arrives with the same key, your system skips the business logic entirely. It simply returns the cached HTTP response from the initial transaction. This completely eliminates duplicate processing, even during aggressive gateway timeouts and client-side retries.

Two corner cases decide whether this actually holds up. First, give cached keys a finite TTL (24 hours is a common window) so your store does not grow forever. Second, if a request arrives with a key you have already seen but a different body, reject it with a 409 Conflict instead of silently returning the old response; a reused key with new data almost always signals a client bug. This is not a fringe pattern, payment platforms like Stripe built their entire retry-safety guarantee on exactly this header.

Authentication, Security, and CORS

Enforcing HTTPS and Bearer Tokens

Never invent custom authentication schemes. Rely on established standards like OAuth2 or JSON Web Tokens (JWT).

Clients should pass credentials using the standard Authorization: Bearer <token> HTTP header. Ensure your API gateway strictly enforces TLS 1.2 or higher. Reject any unencrypted HTTP traffic instantly at the edge rather than redirecting it.

Why Credentials Never Belong in Query Strings

Passing API keys or session tokens in the URL (?api_key=secret) is a critical security vulnerability.

URLs are broadcasted across the entire network stack. They get logged in browser histories, proxy servers, load balancer access logs, and APM tools. If a credential is in the query string, it is essentially public. Always keep sensitive data locked securely within HTTP headers or the encrypted request body.

Versioning Strategies and Deprecation Policies

URL Versioning vs. Accept Header

APIs evolve, and breaking changes are inevitable. You have two primary paths for versioning.

URL versioning (/v1/customers) is the most pragmatic and developer-friendly approach. It makes routing trivial and caching predictable. Alternatively, content negotiation via the Accept header (Accept: application/vnd.company.v1+json) keeps your URIs clean and perfectly aligns with REST principles, though it requires clients to actively manage headers. Pick one and apply it universally.

Sunsetting APIs Gracefully

You cannot turn off an old API version overnight. Deprecation requires a clear, automated communication strategy.

Utilize the standard Deprecation HTTP header to flag aging endpoints dynamically. Pair this with a Sunset header containing the exact timestamp when the endpoint will be permanently disabled. This allows client applications to detect aging endpoints programmatically and alert their developers long before the breaking change occurs.

Rate Limiting, Caching, and Observability

Protecting the Gateway with 429 Too Many Requests

Without strict rate limits, a single poorly written client loop can take down your entire backend. Rate limiting should happen at the API gateway, before requests ever hit your application servers, whether those run behind a Docker alternative or a managed platform.

When a client hits their limit, return a 429 Too Many Requests status code immediately. Crucially, include a Retry-After header. This tells the client exactly how many seconds they must wait before sending another request, preventing aggressive polling and letting traffic spikes cool off organically. Set those thresholds from real numbers; a round of database load testing shows exactly where your backend saturates instead of guessing.

Keeping OpenAPI Documentation in Sync

Outdated documentation is actively harmful. If your developers manually update a wiki after changing code, the documentation will inevitably lie.

Adopt a code-first documentation strategy. Use tools that automatically generate OpenAPI (Swagger) specifications directly from your code annotations and routing logic. If your stack is JavaScript, keeping your Node.js version current keeps those generator plugins compatible on every build. This ensures your interactive documentation, client SDKs, and actual API behavior remain perfectly synchronized on every single deployment.

The rules above are not independent tips; they compound. Cursor pagination protects your reads, idempotency keys protect your writes, and RFC 9457 makes every failure readable. Ship those three first, and most of the production incidents that hit teams shipping their first public API never reach your on-call rotation.