Here are several other items I don't think were addressed in the previous
post but are worth mentioning:

1. Transaction Capacity Limits
Servers are encouraged to enforce a configurable maximum number of
concurrent open transactions (maxConcurrentTransactions). When the limit is
reached, new begin requests are rejected with HTTP 429. Slots are freed
when transactions close via commit, rollback, or timeout. The limit applies
per server instance, not globally across a cluster, consistent with the
design that each server manages only its own transactions.

2. Transaction Timeout and Idle Reclamation
Servers must implement a configurable transaction timeout
(transactionTimeout). The timeout represents how long a transaction can sit
idle with no requests before the server forcibly rolls it back and removes
it. The clock resets on each request received for that transaction, so
active transactions are not affected. After a timeout fires, any subsequent
request with that transaction ID gets a 404 "Transaction not found". There
is no recovery mechanism. This behaves similarly to the old session
lifetime.

3. Error Handling Within a Transaction
If a traversal within an open transaction fails (bad syntax, runtime error,
etc.), the server returns the error for that request but keeps the
transaction open. The client can retry, submit other traversals, or
explicitly roll back. A failed traversal does not implicitly close or roll
back the transaction. Only an explicit commit, explicit rollback, or
server-side timeout closes a transaction. Again, this is similar to the old
session in that sessions didn't close due to errors.

4. Graph Alias Handling
The graph alias ("g" binding) in the request is how the server determines
which graph instance a transaction belongs to. The begin request can
include the alias (defaults to "g"), and the server uses it to open the
transaction against the correct graph. The alias is then locked for the
lifetime of the transaction. If a subsequent request carries a different
alias, the server rejects it with HTTP 400. This prevents cross-graph
commits which aren't a standard feature.

5. HTTP Status Codes
The previous post didn't specify which HTTP status codes the server should
return for transaction-related errors.

The following status codes should be used:

200 OK: Successful response for any valid request: begin (response includes
the server-generated transactionId), traversal execution, commit, and
rollback.

400 Bad Request: The request is malformed or violates a transaction
constraint.
Specific cases:
- Begin on a graph that does not support transactions.
- Graph alias missing or invalid on a begin request.
- Graph alias mismatch on a subsequent request within a transaction (alias
differs from the one used at begin time).
- Malformed gremlin, invalid request body, or other request validation
failures (this applies to both transactional and non-transactional
requests).

404 Not Found: The transactionId in the request does not correspond to any
active transaction on this server.
This covers:
- Transaction was already committed.
- Transaction was already rolled back.
- Transaction was reclaimed by timeout.
- Transaction ID was fabricated or never existed.

This is the single status code for all "transaction not found" scenarios
regardless of why the transaction no longer exists. Although, we can
distinguish between a recently closed transaction and one that doesn't
exist, I don't think it's necessary since it doesn't make a difference to a
user. In either scenario, the user has to start a new transaction.

429 Too Many Requests: The server has reached its maxConcurrentTransactions
limit and cannot accept a new begin request. The client should retry after
a delay or route to a different server instance.

500 Internal Server Error: An unexpected server-side failure occurred
during transaction processing. This is a catch-all for unhandled exceptions
and should not occur during normal operation.

Non-transactional requests (those without a transactionId) are not affected
by any of the transaction-specific status codes above. They continue to use
the existing HTTP status code specified by the HTTP API.

On Wed, Feb 18, 2026 at 6:41 PM Ken Hu <[email protected]> wrote:

> Hi All,
>
> I thought I'd kick-start the conversation of what the remote transaction
> flow could look like and how it would tie in with the TinkerPop 4 HTTP API.
> I added some diagrams to
> https://github.com/apache/tinkerpop/blob/remote-tx/docs/src/dev/future/http-api.md
> which I'll refer to in this post.
>
> Much of this design is borrowed from the current WebSocket implementation
> and so many of the decisions made were done so to minimize changes needed
> when moving from WebSockets to HTTP. The basic idea is that transactions
> will be bound using a Transaction Id rather than a Session over a WebSocket
> connection. The transaction will be controlled using the standard Gremlin
> language syntax for transaction control ("g.tx().begin()",
> "g.tx().commit()", "g.tx().rollback()") and once a begin() is issued, the
> client must send all subsequent requests involved in that transaction to
> the same endpoint. The server is responsible for tracking the state of the
> transaction and executing each transaction against the appropriate Graph.
> On receipt of a begin(), the server will generate the Transaction Id and
> return it in the response to the client, and the client should attach that
> Id to both the header and body for requests involved in that transaction.
> Take a look at the "Protocol Flow" to see how this would work end to end.
>
> Summary of Key Points
>
> The "Request Format Specification" contains the expected fields that will
> be used in the request and responses. Remember that there are other fields
> that can be specified in the HTTP API but they aren't shown in those charts.
>
> 1. Single new field: Only transactionId is added to the request body
> 2. Dual transmission: Transaction ID must appear in both X-Transaction-Id
> header AND transactionId body field
> 3. Header for routing: The header enables load balancer sticky sessions
> without body parsing
> 4. Body for processing: The body field is used by the server to lookup
> transaction state
> 5. Gremlin-based control: Transaction lifecycle uses standard Gremlin
> syntax ("g.tx().begin()", "g.tx().commit()", "g.tx().rollback()")
> 6. Client-based affinity: Client binds to a single endpoint and all
> transactions issued to the same endpoint.
> 7. Backward compatible: Requests without transactionId continue to work as
> non-transactional requests
>
> The protocol deliberately places host affinity responsibility on the
> client side rather than requiring server-side coordination. The client is
> responsible for routing all requests in a transaction to the same server
> instance (via the X-Transaction-Id header for load balancer sticky
> routing). This is very similar to the current WebSocket session design
> where the client binds to one connection. The difference is that we are
> binding to one endpoint instead in HTTP.
>
> This design is intentionally server topology agnostic and works
> identically whether the backend is a single server, a cluster with a load
> balancer, a managed service, or a serverless deployment. The server
> implementation only needs to manage local transaction state. Servers don't
> need to share transaction state or implement distributed locking. Each
> server manages only its own transactions, which should
> simplify implementations for providers. Providers can deploy behind any
> load balancing strategy without protocol changes. The X-Transaction-Id
> header provides a standard mechanism for any infrastructure to achieve
> affinity.
>
> Transaction control uses Gremlin script syntax ("g.tx().begin()",
> "g.tx().commit()", "g.tx().rollback()") rather than dedicated REST
> endpoints (e.g., "POST /transaction/begin", "DELETE /transaction/{id}").
> This approach was chosen for several reasons:
> -Consistency with existing API: All operations flow through the same
> /gremlin endpoint with the same request format. GLVs/drivers/clients don't
> need to implement multiple endpoint patterns or handle different
> request/response structures for transaction control vs. query execution.
> -Script and Traversal compatible: This allows for scripts to have the same
> functionality as traversals.
>
> Please respond if you have any questions or comments.
>
> Thanks,
> Ken
>

Reply via email to