Hadoop's existing RPC transport supports multiplexing. If a client issues a request to a server, then, before a response has been generated, the client (in a separate thread, typically) issues a second request, that second request can be sent and its response can be received before the first response is delivered. This is implemented by tagging each request and response with a call ID.
Such behavior might be natural in some transports (e.g., UDP) and unnatural in others (e.g., HTTP), so in general it ought to be optional and transport-specific. For example, an HTTP-based client should, when a call is in progress when a second call is made, either block or open a second connection. Such behavior is automatic with many HTTP client libraries. I propose that, in Avro, to permit simple implementations, multiplexing is optional for TCP socket based clients and servers (including those with SASL or TLS layered on top). The rule would be that: if a client sets "Call-Id" in a request's metadata to some value, then a server is required to set "Call-Id" in the corresponding response's metadata to the same value. A simple client that does not implement multiplexing would not set this metadata field. Such a client would simply send a request and wait for its response. Such a client should not submit a second request over the connection until the first response is received, as it would not be able to distinguish its response from the first. The server need not handle this specially, although a friendly multiplexing server might report an error if a second request is issued without a call-id before the first response has been sent. A simple server would never deliver responses out of order, but would still copy any "Call-Id" from the request so that it's compatible with multiplexing clients. With this convention, the request and response payload can be just the framed call data as currently in the spec. The only addition we might make to fully-define the wire format is to specify a magic number that's transmitted on connection open. This would (a) permit a server to automatically distinguish RPC requests from, e.g., HTTP/HTML requests, so that a single port can be used to service both, and (b) future-proof the wire format, since we can use the magic number as a version. Hopefully future changes to the wire format can mostly be handled by adding data to the handshake metadata, but, should that prove insufficient, the magic number could be incremented. Arguably, in retrospect, the handshake should be just metadata, with the other current fields moved to the metadata. The handshake would get slightly bigger and slower to process, since, e.g., "Client-Hash" and "Server-Hash" would sent as strings with each request, but it might be simpler to evolve. On a related note, would it be useful to standardize on a naming convention for RPC service addresses? For example, a TCP-based, SASL-authenticated server running on host:port might be tcp-sasl://u...@host:port/, an http-based server might be named with http://host:port/, etc. Applications could use this to name servers. The specification would define a small set of these. Doug