I wanted to chime in on a few things, since avro is a candidate for the HBase RPC.
I am not sure that "browser compatibility" is a legitimate requirement for this kind of thing. It is at odds with high performance in a number of areas, and isn't the driving factor for using HTTP anyways. Security - you can get the advantage of security standards, such as the X.509 SSL cert, without actually using HTTPS. Headers - I don't really think providing a caching mechanism built into the RPC layer is a top requirement. We'd then have to build in a GET/POST idempotent flag into the Avro IDL, and everyone would have to get it right, etc. Considering my top requirement is "make bulk data access and RPC rate/sec as high as possible", I'm not sure caching fits in here and can work against that. On Tue, Sep 29, 2009 at 8:06 PM, Scott Carey <sc...@richrelevance.com> wrote: > > > > On 9/29/09 2:57 PM, "stack" <st...@duboce.net> wrote: > >> On Tue, Sep 29, 2009 at 2:08 PM, Doug Cutting <cutt...@apache.org> wrote: >> >>> >>> Alternately, we could try to make Avro's RPC more HTTP-friendly, and pull >>> stuff out of Avro's payload into HTTP headers. The downside of that would >>> be that, if we still wish to support non-HTTP transports, we'd end up with >>> duplicated logic. >>> >> >> >> There would be loads of upside I'd imagine if there was a natural mapping of >> avro payload specifiers and metadata up into http headers in terms of >> visibility >> > > There are some very serious disadvantages to headers if overused. > > I highly advise keeping what goes into the URL and headers very specific to > support well defined features for this specific transport type. Otherwise, > put it in the data payload for all transports. > > A couple header disadvantages: > * Limited character set allowed. You can't put any data in there you want, > and you can end up with an inefficient encoding mess that is not easy to > read. > * Headers don't take advantage of other transport features. For example, > Content-Encoding:gzip provides gzip compression support for the data > payload, but you can't compress the headers in HTTP. > > On the other hand, Custom headers can be handy ways to implement transport > specific handshakes or advertize capabilities (which helps build in > cross-version compatibility). > But browsers only work with the standard ones, so whatever 'browser > requirement' is out there is going to be a limited subset no matter how you > do it. > > This thread brings up the security features. Payload encryption does not > seem to be a transport feature -- but it could be done via something like > Content-Encoding (X-Avro-Content-Encrypted?). It seems to fit better IMO > within the payload itself, or at the socket / network level via SSH or a > secure tunnel. > > Authentication is a better fit for the transport layer -- but as mentioned > elsewhere if it has to be done for all transports, could it fit in the > payload somehow? > >> >> So, are we're talking about doing something like following for a >> request/response: >> >> GET /avro/org.apache.hadoop.hbase.RegionServer HTTP/1.1 >> Host: www.example.com >> >> >> HTTP/1.1 200 OK >> Date: Mon, 23 May 2005 22:38:34 GMT >> Server: Apache/1.3.3.7 (Unix) (Red-Hat/Linux) >> Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT >> Etag: "3f80f-1b6-3e1cb03b" >> Accept-Ranges: bytes >> Content-Length: 438 >> Connection: close >> Content-Type: X-avro/binary >> > > Its acceptable to drop a lot of the headers above. Some of them are useful > to implement extended functionality -- the Etag can be used for caching if > that were desired, for example. Keep-Alive connections and chunked > responses are nice built-ins too. > >> >> ... or some variation on above on each and every RPC? >> >> St.Ack >> > >