One additional benefit of using HTTP is that people are always working
to improve performance, and not only optimizing servers -- Google's SPDY:
http://www.readwriteweb.com/archives/spdy_google_wants_to_speed_up_the_web.php
Multiplexed requests, compressed headers, etc...
Patrick
Doug Cutting
On 10/14/09 9:37 AM, "Doug Cutting" wrote:
> Kan Zhang wrote:
>> One problem I see with using HTTP is that it's expensive to provide data
>> encryption. We're currently adding 2 authentication mechanisms (Kerberos and
>> DIGEST-MD5) to our existing RPC. Both of them can provide data encryption
Kan Zhang wrote:
One problem I see with using HTTP is that it's expensive to provide data
encryption. We're currently adding 2 authentication mechanisms (Kerberos and
DIGEST-MD5) to our existing RPC. Both of them can provide data encryption
for subsequent communication over the authenticated chan
On 10/9/09 12:56 PM, "Doug Cutting" wrote:
> Sanjay Radia wrote:
>> Will the RPC over HTTP be transparent so that that we can replace with a
>> different layer if needed?
>
> Yes.
>
>> My worry was the separation of data and checksums; someone had mentioned
>> that one could do this over 2 R
On 10/9/09 10:49 AM, "Doug Cutting" wrote:
>
>> It is an interesting question how much we
>> depend on being able to answer queries out of order. There are some
>> parts of the code where overlapping requests from the same client
>> matter. In particular, the terasort scheduler uses threads to
Sanjay Radia wrote:
Will the RPC over HTTP be transparent so that that we can replace with a
different layer if needed?
Yes.
My worry was the separation of data and checksums; someone had mentioned
that one could do this over 2 RPCs - that is not transparent.
That was suggested as a possibi
> iThreadedHttpConnectionManager.html#setMaxConnectionsPerHost%28int%29
>
> Connections are not free of course, but Jetty has been benchmarked at
> 20,000 concurrent connections:
>
> http://cometdaily.com/2008/01/07/2-reasons-that-comet-scales/
>
>> In short, I think
omet-scales/
In short, I think that an HTTP transport is great for playing with, but
I don't think you can assume it will work as the primary transport.
I agree, we cannot assume it. But it's easy to try it and see how it
fares. Any investment in getting it working is perhaps not
icular, the terasort scheduler uses threads to access
the namenode. That would stop providing any pipelining, which I
believe would be significant.
In short, I think that an HTTP transport is great for playing with,
but I don't think you can assume it will work as the primary transport.
-- Owen
>> With respect to Avro/Hadoop, I suspect requests from clients to be time
>> clustered.
>
> That was my thought as well. The thing that gets me is that in the case
> of Hadoop (and the related subprojects) the clients utilizing this
> particular HTTP connection are probably going to be pretty sm
On 10/5/09 1:47 PM, "Ryan Rawson" wrote:
> I have a question about these headers... will they impact the ability to do
> many, but small, rpcs? Imagine you'd need to support 5,000 to 50,000
> rpcs/second. Would this help or hinder?
>
As long as the HTTP response and request fit in one networ
Scott Carey wrote:
> Even in the beacon case, if the browser is likely to send another request
> shortly, it cuts the effective network latency in half.
Which is generally not the case in the beacon / ad server use case. That
was the only point I was making. That's besides the point, though. I
thi
On 10/5/09 1:53 PM, "Eric Sammer" wrote:
> Ryan:
>
> Certainly keep alive will help in this case, if that's what you're
> referring to. The server holds the socket for N seconds or M requests,
> which ever comes first. What you're saving with KA is the connection
> setup / tear down. If you hav
Sanjay Radia wrote:
What about out of order exchange. Will we be able to support that with
http transport?
Out-of-order exchange was originally added to Hadoop's RPC when it was a
part of Nutch. It's an important optimization for distributed search,
but it's not clear how
Ryan Rawson wrote:
> That's good to know. I thought ka would help... but I was also talking about
> the overhead of a header where the payload is smaller than the framing. Eg:
> 8 byte requests, excluding which rpc. This seems like we could be hurt since
> the headers are potentially 5x the size of
That's good to know. I thought ka would help... but I was also talking about
the overhead of a header where the payload is smaller than the framing. Eg:
8 byte requests, excluding which rpc. This seems like we could be hurt since
the headers are potentially 5x the size of our payload/request params
Ryan:
Certainly keep alive will help in this case, if that's what you're
referring to. The server holds the socket for N seconds or M requests,
which ever comes first. What you're saving with KA is the connection
setup / tear down. If you have a lot of cases where the client makes a
single request
I have a question about these headers... will they impact the ability to do
many, but small, rpcs? Imagine you'd need to support 5,000 to 50,000
rpcs/second. Would this help or hinder?
On Oct 5, 2009 4:44 PM, "Eric Sammer" wrote:
Doug Cutting wrote: > More or less. Except we can probably arrange
Doug Cutting wrote:
> More or less. Except we can probably arrange to omit most of those
> response headers except Content-Length. Are any others strictly required?
Content-Type and Server are probably unavoidable. Some of the others are
extremely helpful during development / debugging / etc. It
e transport or choosing a server.
Agreed.
Hence the main advantages that remain for http transport are
1) language independent spec for the protocol. The message headers
will be in avro so that is easy and the message exchange should be
fairly straightforward. I see this as a minor advantage
On Sep 29, 2009, at 2:08 PM, Doug Cutting wrote:
...
Alternately, we could try to make Avro's RPC more HTTP-friendly, and
pull stuff out of Avro's payload into HTTP headers. The downside of
that would be that, if we still wish to support non-HTTP transports,
we'd end up with duplicated logic.
I wanted to chime in on a few things, since avro is a candidate for
the HBase RPC.
I am not sure that "browser compatibility" is a legitimate requirement
for this kind of thing. It is at odds with high performance in a
number of areas, and isn't the driving factor for using HTTP anyways.
Security
On 9/29/09 2:57 PM, "stack" wrote:
> On Tue, Sep 29, 2009 at 2:08 PM, Doug Cutting wrote:
>
>>
>> Alternately, we could try to make Avro's RPC more HTTP-friendly, and pull
>> stuff out of Avro's payload into HTTP headers. The downside of that would
>> be that, if we still wish to support n
BTW, java.net.UrlConnection is the likely bottleneck there - it stinks
performance-wise. The Apache commons http client is much faster. Try out
using Jmeter and switch from one connector to the other for an example.
On 9/29/09 4:17 PM, "Doug Cutting" wrote:
stack wrote:
> So, are we're talk
Out of curiosity, do we have such numbers for the current hadoop RPC?
On 9/29/09 4:17 PM, "Doug Cutting" wrote:
stack wrote:
> So, are we're talking about doing something like following for a
> request/response:
>
> GET /avro/org.apache.hadoop.hbase.RegionServer HTTP/1.1
> Host: www.example.c
Raghu Angadi wrote:
Does this mean current Avro RPC transport (an improved version of Hadoop
RPC) can still exist as long as it supported by developers?
Sure, folks can create new transports for Avro. There is, for example,
in Hadoop Common some code that tunnels Avro RPCs inside Hadoop RPCs.
stack wrote:
So, are we're talking about doing something like following for a
request/response:
GET /avro/org.apache.hadoop.hbase.RegionServer HTTP/1.1
Host: www.example.com
HTTP/1.1 200 OK
Date: Mon, 23 May 2005 22:38:34 GMT
Server: Apache/1.3.3.7 (Unix) (Red-Hat/Linux)
Last-Modified:
-129 and it seems like a great example of using HTTP
transport.
Does this mean current Avro RPC transport (an improved version of Hadoop
RPC) can still exist as long as it supported by developers?
Where does security lie : Avro or Transport layer?
If it is part of Avro : transport layer does
On Tue, Sep 29, 2009 at 2:08 PM, Doug Cutting wrote:
>
> Alternately, we could try to make Avro's RPC more HTTP-friendly, and pull
> stuff out of Avro's payload into HTTP headers. The downside of that would
> be that, if we still wish to support non-HTTP transports, we'd end up with
> duplicated
stack wrote:
What do you think the path on the first line look like? Will it be a method
name or will it be customizable?
Avro RPC currently includes the message name in the payload, so, unless
that changes, for Avro RPC, we'd probably use a different URL per
protocol. As a convention we mig
On Tue, Sep 29, 2009 at 12:43 PM, Doug Cutting wrote:
>
> The question I'm asking now is about the wire format, whether we wish to
> precede each RPC request with something like "GET
> /avro/org.apache.hadoop.hdfs.NameNode HTTP/1.1\n" and each response with
> "HTTP/1.1 200 OK\n", plus a couple of
Sanjay Radia wrote:
Wrt connection pooling/async servers: Can't we use the same libraries
that Jetty and Tomcat use?
Grizzly?
Grizzly also supports HTTP. Choosing Grizzly is independent of choosing
HTTP as a wire transport or choosing a server.
The question I'm asking now is about the wi
On Sep 28, 2009, at 3:42 PM, Doug Cutting wrote:
Owen O'Malley wrote:
> I've got concerns about this. Both tactical and strategic. The
tactical
> problem is that I need to get security (both Kerberos and token)
in to
> 0.22. I'd really like to get Avro RPC into 0.22. I'd like both to be
>
Owen O'Malley wrote:
I've got concerns about this. Both tactical and strategic. The tactical
problem is that I need to get security (both Kerberos and token) in to
0.22. I'd really like to get Avro RPC into 0.22. I'd like both to be
done roughly in 5 months. If you switch off of the current RPC
On Sep 11, 2009, at 2:41 PM, Doug Cutting wrote:
I'm considering an HTTP-based transport for Avro as the preferred,
high-performance option.
HTTP has lots of advantages. In particular, it already has
- lots of authentication, authorization and encryption support;
- highly optimized server
On Sep 11, 2009, at 2:41 PM, Doug Cutting wrote:
I'm considering an HTTP-based transport for Avro as the preferred,
high-performance option.
I've got concerns about this. Both tactical and strategic. The
tactical problem is that I need to get security (both Kerberos and
token) in to 0.22
Scott Carey wrote:
HTTP is very useful and typically performs very well. It has lots of
things built-in too. In addition to what you mention, it has a
caching mechanism built-in, range queries, and all sorts of ways to
tag along state if needed. To top it off there are a lot of testing
and de
Ok, I have some thoughts on this. I might be misinterpreting some use cases
here however.
HTTP is very useful and typically performs very well. It has lots of things
built-in too. In addition to what you mention, it has a caching mechanism
built-in, range queries, and all sorts of ways to t
I'm considering an HTTP-based transport for Avro as the preferred,
high-performance option.
HTTP has lots of advantages. In particular, it already has
- lots of authentication, authorization and encryption support;
- highly optimized servers;
- monitoring, logging, etc.
Tomcat and other ser
39 matches
Mail list logo