On Fri, Jun 3, 2011 at 3:38 PM, Andrew Purtell <[email protected]> wrote:
> I have patches for HDFS-347 and HDFS-941 (and HDFS-918) for CDH3U0.

Does your 347 patch do security? or just the one where it sneaks around back?

Have you tested the others under real load for a couple days?

>
>   - Andy
>
>> From: Doug Meil <[email protected]>
>> Subject: RE: HDFS-1599 status? (HDFS tickets to improve HBase)
>> To: "[email protected]" <[email protected]>
>> Date: Friday, June 3, 2011, 12:50 PM
>> Thanks everybody for commenting on
>> this thread.
>>
>> We'd certainly like to lobby for movement on these two
>> tickets, and although we don't have anybody that is familiar
>> with the source code we'd be happy to perform some tests get
>> some performance numbers.
>>
>> Per Kihwal's comments, it sounds like HDFS-941 needs to get
>> re-worked because the patch is stale.
>>
>> The patch for HDFS-347 sounds like it's still usable.
>>
>> So what else is needed to push this effort forward?
>> Is it beneficial to get more numbers on HDFS-347 and keep
>> lobbying on the ticket, and/or is there another path that
>> should be taken (plying with beer, free Cleveland Indians
>> tickets, harassing phone calls, etc.)?
>>
>>
>>
>> -----Original Message-----
>> From: Dhruba Borthakur [mailto:[email protected]]
>>
>> Sent: Friday, June 03, 2011 3:00 PM
>> To: [email protected]
>> Subject: Re: HDFS-1599 status? (HDFS tickets to improve
>> HBase)
>>
>> I completely agree with Ryan. Most of the measurements in
>> HDFS-347 are point comparisions.... data rate over socket,
>> single-threaded sequential read from datanode,
>> single-threaded random read form datanode, etc. These
>> measurements are good, but when you run the entire Hbase
>> system at load, you definitely see a 3X performance
>> improvement when reading data locally (instead of going
>> through the datanode).
>>
>> -dhruba
>>
>> On Fri, Jun 3, 2011 at 11:08 AM, Ryan Rawson <[email protected]>
>> wrote:
>>
>> > Could you explain your HDFS-347 comment more?  I
>> dont think people
>> > suggested that the socket itself was the primary
>> issue, but dealing
>> > with the datanode and the socket and everything was
>> really slow.  It's
>> > hard to separate concerns and test only 1 thing at a
>> time - for
>> > example you said 'local socket comm isnt the problem',
>> but there is no
>> > way to build a test that uses a local socket but not
>> the datanode.
>> >
>> > The basic fact is that datanode adds a lot of
>> overhead, and under high
>> > concurrency that overhead grows.
>> >
>> >
>> >
>> > On Fri, Jun 3, 2011 at 7:07 AM, Kihwal Lee <[email protected]>
>> wrote:
>> > > HDFS-941
>> > > The trunk has moved on so the patch won't
>> apply.  There has been
>> > significant changes in HDFS lately, so it will require
>> more than
>> > simple rebase/merge.  If the original assignee is
>> busy, I am willing to help.
>> > >
>> > > HDFS-347
>> > > The analysis is pointing out that local socket
>> communication is
>> > > actually
>> > not the problem. The initial assumption of local
>> socket being slow
>> > should be ignored and the design should be revisited.
>> > >
>> > > I agree that improving local pread performance is
>> critical.  Based
>> > > on my
>> > experiments, HDFS-941 helps a lot and the
>> communication channel became
>> > no longer the bottleneck.
>> > >
>> > > Kihwal
>> > >
>> > >
>> > > On 6/2/11 4:00 PM, "Doug Meil" <[email protected]>
>> wrote:
>> > >
>> > > Hi folks, I was wondering if there was any
>> movement on any of these
>> > > HDFS
>> > tickets for HBase.  The umbrella ticket is
>> HDFS-1599, but the last
>> > comment from stack back in Feb highlighted interest in
>> several tickets:
>> > >
>> > >
>> > > 1)      HDFS-918 (use single
>> selector)
>> > >
>> > > a.       Last comment
>> Jan 2011
>> > >
>> > >
>> > >
>> > > 2)      HDFS-941 (reuse of
>> connection)
>> > >
>> > > a.       Patch available
>> as of April 2011
>> > >
>> > > b.      But ticket still
>> unresolved.
>> > >
>> > >
>> > >
>> > > 3)      HDFS-347 (local reads)
>> > >
>> > > a.       Discussion
>> seemed to end in March 2011 with a huge comment
>> > saying that there was no performance benefit.
>> > >
>> > > b.      I'm working my way through
>> this comment/report, but intuitively
>> > it seems like it would be a good idea since as the
>> other comments in
>> > the ticket stated the RS reads locally just about
>> every time.
>> > >
>> > >
>> > > Doug Meil
>> > > Chief Software Architect, Explorys
>> > > [email protected]
>> > >
>> > >
>> > >
>> >
>>
>>
>>
>> --
>> Connect to me at http://www.facebook.com/dhruba
>>
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to