I completely agree with Ryan. Most of the measurements in HDFS-347 are point
comparisions.... data rate over socket, single-threaded sequential read from
datanode, single-threaded random read form datanode, etc. These measurements
are good, but when you run the entire Hbase system at load, you definitely
see a 3X performance improvement when reading data locally (instead of going
through the datanode).

-dhruba

On Fri, Jun 3, 2011 at 11:08 AM, Ryan Rawson <ryano...@gmail.com> wrote:

> Could you explain your HDFS-347 comment more?  I dont think people
> suggested that the socket itself was the primary issue, but dealing
> with the datanode and the socket and everything was really slow.  It's
> hard to separate concerns and test only 1 thing at a time - for
> example you said 'local socket comm isnt the problem', but there is no
> way to build a test that uses a local socket but not the datanode.
>
> The basic fact is that datanode adds a lot of overhead, and under high
> concurrency that overhead grows.
>
>
>
> On Fri, Jun 3, 2011 at 7:07 AM, Kihwal Lee <kih...@yahoo-inc.com> wrote:
> > HDFS-941
> > The trunk has moved on so the patch won't apply.  There has been
> significant changes in HDFS lately, so it will require more than simple
> rebase/merge.  If the original assignee is busy, I am willing to help.
> >
> > HDFS-347
> > The analysis is pointing out that local socket communication is actually
> not the problem. The initial assumption of local socket being slow should be
> ignored and the design should be revisited.
> >
> > I agree that improving local pread performance is critical.  Based on my
> experiments, HDFS-941 helps a lot and the communication channel became no
> longer the bottleneck.
> >
> > Kihwal
> >
> >
> > On 6/2/11 4:00 PM, "Doug Meil" <doug.m...@explorysmedical.com> wrote:
> >
> > Hi folks, I was wondering if there was any movement on any of these HDFS
> tickets for HBase.  The umbrella ticket is HDFS-1599, but the last comment
> from stack back in Feb highlighted interest in several tickets:
> >
> >
> > 1)      HDFS-918 (use single selector)
> >
> > a.       Last comment Jan 2011
> >
> >
> >
> > 2)      HDFS-941 (reuse of connection)
> >
> > a.       Patch available as of April 2011
> >
> > b.      But ticket still unresolved.
> >
> >
> >
> > 3)      HDFS-347 (local reads)
> >
> > a.       Discussion seemed to end in March 2011 with a huge comment
> saying that there was no performance benefit.
> >
> > b.      I'm working my way through this comment/report, but intuitively
> it seems like it would be a good idea since as the other comments in the
> ticket stated the RS reads locally just about every time.
> >
> >
> > Doug Meil
> > Chief Software Architect, Explorys
> > doug.m...@explorys.com
> >
> >
> >
>



-- 
Connect to me at http://www.facebook.com/dhruba

Reply via email to