On Sat, Aug 28, 2010 at 1:29 PM, Ryan Rawson <[email protected]> wrote:

> a production server should be CPU bound, with memory caching etc.  Our
> prod systems do see a reasonable load, and jstack always shows some
> kind of wait generally...
>
> but we need more IO pushdown into HDFS.  For example if we are loading
> regions, why not do N at the same time?  That figure N is probably
> more dependent on how many disks/node you have than anything else
> really.
>
> For simple reads (eg: hfile) would it really be that hard to retrofit
> some kind of async netty based API on top of the existing DFSClient
> logic?
>

Would probably be a duplication rather than a retrofit, but it's probably
doable -- the protocol is pretty simple for reads, and failure/retry is much
less complicated compared to writes (though still pretty complicated)


>
> -ryan
>
> On Sat, Aug 28, 2010 at 1:11 PM, Todd Lipcon <[email protected]> wrote:
> > Depending on the workload, parallelism doesn't seem to matter much. On my
> > 8-core Nehalem test cluster with 12 disks each, I'm always network bound
> far
> > before I'm CPU bound for most benchmarks. ie jstacks show threads mostly
> > waiting for IO to happen, not blocked on locks.
> >
> > Is that not the case for your production boxes?
> >
> > On Sat, Aug 28, 2010 at 1:07 PM, Ryan Rawson <[email protected]> wrote:
> >
> >> bigtable was written for 1 core machines, with ~ 100 regions per box.
> >> Thanks to CMS we generally can't run on < 4 cores, and at this point
> >> 16 core machines (with HTT) is becoming pretty standard.
> >>
> >> The question is, how do we leverage the ever-increasing sizes of
> >> machines and differentiate ourselves from bigtable?  What did google
> >> do (if anything) to adopt to the 16 core machines?  We should be able
> >> to do quite a bit on a 20 or 40 node cluster.
> >>
> >> more thread parallelism?
> >>
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to