On Sat, Aug 28, 2010 at 1:29 PM, Ryan Rawson <[email protected]> wrote:
> a production server should be CPU bound, with memory caching etc. Our > prod systems do see a reasonable load, and jstack always shows some > kind of wait generally... > > but we need more IO pushdown into HDFS. For example if we are loading > regions, why not do N at the same time? That figure N is probably > more dependent on how many disks/node you have than anything else > really. > > For simple reads (eg: hfile) would it really be that hard to retrofit > some kind of async netty based API on top of the existing DFSClient > logic? > Would probably be a duplication rather than a retrofit, but it's probably doable -- the protocol is pretty simple for reads, and failure/retry is much less complicated compared to writes (though still pretty complicated) > > -ryan > > On Sat, Aug 28, 2010 at 1:11 PM, Todd Lipcon <[email protected]> wrote: > > Depending on the workload, parallelism doesn't seem to matter much. On my > > 8-core Nehalem test cluster with 12 disks each, I'm always network bound > far > > before I'm CPU bound for most benchmarks. ie jstacks show threads mostly > > waiting for IO to happen, not blocked on locks. > > > > Is that not the case for your production boxes? > > > > On Sat, Aug 28, 2010 at 1:07 PM, Ryan Rawson <[email protected]> wrote: > > > >> bigtable was written for 1 core machines, with ~ 100 regions per box. > >> Thanks to CMS we generally can't run on < 4 cores, and at this point > >> 16 core machines (with HTT) is becoming pretty standard. > >> > >> The question is, how do we leverage the ever-increasing sizes of > >> machines and differentiate ourselves from bigtable? What did google > >> do (if anything) to adopt to the 16 core machines? We should be able > >> to do quite a bit on a 20 or 40 node cluster. > >> > >> more thread parallelism? > >> > > > > > > > > -- > > Todd Lipcon > > Software Engineer, Cloudera > > > -- Todd Lipcon Software Engineer, Cloudera
