On Wed, Oct 20, 2010 at 7:29 AM, Stack <[email protected]> wrote:
> Hey Dan:
>
> On Wed, Oct 20, 2010 at 2:09 AM, Dan Harvey <[email protected]>
> wrote:
> > Hey,
> >
> > We're just looking into ways to run multiple instances/versions of HBase
> for
> > testing/development and were wondering how other people have gone about
> > doing this.
> >
>
> Development of replication feature has made it so tests now can put up
> multiple concurrent clusters. See TestHBaseClusterUtility which
> starts up three clusters in the one JVM each homed on its own
> directory in a single zookeeper instance, each running its own hdfs
> (having them share an hdfs should work too though might need some
> HBaseTestingUtility fixup).
>
> At SU, there are mutliple clusters: a serving cluster for low-latency
> (replicating to backup cluster) and then a cluster for MR jobs, dev
> clusters, etc. Generally these don't share hdfs though again
> cluster's with like SLAs could.
>
> > If we used just one hadoop cluster then we can have a different paths /
> user
> > for each hbase instance, and then have a set of zookeeper nodes for each
> > instance (or run multiple zk's on each server binding to different hosts
> for
> > each instance..).
>
> You could do that. Have all share same zk ensemble (Run one per
> datacenter?)
>
> > If we used multiple hadoop clusters then the only difference would be
> just
> > using different hdfs for storing the data.
> >
> > Does anyone have experiences with problems or benefits to either of the
> > above?
> >
> > I'm tempted to go towards the single cluster for more efficient use of
> > hardware but I'm not sure if that's a good idea or not.
> >
>
> At SU the cluster serving the frontend is distinct from the cluster
> running the heavy-duty MR jobs. When a big MR job started up, the
> front-end latency tended to suffer. There might be some ratio of HDFS
> nodes to HBase nodes that would make it so low-latency and MR cluster
> could share HDFS but I've not done the work to figure it.
>
I think the low-latency guarantee (or at least in some degree) requirement
prevents any heavy M/R job in the same cell, and here is the reason:
---- If a heavy M/R task gets started to run on a machine, it may peg
the CPU, evict memory and so on, which basically makes the access to data
belonging to that RS much higher latency than normal.
Any comment?
>
> St.Ack
>