Thanks for your info Ryan.

Does HBase do major compaction regularly or do I need to manually do this?
If it's automatic, how frequently is it performed?

I am running 1 replication.

Thanks,
-- Weiwei

On Mon, Mar 14, 2011 at 3:18 PM, Ryan Rawson <ryano...@gmail.com> wrote:

> HDFS does the data rebalancing, over time as major compactions and new
> data comes in, files are written first to the local node then to
> remote nodes.
>
> Whats the replication factor you are running?  HDFS on 2 nodes is
> tricky, since you can either choose r=1 (no data protection) or r=2
> (all writes go to both nodes).
>
> The sweet spot is above 6 nodes alas.
>
> -ryan
>
> On Mon, Mar 14, 2011 at 3:12 PM, Weiwei Xiong <xion...@gmail.com> wrote:
> > Sorry I forgot to mention. I am using HBase 0.90.1 over HDFS 0.20.append
> > Thanks,
> > -- Weiwei
> >
> > On Mon, Mar 14, 2011 at 3:10 PM, Weiwei Xiong <xion...@gmail.com> wrote:
> >>
> >> Thanks very much for your replies.
> >> Something was unclear in my previous emails. I had one node started
> first
> >> and another was added in later. And there're already some regions
> created in
> >> the first started node. Then I started to import more data into the same
> >> table and found that it's always the first node that keeps serving the
> data
> >> writes.
> >> Actually I was expecting that the region data would be re-balanced to
> >> another data node. And I did see in the master log that HBase master is
> >> trying to unassigning some regions from the overloaded node and
> re-assign
> >> them to the less-loaded node. But the real data was never migrated.
> >> I think I observed the region index and cache rebalancing from the
> master
> >> log (correct me if I were wrong).  Does anyone know how frequently this
> >> happens?
> >> Another question is, does HBase support data and I/O rebalancing? Or I
> >> should rely on HDFS to do data rebalancing? I guess HBase should also
> >> support data rebalancing otherwise every time I restart HBase the
> regions
> >> will have to be rebalanced again. Will someone tell me how to configure
> or
> >> program HBase to do data rebalancing?
> >> Thanks,
> >> -- Weiwei
> >> On Mon, Mar 14, 2011 at 2:43 PM, Ryan Rawson <ryano...@gmail.com>
> wrote:
> >>>
> >>> What version of HBase are you testing?
> >>>
> >>> Is it literally 0 vs N assignments?
> >>>
> >>> On Mon, Mar 14, 2011 at 1:18 PM, Weiwei Xiong <xion...@gmail.com>
> wrote:
> >>> > Thanks!
> >>> >
> >>> > I checked the master log and found some info like this:
> >>> > " timestamp ***, INFO org.apache.hadoop.hbase.master.HMaster: balance
> >>> > hri=***, src=***, dst=*** "
> >>> >
> >>> > So I assume the balancer is running. There's no failing info there,
> but
> >>> > I
> >>> > didn't see the regions were actually balanced as the log states.
> >>> >
> >>> > Is it possible that I have been keeping dumping data into the table
> >>> > thus the
> >>> > balancing won't work?
> >>> >
> >>> > Thanks,
> >>> > -- Weiwei
> >>> >
> >>> > On Mon, Mar 14, 2011 at 12:15 PM, Stack <st...@duboce.net> wrote:
> >>> >
> >>> >> Check the master log.  See if the load balancer is running or not.
>  It
> >>> >> usually runs every 5 minutes by default.  It may not run if regions
> >>> >> are transitioning.  It'll log regardless.
> >>> >>
> >>> >> St.Ack
> >>> >>
> >>> >> On Mon, Mar 14, 2011 at 10:50 AM, Weiwei Xiong <xion...@gmail.com>
> >>> >> wrote:
> >>> >> > Hi,
> >>> >> >
> >>> >> > I recently set up a 2-node Hadoop and HBase cluster and am trying
> to
> >>> >> > load
> >>> >> > data into my HBase table using HBase client.
> >>> >> >
> >>> >> > The issue bothers me is that the data are always written into one
> >>> >> > node of
> >>> >> > the cluster, i.e., all the regions of the hbase table are on one
> >>> >> > node.
> >>> >> >
> >>> >> > Is there any configuration I need to change for make the load
> >>> >> > balanced?
> >>> >> >
> >>> >> > Thanks,
> >>> >> > -- w
> >>> >> >
> >>> >>
> >>> >
> >>
> >
> >
>

Reply via email to