by default runs 1x/day. you can do it manually in the hbase shell by typing:

hbase(main):001:0> major_compact "table_name"

-ryan


On Mon, Mar 14, 2011 at 3:25 PM, Weiwei Xiong <xion...@gmail.com> wrote:
> Thanks for your info Ryan.
> Does HBase do major compaction regularly or do I need to manually do this?
> If it's automatic, how frequently is it performed?
> I am running 1 replication.
> Thanks,
> -- Weiwei
>
> On Mon, Mar 14, 2011 at 3:18 PM, Ryan Rawson <ryano...@gmail.com> wrote:
>>
>> HDFS does the data rebalancing, over time as major compactions and new
>> data comes in, files are written first to the local node then to
>> remote nodes.
>>
>> Whats the replication factor you are running?  HDFS on 2 nodes is
>> tricky, since you can either choose r=1 (no data protection) or r=2
>> (all writes go to both nodes).
>>
>> The sweet spot is above 6 nodes alas.
>>
>> -ryan
>>
>> On Mon, Mar 14, 2011 at 3:12 PM, Weiwei Xiong <xion...@gmail.com> wrote:
>> > Sorry I forgot to mention. I am using HBase 0.90.1 over HDFS 0.20.append
>> > Thanks,
>> > -- Weiwei
>> >
>> > On Mon, Mar 14, 2011 at 3:10 PM, Weiwei Xiong <xion...@gmail.com> wrote:
>> >>
>> >> Thanks very much for your replies.
>> >> Something was unclear in my previous emails. I had one node started
>> >> first
>> >> and another was added in later. And there're already some regions
>> >> created in
>> >> the first started node. Then I started to import more data into the
>> >> same
>> >> table and found that it's always the first node that keeps serving the
>> >> data
>> >> writes.
>> >> Actually I was expecting that the region data would be re-balanced to
>> >> another data node. And I did see in the master log that HBase master is
>> >> trying to unassigning some regions from the overloaded node and
>> >> re-assign
>> >> them to the less-loaded node. But the real data was never migrated.
>> >> I think I observed the region index and cache rebalancing from the
>> >> master
>> >> log (correct me if I were wrong).  Does anyone know how frequently this
>> >> happens?
>> >> Another question is, does HBase support data and I/O rebalancing? Or I
>> >> should rely on HDFS to do data rebalancing? I guess HBase should also
>> >> support data rebalancing otherwise every time I restart HBase the
>> >> regions
>> >> will have to be rebalanced again. Will someone tell me how to configure
>> >> or
>> >> program HBase to do data rebalancing?
>> >> Thanks,
>> >> -- Weiwei
>> >> On Mon, Mar 14, 2011 at 2:43 PM, Ryan Rawson <ryano...@gmail.com>
>> >> wrote:
>> >>>
>> >>> What version of HBase are you testing?
>> >>>
>> >>> Is it literally 0 vs N assignments?
>> >>>
>> >>> On Mon, Mar 14, 2011 at 1:18 PM, Weiwei Xiong <xion...@gmail.com>
>> >>> wrote:
>> >>> > Thanks!
>> >>> >
>> >>> > I checked the master log and found some info like this:
>> >>> > " timestamp ***, INFO org.apache.hadoop.hbase.master.HMaster:
>> >>> > balance
>> >>> > hri=***, src=***, dst=*** "
>> >>> >
>> >>> > So I assume the balancer is running. There's no failing info there,
>> >>> > but
>> >>> > I
>> >>> > didn't see the regions were actually balanced as the log states.
>> >>> >
>> >>> > Is it possible that I have been keeping dumping data into the table
>> >>> > thus the
>> >>> > balancing won't work?
>> >>> >
>> >>> > Thanks,
>> >>> > -- Weiwei
>> >>> >
>> >>> > On Mon, Mar 14, 2011 at 12:15 PM, Stack <st...@duboce.net> wrote:
>> >>> >
>> >>> >> Check the master log.  See if the load balancer is running or not.
>> >>> >>  It
>> >>> >> usually runs every 5 minutes by default.  It may not run if regions
>> >>> >> are transitioning.  It'll log regardless.
>> >>> >>
>> >>> >> St.Ack
>> >>> >>
>> >>> >> On Mon, Mar 14, 2011 at 10:50 AM, Weiwei Xiong <xion...@gmail.com>
>> >>> >> wrote:
>> >>> >> > Hi,
>> >>> >> >
>> >>> >> > I recently set up a 2-node Hadoop and HBase cluster and am trying
>> >>> >> > to
>> >>> >> > load
>> >>> >> > data into my HBase table using HBase client.
>> >>> >> >
>> >>> >> > The issue bothers me is that the data are always written into one
>> >>> >> > node of
>> >>> >> > the cluster, i.e., all the regions of the hbase table are on one
>> >>> >> > node.
>> >>> >> >
>> >>> >> > Is there any configuration I need to change for make the load
>> >>> >> > balanced?
>> >>> >> >
>> >>> >> > Thanks,
>> >>> >> > -- w
>> >>> >> >
>> >>> >>
>> >>> >
>> >>
>> >
>> >
>
>

Reply via email to