HBase region size

2011-06-28 Thread Aditya Karanth A
Hi, We have been using Hadoop in our project as a DFS cluster to store some critical information. This critical information is stored as zip files of about 3-5 MB in size each. The number of these files would grow to more than a billion files and more than 1 peta byte of storage. We are aware

HBase region size config

2011-06-28 Thread Aditya Karanth A
Hi, We have been using Hadoop in our project as a DFS cluster to store some critical information. This critical information is stored as zip files of about 3-5 MB in size each. The number of these files would grow to more than a billion files and more than 1 peta byte of storage. We are

RE: HBase region size

2011-06-28 Thread Buttler, David
My understanding is the following (which will hopefully be corrected by those with more experience): * you should try to limit your cell size to less than 1 MB. This is not a hard and fast rule, but there are certain limits you don't want to exceed: you don't want a row exceeding your region

descaling hbase

2011-06-28 Thread Sam Seigal
Hi All, I have a 14 node cluster setup for HBase. Someone else in my office needs to use some of these machines and I would like to descale my cluster from 14 to 6 machines. Is there an efficient way to do this ? Since there is data residing on the machines I want to get rid of, are there

RE: descaling hbase

2011-06-28 Thread Michael Segel
Yeah, but you don't want to drop all of the machines at the same time. When you decommission a node, you need to give the cluster time to rebalance before dropping a second node. That is of course if you don't mind losing any data. :-) Date: Tue, 28 Jun 2011 10:33:39 -0700 Subject: Re:

Re: descaling hbase

2011-06-28 Thread Jean-Daniel Cryans
That's why you should use the DN decommissioning feature that I referred to. You can do it for 20 machines all at the same time if you want as long as you have the capacity. J-D On Tue, Jun 28, 2011 at 11:36 AM, Michael Segel michael_se...@hotmail.com wrote: Yeah, but you don't want to drop

Re: Master died after failed assignment of regionserver

2011-06-28 Thread Jean-Daniel Cryans
Looks like I'll be fixing it in the context of HBASE-3984. J-D On Mon, Jun 27, 2011 at 10:21 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: Yeah the BulkAssigner uses a AssignmentManager.assign method that won't retry if it gets an exception because originally it was only used during

What is the right way to perform a cluster restart?

2011-06-28 Thread Shrijeet Paliwal
Hi Users, We are running following code. Hbase version : 0.90.3 with HBASE-3777, HBASE-2937 and HBASE-3855 on top Hadoop version: CDH3B3 I am trying to figure the right way to perform cluster restart in case we want to push a patched jar or a configuration tweak. I have tried

HMaster crashes during BulkLoad

2011-06-28 Thread Gan, Xiyun
When using HFileOutputFormat, HMaster often crashes. I guess this critical problem is caused by the timeout of Zookeeper. Does anyone know what may have caused this? How might I prevent this from happening again? HBase Version : 0.90.3, r1100350 Hadoop Version : 0.20.3-SNAPSHOT, r1057313

Re: What is the right way to perform a cluster restart?

2011-06-28 Thread Stack
On Tue, Jun 28, 2011 at 6:46 PM, Shrijeet Paliwal shrij...@rocketfuel.com wrote: I am trying to figure the right way to perform cluster restart in case we want to push a patched jar or a configuration tweak. I have tried http://wiki.apache.org/hadoop/Hbase/RollingRestart among other things

Re: HMaster crashes during BulkLoad

2011-06-28 Thread Stack
Yes. Your master is timing out against your zk. Whats going on? ZK nodes have MR running on them? Your master is doing long GC pauses? St.Ack On Tue, Jun 28, 2011 at 7:46 PM, Gan, Xiyun ganxi...@gmail.com wrote: When using HFileOutputFormat, HMaster often crashes. I guess this critical

Re: HMaster crashes during BulkLoad

2011-06-28 Thread Gan, Xiyun
Thanks Stack. Yeah, I have M/R job on Zookeeper nodes. Does it necessary to separate zk nodes and tasktrackers? On Wed, Jun 29, 2011 at 11:24 AM, Stack st...@duboce.net wrote: Yes.  Your master is timing out against your zk. Whats going on?  ZK nodes have MR running on them?  Your master is

Re: HMaster crashes during BulkLoad

2011-06-28 Thread Stack
I'm just guessing the MR job is vampiring i/o from the zk processes. Can you check? St.Ack On Tue, Jun 28, 2011 at 8:29 PM, Gan, Xiyun ganxi...@gmail.com wrote: Thanks Stack. Yeah, I have M/R job on Zookeeper nodes. Does it necessary to separate zk nodes and tasktrackers? On Wed, Jun 29,

Re: HMaster crashes during BulkLoad

2011-06-28 Thread Gan, Xiyun
I use HFileOutputFormat.configureIncrementalLoad() to run a MR job, which lasts about 10 minutes. On Wed, Jun 29, 2011 at 11:53 AM, Stack st...@duboce.net wrote: I'm just guessing the MR job is vampiring i/o from the zk processes. Can you check? St.Ack On Tue, Jun 28, 2011 at 8:29 PM, Gan,

HBase Read and Write Issues in Mutlithreaded Environments

2011-06-28 Thread Srikanth P. Shreenivas
Hi, We are using HBase 0.20.3 (hbase-0.20-0.20.3-1.cloudera.noarch.rpm) cluster in distributed mode with Hadoop 0.20.2 (hadoop-0.20-0.20.2+320-1.noarch). We are using pretty much default configuration, and only thing we have customized is that we have allocated 4GB RAM in

Re: HBase Read and Write Issues in Mutlithreaded Environments

2011-06-28 Thread Stack
Can you upgrade? That release is 18 months old. A bunch has happened in the meantime. For retries exhausted, check whats going on on the remote regionserver that you are trying to write too. Its probably struggling and thats why requests are not going through -- or the client missed the fact