Hi,
We have been using Hadoop in our project as a DFS cluster to store some
critical information.
This critical information is stored as zip files of about 3-5 MB in size
each. The number of these files would grow to more than a billion files and
more than 1 peta byte of storage. We are aware
Hi,
We have been using Hadoop in our project as a DFS cluster to store some
critical information.
This critical information is stored as zip files of about 3-5 MB in size each.
The number of these files would grow to more than a billion files and more than
1 peta byte of storage. We are
My understanding is the following (which will hopefully be corrected by those
with more experience):
* you should try to limit your cell size to less than 1 MB. This is not a hard
and fast rule, but there are certain limits you don't want to exceed: you don't
want a row exceeding your region
Hi All,
I have a 14 node cluster setup for HBase. Someone else in my office needs to
use some of these machines and I would like to descale my cluster from 14 to
6 machines.
Is there an efficient way to do this ? Since there is data residing on the
machines I want to get rid of, are there
Yeah, but you don't want to drop all of the machines at the same time. When you
decommission a node, you need to give the cluster time to rebalance before
dropping a second node.
That is of course if you don't mind losing any data.
:-)
Date: Tue, 28 Jun 2011 10:33:39 -0700
Subject: Re:
That's why you should use the DN decommissioning feature that I
referred to. You can do it for 20 machines all at the same time if you
want as long as you have the capacity.
J-D
On Tue, Jun 28, 2011 at 11:36 AM, Michael Segel
michael_se...@hotmail.com wrote:
Yeah, but you don't want to drop
Looks like I'll be fixing it in the context of HBASE-3984.
J-D
On Mon, Jun 27, 2011 at 10:21 AM, Jean-Daniel Cryans
jdcry...@apache.org wrote:
Yeah the BulkAssigner uses a AssignmentManager.assign method that
won't retry if it gets an exception because originally it was only
used during
Hi Users,
We are running following code.
Hbase version : 0.90.3 with HBASE-3777, HBASE-2937 and HBASE-3855 on top
Hadoop version: CDH3B3
I am trying to figure the right way to perform cluster restart in case
we want to push a patched jar or a configuration tweak. I have tried
When using HFileOutputFormat, HMaster often crashes.
I guess this critical problem is caused by the timeout of Zookeeper.
Does anyone know what may have caused this? How might I prevent this
from happening again?
HBase Version : 0.90.3, r1100350
Hadoop Version : 0.20.3-SNAPSHOT, r1057313
On Tue, Jun 28, 2011 at 6:46 PM, Shrijeet Paliwal
shrij...@rocketfuel.com wrote:
I am trying to figure the right way to perform cluster restart in case
we want to push a patched jar or a configuration tweak. I have tried
http://wiki.apache.org/hadoop/Hbase/RollingRestart among
other things
Yes. Your master is timing out against your zk. Whats going on? ZK
nodes have MR running on them? Your master is doing long GC pauses?
St.Ack
On Tue, Jun 28, 2011 at 7:46 PM, Gan, Xiyun ganxi...@gmail.com wrote:
When using HFileOutputFormat, HMaster often crashes.
I guess this critical
Thanks Stack.
Yeah, I have M/R job on Zookeeper nodes. Does it necessary to separate
zk nodes and tasktrackers?
On Wed, Jun 29, 2011 at 11:24 AM, Stack st...@duboce.net wrote:
Yes. Your master is timing out against your zk. Whats going on? ZK
nodes have MR running on them? Your master is
I'm just guessing the MR job is vampiring i/o from the zk processes.
Can you check?
St.Ack
On Tue, Jun 28, 2011 at 8:29 PM, Gan, Xiyun ganxi...@gmail.com wrote:
Thanks Stack.
Yeah, I have M/R job on Zookeeper nodes. Does it necessary to separate
zk nodes and tasktrackers?
On Wed, Jun 29,
I use HFileOutputFormat.configureIncrementalLoad() to run a MR job,
which lasts about 10 minutes.
On Wed, Jun 29, 2011 at 11:53 AM, Stack st...@duboce.net wrote:
I'm just guessing the MR job is vampiring i/o from the zk processes.
Can you check?
St.Ack
On Tue, Jun 28, 2011 at 8:29 PM, Gan,
Hi,
We are using HBase 0.20.3 (hbase-0.20-0.20.3-1.cloudera.noarch.rpm) cluster in
distributed mode with Hadoop 0.20.2 (hadoop-0.20-0.20.2+320-1.noarch).
We are using pretty much default configuration, and only thing we have
customized is that we have allocated 4GB RAM in
Can you upgrade? That release is 18 months old. A bunch has
happened in the meantime.
For retries exhausted, check whats going on on the remote regionserver
that you are trying to write too. Its probably struggling and thats
why requests are not going through -- or the client missed the fact
16 matches
Mail list logo