Hi,

We are are using the latest release hadoop 0.20.2 and hbase 0.20.3. Thank
you  for  your help.

Tuan Nguyen.

On Sat, Mar 20, 2010 at 8:44 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> What hbase version are you using ?
>
> On Saturday, March 20, 2010, Tuan Nguyen <tua...@gmail.com> wrote:
> > Hi,
> >
> > We are running stress test to evaluate the hbase.  The test  run fine and
> > complete. But we have a small problem with one node.  Here is our
> > configuration and problems:
> >
> > 1. We have 1 master and 4 slaves. the master is used for both namenode
> and
> > hbase master server.  The slaves are used for both datanode and region
> > server.
> > 2.  We have set  xceivier to 8192 and enable the lzo compression.
> > 3. From another machine,  we create 8 threads to write the data into the
> > cluster,  each record is about 5kb to 100kb.
> > 4. The test run fine for  the first 2 hours - 3 hours, but then one of
> node
> >  get the following warning:
> >
> > 2010-03-19 20:26:22,814 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: Deleting block
> > blk_9088042710721149043_145344 file
> >
> /mnt/moom/hadoop/0.20.1/dfs/data/current/subdir6/subdir33/blk_90880427107211490432010-03-19
> > 20:26:22,846
> > WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Error processing
> > datanode Commandjava.io.IOException: Error in deleting blocks.
> > at
> >
> org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:1361)
> >
> >  at
> >
> org.apache.hadoop.hdfs.server.datanode.DataNode.processCommand(DataNode.java:868)
> >
> > at
> >
> org.apache.hadoop.hdfs.server.datanode.DataNode.processCommand(DataNode.java:830)
> >
> > at
> >
> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:710)
> >
> > at
> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1186)
> >
> > at java.lang.Thread.run(Thread.java:619)
> >
> > 5. After the warning,  I do not see the info  Deleting block
> > blk_xxxxxxxxxxxxxxxxxxxxxxxxx  message on this node anymore and we loose
> the
> > disk space very fast on this datanode. I guess because the hbase compact
> the
> > region and delete the old region,  but  the datanode is unable to reclaim
> > the free block.
> >
> > 6. After 5 - 6 hours, the datanode is completely run out of the space,
> but
> > the test is continue running at slower insert rate.
> > 7. The entire test finish after 14 hours.
> > 8. Right after the test finish, this datanode start resume reclaim the
> > deleting blocks.
> >
> > We run the test twice and the same problem occurs on the same node.  I am
> > wonder what is the possible reason that cause our problem and any
> > configuration parameter we can tune to fix the problem
> >
> > Thank for your help!
> > Tuan Nguyen.
> >
>

Reply via email to