date:20110510

Adding new disks to an Hadoop Cluster

2011-05-10 Thread Pete Haidinyak


Hi all,
   When you add a disk to a Hadoop data node do you have to bounce the  
node (restart mapreduce and dfs) before Hadoop can use the new disk?


Thanks

-Pete

Re: Adding new disks to an Hadoop Cluster

2011-05-10 Thread lohit

Yes, you have to bounce datanode so that it can start using the disk. Also
note that you have to tell datanode to use this disk via dfs.data.dir config
parameter in hdfs-site.xml. Same with tasktracker, if you want tasktracker
to use this disk for its temp output, you have to tell it via
mapred-site.xml

2011/5/9 Pete Haidinyak javam...@cox.net

 Hi all,
   When you add a disk to a Hadoop data node do you have to bounce the node
 (restart mapreduce and dfs) before Hadoop can use the new disk?

 Thanks

 -Pete




-- 
Have a Nice Day!
Lohit

Re: A question about client

2011-05-10 Thread Gaojinchao

Hbase version: 0.90.2 .
I merged patches:
HBASE-3773  Set ZK max connections much higher in 0.90
HBASE-3771  All jsp pages don't clean their HBA
HBASE-3783  hbase-0.90.2.jar exists in hbase root and in 'lib/'
HBASE-3756  Can't move META or ROOT from shell
HBASE-3744  createTable blocks until all regions are out of transition
HBASE-3712  HTable.close() doesn't shutdown thread pool
HBASE-3750  HTablePool.putTable() should call 
tableFactory.releaseHTableInterface() for discarded table
HBASE-3722  A lot of data is lost when name node crashed
HBASE-3800  If HMaster is started after NN without starting DN in Hbase 
090.2 then HMaster is not able to start due to AlreadyCreatedException for 
/hbase/hbase.version
HBASE-3749  Master can't exit when open port failed

-邮件原件-
发件人: jdcry...@gmail.com [mailto:jdcry...@gmail.com] 代表 Jean-Daniel Cryans
发送时间: 2011年5月10日 1:17
收件人: user@hbase.apache.org
主题: Re: A question about client

TreeMap isn't concurrent and it seems it was used that way? I know you
guys are testing a bunch of different things at the same time so which
HBase version and which patches were you using when you got that?

Thx,

J-D

On Mon, May 9, 2011 at 5:22 AM, Gaojinchao gaojinc...@huawei.com wrote:
    I used ycsb to put data and threw exception.
    Who can give me some suggestion?

   Hbase Code:
      // Cut the cache so that we only get the part that could contain
      // regions that match our key
      SoftValueSortedMapbyte[], HRegionLocation matchingRegions =
        tableLocations.headMap(row);

      // if that portion of the map is empty, then we're done. otherwise,
      // we need to examine the cached location to verify that it is
      // a match by end key as well.
      if (!matchingRegions.isEmpty()) {
        HRegionLocation possibleRegion =
          matchingRegions.get(matchingRegions.lastKey());

    ycsb client log:

    [java] begin StatusThread run
     [java] java.util.NoSuchElementException
     [java]     at java.util.TreeMap.key(TreeMap.java:1206)
     [java]     at java.util.TreeMap$NavigableSubMap.lastKey(TreeMap.java:1435)
     [java]     at 
 org.apache.hadoop.hbase.util.SoftValueSortedMap.lastKey(SoftValueSortedMap.java:131)
     [java]     at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getCachedLocation(HConnectionManager.java:841)
     [java]     at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:664)
     [java]     at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
     [java]     at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1114)
     [java]     at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
     [java]     at 
 org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
     [java]     at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
     [java]     at org.apache.hadoop.hbase.client.HTable.put(HTable.java:665)
     [java]     at com.yahoo.ycsb.db.HBaseClient.update(Unknown Source)
     [java]     at com.yahoo.ycsb.db.HBaseClient.insert(Unknown Source)
     [java]     at com.yahoo.ycsb.DBWrapper.insert(Unknown Source)
     [java]     at com.yahoo.ycsb.workloads.MyWorkload.doInsert(Unknown Source)
     [java]     at com.yahoo.ycsb.ClientThread.run(Unknown Source)

Re: Hmaster is OutOfMemory

2011-05-10 Thread Gaojinchao

If the cluster has 100K regions , restart cluster, Master will need a lot of 
memory.


-邮件原件-
发件人: saint@gmail.com [mailto:saint@gmail.com] 代表 Stack
发送时间: 2011年5月10日 13:58
收件人: user@hbase.apache.org
主题: Re: Hmaster is OutOfMemory

2011/5/9 Gaojinchao gaojinc...@huawei.com:
 Hbase version : 0.90.3RC0

 It happened when creating table with Regions
 I find master started needs so much memory when the cluster has 100K regions

Do you need to have 100k regions in the cluster Gao?  Or, you are just
testing how we do w/ 100k regions?


 It seems likes zkclientcnxn.

 It seems master assigned region need improve.


 top -c | grep 5834
 5834 root  20   0 8875m 7.9g  11m S2 50.5  33:53.19 
 /opt/jdk1.6.0_22/bin/java -Xmx8192m -ea -XX:+UseConcMarkSweepGC 
 -XX:+CMSIncrementalMode


You probably don't need CMSIncrementalMode if your hardware has = 4 CPUs.

Where do you see heap used in the below?  I just see stats on your
heap config. and a snapshot of what is currently in use.  Seems to be
5G of your 8G heap (~60%).   If you do a full GC, does this go down?

In 0.90.x, HBase Master keeps an 'image' of the cluster in HMaster
RAM.  I'd doubt this takes up 5G but I haven't measured it so perhaps
it could.  Is this a problem for you Gao?  You do have a 100k regions.

St.Ack

 Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize  = 8589934592 (8192.0MB)
   NewSize  = 21757952 (20.75MB)
   MaxNewSize   = 174456832 (166.375MB)
   OldSize  = 65404928 (62.375MB)
   NewRatio = 7
   SurvivorRatio= 8
   PermSize = 21757952 (20.75MB)
   MaxPermSize  = 88080384 (84.0MB)

 Heap Usage:
 New Generation (Eden + 1 Survivor Space):
   capacity = 100335616 (95.6875MB)
   used = 47094720 (44.91302490234375MB)
   free = 53240896 (50.77447509765625MB)
   46.93719127612671% used
 Eden Space:
   capacity = 89194496 (85.0625MB)
   used = 35953600 (34.28802490234375MB)
   free = 53240896 (50.77447509765625MB)
   40.30921369856723% used
 From Space:
   capacity = 11141120 (10.625MB)
   used = 11141120 (10.625MB)
   free = 0 (0.0MB)
   100.0% used
 To Space:
   capacity = 11141120 (10.625MB)
   used = 0 (0.0MB)
   free = 11141120 (10.625MB)
   0.0% used
 concurrent mark-sweep generation:
   capacity = 8415477760 (8025.625MB)
   used = 5107249280 (4870.6524658203125MB)
   free = 3308228480 (3154.9725341796875MB)
   60.68876213155128% used
 Perm Generation:
   capacity = 31199232 (29.75390625MB)
   used = 18681784 (17.81633758544922MB)
   free = 12517448 (11.937568664550781MB)
   59.87898676480241% used


 -邮件原件-
 发件人: jdcry...@gmail.com [mailto:jdcry...@gmail.com] 代表 Jean-Daniel Cryans
 发送时间: 2011年5月10日 1:20
 收件人: user@hbase.apache.org
 主题: Re: Hmaster is OutOfMemory

 It looks like the master entered a GC loop of death (since there are a
 lot of We slept 76166ms messages) and finally died. Was it splitting
 logs? Did you get a heap dump? Did you inspect it and can you tell
 what was using all that space?

 Thx,

 J-D

 2011/5/8 Gaojinchao gaojinc...@huawei.com:
 Hbase version 0.90.2：
 Hmaster has 8G memory,  It seems like not enough ? why it needs so much 
 memory?(50K region)

 Other issue. Log is error:
 see http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A9 should be see 
 http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A8

 Hmaster logs：

 2011-05-06 19:31:09,924 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:2-0x12fc3a17c070022 Creating (or updating) unassigned node for 
 2f19f33ae3f21ac4cb681f1662767d0c with OFFLINE state
 2011-05-06 19:31:09,924 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
 76166ms instead of 6ms, this is likely due to a long garbage collecting 
 pause and it's usually bad, see 
 http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A9
 2011-05-06 19:31:09,924 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 
 16697ms instead of 1000ms, this is likely due to a long garbage collecting 
 pause and it's usually bad, see 
 http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A9
 2011-05-06 19:31:09,932 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=ufdr,211007,1304669377398.696f124cc6ff82302f735c8413c6ac0b. 
 state=CLOSED, ts=1304681364406
 2011-05-06 19:31:09,932 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:2-0x12fc3a17c070022 Creating (or updating) unassigned node for 
 696f124cc6ff82302f735c8413c6ac0b with OFFLINE state
 2011-05-06 19:31:22,942 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 
 ufdr,071415,1304668656420.aa026fbb27a25b0fe54039c00108dad6. on 
 157-5-100-9,20020,1304678135900
 2011-05-06 19:31:22,942 DEBUG 
 org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
 event for 7a75bac2028fba1529075225a3755c4c; deleting unassigned node
 2011-05-06 19:31:22,942 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:

HBase filtered scan problem

2011-05-10 Thread Stefan Comanita

Hi all, 

I want to do a scan on a number of rows, each row having multiple columns, and 
I want to filter out some of this columns based on their values per example, if 
I have the following rows:

plainRow:col:value1 column=T:19, timestamp=19, 
value= 
  
plainRow:col:value1 column=T:2, timestamp=2, 
value=U  
  
plainRow:col:value1 column=T:3, timestamp=3, 
value=U  
  
plainRow:col:value1 column=T:4, timestamp=4, value=

and

secondRow:col:value1 column=T:1, timestamp=1, 
value= 
  
secondRow:col:value1 column=T:2, timestamp=2, 
value=  
  
secondRow:col:value1 column=T:3, timestamp=3, 
value=U 
  
secondRow:col:value1 column=T:4, timestamp=4, value=


and I want to select all the rows but just with the columns that don't have the 
value U, something like:

plainRow:col:value1 column=T:19, timestamp=19, 
value= 
  
plainRow:col:value1 column=T:4, timestamp=4, value=
secondRow:col:value1 column=T:1, timestamp=1, 
value= 
  
secondRow:col:value1 column=T:2, timestamp=2, 
value=   
 secondRow:col:value1 column=T:4, timestamp=4, value=

and to achieve this, i try the following:

Scan scan = new Scan();
    
scan.setStartRow(stringToBytes(rowIdentifier));
scan.setStopRow(stringToBytes(rowIdentifier + Constants.MAX_CHAR));
scan.addFamily(Constants.TERM_VECT_COLUMN_FAMILY);

if(includeFilter) {
    Filter filter = new ValueFilter(CompareOp.EQUAL, 
    new BinaryComparator(stringToBytes(U)));    
    scan.setFilter(filter);
}

and if i execute this scan I get the rows with the columns having the value 
U, which is correct, but when i set CompareOp.NOT_EQUAL and i expect to get 
the other columns it doesnt work the way i want, it give me back all the rows, 
including the one which have the value U, the same happens when i use: 
Filter filter = new ValueFilter(CompareOp.EQUAL, new 
BinaryComparator(stringToBytes())); 

I mention that the columns have the values U and  (empty string), and that 
i also saw the same behaivior with the RegexComparator and SubstringComparator.

Any idea would be very much appreciated, sorry for the long mail, thank you.

Stefan Comanita

Mapping Object-HBase data Framework!

2011-05-10 Thread Kobla Gbenyo


Hello,

I am new at this list and I start testing HBase. I download and install 
HBase successfully and now I am looking for a framework which can help 
me performing CRUD operations (create, read, update and delete). Through 
my research, I found JDO but I do not find more support on it. There are 
any other frameworks to perform my CRUD operations or are there more 
supports on JDO for HBASE? (for information, I am using maven for building)


Cheers,

--
Kobla.

Re: Error of Got error in response to OP_READ_BLOCK for file

2011-05-10 Thread Jean-Daniel Cryans

Data cannot be corrupted at all, since the files in HDFS are immutable
and CRC'ed (unless you are able to lose all 3 copies of every block).

Corruption would happen at the metadata level, whereas the .META.
table which contains the regions for the tables would lose rows. This
is a likely scenario if the region server holding that region dies of
GC since the hadoop version you are using along hbase 0.20.6 doesn't
support appends, meaning that the write-ahead log would be missing
data that, obviously, cannot be replayed.

The best advice I can give you is to upgrade.

J-D

On Tue, May 10, 2011 at 5:44 AM, Stanley Xu wenhao...@gmail.com wrote:
Thanks J-D. A little more confused that is it looks when we have a corrupt
hbase table or some inconsistency data, we will got lots of message like
that. But if the hbase table is proper, we will also get some lines of
messages like that.

How could I identify if it comes from a corruption in data or just some
mis-hit in the scenario you mentioned?

On Tue, May 10, 2011 at 6:23 AM, Jean-Daniel Cryans
jdcry...@apache.orgwrote:

Very often the cannot open filename happens when the region in
question was reopened somewhere else and that region was compacted. As
to why it was reassigned, most of the time it's because of garbage
collections taking too long. The master log should have all the
required evidence, and the region server should print some slept for
Xms (where X is some number of ms) messages before everything goes
bad.

Here are some general tips on debugging problems in HBase
http://hbase.apache.org/book/trouble.html

J-D

On Sat, May 7, 2011 at 2:10 AM, Stanley Xu wenhao...@gmail.com wrote:
Dear all,

We were using HBase 0.20.6 in our environment, and it is pretty stable in
the last couple of month, but we met some reliability issue from last
week.
Our situation is very like the following link.

http://search-hadoop.com/m/UJW6Efw4UW/Got+error+in+response+to+OP_READ_BLOCK+for+filesubj=HBase+fail+over+reliability+issues

When we use a hbase client to connect to the hbase table, it looks stuck
there. And we can find the logs like

WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /
10.24.166.74:50010 for *file*
/hbase/users/73382377/data/312780071564432169
for block -4841840178880951849:java.io.IOException: *Got* *error* in *
response* to
OP_READ_BLOCK for *file* /hbase/users/73382377/data/312780071564432169
for
block -4841840178880951849

INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 40 on 60020,
call
get([B@25f907b4, row=963aba6c5f351f5655abdc9db82a4cbd, maxVersions=1,
timeRange=[0,9223372036854775807), families={(family=data, columns=ALL})
from 10.24.117.100:2365: *error*: java.io.IOException: Cannot open
filename
/hbase/users/73382377/data/312780071564432169
java.io.IOException: Cannot open filename
/hbase/users/73382377/data/312780071564432169

WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(
10.24.166.74:50010,
storageID=DS-14401423-10.24.166.74-50010-1270741415211,
infoPort=50075, ipcPort=50020):
*Got* exception while serving blk_-4841840178880951849_50277 to /
10.25.119.113
:
java.io.IOException: Block blk_-4841840178880951849_50277 is not valid.

in the server side.

And if we do a flush and then a major compaction on the .META., the
problem just went away, but will appear again some time later.

At first we guess it might be the problem of xceiver. So we set the
xceiver
to 4096 as the link here.
http://ccgtech.blogspot.com/2010/02/hadoop-hdfs-deceived-by-xciever.html

But we still get the same problem. It looks that a restart of the whole
HBase cluster will fix the problem for a while, but actually we could not
say always trying to restart the server.

I am waiting online, will really appreciate any message.

Best wishes,
Stanley Xu

Re: Mapping Object-HBase data Framework!

2011-05-10 Thread Jean-Daniel Cryans

Most users I know rolled out their own since it doesn't require a very
big layer on top of HBase (since it's all simple queries) and it's
tailored to their own environment.

For JDO there's DataNucleus that supports HBase.

J-D

On Tue, May 10, 2011 at 2:34 AM, Kobla Gbenyo ko...@riastudio.fr wrote:
 Hello,

 I am new at this list and I start testing HBase. I download and install
 HBase successfully and now I am looking for a framework which can help me
 performing CRUD operations (create, read, update and delete). Through my
 research, I found JDO but I do not find more support on it. There are any
 other frameworks to perform my CRUD operations or are there more supports on
 JDO for HBASE? (for information, I am using maven for building)

 Cheers,

 --
 Kobla.

Re: A question about client

2011-05-10 Thread Jean-Daniel Cryans

Are you running a modified YCSB by any chance? Because last time I
looked at that code it didn't share the HTables between threads and it
looks like it's doing something like that.

Looking more deeper at the code, the NoSuchElementException is thrown
because the map is empty. This is what that code looks like:

  if (!matchingRegions.isEmpty()) {
HRegionLocation possibleRegion =
  matchingRegions.get(matchingRegions.lastKey());

So to me it seems that the only way you would get this exception is if
someone emptied the map between the isEmpty call and lastKey which
shouldn't happen if HTables aren't shared.

The only other way it seems it could happen, and it's a stretch, is
that since the regions are kept in a SoftValueSortedMap then the GC
would have removed the elements you needed exactly between those two
lines...  Is it easy for you to recreate the issue?

Thx a bunch,

J-D

On Mon, May 9, 2011 at 11:34 PM, Gaojinchao gaojinc...@huawei.com wrote:
 Hbase version: 0.90.2 .
 I merged patches:
 HBASE-3773  Set ZK max connections much higher in 0.90
 HBASE-3771  All jsp pages don't clean their HBA
 HBASE-3783  hbase-0.90.2.jar exists in hbase root and in 'lib/'
 HBASE-3756  Can't move META or ROOT from shell
 HBASE-3744  createTable blocks until all regions are out of transition
 HBASE-3712  HTable.close() doesn't shutdown thread pool
 HBASE-3750  HTablePool.putTable() should call 
 tableFactory.releaseHTableInterface() for discarded table
 HBASE-3722  A lot of data is lost when name node crashed
 HBASE-3800  If HMaster is started after NN without starting DN in Hbase 
 090.2 then HMaster is not able to start due to AlreadyCreatedException for 
 /hbase/hbase.version
 HBASE-3749  Master can't exit when open port failed

 -邮件原件-
 发件人: jdcry...@gmail.com [mailto:jdcry...@gmail.com] 代表 Jean-Daniel Cryans
 发送时间: 2011年5月10日 1:17
 收件人: user@hbase.apache.org
 主题: Re: A question about client

 TreeMap isn't concurrent and it seems it was used that way? I know you
 guys are testing a bunch of different things at the same time so which
 HBase version and which patches were you using when you got that?

 Thx,

 J-D

 On Mon, May 9, 2011 at 5:22 AM, Gaojinchao gaojinc...@huawei.com wrote:
I used ycsb to put data and threw exception.
Who can give me some suggestion?

   Hbase Code:
  // Cut the cache so that we only get the part that could contain
  // regions that match our key
  SoftValueSortedMapbyte[], HRegionLocation matchingRegions =
tableLocations.headMap(row);

  // if that portion of the map is empty, then we're done. otherwise,
  // we need to examine the cached location to verify that it is
  // a match by end key as well.
  if (!matchingRegions.isEmpty()) {
HRegionLocation possibleRegion =
  matchingRegions.get(matchingRegions.lastKey());

ycsb client log:

[java] begin StatusThread run
 [java] java.util.NoSuchElementException
 [java] at java.util.TreeMap.key(TreeMap.java:1206)
 [java] at 
 java.util.TreeMap$NavigableSubMap.lastKey(TreeMap.java:1435)
 [java] at 
 org.apache.hadoop.hbase.util.SoftValueSortedMap.lastKey(SoftValueSortedMap.java:131)
 [java] at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getCachedLocation(HConnectionManager.java:841)
 [java] at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:664)
 [java] at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
 [java] at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1114)
 [java] at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
 [java] at 
 org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
 [java] at 
 org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
 [java] at org.apache.hadoop.hbase.client.HTable.put(HTable.java:665)
 [java] at com.yahoo.ycsb.db.HBaseClient.update(Unknown Source)
 [java] at com.yahoo.ycsb.db.HBaseClient.insert(Unknown Source)
 [java] at com.yahoo.ycsb.DBWrapper.insert(Unknown Source)
 [java] at com.yahoo.ycsb.workloads.MyWorkload.doInsert(Unknown 
 Source)
 [java] at com.yahoo.ycsb.ClientThread.run(Unknown Source)

Re: Error of Got error in response to OP_READ_BLOCK for file

2011-05-10 Thread Stanley Xu

Thanks J-D. We are using Hadoop 0.20.2 with quite a couple of patches. Could
you please tell me which patches does the WAL required? Do we need all the
patches in the branch-0.20-append? We just patched the patch that add the
support for the append function I thought.

Thanks.

On Wed, May 11, 2011 at 12:50 AM, Jean-Daniel Cryans jdcry...@apache.orgwrote:

Data cannot be corrupted at all, since the files in HDFS are immutable
and CRC'ed (unless you are able to lose all 3 copies of every block).

The best advice I can give you is to upgrade.

J-D

On Tue, May 10, 2011 at 5:44 AM, Stanley Xu wenhao...@gmail.com wrote:
Thanks J-D. A little more confused that is it looks when we have a
corrupt
hbase table or some inconsistency data, we will got lots of message like
that. But if the hbase table is proper, we will also get some lines of
messages like that.

How could I identify if it comes from a corruption in data or just some
mis-hit in the scenario you mentioned?

On Tue, May 10, 2011 at 6:23 AM, Jean-Daniel Cryans jdcry...@apache.org
wrote:

Here are some general tips on debugging problems in HBase
http://hbase.apache.org/book/trouble.html

J-D

On Sat, May 7, 2011 at 2:10 AM, Stanley Xu wenhao...@gmail.com wrote:
Dear all,

We were using HBase 0.20.6 in our environment, and it is pretty stable
in
the last couple of month, but we met some reliability issue from last
week.
Our situation is very like the following link.

http://search-hadoop.com/m/UJW6Efw4UW/Got+error+in+response+to+OP_READ_BLOCK+for+filesubj=HBase+fail+over+reliability+issues

When we use a hbase client to connect to the hbase table, it looks
stuck
there. And we can find the logs like

INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 40 on
60020,
call
get([B@25f907b4, row=963aba6c5f351f5655abdc9db82a4cbd, maxVersions=1,
timeRange=[0,9223372036854775807), families={(family=data,
columns=ALL})
from 10.24.117.100:2365: *error*: java.io.IOException: Cannot open
filename
/hbase/users/73382377/data/312780071564432169
java.io.IOException: Cannot open filename
/hbase/users/73382377/data/312780071564432169

in the server side.

And if we do a flush and then a major compaction on the .META., the
problem just went away, but will appear again some time later.

At first we guess it might be the problem of xceiver. So we set the
xceiver
to 4096 as the link here.

http://ccgtech.blogspot.com/2010/02/hadoop-hdfs-deceived-by-xciever.html

But we still get the same problem. It looks that a restart of the
whole
HBase cluster will fix the problem for a while, but actually we could
not
say always trying to restart the server.

I am waiting online, will really appreciate any message.

Best wishes,
Stanley Xu

Adding new disks to an Hadoop Cluster

Re: Adding new disks to an Hadoop Cluster

Re: A question about client

Re: Hmaster is OutOfMemory

HBase filtered scan problem

Mapping Object-HBase data Framework!

Re: Error of Got error in response to OP_READ_BLOCK for file

Re: Mapping Object-HBase data Framework!

Re: A question about client

Re: Error of Got error in response to OP_READ_BLOCK for file

10 matches

Site Navigation

Mail list logo

Footer information