Is this a testing install?  If so remove the hbase dir in hdfs and start over. 

Else on pe failure what does the master log say?

In 0.20.5 we moved so some more messages show at info level which could explain 
some of the differences you are seeing?

Stack



On Jun 29, 2010, at 6:21 AM, Stanislaw Kogut <sko...@sistyma.net> wrote:

> Hi everyone!
> 
> Has someone noticed same behaviour of hbase-0.20.5 after upgrade from
> 0.20.3?
> 
> $hadoop jar hbase/hbase-0.20.5-test.jar sequentialWrite 1
> 10/06/29 16:03:21 INFO zookeeper.ZooKeeper: Client
> environment:zookeeper.version=3.2.2-888565, built on 12/08/2009 21:51 GMT
> 10/06/29 16:03:21 INFO zookeeper.ZooKeeper: Client environment:host.name
> =se002.cluster.local
> 10/06/29 16:03:21 INFO zookeeper.ZooKeeper: Client
> environment:java.version=1.6.0_20
> 10/06/29 16:03:21 INFO zookeeper.ZooKeeper: Client
> environment:java.vendor=Sun Microsystems Inc.
> 10/06/29 16:03:21 INFO zookeeper.ZooKeeper: Client
> environment:java.home=/usr/java/jdk1.6.0_20/jre
> 10/06/29 16:03:21 INFO zookeeper.ZooKeeper: Client
> environment:java.class.path=/opt/hadoop/common/bin/../conf:/usr/java/latest/lib/tools.jar:/opt/hadoop/common/bin/..:/opt/hadoop/common/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/common/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/common/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/common/bin/../lib/commons-el-1.0.jar:/opt/hadoop/common/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/common/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/common/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/common/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/common/bin/../lib/core-3.1.1.jar:/opt/hadoop/common/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/common/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/common/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/common/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/common/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/common/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/common/bin/../lib/junit-3.8.1.jar:/opt/hadoop/common/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/common/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/common/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/common/bin/../lib/oro-2.0.8.jar:/opt/hadoop/common/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/common/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/common/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/common/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/common/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/common/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hadoop/hbase/lib/zookeeper-3.2.2.jar:/opt/hadoop/hbase/conf:/opt/hadoop/hbase/hbase-0.20.5.jar
> 10/06/29 16:03:22 INFO hbase.PerformanceEvaluation: Table {NAME =>
> 'TestTable', FAMILIES => [{NAME => 'info', COMPRESSION => 'NONE', VERSIONS
> => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> BLOCKCACHE => 'true'}]} created
> 10/06/29 16:03:22 INFO hbase.PerformanceEvaluation: Start class
> org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest at offset
> 0 for 1048576 rows
> 10/06/29 16:03:37 INFO hbase.PerformanceEvaluation: 0/104857/1048576
> 10/06/29 16:03:52 INFO hbase.PerformanceEvaluation: 0/209714/1048576
> 10/06/29 16:04:09 INFO hbase.PerformanceEvaluation: 0/314571/1048576
> 10/06/29 16:04:27 INFO hbase.PerformanceEvaluation: 0/419428/1048576
> 10/06/29 16:06:06 ERROR hbase.PerformanceEvaluation: Failed
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
> region server Some server, retryOnlyOne=true, index=0, islastrow=false,
> tries=9, numtries=10, i=0, listsize=9650, region=TestTable,,1277816601856
> for region TestTable,,1277816601856, row '0000511450', but failed after 10
> attempts.
> Exceptions:
> 
>    at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1149)
>    at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1230)
>    at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666)
>    at
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:621)
>    at
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:637)
>    at
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:889)
>    at
> org.apache.hadoop.hbase.PerformanceEvaluation.runNIsOne(PerformanceEvaluation.java:907)
>    at
> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:939)
>    at
> org.apache.hadoop.hbase.PerformanceEvaluation.doCommandLine(PerformanceEvaluation.java:1036)
>    at
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:1061)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> 
> Looks like it happens on region splits.
> 
> Also, some other strange things:
> 1. After writing something to TestTable, some regionservers log these:
> 2010-06-29 16:05:06,458 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
> .META.,,1
> 
> After that, there comes another .META. region on this server in 'status
> detailed' output.
> Even more, sometimes it opens not only .META., but also other regions from
> data tables, such as these TestTable from performanceEvaluation.
> 
> 2. After disabling and removing table its regions still look like assigned
> to regionservers in 'status detailed'.
> 
> 3. After 3-4 tries to write into TestTable from performanceEvaluation there
> another strange thing:
> 10/06/29 16:01:59 ERROR hbase.PerformanceEvaluation: Failed
> org.apache.hadoop.hbase.TableExistsException:
> org.apache.hadoop.hbase.TableExistsException: TestTable
> 
> but, table not exists. You cannot disable and drop it, and hbase shells not
> lists it in 'list' output. But you also cannot create it, because it is
> "exists", and it's regions are assigned to regionservers. Note, nobody drops
> this table.
> 
> I spent some time to find out why this all happens, trying to play around
> hadoop cluster versions (first it was cloudera, than "vanilla" 0.20.2), but
> still having this issue. So, I will hope, if someone help to find cause for
> this.
> 
> -- 
> Regards,
> Stanislaw Kogut

Reply via email to