Re: What can corrupt HBase table and what is Cannot find row in .META. for table?

2014-11-19 Thread Serega Sheypak
Hi, I'm using Java API. I see mentioned exception in java log.
I'll provide full stacktrace next time.

2014-11-19 1:01 GMT+03:00 Ted Yu yuzhih...@gmail.com:

 The thread you mentioned was more about thrift API rather than
 TableNotFoundException.

 Can you show us the stack trace of TableNotFoundException (vicinity of app
 log around the exception) ?
 Please also check master log / meta region server log.

 I assume you could access the table using hbase shell.

 Cheers

 On Tue, Nov 18, 2014 at 12:57 PM, Serega Sheypak serega.shey...@gmail.com
 
 wrote:

  hi, sometimes I do get im my web application log:
  org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META.
  for table: my_favorite_table
 
  I've found this
 
 
 http://grokbase.com/t/hbase/user/143bn79wf2/cannot-find-row-in-meta-for-table
 
  I did run hbase hbck
  result:
  0 inconsistencies detected.
  Status: OK
 
  What can I try next?
  I'm using Cloudera CDH 5.2 HBase 0.98
 
  Thanks!
 



hbase: secure login and connection management

2014-11-19 Thread Bogala, Chandra Reddy
Hi,
  I am trying to login to secure cluster with keytabs using below methods. It 
works fine if  the token is not expired. My process runs for long time ( web 
app from tomcat). Keep getting below exceptions after the token expire time and 
connection fails if the user tries to view data from web page.
What is the better way of handling connections? How to refresh keys 
automatically?. Is there a spring implementation for managing connections? If 
yes, can you share sample code.


UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab(hbase.myclient.principal, 
hbase.myclient.keytab);

2014-11-13 08:25:49,899 ERROR [org.apache.hadoop.security.UserGroupInformation] 
PriviledgedActionException as u...@mycompany.com (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2014-11-13 08:25:49,900 WARN [org.apache.hadoop.ipc.RpcClient] Exception 
encountered while connecting to the server : javax.security.sasl.SaslException: 
GSS initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos tgt)]
javax.security.sasl.SaslException: GSS initiate failed
Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism 
level: Failed to find any Kerberos tgt)

Thanks,
Chandra





Re: hbase: secure login and connection management

2014-11-19 Thread Matteo Bertozzi
Take a look at the patch added to
https://issues.apache.org/jira/browse/HBASE-12366
There will be a new AuthUtil. launchAuthChore() which should help in your
case.
(The doc patch is here https://issues.apache.org/jira/browse/HBASE-12528)

Matteo


On Wed, Nov 19, 2014 at 11:19 AM, Bogala, Chandra Reddy 
chandra.bog...@gs.com wrote:

 Hi,
   I am trying to login to secure cluster with keytabs using below methods.
 It works fine if  the token is not expired. My process runs for long time (
 web app from tomcat). Keep getting below exceptions after the token expire
 time and connection fails if the user tries to view data from web page.
 What is the better way of handling connections? How to refresh keys
 automatically?. Is there a spring implementation for managing connections?
 If yes, can you share sample code.


 UserGroupInformation.setConfiguration(conf);
 UserGroupInformation.loginUserFromKeytab(hbase.myclient.principal,
 hbase.myclient.keytab);

 2014-11-13 08:25:49,899 ERROR
 [org.apache.hadoop.security.UserGroupInformation]
 PriviledgedActionException as u...@mycompany.com (auth:KERBEROS)
 cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by
 GSSException: No valid credentials provided (Mechanism level: Failed to
 find any Kerberos tgt)]
 2014-11-13 08:25:49,900 WARN [org.apache.hadoop.ipc.RpcClient] Exception
 encountered while connecting to the server :
 javax.security.sasl.SaslException: GSS initiate failed [Caused by
 GSSException: No valid credentials provided (Mechanism level: Failed to
 find any Kerberos tgt)]
 javax.security.sasl.SaslException: GSS initiate failed
 Caused by: org.ietf.jgss.GSSException: No valid credentials provided
 (Mechanism level: Failed to find any Kerberos tgt)

 Thanks,
 Chandra






RPC Timeout - DoNotRetryIOException

2014-11-19 Thread xuge...@longshine.com
Hello:
I also have encountered the exception? do you have some solution? please 
tell me.
   tks.


xuge...@longshine.com


[ANNOUNCE] HBase 0.98.8 is now available for download

2014-11-19 Thread Andrew Purtell
Apache HBase 0.98.8 is now available for download. Get it from an Apache
mirror [1] or Maven repository.

The list of changes in this release can be found in the release notes [2]
or following this announcement. This release contains a fix for a security
issue, please see HBASE-12536 [3] for more detail.

Thanks to all who contributed to this release.

Best,
The HBase Dev Team

1. http://www.apache.org/dyn/closer.cgi/hbase/
2. http://s.apache.org/44a
3. https://issues.apache.org/jira/browse/HBASE-12536

-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)


Re: HBase concurrent.RejectedExecutionException

2014-11-19 Thread Nicolas Liochon
Hi Arul,

It's a pure client exception: it means that the client has not even tried
to send the query to the server, it failed before.
Why the client failed is another question.
I see that the pool size is 7, have you changed the default configuration?

Cheers,

Nicolas

On Tue, Nov 18, 2014 at 7:29 AM, Arul Ramachandran arkup...@gmail.com
wrote:

 Hi Ted,

 I don't have the load metrics at the moment... are you suggesting this
 could be load related?

 Thanks

 On Mon, Nov 17, 2014 at 5:34 PM, Ted Yu yuzhih...@gmail.com wrote:

  What was the load on rs3.world.com,60020,1414690096750 around 09:49:30 ?
 
  Cheers
 
  On Mon, Nov 17, 2014 at 4:59 PM, Arul Ramachandran arkup...@gmail.com
  wrote:
 
   Hi,
  
   Our Hbase application gets the following exception - HBase
   0.96.1.2.0.6.1-101-hadoop2. I looked at the region server log and
 nothing
   unusual is happening.
  
   Any pointers on what else I can check?  Thanks!
  
   2014-11-16 09:49:37,686  WARN
   [hbase-connection-shared-executor-pool644-t158]
   AsyncProcess.sendMultiAction(AsyncProcess.java:511) - The task was
  rejected
   by the pool. This is unexpected. Server is rs3.world.com
   ,60020,1414690096750
  
   java.util.concurrent.RejectedExecutionException: Task
   java.util.concurrent.FutureTask@f48544d rejected from
   java.util.concurrent.ThreadPoolExecutor@10916cfd[Shutting down, pool
  size
   =
   7, active threads = 7, queued tasks = 0, completed tasks = 932]
  
   at
  
  
 
 java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
  
   at
  
 
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
  
   at
  
  
 
 java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
  
   at
  
  
 
 java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:110)
  
   at
  
  
 
 org.apache.hadoop.hbase.client.AsyncProcess.sendMultiAction(AsyncProcess.java:506)
  
   at
  
 org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:461)
  
   at
  
  
 
 org.apache.hadoop.hbase.client.AsyncProcess.receiveMultiAction(AsyncProcess.java:700)
  
   at
  
  
 
 org.apache.hadoop.hbase.client.AsyncProcess.access$300(AsyncProcess.java:89)
  
   at
  
 org.apache.hadoop.hbase.client.AsyncProcess$1.run(AsyncProcess.java:498)
  
   at
   java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  
   at
  
  
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  
   at
  
  
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  
   at java.lang.Thread.run(Thread.java:744)
  
   2014-11-16 09:49:37,686  INFO [qtp1232775351-168005]
   AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:743) - :
  Waiting
   for the global number of running tasks to be equals or less than 0,
   tasksSent=3, tasksDone=2, currentTasksDone=2,
 tableName=group_sku_mapping
  
   2014-11-16 09:49:37,686  INFO [qtp1232775351-168006]
   AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:743) - :
  Waiting
   for the global number of running tasks to be equals or less than 0,
   tasksSent=3, tasksDone=2, currentTasksDone=2,
 tableName=group_sku_mapping
  
   2014-11-16 09:49:37,686  WARN
   [hbase-connection-shared-executor-pool644-t159]
   AsyncProcess.sendMultiAction(AsyncProcess.java:511) - The task was
  rejected
   by the pool. This is unexpected. Server is rs3.world.com
   ,60020,1414690096750
  
 



scan column qualifiers in column family

2014-11-19 Thread beeshma r
Hi

i need to find whether particular column qualifier present in column family
 so i did code like this

As per document

public boolean containsColumn(byte[] family,
 byte[] qualifier)

Checks for existence of a value for the specified column (empty or not).
Parameters:family - family namequalifier - column qualifierReturns:true if
at least one value exists in the result, false if not

// //my code

public static boolean search_column(String mail) throws IOException
{

HTable testTable = new HTable(frinds_util.get_config(),
people);//configuration
byte[] email_b=Bytes.toBytes(mail);//column qulifier
byte[] colmnfamily=Bytes.toBytes(colmn_fam);//column family
Scan scan_col=new Scan (Bytes.toBytes(colmn_fam),email_b);
ResultScanner results = testTable.getScanner(scan_col);
Result result = results.next();

if(result.containsColumn(colmnfamily, email_b))//check whether
column presernt

{
System.out.println(column is present);
ret=true;

}
return ret;

}

my build is failed with below o/p



java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.NullPointerException
at
org.freinds_rep.java.Insert_friend.search_column(Insert_friend.java:106)
at org.freinds_rep.java.Insert_friend.main(Insert_friend.java:156)
... 6 more
[WARNING] thread Thread[org.freinds_rep.java.Insert_friend.main(
127.0.0.1:2181),5,org.freinds_rep.java.Insert_friend] was interrupted but
is still alive after waiting at least 15000msecs
[WARNING] thread Thread[org.freinds_rep.java.Insert_friend.main(
127.0.0.1:2181),5,org.freinds_rep.java.Insert_friend] will linger despite
being asked to die via interruption
[WARNING] NOTE: 1 thread(s) did not finish despite being asked to  via
interruption. This is not a problem with exec:java, it is a problem with
the running code. Although not serious, it should be remedied.
[WARNING] Couldn't destroy threadgroup
org.codehaus.mojo.exec.ExecJavaMojo$IsolatedThreadGroup[name=org.freinds_rep.java.Insert_friend,maxpri=10]
java.lang.IllegalThreadStateException
at java.lang.ThreadGroup.destroy(ThreadGroup.java:775)
at org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:328)
at
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
at
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
at
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
at
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
at
org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
at
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
at
org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
at
org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409)
at
org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352)
[INFO]

[INFO] BUILD FAILURE
[INFO]

[INFO] Total time: 1:23.294s
[INFO] Finished at: Wed Nov 19 09:08:48 PST 2014
[INFO] Final Memory: 10M/137M





Any idea how to solve this?


Re: scan column qualifiers in column family

2014-11-19 Thread Ted Yu
bq. org.freinds_rep.java.Insert_friend.search_column(Insert_friend.java:106)

Does line 106 correspond to result.containsColumn() call ?
If so, result was null.

On Wed, Nov 19, 2014 at 9:47 AM, beeshma r beeshm...@gmail.com wrote:

 Hi

 i need to find whether particular column qualifier present in column family
  so i did code like this

 As per document

 public boolean containsColumn(byte[] family,
  byte[] qualifier)

 Checks for existence of a value for the specified column (empty or not).
 Parameters:family - family namequalifier - column qualifierReturns:true if
 at least one value exists in the result, false if not

 // //my code

 public static boolean search_column(String mail) throws IOException
 {

 HTable testTable = new HTable(frinds_util.get_config(),
 people);//configuration
 byte[] email_b=Bytes.toBytes(mail);//column qulifier
 byte[] colmnfamily=Bytes.toBytes(colmn_fam);//column family
 Scan scan_col=new Scan (Bytes.toBytes(colmn_fam),email_b);
 ResultScanner results = testTable.getScanner(scan_col);
 Result result = results.next();

 if(result.containsColumn(colmnfamily, email_b))//check whether
 column presernt

 {
 System.out.println(column is present);
 ret=true;

 }
 return ret;

 }

 my build is failed with below o/p



 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
 at java.lang.Thread.run(Thread.java:724)
 Caused by: java.lang.NullPointerException
 at
 org.freinds_rep.java.Insert_friend.search_column(Insert_friend.java:106)
 at org.freinds_rep.java.Insert_friend.main(Insert_friend.java:156)
 ... 6 more
 [WARNING] thread Thread[org.freinds_rep.java.Insert_friend.main(
 127.0.0.1:2181),5,org.freinds_rep.java.Insert_friend] was interrupted but
 is still alive after waiting at least 15000msecs
 [WARNING] thread Thread[org.freinds_rep.java.Insert_friend.main(
 127.0.0.1:2181),5,org.freinds_rep.java.Insert_friend] will linger despite
 being asked to die via interruption
 [WARNING] NOTE: 1 thread(s) did not finish despite being asked to  via
 interruption. This is not a problem with exec:java, it is a problem with
 the running code. Although not serious, it should be remedied.
 [WARNING] Couldn't destroy threadgroup

 org.codehaus.mojo.exec.ExecJavaMojo$IsolatedThreadGroup[name=org.freinds_rep.java.Insert_friend,maxpri=10]
 java.lang.IllegalThreadStateException
 at java.lang.ThreadGroup.destroy(ThreadGroup.java:775)
 at org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:328)
 at

 org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
 at

 org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
 at

 org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
 at

 org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
 at

 org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
 at

 org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
 at

 org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
 at

 org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
 at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
 at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
 at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)
 at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)
 at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at

 org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
 at
 org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
 at

 org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409)
 at
 org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352)
 [INFO]
 
 [INFO] BUILD FAILURE
 [INFO]
 

can't start region server after crash

2014-11-19 Thread Li Li
I am running a single node pseudo hbase cluster on top of a pseudo hadoop.
hadoop is 1.2.1 and replication factor of hdfs is 1. And the hbase
version is 0.98.5
Last night, I found the region server crashed (the process is gone)
I found many logs say
[JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host
machine (eg GC): pause of approximately 2176ms

GC pool 'ParNew' had collection(s): count=1 time=0ms

Then I use ./bin/stop-hbase.sh to stop it and then start-hbase.sh to restart it.
Then I can see many logs in region server like:

wal.HLogSplitter: Creating writer
path=hdfs://192.168.10.121:9000/hbase/data/default/baiducrawler.webpage/5e7f8f9c63c12a70892f3a774e3186f4/recovered.edits/0121515.temp
region=5e7f8f9c63c12a70892f3a774e3186f4

The cpu usage is high and disk read/write speed is 20MB/s. So I let it
run and go home.
Today morning, I found the region server crash and found logs:

hdfs.DFSClient: Failed to close file
/hbase/data/default/baiducrawler.webpage/1a4628670035e53d38f87b534b3302bf/recovered.edits/0116237.temp

org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/hbase/data/default/baiducrawler.webpage/1a4628670035e53d38f87b534b3302bf/recovered.edits/0116237.temp
could only be replicated to 0 nodes, instead of 1

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)

at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)

at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)


at org.apache.hadoop.ipc.Client.call(Client.java:1113)

at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)

at com.sun.proxy.$Proxy8.addBlock(Unknown Source)

at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)

at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)

at com.sun.proxy.$Proxy8.addBlock(Unknown Source)

at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:294)

at com.sun.proxy.$Proxy9.addBlock(Unknown Source)

at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)

at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)

at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)

at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023


Re: can't start region server after crash

2014-11-19 Thread Li Li
also in hdfs ui, I found Number of Under-Replicated Blocks : 497741
it seems there are many bad blocks. is there any method to rescue good data?

On Thu, Nov 20, 2014 at 10:52 AM, Li Li fancye...@gmail.com wrote:
 I am running a single node pseudo hbase cluster on top of a pseudo hadoop.
 hadoop is 1.2.1 and replication factor of hdfs is 1. And the hbase
 version is 0.98.5
 Last night, I found the region server crashed (the process is gone)
 I found many logs say
 [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host
 machine (eg GC): pause of approximately 2176ms

 GC pool 'ParNew' had collection(s): count=1 time=0ms

 Then I use ./bin/stop-hbase.sh to stop it and then start-hbase.sh to restart 
 it.
 Then I can see many logs in region server like:

 wal.HLogSplitter: Creating writer
 path=hdfs://192.168.10.121:9000/hbase/data/default/baiducrawler.webpage/5e7f8f9c63c12a70892f3a774e3186f4/recovered.edits/0121515.temp
 region=5e7f8f9c63c12a70892f3a774e3186f4

 The cpu usage is high and disk read/write speed is 20MB/s. So I let it
 run and go home.
 Today morning, I found the region server crash and found logs:

 hdfs.DFSClient: Failed to close file
 /hbase/data/default/baiducrawler.webpage/1a4628670035e53d38f87b534b3302bf/recovered.edits/0116237.temp

 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /hbase/data/default/baiducrawler.webpage/1a4628670035e53d38f87b534b3302bf/recovered.edits/0116237.temp
 could only be replicated to 0 nodes, instead of 1

 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)

 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)

 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)

 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)

 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)

 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)

 at java.security.AccessController.doPrivileged(Native Method)

 at javax.security.auth.Subject.doAs(Subject.java:415)

 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)


 at org.apache.hadoop.ipc.Client.call(Client.java:1113)

 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)

 at com.sun.proxy.$Proxy8.addBlock(Unknown Source)

 at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)

 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)

 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)

 at com.sun.proxy.$Proxy8.addBlock(Unknown Source)

 at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)

 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at 
 org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:294)

 at com.sun.proxy.$Proxy9.addBlock(Unknown Source)

 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)

 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)

 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)

 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023


Re: can't start region server after crash

2014-11-19 Thread Ted Yu
Have you tried using fsck ?

Cheers

On Wed, Nov 19, 2014 at 6:56 PM, Li Li fancye...@gmail.com wrote:

 also in hdfs ui, I found Number of Under-Replicated Blocks : 497741
 it seems there are many bad blocks. is there any method to rescue good
 data?

 On Thu, Nov 20, 2014 at 10:52 AM, Li Li fancye...@gmail.com wrote:
  I am running a single node pseudo hbase cluster on top of a pseudo
 hadoop.
  hadoop is 1.2.1 and replication factor of hdfs is 1. And the hbase
  version is 0.98.5
  Last night, I found the region server crashed (the process is gone)
  I found many logs say
  [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host
  machine (eg GC): pause of approximately 2176ms
 
  GC pool 'ParNew' had collection(s): count=1 time=0ms
 
  Then I use ./bin/stop-hbase.sh to stop it and then start-hbase.sh to
 restart it.
  Then I can see many logs in region server like:
 
  wal.HLogSplitter: Creating writer
  path=hdfs://
 192.168.10.121:9000/hbase/data/default/baiducrawler.webpage/5e7f8f9c63c12a70892f3a774e3186f4/recovered.edits/0121515.temp
  region=5e7f8f9c63c12a70892f3a774e3186f4
 
  The cpu usage is high and disk read/write speed is 20MB/s. So I let it
  run and go home.
  Today morning, I found the region server crash and found logs:
 
  hdfs.DFSClient: Failed to close file
 
 /hbase/data/default/baiducrawler.webpage/1a4628670035e53d38f87b534b3302bf/recovered.edits/0116237.temp
 
  org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 
 /hbase/data/default/baiducrawler.webpage/1a4628670035e53d38f87b534b3302bf/recovered.edits/0116237.temp
  could only be replicated to 0 nodes, instead of 1
 
  at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
 
  at
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
 
  at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 
  at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
  at java.lang.reflect.Method.invoke(Method.java:606)
 
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
 
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
 
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
 
  at java.security.AccessController.doPrivileged(Native Method)
 
  at javax.security.auth.Subject.doAs(Subject.java:415)
 
  at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
 
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
 
 
  at org.apache.hadoop.ipc.Client.call(Client.java:1113)
 
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
 
  at com.sun.proxy.$Proxy8.addBlock(Unknown Source)
 
  at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
 
  at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
  at java.lang.reflect.Method.invoke(Method.java:606)
 
  at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
 
  at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
 
  at com.sun.proxy.$Proxy8.addBlock(Unknown Source)
 
  at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
 
  at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
  at java.lang.reflect.Method.invoke(Method.java:606)
 
  at
 org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:294)
 
  at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
 
  at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)
 
  at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)
 
  at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)
 
  at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023



Re: can't start region server after crash

2014-11-19 Thread Li Li
I have tried and found many file's replication factor is
3(dfs.replication is 1 in hdfs.xml). So I try to set it to 1 now.
there are so many files that it takes more than 30 minutes now and
still not finished.
I will try fsck later

On Thu, Nov 20, 2014 at 11:25 AM, Ted Yu yuzhih...@gmail.com wrote:
 Have you tried using fsck ?

 Cheers

 On Wed, Nov 19, 2014 at 6:56 PM, Li Li fancye...@gmail.com wrote:

 also in hdfs ui, I found Number of Under-Replicated Blocks : 497741
 it seems there are many bad blocks. is there any method to rescue good
 data?

 On Thu, Nov 20, 2014 at 10:52 AM, Li Li fancye...@gmail.com wrote:
  I am running a single node pseudo hbase cluster on top of a pseudo
 hadoop.
  hadoop is 1.2.1 and replication factor of hdfs is 1. And the hbase
  version is 0.98.5
  Last night, I found the region server crashed (the process is gone)
  I found many logs say
  [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host
  machine (eg GC): pause of approximately 2176ms
 
  GC pool 'ParNew' had collection(s): count=1 time=0ms
 
  Then I use ./bin/stop-hbase.sh to stop it and then start-hbase.sh to
 restart it.
  Then I can see many logs in region server like:
 
  wal.HLogSplitter: Creating writer
  path=hdfs://
 192.168.10.121:9000/hbase/data/default/baiducrawler.webpage/5e7f8f9c63c12a70892f3a774e3186f4/recovered.edits/0121515.temp
  region=5e7f8f9c63c12a70892f3a774e3186f4
 
  The cpu usage is high and disk read/write speed is 20MB/s. So I let it
  run and go home.
  Today morning, I found the region server crash and found logs:
 
  hdfs.DFSClient: Failed to close file
 
 /hbase/data/default/baiducrawler.webpage/1a4628670035e53d38f87b534b3302bf/recovered.edits/0116237.temp
 
  org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 
 /hbase/data/default/baiducrawler.webpage/1a4628670035e53d38f87b534b3302bf/recovered.edits/0116237.temp
  could only be replicated to 0 nodes, instead of 1
 
  at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
 
  at
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
 
  at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 
  at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
  at java.lang.reflect.Method.invoke(Method.java:606)
 
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
 
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
 
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
 
  at java.security.AccessController.doPrivileged(Native Method)
 
  at javax.security.auth.Subject.doAs(Subject.java:415)
 
  at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
 
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
 
 
  at org.apache.hadoop.ipc.Client.call(Client.java:1113)
 
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
 
  at com.sun.proxy.$Proxy8.addBlock(Unknown Source)
 
  at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
 
  at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
  at java.lang.reflect.Method.invoke(Method.java:606)
 
  at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
 
  at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
 
  at com.sun.proxy.$Proxy8.addBlock(Unknown Source)
 
  at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
 
  at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
  at java.lang.reflect.Method.invoke(Method.java:606)
 
  at
 org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:294)
 
  at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
 
  at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)
 
  at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)
 
  at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)
 
  at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023



Re: can't start region server after crash

2014-11-19 Thread Li Li
hadoop fsck /
 Status: HEALTHY

 Total size:1382743735840 B

 Total dirs:1127

 Total files:   476753

 Total blocks (validated):  490085 (avg. block size 2821436 B)

 Minimally replicated blocks:   490085 (100.0 %)

 Over-replicated blocks:0 (0.0 %)

 Under-replicated blocks:   0 (0.0 %)

 Mis-replicated blocks: 0 (0.0 %)

 Default replication factor:1

 Average block replication: 1.0

 Corrupt blocks:0

 Missing replicas:  0 (0.0 %)

 Number of data-nodes:  1

 Number of racks:   1

FSCK ended at Thu Nov 20 13:57:44 CST 2014 in 9065 milliseconds

On Thu, Nov 20, 2014 at 11:25 AM, Ted Yu yuzhih...@gmail.com wrote:
 Have you tried using fsck ?

 Cheers

 On Wed, Nov 19, 2014 at 6:56 PM, Li Li fancye...@gmail.com wrote:

 also in hdfs ui, I found Number of Under-Replicated Blocks : 497741
 it seems there are many bad blocks. is there any method to rescue good
 data?

 On Thu, Nov 20, 2014 at 10:52 AM, Li Li fancye...@gmail.com wrote:
  I am running a single node pseudo hbase cluster on top of a pseudo
 hadoop.
  hadoop is 1.2.1 and replication factor of hdfs is 1. And the hbase
  version is 0.98.5
  Last night, I found the region server crashed (the process is gone)
  I found many logs say
  [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or host
  machine (eg GC): pause of approximately 2176ms
 
  GC pool 'ParNew' had collection(s): count=1 time=0ms
 
  Then I use ./bin/stop-hbase.sh to stop it and then start-hbase.sh to
 restart it.
  Then I can see many logs in region server like:
 
  wal.HLogSplitter: Creating writer
  path=hdfs://
 192.168.10.121:9000/hbase/data/default/baiducrawler.webpage/5e7f8f9c63c12a70892f3a774e3186f4/recovered.edits/0121515.temp
  region=5e7f8f9c63c12a70892f3a774e3186f4
 
  The cpu usage is high and disk read/write speed is 20MB/s. So I let it
  run and go home.
  Today morning, I found the region server crash and found logs:
 
  hdfs.DFSClient: Failed to close file
 
 /hbase/data/default/baiducrawler.webpage/1a4628670035e53d38f87b534b3302bf/recovered.edits/0116237.temp
 
  org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 
 /hbase/data/default/baiducrawler.webpage/1a4628670035e53d38f87b534b3302bf/recovered.edits/0116237.temp
  could only be replicated to 0 nodes, instead of 1
 
  at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
 
  at
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
 
  at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 
  at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
  at java.lang.reflect.Method.invoke(Method.java:606)
 
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
 
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
 
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
 
  at java.security.AccessController.doPrivileged(Native Method)
 
  at javax.security.auth.Subject.doAs(Subject.java:415)
 
  at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
 
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
 
 
  at org.apache.hadoop.ipc.Client.call(Client.java:1113)
 
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
 
  at com.sun.proxy.$Proxy8.addBlock(Unknown Source)
 
  at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
 
  at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
  at java.lang.reflect.Method.invoke(Method.java:606)
 
  at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
 
  at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
 
  at com.sun.proxy.$Proxy8.addBlock(Unknown Source)
 
  at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
 
  at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
  at java.lang.reflect.Method.invoke(Method.java:606)
 
  at
 org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:294)
 
  at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
 
  at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)
 
  at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)
 
  at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)
 
  at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023