Forgot to mention that. It's version 0.92.1 (Cloudera CDH4.1.1), running on 
CentOS 6 64 bit, Java 1.6.0_31

On Dec 14, 2012, at 5:31 PM, lars hofhansl <lhofha...@yahoo.com> wrote:

> Hey Bryan, 
> 
> 
> which version of HBase it this?
> 
> -- Lars
> 
> 
> 
> ________________________________
> From: Bryan Keller <brya...@gmail.com>
> To: "user@hbase.apache.org" <user@hbase.apache.org> 
> Sent: Friday, December 14, 2012 2:59 PM
> Subject: HBaseClient.call() hang
> 
> I have encountered a problem with HBaseClient.call() hanging. This occurs 
> when one of my regionservers goes down while performing a table scan.
> 
> What exacerbates this problem is that the scan I am performing uses filters, 
> and the region size of the table is large (4gb). Because of this, it can take 
> several minutes for a row to be returned when calling scanner.next(). 
> Apparently there is no keep alive message being sent back to the scanner 
> while the region server is busy, so I had to increase the hbase.rpc.timeout 
> value to a large number (60 min), otherwise the next() call will timeout 
> waiting for the regionserver to send something back.
> 
> The result is that this HBaseClient.call() hang is made much worse, because 
> it won't time out for 60 minutes.
> 
> I have a couple of questions:
> 
> 1. Any thoughts on why the HBaseClient.call() is getting stuck? I noticed 
> that call.wait() is not using any timeout so it will wait indefinitely until 
> interrupted externally
> 
> 2. Is there a solution where I do not need to set hbase.rpc.timeout to a very 
> large number? My only thought would be to forego using filters and do the 
> filtering client side, which seems pretty inefficient
> 
> Here is a stack dump of the thread that was hung:
> 
> Thread 10609: (state = BLOCKED)
> - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
> - java.lang.Object.wait() @bci=2, line=485 (Interpreted frame)
> - org.apache.hadoop.hbase.ipc.HBaseClient.call(org.apache.hadoop.io.Writable, 
> java.net.InetSocketAddress, java.lang.Class, 
> org.apache.hadoop.hbase.security.User, int) @bci=51, line=904 (Interpreted 
> frame)
> - 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(java.lang.Object,
>  java.lang.reflect.Method, java.lang.Object[]) @bci=52, line=150 (Interpreted 
> frame)
> - $Proxy12.next(long, int) @bci=26 (Interpreted frame)
> - org.apache.hadoop.hbase.client.ScannerCallable.call() @bci=72, line=92 
> (Interpreted frame)
> - org.apache.hadoop.hbase.client.ScannerCallable.call() @bci=1, line=42 
> (Interpreted frame)
> - 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(org.apache.hadoop.hbase.client.ServerCallable)
>  @bci=36, line=1325 (Interpreted frame)
> - org.apache.hadoop.hbase.client.HTable$ClientScanner.next() @bci=117, 
> line=1299 (Compiled frame)
> - org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue() 
> @bci=41, line=150 (Interpreted frame)
> - org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue() @bci=4, 
> line=142 (Interpreted frame)
> - org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue() 
> @bci=4, line=458 (Interpreted frame)
> - org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue() @bci=4, 
> line=76 (Interpreted frame)
> - org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue() 
> @bci=4, line=85 (Interpreted frame)
> - 
> org.apache.hadoop.mapreduce.Mapper.run(org.apache.hadoop.mapreduce.Mapper$Context)
>  @bci=6, line=139 (Interpreted frame)
> - 
> org.apache.hadoop.mapred.MapTask.runNewMapper(org.apache.hadoop.mapred.JobConf,
>  org.apache.hadoop.mapreduce.split.JobSplit$TaskSplitIndex, 
> org.apache.hadoop.mapred.TaskUmbilicalProtocol, 
> org.apache.hadoop.mapred.Task$TaskReporter) @bci=201, line=645 (Interpreted 
> frame)
> - org.apache.hadoop.mapred.MapTask.run(org.apache.hadoop.mapred.JobConf, 
> org.apache.hadoop.mapred.TaskUmbilicalProtocol) @bci=100, line=325 
> (Interpreted frame)
> - org.apache.hadoop.mapred.Child$4.run() @bci=29, line=268 (Interpreted frame)
> - 
> java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction,
>  java.security.AccessControlContext) @bci=0 (Interpreted frame)
> - javax.security.auth.Subject.doAs(javax.security.auth.Subject, 
> java.security.PrivilegedExceptionAction) @bci=42, line=396 (Interpreted frame)
> - 
> org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction)
>  @bci=14, line=1332 (Interpreted frame)
> - org.apache.hadoop.mapred.Child.main(java.lang.String[]) @bci=776, line=262 
> (Interpreted frame)

Reply via email to