[ https://issues.apache.org/jira/browse/HBASE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947790#comment-14947790 ]
Stephen Yuan Jiang commented on HBASE-14533: -------------------------------------------- [~stack], is this the same as HBASE-14196? > Thrift client gets "AsyncProcess: Failed to get region location .... closed" > ---------------------------------------------------------------------------- > > Key: HBASE-14533 > URL: https://issues.apache.org/jira/browse/HBASE-14533 > Project: HBase > Issue Type: Bug > Components: REST, Thrift > Affects Versions: 1.0.0 > Reporter: stack > Attachments: test.patch > > > An internal python client has been getting below stack trace since > HBASE-134347 > {code} > 2015-09-30 11:27:31,670 runner ERROR : scheduler > executor error > 2015-09-30 11:27:31,674 runner ERROR : Traceback (most > recent call last): > File > "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsRtiFetcher-0.1-py2.6.egg/cops_rti/fetcher/runner.py", > line 82, in run > fetch_list = self.__scheduler_executor.run() > File > "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsRtiFetcher-0.1-py2.6.egg/cops_rti/fetcher/scheduler.py", > line 35, in run > with self.__fetch_db_dao.get_scanner() as scanner: > File > "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_dao.py", > line 57, in get_scanner > caching=caching, field_filter_list=field_filter_list) > File > "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_client_template.py", > line 104, in get_entity_scanner > self.__fix_cfs(self.__filter_columns(field_filter_list)), caching) > File > "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_entity_scanner.py", > line 81, in open > self.__scanner_id = client.scannerOpenWithScan(table_name, scan) > File > "/opt/cops/cops-related-ticket-info-fetcher/.crepo/cops-hbase-common/ext-py/hbase/Hbase.py", > line 1494, in scannerOpenWithScan > return self.recv_scannerOpenWithScan() > File > "/opt/cops/cops-related-ticket-info-fetcher/.crepo/cops-hbase-common/ext-py/hbase/Hbase.py", > line 1518, in recv_scannerOpenWithScan > raise result.io > IOError: > IOError(message="org.apache.hadoop.hbase.client.RetriesExhaustedException: > Can't get the location\n\tat > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:308)\n\tat > > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:149)\n\tat > > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:57)\n\tat > > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)\n\tat > > org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:293)\n\tat > > org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268)\n\tat > > org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)\n\tat > > org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135)\n\tat > org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:888)\n\tat > org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.scannerOpenWithScan(ThriftServerRunner.java:1446)\n\tat > sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)\n\tat > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat > java.lang.reflect.Method.invoke(Method.java:606)\n\tat > org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67)\n\tat > com.sun.proxy.$Proxy14.scannerOpenWithScan(Unknown Source)\n\tat > org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4609)\n\tat > > org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4593)\n\tat > org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)\n\tat > org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)\n\tat > org.apache.hadoop.hbase.thrift.ThriftServerRunner$3.process(ThriftServerRunner.java:502)\n\tat > > org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)\n\tat > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat > java.lang.Thread.run(Thread.java:745)\nCaused by: java.io.IOException: > hconnection-0xa8e1bf9 closed\n\tat > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1117)\n\tat > > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:299)\n\t... > 23 more\n") > {code} > On the thrift server side we see this: > {code} > 2015-09-30 07:22:59,427 ERROR org.apache.hadoop.hbase.client.AsyncProcess: > Failed to get region location > java.io.IOException: hconnection-0x4142991e closed > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1117) > at > org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:369) > at > org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:320) > at > org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:206) > at > org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:183) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1496) > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1107) > at > org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRowTs(ThriftServerRunner.java:1256) > at > org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRow(ThriftServerRunner.java:1209) > at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67) > at com.sun.proxy.$Proxy14.mutateRow(Unknown Source) > at > org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$mutateRow.getResult(Hbase.java:4334) > at > org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$mutateRow.getResult(Hbase.java:4318) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hadoop.hbase.thrift.ThriftServerRunner$3.process(ThriftServerRunner.java:502) > at > org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} > HBASE-13437 has us actual execute a close on timeout -- before we'd mark > connection closed but would never call close on it. > A background chore is going around stamping Connections in the > ConnectionCache as 'closed' if they have not been used in ten minutes. The > 'close' can come in at any time..... In particular between the point at which > we get the table/connection and when we go to use it: i.e. flush puts. It is > at the flush puts point that we get the above 'AsyncProcess: Failed to get > region location' (It is not a failure to find region location but rather our > noticing that the connection has been closed). > Attempts at reproducing this issue locally letting the Connection timeout can > generate the above exception if a certain dance is done but it is hard to do; > I am not reproducing the actual usage by the aforementioned client. > Next steps would be setting up python client talking via thrift and then try > using connection after it has been evicted from the connection cache. Another > thing to try is a pool of connections on the python side...connections are > identified by user and table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)