Lars Hofhansl created PHOENIX-3204:
--------------------------------------
Summary: Scanner lease timeout exception during UPSERT INTO t1
SELECT * FROM t2
Key: PHOENIX-3204
URL: https://issues.apache.org/jira/browse/PHOENIX-3204
Project: Phoenix
Issue Type: Bug
Affects Versions: 4.8.0
Reporter: Lars Hofhansl
Fix For: 4.9.0, 4.8.1
We ran some larg'ish Phoenix tests on a larg'ish cluster.
We had loaded 1bn rows into table t1 and tried to load the same data into a
different table t2. The UPSERT failed about 1/2 way through with various
exceptions like the following.
Interesting are the exception in close() chain, as Phoenix could probably
ignore those.
Just filing here for reference.
{code}
2016-08-23 00:56:41,851 INFO [phoenix-1-thread-653] client.AsyncProcess -
#4816618, waiting for some tasks to finish. Expected max=0, tasksSent=14,
tasksDone=13, currentTasksDone=13, retries=13 hasError=false, tableName=null
2016-08-23 00:56:41,851 INFO [phoenix-1-thread-653] client.AsyncProcess - Left
over 1 task(s) are processed on server(s): [host17-8,60020,1471339710192]
2016-08-23 00:56:41,851 INFO [phoenix-1-thread-653] client.AsyncProcess -
Regions against which left over task(s) are processed: [TEST3,97432328$#}
,1471913672240.9f2fd6435921da1a203daf52f0116a34.]
2016-08-23 00:56:41,859 INFO [phoenix-1-thread-293] client.AsyncProcess -
#4816622, waiting for some tasks to finish. Expected max=0, tasksSent=14,
tasksDone=13, currentTasksDone=13, retries=13 hasError=false, tableName=null
2016-08-23 00:56:41,859 INFO [phoenix-1-thread-293] client.AsyncProcess - Left
over 1 task(s) are processed on server(s): [host17-8,60020,1471339710192]
2016-08-23 00:56:41,859 INFO [phoenix-1-thread-293] client.AsyncProcess -
Regions against which left over task(s) are processed: [TEST3,97432328$#}
,1471913672240.9f2fd6435921da1a203daf52f0116a34.]
2016-08-23 00:56:41,860 INFO [80-shared--pool2-t90] client.AsyncProcess -
#4816618, table=TEST3, attempt=14/35 SUCCEEDED on host17-8,60020,1471339710192,
tracking started Tue Aug 23 00:55:12 GMT 2016
2016-08-23 00:56:41,861 DEBUG [phoenix-1-thread-653]
token.AuthenticationTokenSelector - No matching token found
2016-08-23 00:56:41,861 DEBUG [phoenix-1-thread-653]
security.HBaseSaslRpcClient - Creating SASL GSSAPI client. Server's Kerberos
principal name is hbase/host28-20@
2016-08-23 00:56:41,862 DEBUG [phoenix-1-thread-653]
security.HBaseSaslRpcClient - Have sent token of size 730 from initSASLContext.
2016-08-23 00:56:41,863 DEBUG [phoenix-1-thread-653]
security.HBaseSaslRpcClient - Will read input token of size 108 for processing
by initSASLContext
2016-08-23 00:56:41,863 DEBUG [phoenix-1-thread-653]
security.HBaseSaslRpcClient - Will send token of size 0 from initSASLContext.
2016-08-23 00:56:41,863 DEBUG [phoenix-1-thread-653]
security.HBaseSaslRpcClient - Will read input token of size 32 for processing
by initSASLContext
2016-08-23 00:56:41,863 DEBUG [phoenix-1-thread-653]
security.HBaseSaslRpcClient - Will send token of size 32 from initSASLContext.
2016-08-23 00:56:41,863 DEBUG [phoenix-1-thread-653]
security.HBaseSaslRpcClient - SASL client context established. Negotiated QoP:
auth
2016-08-23 00:56:41,865 INFO [phoenix-1-thread-653] client.ClientScanner - For
hints related to the following exception, please try taking a look at:
https://hbase.apache.org/book.html#trouble.client.scantimeout
2016-08-23 00:56:41,865 WARN [phoenix-1-thread-653] client.ScannerCallable -
Ignore, probably already closed
org.apache.hadoop.hbase.UnknownScannerException:
org.apache.hadoop.hbase.UnknownScannerException: Unknown scanner '49'. This can
happen due to any of the following reasons: a) Scanner id given is wrong, b)
Scanner lease expired because of long wait between consecutive client checkins,
c) Server may be closing down, d) RegionServer restart during upgrade.
If the issue is due to reason (b), a possible fix would be increasing the value
of'hbase.client.scanner.timeout.period' configuration.
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3228)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32492)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2208)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)
at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:305)
at
org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:309)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:58)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:115)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:91)
at
org.apache.hadoop.hbase.client.ClientScanner.close(ClientScanner.java:493)
at
org.apache.phoenix.iterate.ScanningResultIterator.close(ScanningResultIterator.java:49)
at
org.apache.phoenix.iterate.TableResultIterator.close(TableResultIterator.java:113)
at
org.apache.phoenix.jdbc.PhoenixResultSet.close(PhoenixResultSet.java:162)
at
org.apache.phoenix.compile.UpsertCompiler.upsertSelect(UpsertCompiler.java:224)
at
org.apache.phoenix.compile.UpsertCompiler.access$000(UpsertCompiler.java:110)
at
org.apache.phoenix.compile.UpsertCompiler$UpsertingParallelIteratorFactory.mutate(UpsertCompiler.java:254)
at
org.apache.phoenix.compile.MutatingParallelIteratorFactory.newIterator(MutatingParallelIteratorFactory.java:59)
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:114)
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:106)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:183)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownScannerException):
org.apache.hadoop.hbase.UnknownScannerException: Unknown scanner '49'. This
can happen due to any of the following reasons: a) Scanner id given is wrong,
b) Scanner lease expired because of long wait between consecutive client
checkins, c) Server may be closing down, d) RegionServer restart during upgrade.
If the issue is due to reason (b), a possible fix would be increasing the value
of'hbase.client.scanner.timeout.period' configuration.
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3228)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32492)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2208)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1512)
at
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1714)
at
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1784)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32938)
at
org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:307)
... 19 more
2016-08-23 00:56:41,869 INFO [80-shared--pool2-t15] client.AsyncProcess -
#4816622, table=TEST3, attempt=14/35 SUCCEEDED on host17-8,60020,1471339710192,
tracking started Tue Aug 23 00:55:12 GMT 2016
2016-08-23 00:56:41,869 INFO [phoenix-1-thread-293] client.ClientScanner - For
hints related to the following exception, please try taking a look at:
https://hbase.apache.org/book.html#trouble.client.scantimeout
2016-08-23 00:56:41,870 WARN [phoenix-1-thread-293] client.ScannerCallable -
Ignore, probably already closed
org.apache.hadoop.hbase.UnknownScannerException:
org.apache.hadoop.hbase.UnknownScannerException: Unknown scanner '107'. This
can happen due to any of the following reasons: a) Scanner id given is wrong,
b) Scanner lease expired because of long wait between consecutive client
checkins, c) Server may be closing down, d) RegionServer restart during upgrade.
If the issue is due to reason (b), a possible fix would be increasing the value
of'hbase.client.scanner.timeout.period' configuration.
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3228)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32492)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2208)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)
at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:305)
at
org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:309)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:58)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:115)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:91)
at
org.apache.hadoop.hbase.client.ClientScanner.close(ClientScanner.java:493)
at
org.apache.phoenix.iterate.ScanningResultIterator.close(ScanningResultIterator.java:49)
at
org.apache.phoenix.iterate.TableResultIterator.close(TableResultIterator.java:113)
at
org.apache.phoenix.jdbc.PhoenixResultSet.close(PhoenixResultSet.java:162)
at
org.apache.phoenix.compile.UpsertCompiler.upsertSelect(UpsertCompiler.java:224)
at
org.apache.phoenix.compile.UpsertCompiler.access$000(UpsertCompiler.java:110)
at
org.apache.phoenix.compile.UpsertCompiler$UpsertingParallelIteratorFactory.mutate(UpsertCompiler.java:254)
at
org.apache.phoenix.compile.MutatingParallelIteratorFactory.newIterator(MutatingParallelIteratorFactory.java:59)
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:114)
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:106)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:183)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownScannerException):
org.apache.hadoop.hbase.UnknownScannerException: Unknown scanner '107'. This
can happen due to any of the following reasons: a) Scanner id given is wrong,
b) Scanner lease expired because of long wait between consecutive client
checkins, c) Server may be closing down, d) RegionServer restart during upgrade.
If the issue is due to reason (b), a possible fix would be increasing the value
of'hbase.client.scanner.timeout.period' configuration.
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3228)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32492)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2208)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1512)
at
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1714)
at
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1784)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32938)
at
org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:307)
... 19 more
2016-08-23 00:56:41,890 INFO [phoenix-1-thread-174] client.AsyncProcess -
#4816630, waiting for some tasks to finish. Expected max=0, tasksSent=14,
tasksDone=13, currentTasksDone=13, retries=13 hasError=false, tableName=null
2016-08-23 00:56:41,890 INFO [phoenix-1-thread-174] client.AsyncProcess - Left
over 1 task(s) are processed on server(s): [host17-8,60020,1471339710192]
2016-08-23 00:56:41,890 INFO [phoenix-1-thread-174] client.AsyncProcess -
Regions against which left over task(s) are processed: [TEST3,97432328$#}
,1471913672240.9f2fd6435921da1a203daf52f0116a34.]
2016-08-23 00:56:41,900 INFO [0-shared--pool2-t149] client.AsyncProcess -
#4816630, table=TEST3, attempt=14/35 SUCCEEDED on host17-8,60020,1471339710192,
tracking started Tue Aug 23 00:55:12 GMT 2016
2016-08-23 00:56:41,901 DEBUG [phoenix-1-thread-174]
token.AuthenticationTokenSelector - No matching token found
2016-08-23 00:56:41,901 DEBUG [phoenix-1-thread-174]
security.HBaseSaslRpcClient - Creating SASL GSSAPI client. Server's Kerberos
principal name is hbase/host12-5@
2016-08-23 00:56:41,902 DEBUG [phoenix-1-thread-174]
security.HBaseSaslRpcClient - Have sent token of size 729 from initSASLContext.
2016-08-23 00:56:41,903 DEBUG [phoenix-1-thread-174]
security.HBaseSaslRpcClient - Will read input token of size 108 for processing
by initSASLContext
2016-08-23 00:56:41,903 DEBUG [phoenix-1-thread-174]
security.HBaseSaslRpcClient - Will send token of size 0 from initSASLContext.
2016-08-23 00:56:41,903 DEBUG [phoenix-1-thread-174]
security.HBaseSaslRpcClient - Will read input token of size 32 for processing
by initSASLContext
2016-08-23 00:56:41,903 DEBUG [phoenix-1-thread-174]
security.HBaseSaslRpcClient - Will send token of size 32 from initSASLContext.
2016-08-23 00:56:41,903 DEBUG [phoenix-1-thread-174]
security.HBaseSaslRpcClient - SASL client context established. Negotiated QoP:
auth
2016-08-23 00:56:41,904 INFO [phoenix-1-thread-174] client.ClientScanner - For
hints related to the following exception, please try taking a look at:
https://hbase.apache.org/book.html#trouble.client.scantimeout
2016-08-23 00:56:41,905 WARN [phoenix-1-thread-174] client.ScannerCallable -
Ignore, probably already closed
org.apache.hadoop.hbase.UnknownScannerException:
org.apache.hadoop.hbase.UnknownScannerException: Unknown scanner '48'. This can
happen due to any of the following reasons: a) Scanner id given is wrong, b)
Scanner lease expired because of long wait between consecutive client checkins,
c) Server may be closing down, d) RegionServer restart during upgrade.
If the issue is due to reason (b), a possible fix would be increasing the value
of'hbase.client.scanner.timeout.period' configuration.
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3228)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32492)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2208)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)
at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:305)
at
org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:309)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:58)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:115)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:91)
at
org.apache.hadoop.hbase.client.ClientScanner.close(ClientScanner.java:493)
at
org.apache.phoenix.iterate.ScanningResultIterator.close(ScanningResultIterator.java:49)
at
org.apache.phoenix.iterate.TableResultIterator.close(TableResultIterator.java:113)
at
org.apache.phoenix.jdbc.PhoenixResultSet.close(PhoenixResultSet.java:162)
at
org.apache.phoenix.compile.UpsertCompiler.upsertSelect(UpsertCompiler.java:224)
at
org.apache.phoenix.compile.UpsertCompiler.access$000(UpsertCompiler.java:110)
at
org.apache.phoenix.compile.UpsertCompiler$UpsertingParallelIteratorFactory.mutate(UpsertCompiler.java:254)
at
org.apache.phoenix.compile.MutatingParallelIteratorFactory.newIterator(MutatingParallelIteratorFactory.java:59)
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:114)
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:106)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:183)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownScannerException):
org.apache.hadoop.hbase.UnknownScannerException: Unknown scanner '48'. This
can happen due to any of the following reasons: a) Scanner id given is wrong,
b) Scanner lease expired because of long wait between consecutive client
checkins, c) Server may be closing down, d) RegionServer restart during upgrade.
If the issue is due to reason (b), a possible fix would be increasing the value
of'hbase.client.scanner.timeout.period' configuration.
at
org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3228)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32492)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2208)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
at
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1512)
at
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1714)
at
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1784)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32938)
at
org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:307)
... 19 more
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)