[
https://issues.apache.org/jira/browse/PHOENIX-998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057160#comment-14057160
]
Vikas Vishwakarma commented on PHOENIX-998:
-------------------------------------------
I saw similar issues with native HBase API based client loader. I could
correlate it to GC pause in the RegionServers. Log trace given below:
>From the Client logs a read batch failed on the RegionServer machine at
>11:13:29:
===========
[Wed Jul 09 11:13:29 GMT 2014
com.salesforce.hbase.opthreads.FetchScanRunTimeThread run SEVERE]
FetchScanRunTimeThread: java.lang.RuntimeException:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
attempts=1, exceptions:
Wed Jul 09 11:13:29 GMT 2014,
org.apache.hadoop.hbase.client.RpcRetryingCaller@2c109a0a,
java.net.SocketTimeoutException: Call to ---hostname masked--- failed because
java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel
to be ready for read. ch : java.nio.channels.SocketChannel[---hostname
masked---]
from RegionServer logs at the same time there two consecutive GC pause total
3.2 seconds duration followed by RpcServer.responder: asyncWrite failure:
===========
2014-07-09 11:13:21,372 INFO org.apache.hadoop.hbase.util.JvmPauseMonitor:
Detected pause in JVM or host machine (eg GC): pause of approximately 1552ms
2014-07-09 11:13:29,981 INFO org.apache.hadoop.hbase.util.JvmPauseMonitor:
Detected pause in JVM or host machine (eg GC): pause of approximately 1846ms
2014-07-09 11:13:30,040 WARN org.apache.hadoop.ipc.RpcServer:
RpcServer.listener,port=60020: count of bytes read: 0
2014-07-09 11:13:30,041 WARN org.apache.hadoop.ipc.RpcServer:
RpcServer.respondercallId: 30901 service: ClientService methodName: Scan size:
24 connection: 10.230.229.22:57115: output error
2014-07-09 11:13:30,042 INFO org.apache.hadoop.ipc.RpcServer:
RpcServer.responder: asyncWrite
> SocketTimeoutException under high concurrent write access to phoenix indexed
> table
> ----------------------------------------------------------------------------------
>
> Key: PHOENIX-998
> URL: https://issues.apache.org/jira/browse/PHOENIX-998
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.0.0
> Environment: HBase 0.98.1-SNAPSHOT, Hadoop 2.3.0-cdh5.0.0
> Reporter: wangxianbin
> Priority: Critical
>
> we have a small hbase cluster, which has one master, six slaves, we test
> phoenix index concurrent write access performance with four write clients,
> each client has 100 threads, each thread has one phoenix jdbc connection, and
> we encounter SocketTimeoutException as follow, and it will retry for very
> long time, how can i deal with such issue?
> 2014-05-22 17:22:58,490 INFO
> [storm4.org,60020,1400750242045-index-writer--pool3-t10] client.AsyncProcess:
> #16016, waiting for some tasks to finish. Expected max=0, tasksSent=13,
> tasksDone=12, currentTasksDone=12, retries=11 hasError=false,
> tableName=IPHOENIX10M
> 2014-05-22 17:23:00,436 INFO
> [storm4.org,60020,1400750242045-index-writer--pool3-t6] client.AsyncProcess:
> #16027, waiting for some tasks to finish. Expected max=0, tasksSent=13,
> tasksDone=12, currentTasksDone=12, retries=11 hasError=false,
> tableName=IPHOENIX10M
> 2014-05-22 17:23:00,440 INFO
> [storm4.org,60020,1400750242045-index-writer--pool3-t1] client.AsyncProcess:
> #16013, waiting for some tasks to finish. Expected max=0, tasksSent=13,
> tasksDone=12, currentTasksDone=12, retries=11 hasError=false,
> tableName=IPHOENIX10M
> 2014-05-22 17:23:00,449 INFO
> [storm4.org,60020,1400750242045-index-writer--pool3-t7] client.AsyncProcess:
> #16028, waiting for some tasks to finish. Expected max=0, tasksSent=13,
> tasksDone=12, currentTasksDone=12, retries=11 hasError=false,
> tableName=IPHOENIX10M
> 2014-05-22 17:23:00,473 INFO
> [storm4.org,60020,1400750242045-index-writer--pool3-t8] client.AsyncProcess:
> #16020, waiting for some tasks to finish. Expected max=0, tasksSent=13,
> tasksDone=12, currentTasksDone=12, retries=11 hasError=false,
> tableName=IPHOENIX10M
> 2014-05-22 17:23:00,494 INFO [htable-pool20-t13] client.AsyncProcess:
> #16016, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception:
> java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed
> because java.net.SocketTimeoutException: 2000 millis timeout while waiting
> for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017
> remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156,
> tracking started Thu May 22 17:21:32 CST 2014, retrying after 20189 ms,
> replay 1 ops.
> 2014-05-22 17:23:02,439 INFO
> [storm4.org,60020,1400750242045-index-writer--pool3-t4] client.AsyncProcess:
> #16022, waiting for some tasks to finish. Expected max=0, tasksSent=13,
> tasksDone=12, currentTasksDone=12, retries=11 hasError=false,
> tableName=IPHOENIX10M
> 2014-05-22 17:23:02,496 INFO [htable-pool20-t3] client.AsyncProcess: #16013,
> table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception:
> java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed
> because java.net.SocketTimeoutException: 2000 millis timeout while waiting
> for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017
> remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156,
> tracking started Thu May 22 17:21:32 CST 2014, retrying after 20001 ms,
> replay 1 ops.
> 2014-05-22 17:23:02,496 INFO [htable-pool20-t16] client.AsyncProcess:
> #16028, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception:
> java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed
> because java.net.SocketTimeoutException: 2000 millis timeout while waiting
> for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017
> remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156,
> tracking started Thu May 22 17:21:37 CST 2014, retrying after 20095 ms,
> replay 1 ops.
--
This message was sent by Atlassian JIRA
(v6.2#6252)