[jira] [Commented] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930081#comment-15930081
 ] 

Ted Yu commented on HBASE-17798:


Did you put the proposed change in your cluster ?

If so, what effect do you observe ?

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
> Attachments: HBASE-17798-0.98-v1.patch, 
> HBASE-17798-branch-1-v1.patch, HBASE-17798-master-v1.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930968#comment-15930968
 ] 

Andrew Purtell commented on HBASE-17798:


+1

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
> Attachments: HBASE-17798-0.98-v1.patch, 
> HBASE-17798-branch-1-v1.patch, HBASE-17798-master-v1.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-17 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930969#comment-15930969
 ] 

Andrew Purtell commented on HBASE-17798:


If I can make a suggestion, though, change:
{code}
LOG.error(getName() + ": unexpectedly exception", e);
{code}
to
{code}
LOG.error(getName() + ": error in Reader", ex);
{code}
which is the same error message we use for IOException. 

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
> Attachments: HBASE-17798-0.98-v1.patch, 
> HBASE-17798-branch-1-v1.patch, HBASE-17798-master-v1.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-18 Thread Guangxu Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931167#comment-15931167
 ] 

Guangxu Cheng commented on HBASE-17798:
---

{quote}
Did you put the proposed change in your cluster ?
If so, what effect do you observe ?
{quote}
Before fix, there are two effects:
1. A large number of requests are not processed and not closed due to the 
reader abort. And the number of TCP connections in the ESTABLISHED state of the 
RS is increasing as the picture(connections.png) shows.
2. The client has many SocketTimeoutException.
After fix, these problems do not exist.


> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
> Attachments: connections.png, HBASE-17798-0.98-v1.patch, 
> HBASE-17798-0.98-v2.patch, HBASE-17798-branch-1-v1.patch, 
> HBASE-17798-branch-1-v2.patch, HBASE-17798-master-v1.patch, 
> HBASE-17798-master-v2.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-18 Thread Guangxu Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931171#comment-15931171
 ] 

Guangxu Cheng commented on HBASE-17798:
---

Update patch as [~apurtell] suggestions.

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
> Attachments: connections.png, HBASE-17798-0.98-v1.patch, 
> HBASE-17798-0.98-v2.patch, HBASE-17798-branch-1-v1.patch, 
> HBASE-17798-branch-1-v2.patch, HBASE-17798-master-v1.patch, 
> HBASE-17798-master-v2.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933933#comment-15933933
 ] 

Ted Yu commented on HBASE-17798:


In the future, when you have image and patch to attach, attach patch last.
Otherwise:
{code}
  https://issues.apache.org/jira/secure/attachment/12859422/connections.png -> 
Downloaded
ERROR: Unsure how to process HBASE-17798.
{code}

> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
> Attachments: 17798-master-v2.patch, connections.png, 
> HBASE-17798-0.98-v1.patch, HBASE-17798-0.98-v2.patch, 
> HBASE-17798-branch-1-v1.patch, HBASE-17798-branch-1-v2.patch, 
> HBASE-17798-master-v1.patch, HBASE-17798-master-v2.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934036#comment-15934036
 ] 

Hadoop QA commented on HBASE-17798:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
17s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
49s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
43s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 9s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 100m 49s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 140m 42s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:8d52d23 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12859685/17798-master-v2.patch 
|
| JIRA Issue | HBASE-17798 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux e71a85eb1e35 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / a41b185 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6177/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6177/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/6177/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/H

[jira] [Commented] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934929#comment-15934929
 ] 

Hudson commented on HBASE-17798:


FAILURE: Integrated in Jenkins build HBase-1.4 #678 (See 
[https://builds.apache.org/job/HBase-1.4/678/])
HBASE-17798 RpcServer.Listener.Reader can abort due to (tedyu: rev 
9726c71681c0b8b22e83b056102803646b8d50c2)
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java


> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Attachments: 17798-master-v2.patch, connections.png, 
> HBASE-17798-0.98-v1.patch, HBASE-17798-0.98-v2.patch, 
> HBASE-17798-branch-1-v1.patch, HBASE-17798-branch-1-v2.patch, 
> HBASE-17798-master-v1.patch, HBASE-17798-master-v2.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2017-03-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935215#comment-15935215
 ] 

Hudson commented on HBASE-17798:


SUCCESS: Integrated in Jenkins build HBase-Trunk_matrix #2714 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/2714/])
HBASE-17798 RpcServer.Listener.Reader can abort due to (tedyu: rev 
1cfd22bf43c9b64afae35d9bf16f764d0da80cab)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/SimpleRpcServer.java


> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 1.2.4, 0.98.24
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
> Fix For: 1.4.0, 2.0
>
> Attachments: 17798-master-v2.patch, connections.png, 
> HBASE-17798-0.98-v1.patch, HBASE-17798-0.98-v2.patch, 
> HBASE-17798-branch-1-v1.patch, HBASE-17798-branch-1-v2.patch, 
> HBASE-17798-master-v1.patch, HBASE-17798-master-v2.patch
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2018-12-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720738#comment-16720738
 ] 

Hudson commented on HBASE-17798:


SUCCESS: Integrated in Jenkins build HBase-1.3-IT #509 (See 
[https://builds.apache.org/job/HBase-1.3-IT/509/])
HBASE-17798 RpcServer.Listener.Reader can abort due to (apurtell: rev 
16699efea7fda08f242d97a3759cffc7610b3274)
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java


> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.2.4, 0.98.24, 2.0.0
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: 17798-master-v2.patch, HBASE-17798-0.98-v1.patch, 
> HBASE-17798-0.98-v2.patch, HBASE-17798-branch-1-v1.patch, 
> HBASE-17798-branch-1-v2.patch, HBASE-17798-master-v1.patch, 
> HBASE-17798-master-v2.patch, connections.png
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17798) RpcServer.Listener.Reader can abort due to CancelledKeyException

2018-12-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16721125#comment-16721125
 ] 

Hudson commented on HBASE-17798:


Results for branch branch-1.3
[build #576 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/576/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/576//General_Nightly_Build_Report/]


(/) {color:green}+1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/576//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/576//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> RpcServer.Listener.Reader can abort due to CancelledKeyException
> 
>
> Key: HBASE-17798
> URL: https://issues.apache.org/jira/browse/HBASE-17798
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.2.4, 0.98.24, 2.0.0
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Fix For: 1.4.0, 1.3.3, 2.0.0
>
> Attachments: 17798-master-v2.patch, HBASE-17798-0.98-v1.patch, 
> HBASE-17798-0.98-v2.patch, HBASE-17798-branch-1-v1.patch, 
> HBASE-17798-branch-1-v2.patch, HBASE-17798-master-v1.patch, 
> HBASE-17798-master-v2.patch, connections.png
>
>
> In our production cluster(0.98), some of the requests were unacceptable 
> because RpcServer.Listener.Reader were aborted.
> getReader() will return the next reader to deal with request.
> The implementation of getReader() as below:
> {code:title=RpcServer.java|borderStyle=solid}
> // The method that will return the next reader to work with
> // Simplistic implementation of round robin for now
> Reader getReader() {
>   currentReader = (currentReader + 1) % readers.length;
>   return readers[currentReader];
> }
> {code}
> If one of the readers abort, then it will lead to fall on the reader's 
> request will never be dealt with.
> Why does RpcServer.Listener.Reader abort?We add the debug log to get it.
> After a while, we got the following exception:
> {code}
> 2017-03-10 08:05:13,247 ERROR [RpcServer.reader=3,port=60020] ipc.RpcServer: 
> RpcServer.listener,port=60020: unexpectedly error in Reader(Throwable)
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
> at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
> at java.nio.channels.SelectionKey.isReadable(SelectionKey.java:289)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:592)
> at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:566)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> So, when deal with the request in reader, we should handle 
> CanceledKeyException.
> --
> versions 1.x and 2.0 will log and retrun when dealing with the 
> InterruptedException in Reader#doRunLoop after HBASE-10521. It will lead to 
> the same problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)