[GitHub] [hadoop] hadoop-yetus commented on issue #1011: HDFS-14313. Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memor…

2019-07-06 Thread GitBox
hadoop-yetus commented on issue #1011: HDFS-14313. Get hdfs used space from 
FsDatasetImpl#volumeMap#ReplicaInfo in memor…
URL: https://github.com/apache/hadoop/pull/1011#issuecomment-508973639
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 53 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 3 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 72 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1116 | trunk passed |
   | +1 | compile | 1138 | trunk passed |
   | +1 | checkstyle | 154 | trunk passed |
   | +1 | mvnsite | 156 | trunk passed |
   | +1 | shadedclient | 1037 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 125 | trunk passed |
   | 0 | spotbugs | 174 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 292 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 23 | Maven dependency ordering for patch |
   | +1 | mvninstall | 106 | the patch passed |
   | +1 | compile | 1084 | the patch passed |
   | +1 | javac | 1084 | the patch passed |
   | -0 | checkstyle | 154 | root: The patch generated 10 new + 245 unchanged - 
1 fixed = 255 total (was 246) |
   | +1 | mvnsite | 151 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 717 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 121 | the patch passed |
   | +1 | findbugs | 308 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 524 | hadoop-common in the patch passed. |
   | -1 | unit | 6851 | hadoop-hdfs in the patch failed. |
   | -1 | asflicense | 56 | The patch generated 1 ASF License warnings. |
   | | | 14201 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestFSImage |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.TestReconstructStripedFile |
   |   | hadoop.hdfs.web.TestWebHdfsTimeouts |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1011 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux f6e72ffb8af4 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9c90729 |
   | Default Java | 1.8.0_212 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/5/artifact/out/diff-checkstyle-root.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/5/testReport/ |
   | asflicense | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/5/artifact/out/patch-asflicense-problems.txt
 |
   | Max. process+thread count | 3803 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common 
hadoop-hdfs-project/hadoop-hdfs U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/5/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] hadoop-yetus commented on issue #1011: HDFS-14313. Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memor…

2019-07-06 Thread GitBox
hadoop-yetus commented on issue #1011: HDFS-14313. Get hdfs used space from 
FsDatasetImpl#volumeMap#ReplicaInfo in memor…
URL: https://github.com/apache/hadoop/pull/1011#issuecomment-508972755
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 32 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 3 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 63 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1090 | trunk passed |
   | +1 | compile | 1113 | trunk passed |
   | +1 | checkstyle | 133 | trunk passed |
   | +1 | mvnsite | 147 | trunk passed |
   | +1 | shadedclient | 934 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 112 | trunk passed |
   | 0 | spotbugs | 164 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 277 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 25 | Maven dependency ordering for patch |
   | +1 | mvninstall | 101 | the patch passed |
   | +1 | compile | 978 | the patch passed |
   | +1 | javac | 978 | the patch passed |
   | -0 | checkstyle | 133 | root: The patch generated 10 new + 245 unchanged - 
1 fixed = 255 total (was 246) |
   | +1 | mvnsite | 145 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 613 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 112 | the patch passed |
   | +1 | findbugs | 309 | the patch passed |
   ||| _ Other Tests _ |
   | -1 | unit | 545 | hadoop-common in the patch failed. |
   | -1 | unit | 5617 | hadoop-hdfs in the patch failed. |
   | -1 | asflicense | 47 | The patch generated 1 ASF License warnings. |
   | | | 12504 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ha.TestZKFailoverController |
   |   | hadoop.util.TestBasicDiskValidator |
   |   | hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy |
   |   | hadoop.metrics2.sink.TestRollingFileSystemSinkWithSecureHdfs |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.TestFileChecksumCompositeCrc |
   |   | hadoop.hdfs.web.TestWebHdfsTimeouts |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1011 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux ae72623b7916 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9c90729 |
   | Default Java | 1.8.0_212 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/6/artifact/out/diff-checkstyle-root.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/6/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/6/testReport/ |
   | asflicense | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/6/artifact/out/patch-asflicense-problems.txt
 |
   | Max. process+thread count | 4207 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common 
hadoop-hdfs-project/hadoop-hdfs U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1011/6/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16403) Start a new statistical rpc queue and make the Reader's pendingConnection queue runtime-replaceable

2019-07-06 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879790#comment-16879790
 ] 

He Xiaoqiao commented on HADOOP-16403:
--

Thanks [~LiJinglun] for your response. Did you backport HDFS-6763? I have met 
this issue once, and it have resolved by applying HDFS-6763. Please try to 
apply that patch and welcome to some more discussion.

> Start a new statistical rpc queue and make the Reader's pendingConnection 
> queue runtime-replaceable
> ---
>
> Key: HADOOP-16403
> URL: https://issues.apache.org/jira/browse/HADOOP-16403
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jinglun
>Priority: Major
> Attachments: HADOOP-16403.001.patch, MetricLinkedBlockingQueueTest.pdf
>
>
> I have an HA cluster with 2 NameNodes. The NameNode's meta is quite big so 
> after the active dead, it takes the standby more than 40s to become active. 
> Many requests(tcp connect request and rpc request) from Datanodes, clients 
> and zkfc timed out and start retrying. The suddenly request flood lasts for 
> the next 2 minutes and finally all requests are either handled or run out of 
> retry times. 
>  Adjusting the rpc related settings might power the NameNode and solve this 
> problem and the key point is finding the bottle neck. The rpc server can be 
> described as below:
> {noformat}
> Listener -> Readers' queues -> Readers -> callQueue -> Handlers{noformat}
> By sampling some failed clients, I find many of them got 
> ConnectTimeoutException. It's caused by a 20s un-responded tcp connect 
> request. I think may be the reader queue is full and block the listener from 
> handling new connections. Both slow handlers and slow readers can block the 
> whole processing progress, and I need to know who it is. I think *a queue 
> that computes the qps, write log when the queue is full and could be replaced 
> easily* will help. 
>  I find the nice work HADOOP-10302 implementing a runtime-swapped queue. 
> Using it at Reader's queue makes the reader queue runtime-swapped 
> automatically. The qps computing job could be done by implementing a subclass 
> of LinkedBlockQueue that does the computing job while put/take/... happens. 
> The qps data will show on jmx.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16403) Start a new statistical rpc queue and make the Reader's pendingConnection queue runtime-replaceable

2019-07-06 Thread Jinglun (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879636#comment-16879636
 ] 

Jinglun commented on HADOOP-16403:
--

Hi [~hexiaoqiao], sorry for replying late. It's a Xiaomi inner version based on 
hadoop 2.6. The Namenode consists of 140,984,043 Inodes, 351238718 blocks, 
3000+ datanodes and runs with 100G heap. The version 2.6 is a very old version 
but we have back ported many updates in the later version so it's not really 
that old. 

I'm considering profiling the transition progress and the lang tail effect 
caused by the long transition. The queue is a tool for it. May be replaying the 
edit log and handling the postponed block reports cost too much time. What do 
you think?

Any advice about how to reproduce the situation and how to shoot the problem? 
Looking forward to your suggestions.:)

> Start a new statistical rpc queue and make the Reader's pendingConnection 
> queue runtime-replaceable
> ---
>
> Key: HADOOP-16403
> URL: https://issues.apache.org/jira/browse/HADOOP-16403
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jinglun
>Priority: Major
> Attachments: HADOOP-16403.001.patch, MetricLinkedBlockingQueueTest.pdf
>
>
> I have an HA cluster with 2 NameNodes. The NameNode's meta is quite big so 
> after the active dead, it takes the standby more than 40s to become active. 
> Many requests(tcp connect request and rpc request) from Datanodes, clients 
> and zkfc timed out and start retrying. The suddenly request flood lasts for 
> the next 2 minutes and finally all requests are either handled or run out of 
> retry times. 
>  Adjusting the rpc related settings might power the NameNode and solve this 
> problem and the key point is finding the bottle neck. The rpc server can be 
> described as below:
> {noformat}
> Listener -> Readers' queues -> Readers -> callQueue -> Handlers{noformat}
> By sampling some failed clients, I find many of them got 
> ConnectTimeoutException. It's caused by a 20s un-responded tcp connect 
> request. I think may be the reader queue is full and block the listener from 
> handling new connections. Both slow handlers and slow readers can block the 
> whole processing progress, and I need to know who it is. I think *a queue 
> that computes the qps, write log when the queue is full and could be replaced 
> easily* will help. 
>  I find the nice work HADOOP-10302 implementing a runtime-swapped queue. 
> Using it at Reader's queue makes the reader queue runtime-swapped 
> automatically. The qps computing job could be done by implementing a subclass 
> of LinkedBlockQueue that does the computing job while put/take/... happens. 
> The qps data will show on jmx.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16403) Start a new statistical rpc queue and make the Reader's pendingConnection queue runtime-replaceable

2019-07-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879628#comment-16879628
 ] 

Hadoop QA commented on HADOOP-16403:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} HADOOP-16403 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HADOOP-16403 |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/16370/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Start a new statistical rpc queue and make the Reader's pendingConnection 
> queue runtime-replaceable
> ---
>
> Key: HADOOP-16403
> URL: https://issues.apache.org/jira/browse/HADOOP-16403
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jinglun
>Priority: Major
> Attachments: HADOOP-16403.001.patch, MetricLinkedBlockingQueueTest.pdf
>
>
> I have an HA cluster with 2 NameNodes. The NameNode's meta is quite big so 
> after the active dead, it takes the standby more than 40s to become active. 
> Many requests(tcp connect request and rpc request) from Datanodes, clients 
> and zkfc timed out and start retrying. The suddenly request flood lasts for 
> the next 2 minutes and finally all requests are either handled or run out of 
> retry times. 
>  Adjusting the rpc related settings might power the NameNode and solve this 
> problem and the key point is finding the bottle neck. The rpc server can be 
> described as below:
> {noformat}
> Listener -> Readers' queues -> Readers -> callQueue -> Handlers{noformat}
> By sampling some failed clients, I find many of them got 
> ConnectTimeoutException. It's caused by a 20s un-responded tcp connect 
> request. I think may be the reader queue is full and block the listener from 
> handling new connections. Both slow handlers and slow readers can block the 
> whole processing progress, and I need to know who it is. I think *a queue 
> that computes the qps, write log when the queue is full and could be replaced 
> easily* will help. 
>  I find the nice work HADOOP-10302 implementing a runtime-swapped queue. 
> Using it at Reader's queue makes the reader queue runtime-swapped 
> automatically. The qps computing job could be done by implementing a subclass 
> of LinkedBlockQueue that does the computing job while put/take/... happens. 
> The qps data will show on jmx.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16403) Start a new statistical rpc queue and make the Reader's pendingConnection queue runtime-replaceable

2019-07-06 Thread Jinglun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinglun updated HADOOP-16403:
-
Attachment: MetricLinkedBlockingQueueTest.pdf

> Start a new statistical rpc queue and make the Reader's pendingConnection 
> queue runtime-replaceable
> ---
>
> Key: HADOOP-16403
> URL: https://issues.apache.org/jira/browse/HADOOP-16403
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jinglun
>Priority: Major
> Attachments: HADOOP-16403.001.patch, MetricLinkedBlockingQueueTest.pdf
>
>
> I have an HA cluster with 2 NameNodes. The NameNode's meta is quite big so 
> after the active dead, it takes the standby more than 40s to become active. 
> Many requests(tcp connect request and rpc request) from Datanodes, clients 
> and zkfc timed out and start retrying. The suddenly request flood lasts for 
> the next 2 minutes and finally all requests are either handled or run out of 
> retry times. 
>  Adjusting the rpc related settings might power the NameNode and solve this 
> problem and the key point is finding the bottle neck. The rpc server can be 
> described as below:
> {noformat}
> Listener -> Readers' queues -> Readers -> callQueue -> Handlers{noformat}
> By sampling some failed clients, I find many of them got 
> ConnectTimeoutException. It's caused by a 20s un-responded tcp connect 
> request. I think may be the reader queue is full and block the listener from 
> handling new connections. Both slow handlers and slow readers can block the 
> whole processing progress, and I need to know who it is. I think *a queue 
> that computes the qps, write log when the queue is full and could be replaced 
> easily* will help. 
>  I find the nice work HADOOP-10302 implementing a runtime-swapped queue. 
> Using it at Reader's queue makes the reader queue runtime-swapped 
> automatically. The qps computing job could be done by implementing a subclass 
> of LinkedBlockQueue that does the computing job while put/take/... happens. 
> The qps data will show on jmx.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16403) Start a new statistical rpc queue and make the Reader's pendingConnection queue runtime-replaceable

2019-07-06 Thread Jinglun (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879627#comment-16879627
 ] 

Jinglun commented on HADOOP-16403:
--

 
I do a test comparing MetricLinkedBlockingQueue & LinkedBlockingQueue. I start 
256 producers putting entries to the queue and 256 consumers consuming from 
it(each producer / consumer has it's own thread). The total entries is 
100,000,000, it takes the LinkedBlockingQueue 18180ms to finish and the 
MetricLinkedBlockingQueue 62777ms. If I disable the LOG in 
MetricLinkedBlockingQueue, it takes 28720ms to finish. See the table below.
||Queue||Time cost(ms)||consume rate(put/second)||
|LinkedBlockingQueue|18180|5500550|
|MetricLinkedBlockingQueue|62777|1592940|
|MetricLinkedBlockingQueue(without log)|28270|3537318|

Though there is a significant overhead using MetricLinkedBlockingQueue, it 
still won't be a problem. Because the rpc handling rate of NameNode is hardly 
to reach 30,000, which is much lower than the MetricLinkedBlockingQueue's limit 
1,592,940. 

> Start a new statistical rpc queue and make the Reader's pendingConnection 
> queue runtime-replaceable
> ---
>
> Key: HADOOP-16403
> URL: https://issues.apache.org/jira/browse/HADOOP-16403
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jinglun
>Priority: Major
> Attachments: HADOOP-16403.001.patch
>
>
> I have an HA cluster with 2 NameNodes. The NameNode's meta is quite big so 
> after the active dead, it takes the standby more than 40s to become active. 
> Many requests(tcp connect request and rpc request) from Datanodes, clients 
> and zkfc timed out and start retrying. The suddenly request flood lasts for 
> the next 2 minutes and finally all requests are either handled or run out of 
> retry times. 
>  Adjusting the rpc related settings might power the NameNode and solve this 
> problem and the key point is finding the bottle neck. The rpc server can be 
> described as below:
> {noformat}
> Listener -> Readers' queues -> Readers -> callQueue -> Handlers{noformat}
> By sampling some failed clients, I find many of them got 
> ConnectTimeoutException. It's caused by a 20s un-responded tcp connect 
> request. I think may be the reader queue is full and block the listener from 
> handling new connections. Both slow handlers and slow readers can block the 
> whole processing progress, and I need to know who it is. I think *a queue 
> that computes the qps, write log when the queue is full and could be replaced 
> easily* will help. 
>  I find the nice work HADOOP-10302 implementing a runtime-swapped queue. 
> Using it at Reader's queue makes the reader queue runtime-swapped 
> automatically. The qps computing job could be done by implementing a subclass 
> of LinkedBlockQueue that does the computing job while put/take/... happens. 
> The qps data will show on jmx.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org