[ https://issues.apache.org/jira/browse/HADOOP-16266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran reopened HADOOP-16266: ------------------------------------- This seems to be the patch which is breaking my test runs on a PR now rebased to trunk {{org.apache.hadoop.fs.s3a.commit.staging.integration.ITestDirectoryCommitMRJob}} works by bringing up a miniYarn cluster, except now it doesn't. Instead I get a stack trace (shortened) {code} 2019-05-31 21:31:05,544 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,554 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,554 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,554 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,554 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,554 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,554 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,555 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,555 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,542 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,555 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,556 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,556 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,557 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,557 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,557 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,557 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,557 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,557 [IPC Server Responder] WARN ipc.Server (Server.java:doRunLoop(1546)) - Exception in Responder java.io.IOException: Invalid argument at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.ipc.Server$Responder.doRunLoop(Server.java:1485) at org.apache.hadoop.ipc.Server$Responder.run(Server.java:1468) 2019-05-31 21:31:05,557 [IPC Server Responder] WARN ipc.Server (Server {code} * This is happening in arg validation in the select() call, which is being caught, swallowed and repeated, forever. * Searching for "java.io.IOException: Invalid argument" + select I see this surfaces when the timeout is too big [https://bugs.openjdk.java.net/browse/JDK-8165000] * And looking at our code, I can see this happening in writeSelector.select(PURGE_INTERVAL_NANOS). * which is a method that takes milliseconds, not nanos. This leads to a hypothesis "this patch passes a timeout 10e6 times too big, the native code rejects and then server just retries forever hoping it will somehow recover" I'm reopening and making it your homework to fix. Sorry. > Add more fine-grained processing time metrics to the RPC layer > -------------------------------------------------------------- > > Key: HADOOP-16266 > URL: https://issues.apache.org/jira/browse/HADOOP-16266 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc > Reporter: Christopher Gregorian > Assignee: Christopher Gregorian > Priority: Minor > Labels: rpc > Fix For: 3.3.0 > > Attachments: HADOOP-16266.001.patch, HADOOP-16266.002.patch, > HADOOP-16266.003.patch, HADOOP-16266.004.patch, HADOOP-16266.005.patch, > HADOOP-16266.006.patch, HADOOP-16266.007.patch, HADOOP-16266.008.patch, > HADOOP-16266.009.patch, HADOOP-16266.010.patch, HADOOP-16266.011.patch > > > Splitting off of HDFS-14403 to track the first part: introduces more > fine-grained measuring of how a call's processing time is split up. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org