[ https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephen O'Donnell updated HDFS-7175: ------------------------------------ Target Version/s: 3.3.0 (was: ) Affects Version/s: 3.30 Status: Patch Available (was: Open) > Client-side SocketTimeoutException during Fsck > ---------------------------------------------- > > Key: HDFS-7175 > URL: https://issues.apache.org/jira/browse/HDFS-7175 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 3.30 > Reporter: Carl Steinbach > Assignee: Stephen O'Donnell > Priority: Major > Attachments: HDFS-7157.004.patch, HDFS-7175.2.patch, > HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch > > > HDFS-2538 disabled status reporting for the fsck command (it can optionally > be enabled with the -showprogress option). We have observed that without > status reporting the client will abort with read timeout: > {noformat} > [hdfs@lva1-hcl0030 ~]$ hdfs fsck / > Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070 > 14/09/30 06:03:41 WARN security.UserGroupInformation: > PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) > cause:java.net.SocketTimeoutException: Read timed out > Exception in thread "main" java.net.SocketTimeoutException: Read timed out > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) > at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312) > at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72) > at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149) > at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346) > {noformat} > Since there's nothing for the client to read it will abort if the time > required to complete the fsck operation is longer than the client's read > timeout setting. > I can think of a couple ways to fix this: > # Set an infinite read timeout on the client side (not a good idea!). > # Have the server-side write (and flush) zeros to the wire and instruct the > client to ignore these characters instead of echoing them. > # It's possible that flushing an empty buffer on the server-side will trigger > an HTTP response with a zero length payload. This may be enough to keep the > client from hanging up. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org