Hello Krishna,
Judging from that stack trace, it looks like you have written a client
application that uses Apache HTTP Components to make HTTP calls to WebHDFS.
The exception is a client-side timeout. I recommend troubleshooting this from
the perspective of that client application. The HDFS configuration properties
that you mentioned would not control socket timeout for a custom application
like this.
I'm not sure what Apache HTTP Components uses by default for socket timeouts.
You can explicitly control the timeout programmatically:
https://hc.apache.org/httpcomponents-client-ga/httpclient/apidocs/org/apache/http/client/config/RequestConfig.Builder.html#setSocketTimeout(int)
You also have the option of writing a custom socket factory class if you need
to do very customized socket configuration:
http://hc.apache.org/httpcomponents-client-ga/tutorial/html/connmgmt.html#d5e431
There is one known issue that can impact WebHDFS throughput:
https://issues.apache.org/jira/browse/HDFS-8696
This issue would impact Apache Hadoop 2.7 versions, and the fix is currently
targeted to Apache Hadoop 2.8.0.
I hope this helps.
--Chris Nauroth
From: Krishna Kishore Bonagiri
<write2kish...@gmail.com<mailto:write2kish...@gmail.com>>
Date: Wednesday, December 9, 2015 at 2:17 AM
To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
<user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Socket Timeout Exception while multiple concurrent applications are
reading HDFS data through WebHDFS interface
Hi,
We are seeing this SocketTImeout exception while a number of concurrent
applications (probably, 50 of them) are trying to read HDFS data through
WebHDFS interface. Are there any parameters we can tune so it doesn't happen?
An exception occurred: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.read(SocketInputStream.java:163)
at java.net.SocketInputStream.read(SocketInputStream.java:133)
at
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
at
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
at
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
at
org.apache.http.impl.conn.LoggingSessionInputBuffer.readLine(LoggingSessionInputBuffer.java:115)
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
at
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
at
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
at
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
at
org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
at
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
at
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715)
at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at
com.ibm.iis.cc.filesystem.impl.webhdfs.WebHDFS.appendFromBuffer(WebHDFS.java:306)
at
com.ibm.iis.cc.filesystem.impl.webhdfs.WebHDFS.writeFromStream(WebHDFS.java:198)
at
com.ibm.iis.cc.filesystem.AbstractFileSystem.writeFromStream(AbstractFileSystem.java:45)
at com.ibm.iis.cc.filesystem.FileSystem$Uploader.call(FileSystem.java:3393)
at com.ibm.iis.cc.filesystem.FileSystem$Uploader.call(FileSystem.java:3358)
at java.util.concurrent.FutureTask.run(FutureTask.java:273)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1176)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.lang.Thread.run(Thread.java:853)
We have tried increasing the values of these parameters, but there is no change.
1) dfs.datanode.handler.count
2) dfs.client.socket-timeout (the new parameter to define the socket timeout)
3) dfs.socket.timeout (the deprecated parameter)
4) dfs.datanode.socket.write.timeout
Thanks,
Kishore