Hi,
We're running into issues were we are seeing timeouts when
writing/reading a lot of hdfs data. (Hadoop is version CDH4B3 and hdfs
appending is enabled). The type of exceptions vary a lot, but most of
the times it's whenever a DFSClient writes data into the datanodes
pipeline.
For example, one datanode logs "Exception in receiveBlock for block
blk_5476601577216704980_62953994 java.io.EOFException: while trying to
read 65557 bytes" and the other side logs "writeBlock
blk_5476601577216704980_62953994 received exception
java.net.SocketTimeoutException: Read timed out". That's it.
We cannot seem to determine the exact problem. The read timeout is
default (60 sec). The open files limit and the number of xceivers is
upped a lot. A full GC never takes longer than a second.
However, we are seeing a lot of dropped packages on the networking
interface. Could these problems be related?
Any advice will be helpful.
Ferdy.