If just changing the buffer to 4k makes a big difference could you at a minimum file a JIRA to change that buffer size? I know that it is not a final fix but it sure seems like a very nice Band-Aid to put on until we can get to the root of the issues.
--Bobby Evans On 1/27/12 9:23 PM, "Sven Groot" <sgr...@gmail.com> wrote: Hi Nick, Thanks for your reply. I don't think what you are saying is related, as the problem happens when the data is transferred; it's not deserialized or anything during that step. Note that my code isn't involved at all: it's purely Hadoop's own code that's running here. I have done additional work in trying to find the cause, and it's definitely in Jetty. I have created a simple test with Jetty that transfers a file in a manner similar to Hadoop, and it shows the same behavior. It appears to be linked to the buffer size used by Jetty for chunked transfer encoding. Hadoop uses a hardcoded buffer of 64KB for that, which exhibits the problem. If I change the buffer to 4KB, Jetty's transfer speed increases by an order of magnitude. I have posted a question on StackOverflow regarding this behavior in Jetty: http://stackoverflow.com/questions/9031311/slow-transfers-in-jetty-with-chunked-transfer-encoding-at-certain-buffer-size. So far, there are no answers posted. I've always found it a strange decision that Hadoop uses HTTP to transfer intermediate data. Let's just say that this issue reinforces that opinion. Regards, Sven From: Nussbaum, Nick [mailto:nick.nussb...@fticonsulting.com] Sent: zaterdag 28 januari 2012 3:52 To: sgr...@gmail.com Subject: FW: Reduce shuffle data transfer takes excessively long I'm no expert on Hadoop, but I have already encountered a surprising gotcha that may be your problem. If you repeatedly use a function like String getBytes <http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/String.html#getBytes%28%29> () that needs to know a default OS character set it can take a surprisingly long time. I speculate this is due to having to go through hoops in various sandboxes to read the OS default locale. If it is the case, getting the system locale and char set once and specifying it explicitly in the call to getBytes() or whatever should make a big difference. let me know if it works for you -Nick From: Sven Groot [mailto:sgr...@gmail.com] Sent: Thursday, January 26, 2012 10:25 PM To: mapreduce-user@hadoop.apache.org Subject: Reduce shuffle data transfer takes excessively long Hello, I have been working on profiling the performance of certain parts of Hadoop 0.20.203.0. For this reason, I have set up a simple cluster that uses one node as the Namenode/Jobtracker, and one node as the sole Datanode/tasktracker. In this experiment, I run a job consisting of a single map task and a single reduce task. Both are simply using the default Mapper/Reducer implementations (the identity functions). The input of the job is a file with a single 256MB block. Therefore, the output of the map task is 256MB, and the reduce task must shuffle that 256MB from the local host. To my surprise, shuffling this amount of data takes around 9 seconds, which is excessively slow. First I turned my attention to the ReduceTask.ReduceOutputCopier. I determined that about 1.1 seconds is spent calculating checksums (this is the expected value), and the remaining time is spent reading from the stream returned by URLConnection.getInputStream(). Some simple tests with URLConnection could not reproduce that issue except if it was actually reading from the TaskTracker's MapOutputServlet, so the problem seemed to be on the server side. Reading the same amount of data from any other local web server takes only 0.2s. I inserted some measurements into the MapOutputServlet and determined that 0.1s was spent reading the intermediate file (unsurprising as it was still in the page cache) and 7.7s are spent writing to the stream returned by response.getOutputStream(). The slowdown therefore appears to be in Jetty. CPU usage during the transfer appears to be low, so it feels like the transfer is getting throttled somehow. But if that's the case I can't figure out how that's happening. There's nothing in the source code to lead me to believe Hadoop is deliberately throttling anything, and as far as I know Jetty doesn't throttle by default. I was seeing some warnings in the tasktracker log file related to this: http://wiki.eclipse.org/Jetty/Feature/JVM_NIO_Bug However, running Hadoop under Java 7 made those warnings disappear and the transfer is still slow, so I don't think that's it. I'm out of ideas as to what could be causing this. Any insights? Regards, Sven Confidentiality Notice: This email and any attachments may be confidential and protected by legal privilege. If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the e-mail or any attachment is prohibited. If you have received this email in error, please notify us immediately by replying to the sender and then delete this copy and the reply from your system. Thank you for your cooperation.