[ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Saab updated HADOOP-2419: ------------------------------ Attachment: MapRunnableTest.java This is a sample MR job that mimics what nutch's Fetcher2 does with threads and queuing url's to be fetched. It takes 2 arguments on the command line, an input path and an output path. All it does it reads a TextInputFormat, queues it up for threads to then write it back out as it came in. The sample run I just did came back with the following exception. Reverting HADOOP-1965 allows the job to finish without error. java.lang.ArrayIndexOutOfBoundsException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.mapred.MergeSorter.sort(MergeSorter.java:45) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:446) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:690) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2016) java.lang.ArrayIndexOutOfBoundsException: 6 at org.apache.hadoop.mapred.BasicTypeSorterBase.compare(BasicTypeSorterBase.java:133) at org.apache.hadoop.mapred.MergeSorter.compare(MergeSorter.java:59) at org.apache.hadoop.mapred.MergeSorter.compare(MergeSorter.java:35) at org.apache.hadoop.util.MergeSort.mergeSort(MergeSort.java:46) at org.apache.hadoop.util.MergeSort.mergeSort(MergeSort.java:56) at org.apache.hadoop.mapred.MergeSorter.sort(MergeSorter.java:46) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:446) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:690) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2016) > HADOOP-1965 breaks nutch > ------------------------ > > Key: HADOOP-2419 > URL: https://issues.apache.org/jira/browse/HADOOP-2419 > Project: Hadoop > Issue Type: Bug > Reporter: Paul Saab > Attachments: MapRunnableTest.java > > > When running nutch on trunk, nutch is unable to complete a fetch and the > following exceptions are raised: > java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:180) > at org.apache.nutch.protocol.Content.readFields(Content.java:158) > at > org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413) > Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException > at org.apache.hadoop.io.Text.readString(Text.java:388) > at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243) > at org.apache.nutch.protocol.Content.readFields(Content.java:151) > at > org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413) > After reverting HADOOP-1965 nutch works just fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.