Brian Whitman wrote: > > On Jan 19, 2007, at 4:29 AM, Andrzej Bialecki wrote: >> > >> Could you guys come up with exact data that causes this bug (primarily >> I'm interested in a seed list, because then I can see that you simply >> use the crawl tool, and finally try to run mergesegs). Thanks!
I am also experiencing NPE in SegmentReader and Indexer, not 100% sure yet what exactly causes these problems that happens when hadoop "spills", I got rid of it with a little patching: - added/changed Mapper.map to return ObjectWritable instead of plain object. - patched SequenceFile slightly because of NPE in SequenceFile.Sorter.MergeQueue However I cannot find from the change logs of hadoop that what the change is that is causing nutch these problems. -- Sami Siren