Hey,
I've figured out the problem was. Somehow while transferring the data to different servers, the data got very slightly corrupted, which caused this error.
Luke
On 04/01/2005 09:17 AM, Luke Baker wrote:
Hey,
When updating the db with a certain segment, I get the following error:
050330 235110 Processing pagesByURL: Sorted 28711.666716283053 instructions/second
Exception in thread "main" java.io.IOException: key out of order: gopher://Gopher.wkap..l:70/11gopher_root%3A%5B_journal._jrnl.acbi%5D after gopher://Gopher.wkap.nl/11gopher_root%3a%5b_journal._jrnl.biph%5d
at org.apache.nutch.io.MapFile$Writer.checkKey(MapFile.java:128)
at org.apache.nutch.io.MapFile$Writer.append(MapFile.java:114)
at org.apache.nutch.db.WebDBWriter$PagesByURLProcessor.mergeEdits(WebDBWriter.java:635)
at org.apache.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java:557)
at org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1544)
at org.apache.nutch.tools.UpdateDatabaseTool.close(UpdateDatabaseTool.java:318)
at org.apache.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:368)
Has anyone seen this before? Is it a problem with by webdb or the segment? I tried this same segment on a different webdb and I got the same error (key out of order), but with different URLs referenced.
Luke
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
