[ https://issues.apache.org/jira/browse/NUTCH-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon updated NUTCH-676: ------------------------------ Attachment: 0001-NUTCH-676-Replace-MapWritable-implementation-with-t.patch NUTCH-676: Replace MapWritable implementation with the one from Hadoop, but retaining old class IDs from nutch Change to the test because the test assumes broken behavior in MapWritable > MapWritable is written inefficiently and confusingly > ---------------------------------------------------- > > Key: NUTCH-676 > URL: https://issues.apache.org/jira/browse/NUTCH-676 > Project: Nutch > Issue Type: Improvement > Affects Versions: 0.9.0 > Reporter: Todd Lipcon > Priority: Minor > Attachments: > 0001-NUTCH-676-Replace-MapWritable-implementation-with-t.patch > > > The MapWritable implemention in o.a.n.crawl is written confusingly - it > maintains its own internal linked list which I think may have a bug somewhere > (I'm getting an NPE in certain cases in the code, though it's hard to track > down) > Can anyone comment as to why MapWritable is written the way it is, rather > than just using a HashMap or a LinkedHashMap if consistent ordering is > important? I imagine that would improve performance. > What about just using the Hadoop MapWritable? Obviously that would break some > backwards compatibility but it may be a good idea at some point to reduce > confusion (I didn't realize that Nutch had its own impl until a few minutes > ago) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.