[ https://issues.apache.org/jira/browse/NUTCH-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Joyce resolved NUTCH-2182. ---------------------------------- Resolution: Fixed Resolved in r1720466 > Make reverseUrlDirs file dumper option hash the URL for consistency > ------------------------------------------------------------------- > > Key: NUTCH-2182 > URL: https://issues.apache.org/jira/browse/NUTCH-2182 > Project: Nutch > Issue Type: Improvement > Components: tool > Affects Versions: 1.11 > Reporter: Michael Joyce > Assignee: Michael Joyce > Fix For: 1.12 > > Attachments: NUTCH-2182_joyce_8Dec2015.patch > > > At the moment the "reverseUrlDirs" option for FileDumper is terribly brittle > and fails on a fair number of edge cases. A more robust way to handle the > reverse URL approach to dumping a file is to reverse the server part and hash > the URL to use as the file name. This gives us a nice split of files while > avoiding a number of likely classes that causes dumps to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)