> On Sept. 9, 2014, 11:40 p.m., Lewis McGibbney wrote: > >
Thanks Lewis, I'll address these right away. - Chris ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9119/#review52796 ----------------------------------------------------------- On Sept. 6, 2014, 4:57 a.m., Chris Mattmann wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/9119/ > ----------------------------------------------------------- > > (Updated Sept. 6, 2014, 4:57 a.m.) > > > Review request for nutch and Julien Le Dem. > > > Bugs: NUTCH-1526 > https://issues.apache.org/jira/browse/NUTCH-1526 > > > Repository: nutch > > > Description > ------- > > Will contain the patch the SegmentContentDumperTool described in NUTCH-1526: > > ./bin/nutch org.apache.nutch.tools.SegmentContentDumper [options] > -segmentRootDir full file path to the root segment directory, e.g., > crawl/segments > -regexUrlPattern a regex URL pattern to select URL keys to dump from the > content DB in each segment > -outputDir The output directory to write file names to. > -metadata --key=value where key is a Content Metadata key and value is a > value to check. > > > Diffs > ----- > > ./trunk/src/java/org/apache/nutch/tools/FileDumper.java PRE-CREATION > > Diff: https://reviews.apache.org/r/9119/diff/ > > > Testing > ------- > > Testing it on DARPA XDATA XNET. > > > Thanks, > > Chris Mattmann > >