-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9119/
-----------------------------------------------------------

(Updated Sept. 6, 2014, 4:57 a.m.)


Review request for nutch and Julien Le Dem.


Bugs: NUTCH-1526
    https://issues.apache.org/jira/browse/NUTCH-1526


Repository: nutch


Description
-------

Will contain the patch the SegmentContentDumperTool described in NUTCH-1526:

./bin/nutch org.apache.nutch.tools.SegmentContentDumper [options]
   -segmentRootDir full file path to the root segment directory, e.g., 
crawl/segments
   -regexUrlPattern a regex URL pattern to select URL keys to dump from the 
content DB in each segment
   -outputDir The output directory to write file names to.
   -metadata --key=value where key is a Content Metadata key and value is a 
value to check.


Diffs (updated)
-----

  ./trunk/src/java/org/apache/nutch/tools/FileDumper.java PRE-CREATION 

Diff: https://reviews.apache.org/r/9119/diff/


Testing
-------

Testing it on DARPA XDATA XNET.


Thanks,

Chris Mattmann

Reply via email to