Kim Whitehall created NUTCH-2100:
------------------------------------

             Summary: Nutch dump command doesnt dump anything 
                 Key: NUTCH-2100
                 URL: https://issues.apache.org/jira/browse/NUTCH-2100
             Project: Nutch
          Issue Type: Bug
            Reporter: Kim Whitehall


When running the cmd 
nutch dump -segment segment -outputDir dumpFolder -mimeStats

I receive the following 
Dumper File Stats: 
TOTAL Stats:
[
]

The log indicates that segments are being skipped. 
Note, if I use nutch/readseg -dump  I can see there is content there. 

The log is shown below:
2015-09-15 20:10:56,142 INFO  tools.FileDumper - Accepting all mimetypes.
2015-09-15 20:10:56,782 WARN  util.NativeCodeLoader - Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
2015-09-15 20:10:57,057 INFO  tools.FileDumper - Processing segment: 
[/.../segments/20150915195411/crawl_generate]
2015-09-15 20:10:57,057 WARN  tools.FileDumper - Skipping segment: 
[/.../segments/20150915195411/crawl_generate/content/part-00000/data]: no data 
directory present
2015-09-15 20:10:57,057 INFO  tools.FileDumper - Processing segment: 
[/.../segments/20150915195411/crawl_fetch]
2015-09-15 20:10:57,057 WARN  tools.FileDumper - Skipping segment: 
[/.../segments/20150915195411/crawl_fetch/content/part-00000/data]: no data 
directory present
2015-09-15 20:10:57,058 INFO  tools.FileDumper - Processing segment: 
[/.../segments/20150915195411/content]
2015-09-15 20:10:57,058 WARN  tools.FileDumper - Skipping segment: 
[/.../segments/20150915195411/content/content/part-00000/data]: no data 
directory present
2015-09-15 20:10:57,058 INFO  tools.FileDumper - Processing segment: 
[/.../segments/20150915195411/parse_text]
2015-09-15 20:10:57,058 WARN  tools.FileDumper - Skipping segment: 
[/.../segments/20150915195411/parse_text/content/part-00000/data]: no data 
directory present
2015-09-15 20:10:57,058 INFO  tools.FileDumper - Processing segment: 
[/.../segments/20150915195411/parse_data]
2015-09-15 20:10:57,058 WARN  tools.FileDumper - Skipping segment: 
[/.../segments/20150915195411/parse_data/content/part-00000/data]: no data 
directory present
2015-09-15 20:10:57,058 INFO  tools.FileDumper - Processing segment: 
[/.../segments/20150915195411/crawl_parse]
2015-09-15 20:10:57,058 WARN  tools.FileDumper - Skipping segment: 
[/.../segments/20150915195411/crawl_parse/content/part-00000/data]: no data 
directory present
2015-09-15 20:10:57,059 INFO  tools.FileDumper - Dumper File Stats: 
TOTAL Stats:
[
]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to