Hello,

please help me... am using nutch-0.9 with lucene 2.2 and Hadoop 0.15.0

I have commented the line dedup, so am able crwal the site
http://www.traguiden.se(but for other sites its working properly...), but
not indexing properly. If i uncomment the line dedup am getting below
exception.

Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
        at
org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:439)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)

only file by name segments are creating in index folder. but failing to
generate below files...

please help out... am not able to generate some files under index folder...
when i crwal a site...

i need to generate below files... please help... tried nearly a week.. to
solve.


_0.fdt
_0.tis
_0.fdx
_0.prx
._0.fdt.crc
_0.tii
._0.fdx.crc
_0.nrm
._0.fnm.crc
._0.frq.crc
_0.fnm
_0.frq
._0.nrm.crc
._0.tii.crc
._0.tis.crc
._0.prx.crc 

response as a solution is appreciated.

Thanks
S Patil.
-- 
View this message in context: 
http://www.nabble.com/files-are-not-generated-in-index-folder-by-indexer-for-the-site-http%3A--www.traguiden.se%28for-other-sites-its-working-good%29-while-crwaling-tp14330778p14330778.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.

Reply via email to