Hi, This is most likely an URL filter issue. Check all URL filters. There's also a test program for URL filtering. Try it out.
http://wiki.apache.org/nutch/CommandLineOptions Cheers, ps. Moved to user@nutch as it's more appropriate there. > I have problems with running injector in nutch-1.4 on hadoop, same > command with nutch-1.3 works fine. As you can see, list of URLs is > loaded from hdfs correctly Map input records=66906 but no records are on > map ouput. Could it be some problems with broken filtering? > > ponto:(crawler)runtime/deploy>bin/nutch inject /czcrawl/db /czcrawl/seeds > 11/10/13 17:56:25 INFO crawl.Injector: Injector: starting at 2011-10-13 > 17:56:25 > 11/10/13 17:56:25 INFO crawl.Injector: Injector: crawlDb: /czcrawl/db > 11/10/13 17:56:25 INFO crawl.Injector: Injector: urlDir: /czcrawl/seeds > 11/10/13 17:56:25 INFO crawl.Injector: Injector: Converting injected > urls to crawl db entries. > 11/10/13 17:56:28 INFO mapred.FileInputFormat: Total input paths to > process : 1 > 11/10/13 17:56:29 INFO mapred.JobClient: Running job: job_201110091645_0032 > 11/10/13 17:56:30 INFO mapred.JobClient: map 0% reduce 0% > 11/10/13 17:56:52 INFO mapred.JobClient: map 50% reduce 0% > 11/10/13 17:56:53 INFO mapred.JobClient: map 100% reduce 0% > 11/10/13 17:57:05 INFO mapred.JobClient: map 100% reduce 100% > 11/10/13 17:57:10 INFO mapred.JobClient: Job complete: > job_201110091645_0032 11/10/13 17:57:10 INFO mapred.JobClient: Counters: > 27 > 11/10/13 17:57:10 INFO mapred.JobClient: Job Counters > 11/10/13 17:57:10 INFO mapred.JobClient: Launched reduce tasks=1 > 11/10/13 17:57:10 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=20455 > 11/10/13 17:57:10 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 11/10/13 17:57:10 INFO mapred.JobClient: Total time spent by all > maps waiting after reserving slots (ms)=0 > 11/10/13 17:57:10 INFO mapred.JobClient: Rack-local map tasks=1 > 11/10/13 17:57:10 INFO mapred.JobClient: Launched map tasks=2 > 11/10/13 17:57:10 INFO mapred.JobClient: Data-local map tasks=1 > 11/10/13 17:57:10 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=10356 > 11/10/13 17:57:10 INFO mapred.JobClient: File Input Format Counters > 11/10/13 17:57:10 INFO mapred.JobClient: Bytes Read=1283144 > 11/10/13 17:57:10 INFO mapred.JobClient: File Output Format Counters > 11/10/13 17:57:10 INFO mapred.JobClient: Bytes Written=86 > 11/10/13 17:57:10 INFO mapred.JobClient: FileSystemCounters > 11/10/13 17:57:10 INFO mapred.JobClient: FILE_BYTES_READ=6 > 11/10/13 17:57:10 INFO mapred.JobClient: HDFS_BYTES_READ=1283358 > 11/10/13 17:57:10 INFO mapred.JobClient: FILE_BYTES_WRITTEN=89486 > 11/10/13 17:57:10 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=86 > 11/10/13 17:57:10 INFO mapred.JobClient: Map-Reduce Framework > 11/10/13 17:57:10 INFO mapred.JobClient: Map output materialized > bytes=12 > 11/10/13 17:57:10 INFO mapred.JobClient: Map input records=66906 > 11/10/13 17:57:10 INFO mapred.JobClient: Reduce shuffle bytes=6 > 11/10/13 17:57:10 INFO mapred.JobClient: Spilled Records=0 > 11/10/13 17:57:10 INFO mapred.JobClient: Map output bytes=0 > 11/10/13 17:57:10 INFO mapred.JobClient: Map input bytes=1280141 > 11/10/13 17:57:10 INFO mapred.JobClient: Combine input records=0 > 11/10/13 17:57:10 INFO mapred.JobClient: SPLIT_RAW_BYTES=214 > 11/10/13 17:57:10 INFO mapred.JobClient: Reduce input records=0 > 11/10/13 17:57:10 INFO mapred.JobClient: Reduce input groups=0 > 11/10/13 17:57:10 INFO mapred.JobClient: Combine output records=0 > 11/10/13 17:57:10 INFO mapred.JobClient: Reduce output records=0 > 11/10/13 17:57:10 INFO mapred.JobClient: Map output records=0