You can ignore mapred.input.subdir; I find it is an unneeded option.

Now that the mapred branch is merged to be the trunk, there is a need
to clarify the documentation since the a change was made to have the
input be specified as a directory and then all files in that directory
are considered input files (no wildcard needed).  I will put that on
my ToDo list.

mapred.input.dir is an abstract path that is either the OS filesystem
or NDFS, depending on which is in use (if fs.default.name is "local" then
the local OS fs is being used, otherwise fs.default.name is something
like domainOfMyMasterNode:port).

To use NDFS, you need to copy your input file(s) from your local fs to NDFS:

  bin/nutch ndfs -put /home/peb/urls_localfs/oneFILENAME  /urls

The destination path "/urls" is arbitrary and is created as a side effect
of the file -put.  Repeat this for each file you have.

Paul

Lukas Vlcek wrote:
> java.io.IOException: No input directories specified in: NutchConf:
> nutch-default.xml , mapred-default.xml ,
> /home/lukas/nutch/mapred/local/localRunner/job_4zwds6.xml ,
> nutch-site.xml
>         at 
org.apache.nutch.mapred.InputFormatBase.listFiles(InputFormatBase.java:85)
>         at 
org.apache.nutch.mapred.InputFormatBase.getSplits(InputFormatBase.java:95)
>         at 
org.apache.nutch.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:63)
> 051220 204249 Running job: job_4zwds6
> Exception in thread "main" java.io.IOException: Job failed!
>         at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:308)
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:102)
>         at org.apache.nutch.crawl.Crawl.main(Crawl.java:101)



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to