All, I have a couple websites that I need to crawl and the following command line used to work I think. Solr is up and running and everything is fine there and I can go through and index the site but I really need the results added to Solr after the crawl. Does anyone have any idea on how to make that happen or what I'm doing wrong? These errors are being thrown fro Hadoop which I am not using at all.
$ bin/nutch crawl urls -dir crawl -threads 10 -depth 100 -topN 50 -solrindex ht tp://localhost:8983/solr crawl started in: crawl rootUrlDir = http://localhost:8983/solr threads = 10 depth = 100 indexer=lucene topN = 50 Injector: starting at 2010-12-20 15:23:25 Injector: crawlDb: crawl/crawldb Injector: urlDir: http://localhost:8983/solr Injector: Converting injected urls to crawl db entries. Exception in thread "main" java.io.IOException: No FileSystem for scheme: http at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375 ) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.j ava:169) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.ja va:201) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:7 81) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249) at org.apache.nutch.crawl.Injector.inject(Injector.java:217) at org.apache.nutch.crawl.Crawl.main(Crawl.java:124)