Re: [Error Crawling Job Failed] NUTCH 1.9

Muhamad Muchlis Mon, 03 Nov 2014 02:48:12 -0800

Like this ?

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>


<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>
 <name>http.agent.name</name>
 <value>My Nutch Spider</value>
</property>

*<property>*
* <name>solr.server.url</name>*
* <value>http://localhost:8983/solr/ <http://localhost:8983/solr/></value>*
*</property>*


</configuration>


On Mon, Nov 3, 2014 at 5:41 PM, Markus Jelsma <markus.jel...@openindex.io>
wrote:

> You can set solr.server.url in your nutch-site.xml or pass it via command
> line as -Dsolr.server.url=<URL>
>
>
>
> -----Original message-----
> > From:Muhamad Muchlis <tru3....@gmail.com>
> > Sent: Monday 3rd November 2014 11:37
> > To: user@nutch.apache.org
> > Subject: Re: [Error Crawling Job Failed] NUTCH 1.9
> >
> > Hi Markus,
> >
> > Where can I find the settings solr url?  -D
> >
> > On Mon, Nov 3, 2014 at 5:31 PM, Markus Jelsma <
> markus.jel...@openindex.io>
> > wrote:
> >
> > > Well, here is is:
> > > java.lang.RuntimeException: Missing SOLR URL. Should be set via
> > > -Dsolr.server.url
> > >
> > >
> > >
> > > -----Original message-----
> > > > From:Muhamad Muchlis <tru3....@gmail.com>
> > > > Sent: Monday 3rd November 2014 10:58
> > > > To: user@nutch.apache.org
> > > > Subject: Re: [Error Crawling Job Failed] NUTCH 1.9
> > > >
> > > > 2014-11-03 16:56:06,530 INFO  indexer.IndexingJob - Indexer:
> starting at
> > > > 2014-11-03 16:56:06
> > > > 2014-11-03 16:56:06,582 INFO  indexer.IndexingJob - Indexer: deleting
> > > gone
> > > > documents: false
> > > > 2014-11-03 16:56:06,582 INFO  indexer.IndexingJob - Indexer: URL
> > > filtering:
> > > > false
> > > > 2014-11-03 16:56:06,582 INFO  indexer.IndexingJob - Indexer: URL
> > > > normalizing: false
> > > > 2014-11-03 16:56:06,800 ERROR solr.SolrIndexWriter - Missing SOLR
> URL.
> > > > Should be set via -D solr.server.url
> > > > SOLRIndexWriter
> > > > solr.server.url : URL of the SOLR instance (mandatory)
> > > > solr.commit.size : buffer size when sending to SOLR (default 1000)
> > > > solr.mapping.file : name of the mapping file for fields (default
> > > > solrindex-mapping.xml)
> > > > solr.auth : use authentication (default false)
> > > > solr.auth.username : use authentication (default false)
> > > > solr.auth : username for authentication
> > > > solr.auth.password : password for authentication
> > > >
> > > > 2014-11-03 16:56:06,802 ERROR indexer.IndexingJob - Indexer:
> > > > java.lang.RuntimeException: Missing SOLR URL. Should be set via -D
> > > > solr.server.url
> > > > SOLRIndexWriter
> > > > solr.server.url : URL of the SOLR instance (mandatory)
> > > > solr.commit.size : buffer size when sending to SOLR (default 1000)
> > > > solr.mapping.file : name of the mapping file for fields (default
> > > > solrindex-mapping.xml)
> > > > solr.auth : use authentication (default false)
> > > > solr.auth.username : use authentication (default false)
> > > > solr.auth : username for authentication
> > > > solr.auth.password : password for authentication
> > > >
> > > > at
> > > >
> > >
> org.apache.nutch.indexwriter.solr.SolrIndexWriter.setConf(SolrIndexWriter.java:192)
> > > > at
> > > >
> > >
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:159)
> > > > at org.apache.nutch.indexer.IndexWriters.<init>(IndexWriters.java:57)
> > > > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:91)
> > > > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:176)
> > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:186)
> > > >
> > > >
> > > > On Mon, Nov 3, 2014 at 3:41 PM, Markus Jelsma <
> > > markus.jel...@openindex.io>
> > > > wrote:
> > > >
> > > > > Hi - see the logs for more details.
> > > > > Markus
> > > > >
> > > > > -----Original message-----
> > > > > > From:Muhamad Muchlis <tru3....@gmail.com>
> > > > > > Sent: Monday 3rd November 2014 9:15
> > > > > > To: user@nutch.apache.org
> > > > > > Subject: [Error Crawling Job Failed] NUTCH 1.9
> > > > > >
> > > > > > Hello.
> > > > > >
> > > > > > I get an error message when I run the command:
> > > > > >
> > > > > > *crawl seed/seed.txt crawl -depth 3 -topN 5*
> > > > > >
> > > > > >
> > > > > > Error Message :
> > > > > >
> > > > > > SOLRIndexWriter
> > > > > > solr.server.url : URL of the SOLR instance (mandatory)
> > > > > > solr.commit.size : buffer size when sending to SOLR (default
> 1000)
> > > > > > solr.mapping.file : name of the mapping file for fields (default
> > > > > > solrindex-mapping.xml)
> > > > > > solr.auth : use authentication (default false)
> > > > > > solr.auth.username : use authentication (default false)
> > > > > > solr.auth : username for authentication
> > > > > > solr.auth.password : password for authentication
> > > > > >
> > > > > >
> > > > > > Indexer: java.io.IOException: Job failed!
> > > > > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
> > > > > > at
> org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:114)
> > > > > > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:176)
> > > > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > > > > at
> org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:186)
> > > > > >
> > > > > >
> > > > > > Can anyone explain why this happened ?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Best regard's
> > > > > >
> > > > > > M.Muchlis
> > > > > >
> > > > >
> > > >
> > >
>

Re: [Error Crawling Job Failed] NUTCH 1.9

Reply via email to