So now it gets interesting:
I copied nutch lib and have done a successful 'ant job'.
bin/nutch solrindex still raises a java.io.IOException: Job failed!
The interesting part is in the solr output:
Apr 19, 2011 11:57:31 AM org.apache.solr.update.processor.LogUpdateProcessor
finish
INFO: {add=[http:/urltoindex.com/, http:/urltoindex.com/2010/04/12.html, ...
(162 adds)]} 0 1746
Apr 19, 2011 11:57:31 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/update params={wt=javabin&version=1} status=0
QTime=1746
So it seems that nutch passes some information to solr.
But the nutch log states the following:
2011-04-19 12:08:05,221 WARN mapred.LocalJobRunner - job_local_0001
java.lang.RuntimeException: Invalid version or the data in not in 'javabin'
format
at
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
at
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:39)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:466)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:75)
at
org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:474)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
2011-04-19 12:08:06,042 ERROR solr.SolrIndexer - java.io.IOException: Job
failed!
What could be wrong with that simple setup?
Could there be other jar incompatibilities?
> Copying to Nutch lib does not necessarily mean that this is what is being
> used until you generate a new job file with 'ant job'. Nutch 1.3 clarifies
> this by separating a deployed environment from a local one and it would be a
> good idea to use it (it will be released pretty soon anyway).
>
> Am starting to believe that we should upgrade the version of SOLR in 1.3
> before it is releases as I expect that many people will be using it.
>
>
> On 19 April 2011 14:19, Max Stricker <[email protected]> wrote:
>
>>
>>> You must upgrade Nutch' two SolrJ jar's to 3.1.
>>
>> That's what I already tried. I copied the SolrJ Jars contained in the Solr
>> 3.1.0 release
>> to nutch/lib and removed the old ones.
>> So nutch now uses apache-solr-solrj-3.1.0.jar
>> The error however remains the same.
>>
>> Any other suggestions?
>
>
>
>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com