Check line 79 of your Solr schema:
http://svn.apache.org/viewvc/nutch/branches/branch-1.3/conf/schema.xml?view=markup

Maybe we should configure the field to be stored in 1.4. I can imagine this 
causes a lot of headaches for new users. Also highlighting will never work 
with unstored fields.

On Monday 19 September 2011 11:02:17 Jann Forrer wrote:
> Hi
> 
> I tried to run nutch-1.3 together with solr  3.x according to
> http://wiki.apache.org/nutch/NutchTutorial.
> 
> That worked as described but if I try to search the index using the Solr
> admin
> interface i always get an empty result.
> 
> http://localhost:8983/solr/admin/schema.jsp
> 
> Using the Schema Browser I see entries in different fields (e.g. the url
> field) but the content field is emtpy. I
> was looking for similar problem on the mailing list but I didn't found a
> solution for this problem.
> 
> Here is what  I did:
> 
> 1.) ./bin/nutch crawl urls -dir crawl -depth 3 -topN 5
> 2.) Dumping the segment (./bin/nutch readseg -dump
> crawl/segments/20110916124747 test). The script
>       did also dump the content of the web pages. All seems to be ok here.
> 3.) Copy the nutch schema.xml to the solr conf directory
> 4.) bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb
> crawl/linkdb crawl/segments/*
> 5.) And then trying to search using http://localhost:8983/solr/admin/.
> but didn't found any HTML-content.
>       However if there was a pdf-File to crawl, this pdf-Content is found.
> 
> BTW. Using Nutch 1.1 and solr 1.4.1 all worked as expected.  I could use
> these version but I am upgrading
> from an older Nutch Version and it would be nice if I could use the
> newer version where nutch and solr
> are better integrated.
> 
> Any Ideas what might be wrong?
> 
> Jann

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Reply via email to