Yes, what Markus has pointed out is the problem I think Jann. This means you need to re-index you're data and change the stored and index value to true.
Markus', out of interest do you know the pro's/con's if we were to make this default in the Nutch schema? For example, with small indexes I wouldn't imagine there would be much difference, however non-trivial sized indexes I would imagine would be a different story... Any thoughts. On Mon, Sep 19, 2011 at 2:54 PM, Markus Jelsma <[email protected]>wrote: > Check line 79 of your Solr schema: > > http://svn.apache.org/viewvc/nutch/branches/branch-1.3/conf/schema.xml?view=markup > > Maybe we should configure the field to be stored in 1.4. I can imagine this > causes a lot of headaches for new users. Also highlighting will never work > with unstored fields. > > On Monday 19 September 2011 11:02:17 Jann Forrer wrote: > > Hi > > > > I tried to run nutch-1.3 together with solr 3.x according to > > http://wiki.apache.org/nutch/NutchTutorial. > > > > That worked as described but if I try to search the index using the Solr > > admin > > interface i always get an empty result. > > > > http://localhost:8983/solr/admin/schema.jsp > > > > Using the Schema Browser I see entries in different fields (e.g. the url > > field) but the content field is emtpy. I > > was looking for similar problem on the mailing list but I didn't found a > > solution for this problem. > > > > Here is what I did: > > > > 1.) ./bin/nutch crawl urls -dir crawl -depth 3 -topN 5 > > 2.) Dumping the segment (./bin/nutch readseg -dump > > crawl/segments/20110916124747 test). The script > > did also dump the content of the web pages. All seems to be ok > here. > > 3.) Copy the nutch schema.xml to the solr conf directory > > 4.) bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb > > crawl/linkdb crawl/segments/* > > 5.) And then trying to search using http://localhost:8983/solr/admin/. > > but didn't found any HTML-content. > > However if there was a pdf-File to crawl, this pdf-Content is > found. > > > > BTW. Using Nutch 1.1 and solr 1.4.1 all worked as expected. I could use > > these version but I am upgrading > > from an older Nutch Version and it would be nice if I could use the > > newer version where nutch and solr > > are better integrated. > > > > Any Ideas what might be wrong? > > > > Jann > > -- > Markus Jelsma - CTO - Openindex > http://www.linkedin.com/in/markus17 > 050-8536620 / 06-50258350 > -- *Lewis*

