creating solr index from nutch segments, no errors, no results

2011-11-15 Thread Armin Schleicher

hi there,

i am trying to create a fulltext index over internet archive .warc 
files. the whole procedure (as described in the following) seems to work 
fine, i do not get any errors or warnings, however there is no data 
being passed to solr, at least q=*:* returns nothing. I double checked 
the nutch scheme.xml is in the right place  and when i dump the segments 
into a textfile, all the data is there...
i create the segments using nutchwax import command from *.warc.gz files 
created by archive-it! (heritrix) and then create crawldb and linkdb 
using nutch updatedb and invertlinks commands.

here is my procedure:

*create solrindex*

   /sh /nutch-1.3/runtime/local/bin/nutch solrindex
   http://127.0.0.1:8983/solr/ /crawldb /linkdb /segments_test//


*nutch output:

*

   /SolrIndexer: starting at 2011-11-15 08:45:53
   SolrIndexer: finished at 2011-11-15 08:45:57, elapsed: 00:00:03/

*
*
*this is the resulting solr/jetty output:*

   /15.11.2011 08:45:57 org.apache.solr.update.DirectUpdateHandler2 commit
   INFO: start
   commit(optimize=false,waitFlush=true,waitSearcher=true,expungeDeletes=false)
   15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher init
   INFO: Opening Searcher@3d015a9e main
   15.11.2011 08:45:57 org.apache.solr.update.DirectUpdateHandler2 commit
   INFO: end_commit_flush
   15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
   INFO: autowarming Searcher@3d015a9e main from Searcher@4743bf3d main
   
   fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

   15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
   INFO: autowarming result for Searcher@3d015a9e main
   
   fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

   15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
   INFO: autowarming Searcher@3d015a9e main from Searcher@4743bf3d main
   
   filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

   15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
   INFO: autowarming result for Searcher@3d015a9e main
   
   filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

   15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
   INFO: autowarming Searcher@3d015a9e main from Searcher@4743bf3d main
   
   queryResultCache{lookups=1,hits=0,hitratio=0.00,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=1,cumulative_evictions=0}

   15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
   INFO: autowarming result for Searcher@3d015a9e main
   
   queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=1,cumulative_evictions=0}

   15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
   INFO: autowarming Searcher@3d015a9e main from Searcher@4743bf3d main
   
   documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

   15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
   INFO: autowarming result for Searcher@3d015a9e main
   
   documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}

   15.11.2011 08:45:57 org.apache.solr.core.QuerySenderListener newSearcher
   INFO: QuerySenderListener sending requests to Searcher@3d015a9e main
   15.11.2011 08:45:57 org.apache.solr.core.QuerySenderListener newSearcher
   INFO: QuerySenderListener done.
   15.11.2011 08:45:57 org.apache.solr.core.SolrCore registerSearcher
   INFO: [] Registered new searcher Searcher@3d015a9e main
   15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher close
   INFO: Closing Searcher@4743bf3d main
   
   fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
   
   filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
   
   

Re: creating solr index from nutch segments, no errors, no results

2011-11-15 Thread Michael Kuhlmann
I don't know much about nutch, but it looks like there's simply a commit 
missing at the end.


Try to send a commit, e.g  by executing

curl http://host:port/solr/core/update -H Content-Type: text/xml 
--data-binary 'commit /'


-Kuli

Am 15.11.2011 09:11, schrieb Armin Schleicher:

hi there,

[...]