Sharded Index Creation Magic?

Nick Dimiduk Mon, 13 Jul 2009 13:30:50 -0700

Hello!

I'm working with Solr-1.3.0 using a sharded index for distributed,
aggregated search. I've successfully run through the example described in
the DistributedSearch wiki page. I have built an index from a corpus of some
50mil documents in an HBase table and created 7 shards using the
org.apache.hadoop.hbase.mapred.BuildTableIndex. I can deploy any one of
these shards to a single Solr instance and happily search the index after
tweaking the schema appropriately. However, when I search across all
deployed shards using the &shards= query parameter (
http://host00:8080/solr/select?shards=host00:8080/solr,host01:8080/solr&q=body\%3A%3Aterm),
I get a NullPointerException:


java.lang.NullPointerException
        at 
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:421)
        at 
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:265)
        at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:264)
        at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
        at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
        at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)

Debugging into the QueryComponent.mergeIds() method reveals the instance
sreq.responses (line 356) contains one response for each shard specified,
each with the number of results received by the independant queries. The
problems begin down at line 370 because the SolrDocument instance has only a
score field -- which proves problematic in the following line where the id
is requested. The SolrDocument, only containing a score, lacks the
designated ID field (from my schema) and thus the document cannot be added
to the results queue.

Because the example on the wiki works by loading the documents directly into
Solr for indexing, I have come to the conclusion that there is some extra
magic happening in this index generation process which my process lacks.

Thanks for the help!

Sharded Index Creation Magic?

Reply via email to