Re: Cross index join query performance

Joel Bernstein Thu, 26 Sep 2013 15:51:02 -0700

It looks like you are using int join keys so you may want to check out
SOLR-4787, specifically the hjoin and bjoin.


These perform well when you have a large number of results from the
fromIndex. If you have a small number of results in the fromIndex the
standard join will be faster.


On Wed, Sep 25, 2013 at 3:39 PM, Peter Keegan <peterlkee...@gmail.com>wrote:

> I forgot to mention - this is Solr 4.3
>
> Peter
>
>
>
> On Wed, Sep 25, 2013 at 3:38 PM, Peter Keegan <peterlkee...@gmail.com
> >wrote:
>
> > I'm doing a cross-core join query and the join query is 30X slower than
> > each of the 2 individual queries. Here are the queries:
> >
> > Main query: http://localhost:8983/solr/mainindex/select?q=title:java
> > QTime: 5 msec
> > hit count: 1000
> >
> > Sub query: http://localhost:8983/solr/subindex/select?q=+fld1:[0.1 TO
> 0.3]
> > QTime: 4 msec
> > hit count: 25K
> >
> > Join query:
> >
> http://localhost:8983/solr/mainindex/select?q=title:java&fq={!joinfromIndex=mainindextoIndex=subindex
>  from=docid to=docid}fld1:[0.1 TO 0.3]
> > QTime: 160 msec
> > hit count: 205
> >
> > Here are the index spec's:
> >
> > mainindex size: 117K docs, 1 segment
> > mainindex schema:
> >    <field name="docid" type="int" indexed="true" stored="true"
> > required="true" multiValued="false" />
> >    <field name="title" type="text_en_splitting" indexed="true"
> > stored="true" multiValued="false" />
> >    <uniqueKey>docid</uniqueKey>
> >
> > subindex size: 117K docs, 1 segment
> > subindex schema:
> >    <field name="docid" type="int" indexed="true" stored="true"
> > required="true" multiValued="false" />
> >    <field name="fld1" type="float" indexed="true" stored="true"
> > required="false" multiValued="false" />
> >    <uniqueKey>docid</uniqueKey>
> >
> > With debugQuery=true I see:
> >   "debug":{
> >     "join":{
> >       "{!join from=docid to=docid fromIndex=subindex}fld1:[0.1 TO 0.3]":{
> >         "time":155,
> >         "fromSetSize":24742,
> >         "toSetSize":24742,
> >         "fromTermCount":117810,
> >         "fromTermTotalDf":117810,
> >         "fromTermDirectCount":117810,
> >         "fromTermHits":24742,
> >         "fromTermHitsTotalDf":24742,
> >         "toTermHits":24742,
> >         "toTermHitsTotalDf":24742,
> >         "toTermDirectCount":24627,
> >         "smallSetsDeferred":115,
> >         "toSetDocsAdded":24742}},
> >
> > Via profiler and debugger, I see 150 msec spent in the outer
> > 'while(term!=null)' loop in: JoinQueryWeight.getDocSet(). This seems
> like a
> > lot of time to join the bitsets. Does this seem right?
> >
> > Peter
> >
> >
>



-- 
Joel Bernstein
Professional Services LucidWorks

Re: Cross index join query performance

Reply via email to