[ 
https://issues.apache.org/jira/browse/LUCENE-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189375#comment-13189375
 ] 

Jason Rutherglen commented on LUCENE-3602:
------------------------------------------

I was reviewing this issue to use where Solr's join implementation may not be 
the right choice.

In this Lucene Join implementation, a new BytesRefHash is built per query (and 
cannot be reused).  This could generate a fair amount of garbage quickly.  

Also the sort compare using BRH is per byte (not as cheap as an ord compare).  
We can probably use DocTermsIndex to replace the use of BytesRefHash by 
comparing DTI's ords.  Then we are saving off the bytes into BRH per query, and 
the comparison would be faster.

Additionally, for a join with many terms, the number of postings could become a 
factor in performance.  Because we are not caching bitsets like Solr does, it 
seems like an excellent occasion for a faster less-compressed codec.

Further, to save on term seeking, if the term state was cached (eg, the file 
pointers into the posting list), the iteration on the terms dict would be 
removed.

Granted all this requires more RAM, however in many cases (eg, mine), that 
would be acceptable.
                
> Add join query to Lucene
> ------------------------
>
>                 Key: LUCENE-3602
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3602
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/join
>            Reporter: Martijn van Groningen
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3602.patch, LUCENE-3602.patch, LUCENE-3602.patch, 
> LUCENE-3602.patch, LUCENE-3602.patch, LUCENE-3602.patch, LUCENE-3602.patch, 
> LUCENE-3602.patch, LUCENE-3602.patch, LUCENE-3602.patch
>
>
> Solr has (psuedo) join query for a while now. I think this should also be 
> available in Lucene.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to