[ 
https://issues.apache.org/jira/browse/LUCENE-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182269#comment-13182269
 ] 

Michael McCandless commented on LUCENE-3602:
--------------------------------------------

Wow new patch is tiny -- just 2 static methods!

Right now you do 3 passes -- 1st pass records fromSearcher's docIDs
matching fromQuery; 2nd pass translates those matching docIDs into
the joinable terms in fromSearcher.fromField; 3rd pass then records
toSearcher docIDs matching those terms in toField.

But I think the first 2 passes could be combined?  Ie, as you collect
each hit from fromQuery, instead of recording the docID, go and look up
the term in fromField for that doc and save it away?  Then you don't
need to save the fromSearcher docIDs?  (3rd pass would then be the
same).

Also, instead of making a toplevel bit set as the return
result... could it be an ordinary filter?  Then the 3rd pass would be
implemented in Filter.getDocIDSet, and the Filter instance would hold
onto these terms computed by the combined 1st/2nd pass?

I think this is a great step forward over previous patch... so tiny
too :)

The 1st/2nd pass would have "expected" cost, ie on the order of how
many hits matched in fromQuery.  But the 3rd pass has a high cost even
for tiny queries since it visits every doc, checking whether its terms
are in the set.  We might be able to improve on that somehow... eg if
the number of terms is small, really you want to invert that process
(ie visit the postings and gather all matching docs), either with an
OR query or with TermsFilter (in modules/queries)?  Hmm, in fact,
isn't this just a MultiTermQuery?  We can then use auto rewrite mode
to rewrite as filter or small BooleanQuery?

                
> Add join query to Lucene
> ------------------------
>
>                 Key: LUCENE-3602
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3602
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/join
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3602.patch, LUCENE-3602.patch, LUCENE-3602.patch
>
>
> Solr has (psuedo) join query for a while now. I think this should also be 
> available in Lucene.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to