[jira] [Commented] (LUCENE-3602) Add join query to Lucene

Michael McCandless (Commented) (JIRA) Mon, 12 Dec 2011 11:12:04 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167686#comment-13167686
 ]


Michael McCandless commented on LUCENE-3602:
--------------------------------------------

{quote}
This is in the case if your from query was cached and your toSearch's
bitset isn't, which is a likely scenario.
{quote}

Hmm can you describe this?  You mean the app sometimes actually uses the 
fromSearcher.fromQuery's results, directly, without joining?

{quote}
It just matches docs from one side to the to side. That is all... So static 
method / filter should be able to do the job.
I'm not sure, but if it is a query it might be able to one day encapsulate the 
joining in the Lucene query language?
{quote}

Yeah... the core API is really the join method, to translate top-level docIDs 
in fromSearcher over to toSearcher's top-level docIDs.

The AdjustedDISI (maybe rename SliceDISI?  SubReaderDISI?  ie, something to 
indicate it "slices" a sub-reader's portion of the top-level docID space) can 
then be used to translate back into a per-segment Filter.

I think it would be cleaner as a Filter...?  This is actually similar to 
DuplicateFilter, which also must operate on top-level docIDs (since dups can 
happen in different segments).

bq.  Would be nice if the user can choose between a more ram but faster variant 
and a less ram but slower variant.

Yeah I agree... but what worries me is just how slow this non-RAM version will 
be.  Ie, it must do the full join and uninvert every time; so even if your 
fromQuery only matches a tiny number of docs... you pay massive cost of the 
full join.  Even better than using FC/DV/DTO to map docID -> term(s) per query, 
we could hold in RAM the join result itself (docID -> docID(s)) in some form, 
then the query just directly maps the docIDs w/o having to lookup terms again.

Stepping back a bit... do we know how this impl compares to how ElasticSearch 
does joins?  And to how Solr does...?
                
> Add join query to Lucene
> ------------------------
>
>                 Key: LUCENE-3602
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3602
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/join
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3602.patch, LUCENE-3602.patch
>
>
> Solr has (psuedo) join query for a while now. I think this should also be 
> available in Lucene.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-3602) Add join query to Lucene

Reply via email to