[
https://issues.apache.org/jira/browse/LUCENE-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167686#comment-13167686
]
Michael McCandless commented on LUCENE-3602:
--------------------------------------------
{quote}
This is in the case if your from query was cached and your toSearch's
bitset isn't, which is a likely scenario.
{quote}
Hmm can you describe this? You mean the app sometimes actually uses the
fromSearcher.fromQuery's results, directly, without joining?
{quote}
It just matches docs from one side to the to side. That is all... So static
method / filter should be able to do the job.
I'm not sure, but if it is a query it might be able to one day encapsulate the
joining in the Lucene query language?
{quote}
Yeah... the core API is really the join method, to translate top-level docIDs
in fromSearcher over to toSearcher's top-level docIDs.
The AdjustedDISI (maybe rename SliceDISI? SubReaderDISI? ie, something to
indicate it "slices" a sub-reader's portion of the top-level docID space) can
then be used to translate back into a per-segment Filter.
I think it would be cleaner as a Filter...? This is actually similar to
DuplicateFilter, which also must operate on top-level docIDs (since dups can
happen in different segments).
bq. Would be nice if the user can choose between a more ram but faster variant
and a less ram but slower variant.
Yeah I agree... but what worries me is just how slow this non-RAM version will
be. Ie, it must do the full join and uninvert every time; so even if your
fromQuery only matches a tiny number of docs... you pay massive cost of the
full join. Even better than using FC/DV/DTO to map docID -> term(s) per query,
we could hold in RAM the join result itself (docID -> docID(s)) in some form,
then the query just directly maps the docIDs w/o having to lookup terms again.
Stepping back a bit... do we know how this impl compares to how ElasticSearch
does joins? And to how Solr does...?
> Add join query to Lucene
> ------------------------
>
> Key: LUCENE-3602
> URL: https://issues.apache.org/jira/browse/LUCENE-3602
> Project: Lucene - Java
> Issue Type: New Feature
> Components: modules/join
> Reporter: Martijn van Groningen
> Attachments: LUCENE-3602.patch, LUCENE-3602.patch
>
>
> Solr has (psuedo) join query for a while now. I think this should also be
> available in Lucene.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]