[jira] [Commented] (LUCENE-2454) Nested Document query support

Michael McCandless (JIRA) Tue, 21 Jun 2011 03:35:12 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052459#comment-13052459
 ]


Michael McCandless commented on LUCENE-2454:
--------------------------------------------

{quote}
bq. It uses 2 passes if you also want to collect child docs per parent

I tend to work with distributed indexes so it involves a 2 pass op anyway - one 
to understand best parents across the multiple shards first then the 
perparentlimitedquery to ensure we only pay the retrieve costs for those 
parents that make the final cut.
{quote}

The distributed case can still be done single pass, using LUCENE-3171,
ie each shard returns the top groups and then they are merged in the
front.  This should be substantially faster than doing a 2nd pass out
to all shards.

Also, we now have TopDocs.merge/TopGroups.merge to support this use
case.

bq. This overlaps with the BlockJoinQuery of LUCENE-3171, this issue might even 
be closed as duplicate of that one. Which one is preferred?

I think they are likely dups of one another and I agree we need to
make sure all important use cases are covered.

bq. Apps commonly need to return a selection of both matching and non-matching 
children along with the "best" parents.

LUCENE-3171 can do this as well, with the same approach as here, ie
doing 2 passes with two different child queries.

However, I think for both this issue and for LUCENE-3171, this means
each child doc must have the parent's PK indexed against it, right?
Ie, for that 2nd query you need some way to return all child docs
under any of the top parents, so the child query is "parentID MUST be
in XX, YY, ZZ" and "childDoc SHOULD XYZ".

In fact, we could make this a single pass capability with LUCENE-3171
and without requireing each child doc index its parent PK, ie also
pull & sort all other non-matching children under any top parent,
because collction within each parent is done when you retrieve the
TopGroups, but this can be a later enhancement.


> Nested Document query support
> -----------------------------
>
>                 Key: LUCENE-2454
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2454
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: core/search
>    Affects Versions: 3.0.2
>            Reporter: Mark Harwood
>            Assignee: Mark Harwood
>            Priority: Minor
>         Attachments: LUCENE-2454.patch, LUCENE-2454.patch, 
> LuceneNestedDocumentSupport.zip
>
>
> A facility for querying nested documents in a Lucene index as outlined in 
> http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2454) Nested Document query support

Reply via email to