[ https://issues.apache.org/jira/browse/LUCENE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052459#comment-13052459 ]
Michael McCandless commented on LUCENE-2454: -------------------------------------------- {quote} bq. It uses 2 passes if you also want to collect child docs per parent I tend to work with distributed indexes so it involves a 2 pass op anyway - one to understand best parents across the multiple shards first then the perparentlimitedquery to ensure we only pay the retrieve costs for those parents that make the final cut. {quote} The distributed case can still be done single pass, using LUCENE-3171, ie each shard returns the top groups and then they are merged in the front. This should be substantially faster than doing a 2nd pass out to all shards. Also, we now have TopDocs.merge/TopGroups.merge to support this use case. bq. This overlaps with the BlockJoinQuery of LUCENE-3171, this issue might even be closed as duplicate of that one. Which one is preferred? I think they are likely dups of one another and I agree we need to make sure all important use cases are covered. bq. Apps commonly need to return a selection of both matching and non-matching children along with the "best" parents. LUCENE-3171 can do this as well, with the same approach as here, ie doing 2 passes with two different child queries. However, I think for both this issue and for LUCENE-3171, this means each child doc must have the parent's PK indexed against it, right? Ie, for that 2nd query you need some way to return all child docs under any of the top parents, so the child query is "parentID MUST be in XX, YY, ZZ" and "childDoc SHOULD XYZ". In fact, we could make this a single pass capability with LUCENE-3171 and without requireing each child doc index its parent PK, ie also pull & sort all other non-matching children under any top parent, because collction within each parent is done when you retrieve the TopGroups, but this can be a later enhancement. > Nested Document query support > ----------------------------- > > Key: LUCENE-2454 > URL: https://issues.apache.org/jira/browse/LUCENE-2454 > Project: Lucene - Java > Issue Type: New Feature > Components: core/search > Affects Versions: 3.0.2 > Reporter: Mark Harwood > Assignee: Mark Harwood > Priority: Minor > Attachments: LUCENE-2454.patch, LUCENE-2454.patch, > LuceneNestedDocumentSupport.zip > > > A facility for querying nested documents in a Lucene index as outlined in > http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org