[jira] [Commented] (LUCENE-4768) Child Traversable To Parent Block Join Query

Mark Harwood (JIRA) Mon, 11 Feb 2013 03:35:16 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-4768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13575740#comment-13575740
 ]


Mark Harwood commented on LUCENE-4768:
--------------------------------------

As with any discussion about nested queries you need to be very clear about the 
required logic. When you talk about matching f1:A or f1:B - are we talking 
about matches on the same child doc or possibly matches on different child docs 
of the same parent? The examples don't make this clear.
If we assume your child-based criteria is focused on examining the contents of 
single children (as opposed to combining f1:A on one child doc with f1:B on a 
different child doc) then a BooleanQuery that combines these child query 
elements will already be sufficient for skipping through children.

Not really sure what you are trying to optimize anyway with skipping - 
parent-child combos are limited to what fits into a single segment which is in 
turn limited by RAM. You don't generally get parents with "many many" children 
because of these constraints. The "nextDoc" calls you are trying to skip are 
related to a compressed block of child doc IDs (gap encoded varints) that are 
read off disk in 1K chunks (if I recall default Directory settings correctly). 
The chances are high that the limited number of child docIDs that belong to 
each parent are already in RAM as part of normal disk access patterns so there 
is no real saving in disk IO. Are you sure this is a performance bottleneck?



                
> Child Traversable To Parent Block Join Query
> --------------------------------------------
>
>                 Key: LUCENE-4768
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4768
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/query/scoring
>         Environment: trunk
> git rev-parse HEAD
> 5cc88eaa41eb66236a0d4203cc81f1eed97c9a41
>            Reporter: Vadim Kirilchuk
>         Attachments: LUCENE-4768-draft.patch
>
>
> Hi everyone!
> Let me describe what i am trying to do:
> I have hierarchical documents ('car model' as parent, 'trim' as child) and 
> use block join queries to retrieve them. However, i am not happy with current 
> behavior of ToParentBlockJoinQuery which goes through all parent childs 
> during nextDoc call (accumulating scores and freqs).
> Consider the following example, you have a query with a custom post condition 
> on top of such bjq: and during post condition you traverse scorers tree 
> (doc-at-time) and want to manually push child scorers of bjq one by one until 
> condition passes or current parent have no more childs.
> I am attaching the patch with query(and some tests) similar to 
> ToParentBlockJoin but with an ability to traverse childs. (i have to do weird 
> instance of check and cast inside my code) This is a draft only and i will be 
> glad to hear if someone need it or to hear how we can improve it. 
> P.s i believe that proposed query is more generic (low level) than 
> ToParentBJQ and ToParentBJQ can be extended from it and call nextChild() 
> internally during nextDoc().
> Also, i think that the problem of traversing hierarchical documents is more 
> complex as lucene have only nextDoc API. What do you think about making api 
> more hierarchy aware? One level document is a special case of multi level 
> document but not vice versa. WDYT?
> Thanks in advance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4768) Child Traversable To Parent Block Join Query

Reply via email to