[jira] [Commented] (SOLR-12519) Support Deeply Nested Docs In Child Documents Transformer

David Smiley (JIRA) Thu, 09 Aug 2018 21:51:23 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16575751#comment-16575751
 ]


David Smiley commented on SOLR-12519:
-------------------------------------

[~moshebla] I pushed a feature branch, as you can see.  I took your PR, and 
went a bit further with it.  

There's no "hierarchy" transformer param, or anonChildDocs transformer param 
either; it's driven by the presence of the nest path in the schema.  No new 
params from before.

One of the biggest changes you'll notice is I ripped out ToChildBlockJoinQuery 
which was really unnecessary for a scenario like this where we're really just 
doing an ID lookup.  But the silly thing is that we already have the low-level 
Lucene document ID for the root document in a parameter.  So we don't even need 
the uniqueKey field either.  It scans from first child ID (even if it didn't 
match the childFilter).  This sets us up well for a future feature you've 
spoken of to get all prior sibling child documents.  It's unfortunate we fetch 
the path only to potentially not need it if we haven't reached the first child 
matching the filter but that should be a relatively minor cost (DocValues are 
designed for fast access).

There's a nocommit about supporting "limit".  If there is a limit... we may 
want to scan backwards from the root ID to detect which of them are in the 
child filter so we know how far back to go, and then perhaps initialize the 
loop from that point.  That doesn't sound too bad.  It's also an improvement 
over the previous limit processing which would go forward from the first... 
which seems worse then starting from the root.  This ought to be tested.

I definitely made other changes too.  Maybe you can see these changes easiest 
in your IDE by comparing the branches.  At least I can do so easily in IntelliJ.

> Support Deeply Nested Docs In Child Documents Transformer
> ---------------------------------------------------------
>
>                 Key: SOLR-12519
>                 URL: https://issues.apache.org/jira/browse/SOLR-12519
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: mosh
>            Priority: Major
>         Attachments: SOLR-12519-no-commit.patch
>
>          Time Spent: 16.5h
>  Remaining Estimate: 0h
>
> As discussed in SOLR-12298, to make use of the meta-data fields in 
> SOLR-12441, there needs to be a smarter child document transformer, which 
> provides the ability to rebuild the original nested documents' structure.
>  In addition, I also propose the transformer will also have the ability to 
> bring only some of the original hierarchy, to prevent unnecessary block join 
> queries. e.g.
> {code}  {"a": "b", "c": [ {"e": "f"}, {"e": "g"} , {"h": "i"} ]} {code}
>  Incase my query is for all the children of "a:b", which contain the key "e" 
> in them, the query will be broken in to two parts:
>  1. The parent query "a:b"
>  2. The child query "e:*".
> If the only children flag is on, the transformer will return the following 
> documents:
>  {code}[ {"e": "f"}, {"e": "g"} ]{code}
> In case the flag was not turned on(perhaps the default state), the whole 
> document hierarchy will be returned, containing only the matching children:
> {code}{"a": "b", "c": [ {"e": "f"}, {"e": "g"} ]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-12519) Support Deeply Nested Docs In Child Documents Transformer

Reply via email to