[ 
https://issues.apache.org/jira/browse/SOLR-12441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526208#comment-16526208
 ] 

Jan Høydahl commented on SOLR-12441:
------------------------------------

Elastic will always index nested objects as plain flat fields on the main 
document unless the mapping (schema) [explicitly defines a particular json-path 
as 
"nested"|https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html].
 I think this explicit definition makes sense for several reasons. We also need 
to make sure that users don't index two docs where one is adding a simple value 
to the "myChildren" field while another document adds a nested document below 
the same field. So it sounds like schema should have a way to define 
{{nested=true}} for certain fields or path.to.field so that the URP can know 
how to interpret a doc. That would also remove the need for guessing based on 
presence of id field or whatever, you just ask the {{IndexSchema}}. 

We then also need to handle the case where a sub doc wants to use the same 
field name as a parent and those are different types, e.g.

{code:javascript}
{ "id": 1, 
  "name" : "john", 
  "address" : "London", 
  "child" : { 
    "name" : "peter", 
    "address" : { 
      "street" : "oxford st 3", 
      "zip" : "12345"}}}
{code}

In ES this is legal, since in the default type-guessing will create lucene 
fields "name", "address", "child.name", "child.address.street", 
"child.address.zip". And in case of nested docs I guess the "address" field 
name would not share the same type in the mapping.

So in order to tackle this we'd need to do some changes to auto-guessing logic 
as well as ability to use a fully qualified field name for the nested parts of 
a document, if we'd like to support both flat-style and nested-style from the 
same source document.

> Add deeply nested documents URP
> -------------------------------
>
>                 Key: SOLR-12441
>                 URL: https://issues.apache.org/jira/browse/SOLR-12441
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: mosh
>            Assignee: David Smiley
>            Priority: Major
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> As discussed in 
> [SOLR-12298|https://issues.apache.org/jira/browse/SOLR-12298], there ought to 
> be an URP to add metadata fields to childDocuments in order to allow a 
> transformer to rebuild the original document hierarchy.
> {quote}I propose we add the following fields:
>  # __nestParent__
>  # _nestLevel_
>  # __nestPath__
> __nestParent__: This field wild will store the document's parent docId, to be 
> used for building the whole hierarchy, using a new document transformer, as 
> suggested by Jan on the mailing list.
> _nestLevel_: This field will store the level of the specified field in the 
> document, using an int value. This field can be used for the parentFilter, 
> eliminating the need to provide a parentFilter, which will be set by default 
> as "_level_:queriedFieldLevel".
> _nestLevel_: This field will contain the full path, separated by a specific 
> reserved char e.g., '.'
>  for example: "first.second.third".
>  This will enable users to search for a specific path, or provide a regular 
> expression to search for fields sharing the same name in different levels of 
> the document, filtering using the level key if needed.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to