[
https://issues.apache.org/jira/browse/SOLR-12441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526208#comment-16526208
]
Jan Høydahl commented on SOLR-12441:
------------------------------------
Elastic will always index nested objects as plain flat fields on the main
document unless the mapping (schema) [explicitly defines a particular json-path
as
"nested"|https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html].
I think this explicit definition makes sense for several reasons. We also need
to make sure that users don't index two docs where one is adding a simple value
to the "myChildren" field while another document adds a nested document below
the same field. So it sounds like schema should have a way to define
{{nested=true}} for certain fields or path.to.field so that the URP can know
how to interpret a doc. That would also remove the need for guessing based on
presence of id field or whatever, you just ask the {{IndexSchema}}.
We then also need to handle the case where a sub doc wants to use the same
field name as a parent and those are different types, e.g.
{code:javascript}
{ "id": 1,
"name" : "john",
"address" : "London",
"child" : {
"name" : "peter",
"address" : {
"street" : "oxford st 3",
"zip" : "12345"}}}
{code}
In ES this is legal, since in the default type-guessing will create lucene
fields "name", "address", "child.name", "child.address.street",
"child.address.zip". And in case of nested docs I guess the "address" field
name would not share the same type in the mapping.
So in order to tackle this we'd need to do some changes to auto-guessing logic
as well as ability to use a fully qualified field name for the nested parts of
a document, if we'd like to support both flat-style and nested-style from the
same source document.
> Add deeply nested documents URP
> -------------------------------
>
> Key: SOLR-12441
> URL: https://issues.apache.org/jira/browse/SOLR-12441
> Project: Solr
> Issue Type: Sub-task
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: mosh
> Assignee: David Smiley
> Priority: Major
> Time Spent: 2h 10m
> Remaining Estimate: 0h
>
> As discussed in
> [SOLR-12298|https://issues.apache.org/jira/browse/SOLR-12298], there ought to
> be an URP to add metadata fields to childDocuments in order to allow a
> transformer to rebuild the original document hierarchy.
> {quote}I propose we add the following fields:
> # __nestParent__
> # _nestLevel_
> # __nestPath__
> __nestParent__: This field wild will store the document's parent docId, to be
> used for building the whole hierarchy, using a new document transformer, as
> suggested by Jan on the mailing list.
> _nestLevel_: This field will store the level of the specified field in the
> document, using an int value. This field can be used for the parentFilter,
> eliminating the need to provide a parentFilter, which will be set by default
> as "_level_:queriedFieldLevel".
> _nestLevel_: This field will contain the full path, separated by a specific
> reserved char e.g., '.'
> for example: "first.second.third".
> This will enable users to search for a specific path, or provide a regular
> expression to search for fields sharing the same name in different levels of
> the document, filtering using the level key if needed.
> {quote}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]