[
https://issues.apache.org/jira/browse/SOLR-12768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730574#comment-16730574
]
David Smiley commented on SOLR-12768:
-------------------------------------
I toyed with creating a custom field type subclass but I wound up thinking it's
rather unnecessary... if only Solr had some notion of implicit field types that
you could refer to without having to explicitly define them (as I mentioned in
the dev list). This isn't strictly necessary, I'm trying to both (a) avoid
implementation detail & bloat in the schema of users, and (b) use an approach
that could easily change in the future using luceneMatchVersion if we change
our mind on the default.
So I added this mechanism with {{_nest_path_}} being an implicitly defined
field that is registered on-demand of first use via
{{IndexSchema.createImplicitFieldType()}}. I decided factor out a static
method to create this specific field type and put it into
NestedUpdateProcessorFactory so as to keep related code together. I had to
make FieldType.setArgs public which seemed fine. Eventually it'd be nice to
see primitive field types implicitly declared, which would be done directly in
the switch statement I added in createImplicitFieldType. I did _not_ enhance
the REST Schema mutation API to use this mechanism -- that's a follow-on TODO
and would need its own test.
About the new analysis.... TestChildDocTransformerHierarchy has many test
methods that fail. I have yet to update them to pass. Many fail because of a
combination of two factors (a) they use an experimental syntax for the
"childFilter" parameter that is defined in
org.apache.solr.response.transform.ChildDocTransformerFactory#processPathHierarchyQueryString
that assumes a tokenization that allows certain queries to match, and (b) I
changed the text analysis in this patch and thus (a)'s assumption is false. I
think this can be fixed by a simple adjustment to the code building the query
to insert a leading wildcard if the input/query does not start with a '/'. Of
course leading wildcard queries are slow but if the total number of unique
paths is very small (as I expect it should be) then it's fine. CC [~moshebla]
> Determine how _nest_path_ should be analyzed to support various use-cases
> -------------------------------------------------------------------------
>
> Key: SOLR-12768
> URL: https://issues.apache.org/jira/browse/SOLR-12768
> Project: Solr
> Issue Type: Sub-task
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: David Smiley
> Assignee: David Smiley
> Priority: Blocker
> Fix For: master (8.0)
>
> Attachments: SOLR-12768.patch
>
>
> We know we need {{\_nest\_path\_}} in the schema for the new nested documents
> support, and we loosely know what goes in it. From a DocValues perspective,
> we've got it down; though we might tweak it. From an indexing (text
> analysis) perspective, we're not quite sure yet, though we've got a test
> schema, {{schema-nest.xml}} with a decent shot at it. Ultimately, how we
> index it will depend on the query/filter use-cases we need to support. So
> we'll review some of them here.
> TBD: Not sure if the outcome of this task is just a "decide" or wether we
> also potentially add a few tests for some of these cases, and/or if we also
> add a FieldType to make declaring it as easy as a one-liner. A FieldType
> would have other benefits too once we're ready to make querying on the path
> easier.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]