[ https://issues.apache.org/jira/browse/SOLR-12768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730574#comment-16730574 ]
David Smiley commented on SOLR-12768: ------------------------------------- I toyed with creating a custom field type subclass but I wound up thinking it's rather unnecessary... if only Solr had some notion of implicit field types that you could refer to without having to explicitly define them (as I mentioned in the dev list). This isn't strictly necessary, I'm trying to both (a) avoid implementation detail & bloat in the schema of users, and (b) use an approach that could easily change in the future using luceneMatchVersion if we change our mind on the default. So I added this mechanism with {{_nest_path_}} being an implicitly defined field that is registered on-demand of first use via {{IndexSchema.createImplicitFieldType()}}. I decided factor out a static method to create this specific field type and put it into NestedUpdateProcessorFactory so as to keep related code together. I had to make FieldType.setArgs public which seemed fine. Eventually it'd be nice to see primitive field types implicitly declared, which would be done directly in the switch statement I added in createImplicitFieldType. I did _not_ enhance the REST Schema mutation API to use this mechanism -- that's a follow-on TODO and would need its own test. About the new analysis.... TestChildDocTransformerHierarchy has many test methods that fail. I have yet to update them to pass. Many fail because of a combination of two factors (a) they use an experimental syntax for the "childFilter" parameter that is defined in org.apache.solr.response.transform.ChildDocTransformerFactory#processPathHierarchyQueryString that assumes a tokenization that allows certain queries to match, and (b) I changed the text analysis in this patch and thus (a)'s assumption is false. I think this can be fixed by a simple adjustment to the code building the query to insert a leading wildcard if the input/query does not start with a '/'. Of course leading wildcard queries are slow but if the total number of unique paths is very small (as I expect it should be) then it's fine. CC [~moshebla] > Determine how _nest_path_ should be analyzed to support various use-cases > ------------------------------------------------------------------------- > > Key: SOLR-12768 > URL: https://issues.apache.org/jira/browse/SOLR-12768 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) > Reporter: David Smiley > Assignee: David Smiley > Priority: Blocker > Fix For: master (8.0) > > Attachments: SOLR-12768.patch > > > We know we need {{\_nest\_path\_}} in the schema for the new nested documents > support, and we loosely know what goes in it. From a DocValues perspective, > we've got it down; though we might tweak it. From an indexing (text > analysis) perspective, we're not quite sure yet, though we've got a test > schema, {{schema-nest.xml}} with a decent shot at it. Ultimately, how we > index it will depend on the query/filter use-cases we need to support. So > we'll review some of them here. > TBD: Not sure if the outcome of this task is just a "decide" or wether we > also potentially add a few tests for some of these cases, and/or if we also > add a FieldType to make declaring it as easy as a one-liner. A FieldType > would have other benefits too once we're ready to make querying on the path > easier. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org