[ 
https://issues.apache.org/jira/browse/SOLR-12768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16724260#comment-16724260
 ] 

David Smiley commented on SOLR-12768:
-------------------------------------

Simple proposal:
* Use a new FieldType subclass to a simplify upgrades and enable ease of use
* Use one index token instead of path tokenizing at this stage.  This is 
lighter-weight when a user might not even need/want to query on it.  Instead, 
queries would use wildcards on it to express relationships.  Some day in the 
future, someone could make an easy to use query parser and/or query language 
that would build the appropriate wildcard patterns.

The index analyzer would simply be the indexed equivalent of:
{code:xml}
      <tokenizer class="solr.KeywordTokenizerFactory"/>
      <!--remove the # and digit index of array from path 
toppings#1/ingredients#/ turns to toppings/ingredients/ -->
      <filter class="solr.PatternReplaceFilterFactory" pattern="#\d*" 
replace="all"/>
{code}
Notice the last pattern is simplified and fixes a bug in the current test that 
will match all digits instead of only those after a pound.  I wrote a unit test 
for that fix.

CC [~moshebla]


> Determine how _nest_path_ should be analyzed to support various use-cases
> -------------------------------------------------------------------------
>
>                 Key: SOLR-12768
>                 URL: https://issues.apache.org/jira/browse/SOLR-12768
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: David Smiley
>            Priority: Blocker
>             Fix For: master (8.0)
>
>
> We know we need {{\_nest\_path\_}} in the schema for the new nested documents 
> support, and we loosely know what goes in it.  From a DocValues perspective, 
> we've got it down; though we might tweak it.  From an indexing (text 
> analysis) perspective, we're not quite sure yet, though we've got a test 
> schema, {{schema-nest.xml}} with a decent shot at it.  Ultimately, how we 
> index it will depend on the query/filter use-cases we need to support.  So 
> we'll review some of them here.
> TBD: Not sure if the outcome of this task is just a "decide" or wether we 
> also potentially add a few tests for some of these cases, and/or if we also 
> add a FieldType to make declaring it as easy as a one-liner.  A FieldType 
> would have other benefits too once we're ready to make querying on the path 
> easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to