[ 
https://issues.apache.org/jira/browse/SOLR-16810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17724701#comment-17724701
 ] 

Shawn Heisey commented on SOLR-16810:
-------------------------------------

Solr does not support all characters in field names.  The documentation 
specifically warns not to use characters like this:

[https://solr.apache.org/guide/solr/latest/indexing-guide/fields.html#field-properties]
{quote}Field names should consist of alphanumeric or underscore characters only 
and not start with a digit. This is not currently strictly enforced, but other 
field names will not have first class support from all components and back 
compatibility is not guaranteed. Names with both leading and trailing 
underscores (e.g., {{{}_version_{}}}) are reserved.
{quote}
 

> Under certain situations Solr produces managed schema XML that cannot be 
> loaded
> -------------------------------------------------------------------------------
>
>                 Key: SOLR-16810
>                 URL: https://issues.apache.org/jira/browse/SOLR-16810
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Schema and Analysis
>    Affects Versions: 9.2.1
>            Reporter: Thiruvalluvan M. G.
>            Assignee: Ishan Chattopadhyaya
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> While persisting the {{ManagedIndexSchema}} as XML, non-printable characters 
> in field names get escaped as {{{}#nn;{}}}, where {{nn}} is the decimal 
> representation of the non-printable character. For example, if the field name 
> has the byte {{{}0x14{}}}, it gets escaped as {{{}#20;{}}}. This in 
> indistinguishable from the literal {{#20;}} in the field name. If we have two 
> fields - one with the non-printable character and the other with the literal 
> string, two fields get generated with the same name. Loading the resulting 
> XML, naturally, causes an exception. To fix this, any occurrence of literal 
> {{#}} in the field name should be escaped, with say {{{}##{}}}.
> A second problem is that while escaping happens when generating XML, the 
> corresponding unescaping does not happen on loading it. This asymmetry should 
> be fixed as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to