[ 
https://issues.apache.org/jira/browse/SOLR-16810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17724862#comment-17724862
 ] 

Thiruvalluvan M. G. commented on SOLR-16810:
--------------------------------------------

[~gus] We are not escaping {{{}#{}}}. But we encode control characters {{0x00}} 
to {{0x1f}} as {{{}#nn;{}}}. I've explained it in the description of this 
ticket along with the ambiguity it creates. For example {{"\u0014"}} and 
{{"#20;"}} both get encoded as {{{}"#20"{}}}. If a schema had two fields with 
these names, then the XML has duplicate field name and the XML cannot be parsed 
any more.

> Under certain situations Solr produces managed schema XML that cannot be 
> loaded
> -------------------------------------------------------------------------------
>
>                 Key: SOLR-16810
>                 URL: https://issues.apache.org/jira/browse/SOLR-16810
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Schema and Analysis
>    Affects Versions: 9.2.1
>            Reporter: Thiruvalluvan M. G.
>            Assignee: Ishan Chattopadhyaya
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> While persisting the {{ManagedIndexSchema}} as XML, non-printable characters 
> in field names get escaped as {{{}#nn;{}}}, where {{nn}} is the decimal 
> representation of the non-printable character. For example, if the field name 
> has the byte {{{}0x14{}}}, it gets escaped as {{{}#20;{}}}. This in 
> indistinguishable from the literal {{#20;}} in the field name. If we have two 
> fields - one with the non-printable character and the other with the literal 
> string, two fields get generated with the same name. Loading the resulting 
> XML, naturally, causes an exception. To fix this, any occurrence of literal 
> {{#}} in the field name should be escaped, with say {{{}##{}}}.
> A second problem is that while escaping happens when generating XML, the 
> corresponding unescaping does not happen on loading it. This asymmetry should 
> be fixed as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to