Thiruvalluvan M. G. created SOLR-16810:
------------------------------------------

             Summary: Under certain situations Solr produces managed schema XML 
that cannot be loaded
                 Key: SOLR-16810
                 URL: https://issues.apache.org/jira/browse/SOLR-16810
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: Schema and Analysis
    Affects Versions: 9.2.1
            Reporter: Thiruvalluvan M. G.


While persisting the {{ManagedIndexSchema}} as XML, non-printable characters in 
field names get escaped as {{{}#nn;{}}}, where {{nn}} is the decimal 
representation of the non-printable character. For example, if the field name 
has the byte {{{}0x14{}}}, it gets escaped as {{{}#20;{}}}. This in 
indistinguishable from the literal {{#20;}} in the field name. If we have two 
fields - one with the non-printable character and the other with the literal 
string, two fields get generated with the same name. Loading the resulting XML, 
naturally, causes an exception. To fix this, any occurrence of literal {{#}} in 
the field name should be escaped, with say {{{}##{}}}.
A second problem is that while escaping happens when generating XML, the 
corresponding unescaping does not happen on loading it. This asymmetry should 
be fixed as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to