I had a quick look at the code, and couldn't find anything that looked 
suspicious Christopher. There is some state, but it is private, created in the 
class static constructors, and not changed anywhere that I could find.
Interested to learn what's causing this issue in your environment. Keep us 
posted.
Cheers
Bruno


    On Friday, 17 January 2020, 7:40:44 am NZDT, Christopher Schultz 
<[email protected]> wrote:  
 
 -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

All,

In the past week, I've received reports of our servers starting to
incorrectly escape XML strings with consumer errors like this:

org.xml.sax.SAXParseException: The entity "rsquo" was referenced, but
not declared.

When looking at the raw text being generated, it's clear that, indeed,
the text is being escaped as if it were HTML (where the &rsquo; entity
is defined) instead of XML.

The code path is a little convoluted, and I'm going to try to get the
smallest reproducible test case I can, but I thought I'd reach-out
early to see if anyone has any "aha" guidance to me before I tear-out
a whole lot of hair following this down the rabbit hole.

This is commons-text-1.1. I've looked at the release notes between 1.1
and 1.8 and I don't see anything immediately that looks like a bugfix.

The data is coming from a database, and the string is clearly correct,
and it includes a "typographic right apostrophe", which is accurately
&rsquo; in HTML.

The output is being generated by Apache Velocity, through a macro
which escapes XML for us. The code in the template looks like this:

#xmlEscape($foo)

Where $foo is a string value containing this character: ’

The xmlEscape macro is defined in our global macros file which gets
evaluated on startup:

#macro (xmlEscape
$text)#if($text)$!modernEscape.escapeXml10($text.toString())#end#end

$modernEscape is an instance of
org.apache.commons.text.StringEscapeUtils in the global-scope; it's
like "application" scope for webapps, but it's in Velocity.

When we first start our web application, all seems well. After some
time, this process breaks and we start emitting "&rsquo;" instead of "’"
.

I can find no evidence of any of the following:

1. multiple versions of commons-text library
2. multiple versions org.apache.commons.text.StringEscapeUtil in any
library
3. any component replacing the value of $modernEscape
4. any component replacing the definition of the #xmlEscape macro

When the first report came in, we tried replicating the reporter's
experience and we could see it on one server node but not others. We
restarted that web application on that node and it started working
properly again.

Does StringEscapeUtils.escape* keep any state associated with what
it's doing? We aren't doing anything weird: just calling
StringEscapeUtils.escapeXml10 ... a lot of times, probably from many
threads.

Any ideas?

- -chris
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAl4grhwACgkQHPApP6U8
pFga3RAAgPalqagLkEyGuWhKOaa6VbGaXRqLGNjd63byTM/TFKJyuVHsU3W0MpkC
LxG7IK1a+FuTcQuaxSY8tP9T/TH7p88y9cVpj2r8b4PXJLZ4SddOMxr/gT9MfBxA
7Vq+vpvwdkWOfcIqFBwgcx7h+EVGoUbzzYBbc301m5TxkK7kYtV6KmlGi4o3R68A
x5Ic6QtASxjaDZK6bywsHTxQWmp66+8j1QFInEtjP69Am+fkjKxE/vnTHFYha+Cr
rYuseQxhDMOyUOxhPQiU65sFzjGnS/0529EV0VykP59YNrpTGAxha7T5tSQL8iNy
p9fRv0X/Ijz6WznNiN6K36Ftu6OEyTouak0zfzKiOPZKhIvp+ofNaRbuA01O/Km/
hqt0bEdBtq8/nnYGsKmXuNv+18pWl8eY539w3kw572Rnzyxo5bdUX5YFCyq3dIeP
rhQDhA4DDpFfaHHsL1cIdLXs5b+0au85REwHusZe7iPCxZytUNahE9uDIcQhyRwJ
ix6+LgF+4nWHVtMnQL3Dw60Of/uIbvEs/Bfvc86dIGrEBhXoh2q1qLu1iwlBf7Jw
rxFsWmDv8T1jrWYmvKNispr2KUAhGf6bl+1PxxxdnKnUJdE09CqjDL/BnYclDqJZ
6f7pORqISRLiUN99KHNliC9TMwEBjmXUhV3QOoSx+d5IUTBB0/g=
=zk4m
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

  

Reply via email to