On 11/30/22 13:44, Matthew Castrigno wrote:
Using SOLR 9.0 and the ScriptUpdatProcesor, it appears SOLR is erroneously adding "
,​ " in the middle of a string field.
The script just logs the fields. If you compare the curl request with what is
logged you see the addition of many instances of ,​ in the content field.
This just happens on the logging tab of the admin UI. In the javascript
file at server/solr-webapp/webapp/js/angular/controllers/logging.js I
found the following line:
event.message = event.message.replace(/,/g, ',​');
HTML character code 8203 is the unicode "zero width space" character. I
think the admin UI code is trying to make long comma separated lists in
log entries word-wrap better, and somehow the browser is treating that
as literal text rather than an HTML entity. This is NOT in the data
being indexed, it is just in the log. It's definitely a display bug,
but doesn't affect the data being indexes.
Here you can see the same thing happening with my server running
9.2.0-SNAPSHOT:
https://www.dropbox.com/s/77yc9bovxwaauu6/solr-logging-html-8203.png?dl=0
I checked solr.log and that text is NOT there. I bet if you check
solr.log you will also find that it is not there.
Requests to the URL in my screenshot that do not come from specific IP
addresses are blocked. Those requests never get beyond the reverse proxy.
Thanks,
Shawn