Oh I got it, that's not indexing error! Seem like I need to remove all the characters between [\x0-\x1F] (except \x9 TAB, \xA LF, \xD CR) first.
Thanks a lot! ________________________________ From: Shawn Heisey <apa...@elyograg.org> Sent: Tuesday, June 9, 2020 3:19 PM To: solr-user@lucene.apache.org <solr-user@lucene.apache.org> Subject: Re: Fw: TolerantUpdateProcessorFactory not functioning I tried your example XML as it is shown in your original message, saved to a file named "foo.xml", and didn't have any trouble. I wasn't even using the tolerant update processor. I just fired up the techproducts example on a solr-8.3.0 download I already had, added a field named "isbn13" (string type) so the schema was compatible, and tried the following command: curl "http://localhost:8983/solr/techproducts/update" -H 'Content-Type: text/xml; charset=utf-8' -d @foo.xml I then tried it again with the ^Z (which is two characters) replaced by an actual Ctrl-Z character. When I did that, I got exactly the same error you did. A Ctrl-Z character (ascii code 26) is *NOT* a valid character for XML, which is why you're getting the error. The tolerant update processor can't ignore errors in the actual format of the input ... it only ignores errors during *indexing*. This error occurred during the input parsing, not during indexing, so the update processor could not ignore it. Thanks, Shawn