Oh I got it, that's not indexing error!
Seem like I need to remove all the characters between [\x0-\x1F] (except \x9 
TAB, \xA LF, \xD CR) first.

Thanks a lot!




________________________________
From: Shawn Heisey <apa...@elyograg.org>
Sent: Tuesday, June 9, 2020 3:19 PM
To: solr-user@lucene.apache.org <solr-user@lucene.apache.org>
Subject: Re: Fw: TolerantUpdateProcessorFactory not functioning


I tried your example XML as it is shown in your original message, saved
to a file named "foo.xml", and didn't have any trouble.  I wasn't even
using the tolerant update processor.   I just fired up the techproducts
example on a solr-8.3.0 download I already had, added a field named
"isbn13" (string type) so the schema was compatible, and tried the
following command:

curl "http://localhost:8983/solr/techproducts/update"; -H 'Content-Type:
text/xml; charset=utf-8' -d @foo.xml

I then tried it again with the ^Z (which is two characters) replaced by
an actual Ctrl-Z character.  When I did that, I got exactly the same
error you did.

A Ctrl-Z character (ascii code 26) is *NOT* a valid character for XML,
which is why you're getting the error.

The tolerant update processor can't ignore errors in the actual format
of the input ... it only ignores errors during *indexing*.  This error
occurred during the input parsing, not during indexing, so the update
processor could not ignore it.

Thanks,
Shawn

Reply via email to