On 11/27/2012 1:07 PM, Timo Sirainen wrote:
On 27.11.2012, at 17.38, Daniel L. Miller wrote:

On 11/27/2012 7:28 AM, Daniel L. Miller wrote:
On 11/26/2012 10:08 PM, Timo Sirainen wrote:
On 27.11.2012, at 7.50, Timo Sirainen wrote:

Nov 26, 2012 8:49:29 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: Illegal character ((CTRL-CHAR, 
code 8))
at [row,col {unknown-source}]: [1011144,197790]
Something's wrong. The Solr code was already supposed to catch all of these.
I was taking a brief scan of the code - and as usual I'm probably wrong - but I 
believe the protection comes from the xml_encode functions.  Could it be that 
there are some solr writes that don't go through that function - because it is 
assumed that the data in question doesn't need that processing?  Like mailbox 
names, field names, or uids - that SHOULDN'T have any garbage but maybe 
something is creeping in?
I did go through the code looking for that a few times already but didn't 
notice anything. I went through it once more, and finally found the problem. :) 
http://hg.dovecot.org/dovecot-2.1/rev/6a97faf3e500

:( Mine still breaks.  Both UTF-8 and Control-Char errors.

--
Daniel

Reply via email to