On 11/26/2012 10:08 PM, Timo Sirainen wrote:
On 27.11.2012, at 7.50, Timo Sirainen wrote:

Nov 26, 2012 8:49:29 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: Illegal character ((CTRL-CHAR, 
code 8))
at [row,col {unknown-source}]: [1011144,197790]
Something's wrong. The Solr code was already supposed to catch all of these.
http://dovecot.org/tmp/allchars.gz

If you send this mail to yourself and index it, does it fail? (Works for me.)

I think it works - I tried sending it as an attachment (unzipped) and then with a command of "sendmail -t dmil...@amfes.com < allchars" - I don't know how else to do it.

Following that by a "doveadm search -u dmil...@amfes.com mailbox INBOX text test" indexed a couple new messages, including I assume these, without errors. Some of my other mailboxes continue to break.

I know you've got a filter that strips out control characters prior to sending to solr - so I'm left to assume:
1.  solr is breaking on its own
2. I have a hardware problem that is corrupting memory (possible, but this server is using ECC, so I don't think so). 3. Somehow in the communication with solr, control characters are being introduced. Perhaps it's a maximum length or buffer issue?
4.  Could it be attachment related?
5. Could it be zlib related - as in compressed mail, or a mix of compressed & uncompressed mail, being processed?

--
Daniel

Reply via email to