On 11/26/2012 10:08 PM, Timo Sirainen wrote:
On 27.11.2012, at 7.50, Timo Sirainen wrote:
Nov 26, 2012 8:49:29 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: Illegal character ((CTRL-CHAR,
code 8))
at [row,col {unknown-source}]: [1011144,197790]
Something's wrong. The Solr code was already supposed to catch all of these.
http://dovecot.org/tmp/allchars.gz
If you send this mail to yourself and index it, does it fail? (Works for me.)
I think it works - I tried sending it as an attachment (unzipped) and
then with a command of "sendmail -t dmil...@amfes.com < allchars" - I
don't know how else to do it.
Following that by a "doveadm search -u dmil...@amfes.com mailbox INBOX
text test" indexed a couple new messages, including I assume these,
without errors. Some of my other mailboxes continue to break.
I know you've got a filter that strips out control characters prior to
sending to solr - so I'm left to assume:
1. solr is breaking on its own
2. I have a hardware problem that is corrupting memory (possible, but
this server is using ECC, so I don't think so).
3. Somehow in the communication with solr, control characters are being
introduced. Perhaps it's a maximum length or buffer issue?
4. Could it be attachment related?
5. Could it be zlib related - as in compressed mail, or a mix of
compressed & uncompressed mail, being processed?
--
Daniel