Hi.

We user ftp-solr plugin and have problem with solr-1.3+
with HTMLStripWhitespaceTokenizerFactory (Solr schema in attachments).
In some maildir's present messages with wrong "Content-Type: " fields in 
attachments.
For example:
"
Content-Type: TEXT/mspowerpoint; name="Zapatec_6zap_netvibes_1.ppt"
"
Indexing for this messages is stop with "fts_solr: Indexing failed: 500 
Internal Server Error".
In solr log is:
"
SEVERE: java.io.IOException: Mark invalid
at java.io.BufferedReader.reset(BufferedReader.java:485)
"
(mail list with discussion: 
http://markmail.org/message/2fnfiwygvehjngyr#query:SEVERE%3A%20java.io.IOException%3A%20Mark%20invalid%20lucene+page:1+mid:2fnfiwygvehjngyr+state:results)

Look's like dovecot try to index attachments like this.
Also for some messages we have same error.
Dovecot stop indexing of box and each search we have lag and CPU load on server.

So we need to make dovecot more "stable" to this error.
For first time , will be good, just ignore problematic messages with error from 
solr.

Let's discuss this issue, because this is general problem.
We ready to explore code where needed , etc.

Regards,
Nikolai

Powered by the 6zap. Sign up at http://www.6zap.com for an account that 
provides advanced e-mail, calendar and contacts capabilities.

Reply via email to