In some cases (exact condition still unknown) dovecot sends binary data 
(attachments) to SOLR for indexing. This reduces index and overall FTS 
efficiency dramatically. 

In extreme condition (below an example of 20MB) dovecot’s hardwired timeout of 
60s gets triggered during HTTP exchange with SOLR on just a single file. This 
results in an unfinished index which, by initial indexing, gets restarted over 
and over. With multiple affected mailboxes even on moderate usage this can 
cause an IO overload of the whole system.

Message example (doveadm fetch text): 
https://filebin.ca/5oy5Wc1QrBK3/fetch-text.obfuscated.txt 
<https://filebin.ca/5oy5Wc1QrBK3/fetch-text.obfuscated.txt>
Corresponding raw log data: 
https://filebin.ca/5oy6yqLSCr3H/rawlog.obfuscated.txt 
<https://filebin.ca/5oy6yqLSCr3H/rawlog.obfuscated.txt>

(Both files were processed with perl doveadm-obfuscate.pl 
<https://www.dovecot.org/tools/doveadm-obfuscate.pl>; the script doesn’t 
replace non-latin characters so they were replaced with ‘R’ manually)

Workaround: there is a useful patch by John Fawcett  
<https://www.mail-archive.com/dovecot@dovecot.org/msg82296.html> that allows to 
set the FTS indexing message body maximum size. It works perfectly, but 
affected messages are getting completely ignored by FTS.

This bug report is a summarised result of this discussion 
<https://www.mail-archive.com/dovecot@dovecot.org/msg82599.html>. 

Reply via email to