Re: [Dovecot] "Header is huge" in fts-solr

2013-02-22 Thread Timo Sirainen
On 5.2.2013, at 15.58, Valery V. Sedletski  wrote:

> Hi, Timo and all!
> 
> I am trying to index mail in a test mailbox using fts_solr plugin for
> full-text search. On most mailboxes, it works fine, but on some big
> messages I get
> warnings like the following, and then I get an Out of memory error from
> Solr, then the indexer-worker process (or doveadm) crashes with "assertion
> failed" error and the backtrace:
> 
> ==
> doveadm(valer...@test.afterlogic.com): Warning:
> fts-solr(valer...@test.afterlogic.com): Mailbox gmail.com UID=48 header
> size is huge

I'm not sure why Solr would become out of memory. If it handles huge message 
bodies then I don't really see why it couldn't handle huge headers..

> doveadm(valer...@test.afterlogic.com): Panic: file
> ../../../../src/plugins/fts-solr/solr-connection.c: line 548
> (solr_connection_post_more): assertion failed: (maxfd >= 0)

This is hopefully fixed by v2.2, which uses its own lib-http instead of libcurl 
(which I'm apparently not using correctly).

> So, it seems that Dovecot tries to parse messages in the mailbox, and can't
> correctly determine where the message header ends. So, it thinks that the
> message header is big, and passes very big data to Solr. When trying to
> index it, Solr exhausts the available memory (though, I have 8 Gb of RAM on
> my machine, and java eats more than 2 Gb when indexing). Then connections
> to Solr get closed, and maxfd is invalid, hence the assertion is failed.
> 
> Note also the following error
> 
> ==
> SEVERE: org.apache.solr.common.SolrException: undefined field text
> ==
> 
> before an out of memory error.

I don't know about that one.

> I also tried to tweak the decode2text.sh script to ignore all attachments
> bigger than 1 Mb (just test if the file is bigger than 1 Mb, and if so,
> return "1"). This won't help. As I understood, this is because of big
> header, so attachments doesn't matter.

Yes.

> I separated the set of messages which cause this error (by their UID's).
> So, I can give them as a testcase, the size of them all in archive is about
> 40 Mb. The error can be reproduced if put all these messages into an empty
> mailbox, and do reindexing, via IMAP search, or via "doveadm index -u  ".

Is it really a message with huge header? Also MIME headers are counted as 
headers.

Anyway, http://hg.dovecot.org/dovecot-2.1/rev/0a932ba1f01f hopefully helps?



[Dovecot] "Header is huge" in fts-solr

2013-02-05 Thread Valery V. Sedletski
Hi, Timo and all!

I am trying to index mail in a test mailbox using fts_solr plugin for
full-text search. On most mailboxes, it works fine, but on some big
messages I get
warnings like the following, and then I get an Out of memory error from
Solr, then the indexer-worker process (or doveadm) crashes with "assertion
failed" error and the backtrace:

==
doveadm(valer...@test.afterlogic.com): Warning:
fts-solr(valer...@test.afterlogic.com): Mailbox gmail.com UID=48 header
size is huge
doveadm(valer...@test.afterlogic.com): Warning:
fts-solr(valer...@test.afterlogic.com): Mailbox gmail.com UID=49 header
size is huge
doveadm(valer...@test.afterlogic.com): Panic: file
../../../../src/plugins/fts-solr/solr-connection.c: line 548
(solr_connection_post_more): assertion failed: (maxfd >= 0)
doveadm(valer...@test.afterlogic.com): Error: Raw backtrace:
/usr/mailsuite/lib/dovecot/libdovecot.so.0(+0x58f04) [0x7fe8a908af04] ->
/usr/mailsuite/lib/dovecot/libdovecot.so.0(default_error_handler+0)
[0x7fe8a908af93] -> /usr/mailsuite/lib/dovecot/libdovecot.so.0(i_fatal+0)
[0x7fe8a908b274] ->
/usr/mailsuite/lib/dovecot/lib21_fts_solr_plugin.so(solr_connection_post_more+0x2d2)
[0x7fe8a75fe973] ->
/usr/mailsuite/lib/dovecot/lib21_fts_solr_plugin.so(+0x4d03)
[0x7fe8a75f9d03] ->
/usr/mailsuite/lib/dovecot/lib20_fts_plugin.so(fts_backend_update_build_more+0x77)
[0x7fe8a7c1d401] -> /usr/mailsuite/lib/dovecot/lib20_fts_plugin.so(+0x7fe2)
[0x7fe8a7c1dfe2] -> /usr/mailsuite/lib/dovecot/lib20_fts_plugin.so(+0x80d5)
[0x7fe8a7c1e0d5] -> /usr/mailsuite/lib/dovecot/lib20_fts_plugin.so(+0x89e4)
[0x7fe8a7c1e9e4] ->
/usr/mailsuite/lib/dovecot/lib20_fts_plugin.so(fts_build_mail+0x2b)
[0x7fe8a7c1ebf5] -> /usr/mailsuite/lib/dovecot/lib20_fts_plugin.so(+0xe7cf)
[0x7fe8a7c247cf] -> /usr/mailsuite/lib/dovecot/lib20_fts_plugin.so(+0xe8ba)
[0x7fe8a7c248ba] ->
/usr/mailsuite/lib/dovecot/libdovecot-storage.so.0(mail_precache+0x25)
[0x7fe8a9379bc9] -> /usr/mailsuite/bin/doveadm() [0x4139de] ->
/usr/mailsuite/bin/doveadm() [0x413c17] -> /usr/mailsuite/bin/doveadm()
[0x413f18] -> /usr/mailsuite/bin/doveadm() [0x40fea6] ->
/usr/mailsuite/bin/doveadm(doveadm_mail_single_user+0x154) [0x410069] ->
/usr/mailsuite/bin/doveadm() [0x41090a] ->
/usr/mailsuite/bin/doveadm(doveadm_mail_try_run+0xac) [0x410b81] ->
/usr/mailsuite/bin/doveadm(main+0x28d) [0x41a92c] ->
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x7fe8a8cc9ead] ->
/usr/mailsuite/bin/doveadm() [0x40f499]
==

And Solr log, at the same time:
==
2013-02-01 18:03:53.342:INFO::Logging to STDERR via
org.mortbay.log.StdErrLog
2013-02-01 18:03:53.425:INFO::jetty-6.1-SNAPSHOT
01.02.2013 18:03:53 org.apache.solr.core.SolrResourceLoader locateSolrHome
INFO: JNDI not configured for solr (NoInitialContextEx)
01.02.2013 18:03:53 org.apache.solr.core.SolrResourceLoader locateSolrHome
INFO: solr home defaulted to 'solr/' (could not find system property or
JNDI)
01.02.2013 18:03:53 org.apache.solr.core.SolrResourceLoader
INFO: new SolrResourceLoader for deduced Solr Home: 'solr/'
01.02.2013 18:03:53 org.apache.solr.servlet.SolrDispatchFilter init
INFO: SolrDispatchFilter.init()
01.02.2013 18:03:53 org.apache.solr.core.SolrResourceLoader locateSolrHome
INFO: JNDI not configured for solr (NoInitialContextEx)
01.02.2013 18:03:53 org.apache.solr.core.SolrResourceLoader locateSolrHome
INFO: solr home defaulted to 'solr/' (could not find system property or
JNDI)
01.02.2013 18:03:53 org.apache.solr.core.CoreContainerSInitializer
initialize
INFO: looking for solr.xml:
/home/valerius/apache-solr-3.6.2/example/solr/solr.xml
01.02.2013 18:03:53 org.apache.solr.core.CoreContainer load
INFO: Loading CoreContainer using Solr Home: 'solr/'
01.02.2013 18:03:53 org.apache.solr.core.SolrResourceLoader
INFO: new SolrResourceLoader for directory: 'solr/'
01.02.2013 18:03:53 org.apache.solr.core.CoreContainer create
INFO: Creating SolrCore '' using instanceDir: solr/.
01.02.2013 18:03:53 org.apache.solr.core.SolrResourceLoader
INFO: new SolrResourceLoader for directory: 'solr/./'
01.02.2013 18:03:53 org.apache.solr.core.SolrConfig initLibs
INFO: Adding specified lib dirs to ClassLoader
01.02.2013 18:03:53 org.apache.solr.core.SolrResourceLoader
replaceClassLoader
INFO: Adding
'file:/home/valerius/apache-solr-3.6.2/dist/apache-solr-cell-3.6.2.jar' to
classloader
01.02.2013 18:03:53 org.apache.solr.core.SolrResourceLoader
replaceClassLoader
INFO: Adding
'file:/home/valerius/apache-solr-3.6.2/contrib/extraction/lib/poi-ooxml-3.8-beta4.jar'
to classloader
01.02.2013 18:03:53 org.apache.solr.core.SolrResourceLoader
replaceClassLoader
INFO: Adding
'file:/home/valerius/apache-solr-3.6.2/contrib/extraction/lib/jdom-1.0.jar'
to classloader
01.02.2013 18:03:53 org.apache.solr.core.SolrResourceLoader
replaceClassLoader
INFO: Adding
'file:/home/valerius/apache-solr-3.6.2/cont