Re: [Dovecot] Full text search improvements

2013-12-05 Thread Steffen Kaiser
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Sat, 30 Nov 2013, Timo Sirainen wrote: 7. Don't index non-text data? For example if there is large block of base64 data or something else that definitely doesn't look like text, it's pretty useless to index it. Then again, we do want to index

Re: [Dovecot] Full text search improvements

2013-12-05 Thread Timo Sirainen
On 5.12.2013, at 10.40, Steffen Kaiser skdove...@smail.inf.fh-brs.de wrote: 9. Attachments can be translated to indexable UTF-8 text already with fts_decoder setting by doing it via a conversion script. This could also support Apache Tika server directly. This means some kind of MIME type

Re: [Dovecot] Full text search improvements

2013-12-04 Thread Metro Domain Admin
Substring match is important to us, so we'd love to see Squat reinstated with speed improvements. It seems like Solr can handle substrings as well ([Edge]NGramFilterFactory), but for small deployments, having the engine built right in is a plus.

Re: [Dovecot] Full text search improvements

2013-12-04 Thread Michael M Slusarz
Quoting Timo Sirainen t...@iki.fi: 1. Support for multiple languages. Use textcat while indexing to guess the language of the indexed data. FWIW, you could probably use the Content-Language header (if it exists) to at least give a hint. No guarantee it is correct, but it's a better

Re: [Dovecot] Full text search improvements

2013-12-02 Thread Mike Abbott
how [FTS indexing] could be improved for everyone in future For sites which set client_limit 1 it would help performance not to stall for INDEXER_WAIT_MSECS when polling the indexer for input. Currently dovecot unwinds back out to the main command loop repeatedly to allow other clients to

Re: [Dovecot] Full text search improvements

2013-12-02 Thread Timo Sirainen
On 2.12.2013, at 20.50, Mike Abbott michael.abb...@apple.com wrote: how [FTS indexing] could be improved for everyone in future For sites which set client_limit 1 it would help performance not to stall for INDEXER_WAIT_MSECS when polling the indexer for input. Currently dovecot unwinds

Re: [Dovecot] Full text search improvements

2013-12-02 Thread Gedalya
On 12/02/2013 02:41 PM, Timo Sirainen wrote: Currently I’m thinking that most of the reasons for client_limit1 can be avoided just by moving IMAP IDLE connections to a separate imap-idle process where they wait until they have more work to do. Do you think that would work for you also? I was

Re: [Dovecot] Full text search improvements

2013-12-02 Thread Mike Abbott
Do you think [moving IMAP IDLE connections to a separate imap-idle process] would work for you also? Probably. It always depends on the details. Forking a new imap process every time there's a little input to read or output to send might perform poorly under load. Having a pool of ready

Re: [Dovecot] Full text search improvements

2013-12-02 Thread Timo Sirainen
On 3.12.2013, at 0.09, Mike Abbott michael.abb...@apple.com wrote: Do you think [moving IMAP IDLE connections to a separate imap-idle process] would work for you also? Probably. It always depends on the details. Forking a new imap process every time there's a little input to read or

[Dovecot] Full text search improvements

2013-11-30 Thread Timo Sirainen
FTS indexing is something I hear quite often nowadays. I’ve added some hacks to make it work better for some installations, but it’s about time to think about the whole design and how it could be improved for everyone in future. Here are some of my initial thoughts. Currently Dovecot supports