Re: Strategy for fts and Replication

2020-02-22 Thread Philon

Hi Francis,

My Solr instance is on 1GB but using less than 512MB. You might need to 
adjust Java VM memory usage but it's possible. I have only my own email 
but also 10-15 years history and search results including headers and 
body are instant.


Things are on SSD but still I think the search storage fits into memory.


Philon

Am 04.02.2020 11:46, schrieb Francis Augusto Medeiros-Logeay:

Hi Philon,

Thanks a lot for your thoughts!

Can I ask you if using Solr improved things for you? I have a mailbox
with 15 years of e-mail and searching things take a long time.

On 04.02.2020 09:39, Philon wrote:

Hi Francis,

next to fts-solr there was fts-lucene. But that Lucene there seems
heavily outdated why the Dovecot docs also suggest using Solr.
Elasticsearch probably is similar to Solr but the later is maintained
by Dovecot team.

I started with downloading the Solr binary distribution to Debian with
JRE preinstalled and things were running like after 10 min. Yes it’s a
bit more complicated to find the schema and edit things like header
size (in tips section). It’s running quite nicely since then and has
zero maintenance.


I will try again - I kept getting some weird errors, so I don't know
if that's why I wasn't seing much of improvement.



As FTS indexes are separate in external Solr instance I’d guess that
it won’t interfere with dsync. What I don’t know is if dsync’ing would
trigger indexing. This brings me to wonder how one could actually
replicate the Solr instance!?


Good question. But what I thought about doing was to install FTS on my
backup instance, and if things go fine, then I install an FTS instance
on my production server - that is, if one doesn't interfere with the
other.

I will give Solr another shot - my worries are mostly if Solr is
supported on ARM (my prod instance is running on ARM) - I know
Elasticsearch has an ARM build.

Ii thought about the Xapian engine, but since it requires dovecot 2.3,
I will have to wait.

Best,

Francis




Philon

On 31 Jan 2020, at 17:24, Francis Augusto Medeiros-Logeay 
 wrote:


Hi there,

I got successfully to replicate my mail server to another dovecot 
install using dsync, mainly for redundancy, and it works great.


I want to try to install fts, as some of the mailboxes have tens of 
thousands of messages, and it takes minutes to get some results when 
searching via IMAP on a Roundcube interface.


I want to experiment with fts-solr first, and firstly on my redundant 
server, ie., not on my main dovecot install. Is it ok to do this? I 
ask because I am afraid of how this whole reindexing on the redundant 
install will affect the production server.


Also, any tips on something else than fts-solr? I tried it once, but 
it was so hard to get it right, so many configurations, java, etc., 
that I'd rather try something else. I also could try fts-elastic or 
something like that, but, again, having to maintain an elasticsearch 
install might use more resources than I think is worth. Any thoughts 
on that?


Best,

--
Francis



Re: Strategy for fts

2020-02-16 Thread Francis Augusto Medeiros-Logeay

This is very good news. I will certainly try it!

Thanks for that!

Best,

Francis

---
Francis Augusto Medeiros-Logeay
Oslo, Norway

On 15.02.2020 19:54, Joan Moreau wrote:

I updated fts-xapian to make it compatible with dovecot 2.2

On 2020-02-04 12:37, Peter Chiochetti wrote:


Am 04.02.20 um 11:46 schrieb Francis Augusto Medeiros-Logeay:


Hi Philon,

Thanks a lot for your thoughts!

Can I ask you if using Solr improved things for you? I have a
mailbox with 15 years of e-mail and searching things take a long
time.


Here, SOLR itself searches a quarter million mails in split seconds
and returns very good results. That is on a low memory average
machine.

If you dont mind the standard, you can change the schema, so headers
(from, to) get indexed in body text. That can help narrowing
results.

Only problem is search through e.g. nested folders from IMAP:
something like ESEARCH would be nice -
https://tools.ietf.org/html/rfc6237

Peter

On 04.02.2020 09:39, Philon wrote: Hi Francis,

next to fts-solr there was fts-lucene. But that Lucene there seems
heavily outdated why the Dovecot docs also suggest using Solr.
Elasticsearch probably is similar to Solr but the later is
maintained
by Dovecot team.

I started with downloading the Solr binary distribution to Debian
with
JRE preinstalled and things were running like after 10 min. Yes it's
a
bit more complicated to find the schema and edit things like header
size (in tips section). It's running quite nicely since then and has
zero maintenance.
I will try again - I kept getting some weird errors, so I don't know
if that's why I wasn't seing much of improvement.

As FTS indexes are separate in external Solr instance I'd guess that
it won't interfere with dsync. What I don't know is if dsync'ing
would
trigger indexing. This brings me to wonder how one could actually
replicate the Solr instance!?
Good question. But what I thought about doing was to install FTS on
my backup instance, and if things go fine, then I install an FTS
instance on my production server - that is, if one doesn't interfere
with the other.

I will give Solr another shot - my worries are mostly if Solr is
supported on ARM (my prod instance is running on ARM) - I know
Elasticsearch has an ARM build.

Ii thought about the Xapian engine, but since it requires dovecot
2.3, I will have to wait.

Best,

Francis

Philon

On 31 Jan 2020, at 17:24, Francis Augusto Medeiros-Logeay
 wrote:

Hi there,

I got successfully to replicate my mail server to another dovecot
install using dsync, mainly for redundancy, and it works great.

I want to try to install fts, as some of the mailboxes have tens of
thousands of messages, and it takes minutes to get some results when
searching via IMAP on a Roundcube interface.

I want to experiment with fts-solr first, and firstly on my
redundant server, ie., not on my main dovecot install. Is it ok to
do this? I ask because I am afraid of how this whole reindexing on
the redundant install will affect the production server.

Also, any tips on something else than fts-solr? I tried it once, but
it was so hard to get it right, so many configurations, java, etc.,
that I'd rather try something else. I also could try fts-elastic or
something like that, but, again, having to maintain an elasticsearch
install might use more resources than I think is worth. Any thoughts
on that?

Best,

-- Francis


Re: Strategy for fts

2020-02-15 Thread Joan Moreau

I updated fts-xapian to make it compatible with dovecot 2.2

On 2020-02-04 12:37, Peter Chiochetti wrote:

Am 04.02.20 um 11:46 schrieb Francis Augusto Medeiros-Logeay: 


Hi Philon,

Thanks a lot for your thoughts!

Can I ask you if using Solr improved things for you? I have a mailbox with 15 
years of e-mail and searching things take a long time.


Here, SOLR itself searches a quarter million mails in split seconds and returns 
very good results. That is on a low memory average machine.

If you dont mind the standard, you can change the schema, so headers (from, to) 
get indexed in body text. That can help narrowing results.

Only problem is search through e.g. nested folders from IMAP: something like 
ESEARCH would be nice - https://tools.ietf.org/html/rfc6237

Peter

On 04.02.2020 09:39, Philon wrote: Hi Francis,

next to fts-solr there was fts-lucene. But that Lucene there seems
heavily outdated why the Dovecot docs also suggest using Solr.
Elasticsearch probably is similar to Solr but the later is maintained
by Dovecot team.

I started with downloading the Solr binary distribution to Debian with
JRE preinstalled and things were running like after 10 min. Yes it's a
bit more complicated to find the schema and edit things like header
size (in tips section). It's running quite nicely since then and has
zero maintenance. 
I will try again - I kept getting some weird errors, so I don't know if that's why I wasn't seing much of improvement.


As FTS indexes are separate in external Solr instance I'd guess that
it won't interfere with dsync. What I don't know is if dsync'ing would
trigger indexing. This brings me to wonder how one could actually
replicate the Solr instance!? 
Good question. But what I thought about doing was to install FTS on my backup instance, and if things go fine, then I install an FTS instance on my production server - that is, if one doesn't interfere with the other.


I will give Solr another shot - my worries are mostly if Solr is supported on 
ARM (my prod instance is running on ARM) - I know Elasticsearch has an ARM 
build.

Ii thought about the Xapian engine, but since it requires dovecot 2.3, I will 
have to wait.

Best,

Francis

Philon

On 31 Jan 2020, at 17:24, Francis Augusto Medeiros-Logeay  
wrote:

Hi there,

I got successfully to replicate my mail server to another dovecot install using 
dsync, mainly for redundancy, and it works great.

I want to try to install fts, as some of the mailboxes have tens of thousands 
of messages, and it takes minutes to get some results when searching via IMAP 
on a Roundcube interface.

I want to experiment with fts-solr first, and firstly on my redundant server, 
ie., not on my main dovecot install. Is it ok to do this? I ask because I am 
afraid of how this whole reindexing on the redundant install will affect the 
production server.

Also, any tips on something else than fts-solr? I tried it once, but it was so 
hard to get it right, so many configurations, java, etc., that I'd rather try 
something else. I also could try fts-elastic or something like that, but, 
again, having to maintain an elasticsearch install might use more resources 
than I think is worth. Any thoughts on that?

Best,

-- Francis

Re: Strategy for fts

2020-02-05 Thread Francis Augusto Medeiros-Logeay



---
Francis Augusto Medeiros-Logeay
Oslo, Norway

On 04.02.2020 22:55, Peter Chiochetti wrote:

Am 04.02.20 um 12:37 schrieb Peter Chiochetti:

Am 04.02.20 um 11:46 schrieb Francis Augusto Medeiros-Logeay:

Hi Philon,

Thanks a lot for your thoughts!

Can I ask you if using Solr improved things for you? I have a mailbox 
with 15 years of e-mail and searching things take a long time.


Here, SOLR itself searches a quarter million mails in split seconds 
and returns very good results. That is on a low memory average 
machine.




How much memory are you using, if I may ask? I have a really small 
server only with only 2GB. I am thinking about migrating it, but haven't 
done it so far, most likely to a 16GB instance.


Best,

Francis

0xEE41D33F.asc
Description: application/pgp-keys


Re: Strategy for fts

2020-02-05 Thread Francis Augusto Medeiros-Logeay


---
Francis Augusto Medeiros-Logeay
Oslo, Norway

On 04.02.2020 22:55, Peter Chiochetti wrote:

Am 04.02.20 um 12:37 schrieb Peter Chiochetti:

Am 04.02.20 um 11:46 schrieb Francis Augusto Medeiros-Logeay:

Hi Philon,

Thanks a lot for your thoughts!

Can I ask you if using Solr improved things for you? I have a mailbox 
with 15 years of e-mail and searching things take a long time.


Here, SOLR itself searches a quarter million mails in split seconds 
and returns very good results. That is on a low memory average 
machine.




How much memory are you using, if I may ask? I have a really small 
server only with only 2GB. I am thinking about migrating it, but haven't 
done it so far, most likely to a 16GB instance.


Best,

Francis

0xEE41D33F.asc
Description: application/pgp-keys


0xEE41D33F.asc
Description: application/pgp-keys


Re: Strategy for fts

2020-02-04 Thread Peter Chiochetti

Am 04.02.20 um 12:37 schrieb Peter Chiochetti:

Am 04.02.20 um 11:46 schrieb Francis Augusto Medeiros-Logeay:

Hi Philon,

Thanks a lot for your thoughts!

Can I ask you if using Solr improved things for you? I have a mailbox 
with 15 years of e-mail and searching things take a long time.


Here, SOLR itself searches a quarter million mails in split seconds and 
returns very good results. That is on a low memory average machine.



Looking at the facts, it is closer to half a million mails in a 160GB 
Maildir, lots of trash too, but no one to sort it out. SOLR index is 1.2 
GB in size on disk. A tremendous ratio IMO.


In dovecot terms this is likely considered a small installation. We are 
a small team too :) and quite happy with the generous gift of dovecot, 
and Thunderbird BTW.



Only problem is search through e.g. nested folders from IMAP: something 
like ESEARCH would be nice - https://tools.ietf.org/html/rfc6237


PS: There is powerful client side search in some MUAs, yet sometimes 
serverside comes handy.


--
peter


Re: Strategy for fts

2020-02-04 Thread Peter Chiochetti

Am 04.02.20 um 11:46 schrieb Francis Augusto Medeiros-Logeay:

Hi Philon,

Thanks a lot for your thoughts!

Can I ask you if using Solr improved things for you? I have a mailbox 
with 15 years of e-mail and searching things take a long time.


Here, SOLR itself searches a quarter million mails in split seconds and 
returns very good results. That is on a low memory average machine.


If you dont mind the standard, you can change the schema, so headers 
(from, to) get indexed in body text. That can help narrowing results.


Only problem is search through e.g. nested folders from IMAP: something 
like ESEARCH would be nice - https://tools.ietf.org/html/rfc6237



Peter



On 04.02.2020 09:39, Philon wrote:

Hi Francis,

next to fts-solr there was fts-lucene. But that Lucene there seems
heavily outdated why the Dovecot docs also suggest using Solr.
Elasticsearch probably is similar to Solr but the later is maintained
by Dovecot team.

I started with downloading the Solr binary distribution to Debian with
JRE preinstalled and things were running like after 10 min. Yes it’s a
bit more complicated to find the schema and edit things like header
size (in tips section). It’s running quite nicely since then and has
zero maintenance.


I will try again - I kept getting some weird errors, so I don't know if 
that's why I wasn't seing much of improvement.




As FTS indexes are separate in external Solr instance I’d guess that
it won’t interfere with dsync. What I don’t know is if dsync’ing would
trigger indexing. This brings me to wonder how one could actually
replicate the Solr instance!?


Good question. But what I thought about doing was to install FTS on my 
backup instance, and if things go fine, then I install an FTS instance 
on my production server - that is, if one doesn't interfere with the other.


I will give Solr another shot - my worries are mostly if Solr is 
supported on ARM (my prod instance is running on ARM) - I know 
Elasticsearch has an ARM build.


Ii thought about the Xapian engine, but since it requires dovecot 2.3, I 
will have to wait.


Best,

Francis




Philon

On 31 Jan 2020, at 17:24, Francis Augusto Medeiros-Logeay 
 wrote:


Hi there,

I got successfully to replicate my mail server to another dovecot 
install using dsync, mainly for redundancy, and it works great.


I want to try to install fts, as some of the mailboxes have tens of 
thousands of messages, and it takes minutes to get some results when 
searching via IMAP on a Roundcube interface.


I want to experiment with fts-solr first, and firstly on my redundant 
server, ie., not on my main dovecot install. Is it ok to do this? I 
ask because I am afraid of how this whole reindexing on the redundant 
install will affect the production server.


Also, any tips on something else than fts-solr? I tried it once, but 
it was so hard to get it right, so many configurations, java, etc., 
that I'd rather try something else. I also could try fts-elastic or 
something like that, but, again, having to maintain an elasticsearch 
install might use more resources than I think is worth. Any thoughts 
on that?


Best,

--
Francis



Re: Strategy for fts and Replication

2020-02-04 Thread Christian Kivalo



On February 4, 2020 11:46:31 AM GMT+01:00, Francis Augusto Medeiros-Logeay 
 wrote:
>Hi Philon,
>
>Thanks a lot for your thoughts!
>
>Can I ask you if using Solr improved things for you? I have a mailbox 
>with 15 years of e-mail and searching things take a long time.
It a vast improvement, more or less instant results. 
>On 04.02.2020 09:39, Philon wrote:
>> Hi Francis,
>> 
>> next to fts-solr there was fts-lucene. But that Lucene there seems
>> heavily outdated why the Dovecot docs also suggest using Solr.
>> Elasticsearch probably is similar to Solr but the later is maintained
>> by Dovecot team.
>> 
>> I started with downloading the Solr binary distribution to Debian
>with
>> JRE preinstalled and things were running like after 10 min. Yes it’s
>a
>> bit more complicated to find the schema and edit things like header
>> size (in tips section). It’s running quite nicely since then and has
>> zero maintenance.
>
>I will try again - I kept getting some weird errors, so I don't know if
>
>that's why I wasn't seing much of improvement.
>> 
>> As FTS indexes are separate in external Solr instance I’d guess that
>> it won’t interfere with dsync. What I don’t know is if dsync’ing
>would
>> trigger indexing. This brings me to wonder how one could actually
>> replicate the Solr instance!?
>
>Good question. But what I thought about doing was to install FTS on my 
>backup instance, and if things go fine, then I install an FTS instance 
>on my production server - that is, if one doesn't interfere with the 
>other.
>
>I will give Solr another shot - my worries are mostly if Solr is 
>supported on ARM (my prod instance is running on ARM) - I know 
>Elasticsearch has an ARM build.
>
>Ii thought about the Xapian engine, but since it requires dovecot 2.3,
>I 
>will have to wait.
>
>Best,
>
>Francis
>
>
>> 
>> Philon
>> 
>>> On 31 Jan 2020, at 17:24, Francis Augusto Medeiros-Logeay 
>>>  wrote:
>>> 
>>> Hi there,
>>> 
>>> I got successfully to replicate my mail server to another dovecot 
>>> install using dsync, mainly for redundancy, and it works great.
>>> 
>>> I want to try to install fts, as some of the mailboxes have tens of 
>>> thousands of messages, and it takes minutes to get some results when
>
>>> searching via IMAP on a Roundcube interface.
>>> 
>>> I want to experiment with fts-solr first, and firstly on my
>redundant 
>>> server, ie., not on my main dovecot install. Is it ok to do this? I 
>>> ask because I am afraid of how this whole reindexing on the
>redundant 
>>> install will affect the production server.
>>> 
>>> Also, any tips on something else than fts-solr? I tried it once, but
>
>>> it was so hard to get it right, so many configurations, java, etc., 
>>> that I'd rather try something else. I also could try fts-elastic or 
>>> something like that, but, again, having to maintain an elasticsearch
>
>>> install might use more resources than I think is worth. Any thoughts
>
>>> on that?
>>> 
>>> Best,
>>> 
>>> --
>>> Francis
>>> 

-- 
Christian Kivalo


Re: Strategy for fts and Replication

2020-02-04 Thread Francis Augusto Medeiros-Logeay

Hi Philon,

Thanks a lot for your thoughts!

Can I ask you if using Solr improved things for you? I have a mailbox 
with 15 years of e-mail and searching things take a long time.


On 04.02.2020 09:39, Philon wrote:

Hi Francis,

next to fts-solr there was fts-lucene. But that Lucene there seems
heavily outdated why the Dovecot docs also suggest using Solr.
Elasticsearch probably is similar to Solr but the later is maintained
by Dovecot team.

I started with downloading the Solr binary distribution to Debian with
JRE preinstalled and things were running like after 10 min. Yes it’s a
bit more complicated to find the schema and edit things like header
size (in tips section). It’s running quite nicely since then and has
zero maintenance.


I will try again - I kept getting some weird errors, so I don't know if 
that's why I wasn't seing much of improvement.




As FTS indexes are separate in external Solr instance I’d guess that
it won’t interfere with dsync. What I don’t know is if dsync’ing would
trigger indexing. This brings me to wonder how one could actually
replicate the Solr instance!?


Good question. But what I thought about doing was to install FTS on my 
backup instance, and if things go fine, then I install an FTS instance 
on my production server - that is, if one doesn't interfere with the 
other.


I will give Solr another shot - my worries are mostly if Solr is 
supported on ARM (my prod instance is running on ARM) - I know 
Elasticsearch has an ARM build.


Ii thought about the Xapian engine, but since it requires dovecot 2.3, I 
will have to wait.


Best,

Francis




Philon

On 31 Jan 2020, at 17:24, Francis Augusto Medeiros-Logeay 
 wrote:


Hi there,

I got successfully to replicate my mail server to another dovecot 
install using dsync, mainly for redundancy, and it works great.


I want to try to install fts, as some of the mailboxes have tens of 
thousands of messages, and it takes minutes to get some results when 
searching via IMAP on a Roundcube interface.


I want to experiment with fts-solr first, and firstly on my redundant 
server, ie., not on my main dovecot install. Is it ok to do this? I 
ask because I am afraid of how this whole reindexing on the redundant 
install will affect the production server.


Also, any tips on something else than fts-solr? I tried it once, but 
it was so hard to get it right, so many configurations, java, etc., 
that I'd rather try something else. I also could try fts-elastic or 
something like that, but, again, having to maintain an elasticsearch 
install might use more resources than I think is worth. Any thoughts 
on that?


Best,

--
Francis


0xEE41D33F.asc
Description: application/pgp-keys


Re: Strategy for fts and Replication

2020-02-04 Thread Philon
Hi Francis,

next to fts-solr there was fts-lucene. But that Lucene there seems heavily 
outdated why the Dovecot docs also suggest using Solr. Elasticsearch probably 
is similar to Solr but the later is maintained by Dovecot team.

I started with downloading the Solr binary distribution to Debian with JRE 
preinstalled and things were running like after 10 min. Yes it’s a bit more 
complicated to find the schema and edit things like header size (in tips 
section). It’s running quite nicely since then and has zero maintenance.

As FTS indexes are separate in external Solr instance I’d guess that it won’t 
interfere with dsync. What I don’t know is if dsync’ing would trigger indexing. 
This brings me to wonder how one could actually replicate the Solr instance!?


Philon

> On 31 Jan 2020, at 17:24, Francis Augusto Medeiros-Logeay  
> wrote:
> 
> Hi there,
> 
> I got successfully to replicate my mail server to another dovecot install 
> using dsync, mainly for redundancy, and it works great.
> 
> I want to try to install fts, as some of the mailboxes have tens of thousands 
> of messages, and it takes minutes to get some results when searching via IMAP 
> on a Roundcube interface.
> 
> I want to experiment with fts-solr first, and firstly on my redundant server, 
> ie., not on my main dovecot install. Is it ok to do this? I ask because I am 
> afraid of how this whole reindexing on the redundant install will affect the 
> production server.
> 
> Also, any tips on something else than fts-solr? I tried it once, but it was 
> so hard to get it right, so many configurations, java, etc., that I'd rather 
> try something else. I also could try fts-elastic or something like that, but, 
> again, having to maintain an elasticsearch install might use more resources 
> than I think is worth. Any thoughts on that?
> 
> Best,
> 
> --
> Francis
>