Re: FTS delays

2019-04-21 Thread Joan Moreau via dovecot

for instance, if I do a search from roundcube, the inbo name is NOT
passed to the backend (which is normal) 


the same search from the command line add the mailbox name ADDITIONALLY
to the mailbox * pointer 


However, passing a search from roudcube ask TWICE the backend  (first
with AND flag, second with OR flag) 


THis is obviously a clear bug form the part calling the backend (even if
the backend may need improvements ! this is really not the point here) 


Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Get last UID of Sent =
61714
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Get last UID of Sent =
61714
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query: FLAG=AND
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query(1/1): add
term(wilcard) : milao
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query(2/1): add
term(wilcard) : milao
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query(3/1): add
term(wilcard) : milao
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query(4/1): add
term(wilcard) : milao
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query(5/1): add
term(wilcard) : milao
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: SEARCH_OR
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: MATCH NOT : 0
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Testing if wildcard
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query: set GLOBAL (no
specified header)
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query : ( bcc:milao OR
body:milao OR cc:milao OR from:milao OR message-id:milao OR
subject:milao OR to:milao )
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query: 0 results in 0 ms
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query: FLAG=OR
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query(1): add
term(SUBJECT) : milao
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: SEARCH_HEADER
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: MATCH NOT : 0
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query(2): add term(TO) :
milao
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: SEARCH_HEADER
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: MATCH NOT : 0
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query(3): add term(FROM)
: milao
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: SEARCH_HEADER
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: MATCH NOT : 0
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query(4): add term(CC) :
milao
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: SEARCH_HEADER
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: MATCH NOT : 0
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query(5): add term(BCC) :
milao
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: SEARCH_HEADER
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: MATCH NOT : 0
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Testing if wildcard
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query : ( bcc:milao ) OR
( cc:milao ) OR ( from:milao ) OR ( subject:milao ) OR ( to:milao )
Apr 21 11:08:39 gjserver dovecot[14251]:
imap(j...@grosjo.net)<15709>: Query: 0 results in 0 ms

On 2019-04-21 11:56, Joan Moreau via dovecot wrote:

Timo, 

A little of logic here : 

1 - the mailbox is passed by dovecot to the backend as a mailbox * pointer  , NOT as a search parameter. 

-> It works properly when entering a search from roundcube or evolution for instance. 

-> therefore this is a clear bug of the command line 

2 - the loop : Actually, the timeout occurs because the dovecot core is DISCARDING the results of the backend and do its own search (ie. in my example , it search fo "milan" in my inbox , which is huge , without even considering the backend results 


-> This is a enormous error.

On 2019-04-21 11:29, Timo Sirainen wrote: It's because you're misunderstanding how the lookup() function works. It gets ALL the search parameters, including the "mailbox inbox". This is intentional, and not a bug. Two reasons being: 

1) The FTS plugin in theory could support indexing/searching any kinds of searches, not just regular word searches. So I didn't want to limit it unnecessarily. 

2) Especially with "mailbox inbox" this is important when searching from virtual mailboxes. If you configure "All mails in all folders" virtual mailbox, you can do a search in there that restricts which physical mailboxes are 

Re: FTS delays

2019-04-21 Thread Joan Moreau via dovecot
Timo, 

A little of logic here : 


1 - the mailbox is passed by dovecot to the backend as a mailbox *
pointer  , NOT as a search parameter. 


-> It works properly when entering a search from roundcube or evolution
for instance. 

-> therefore this is a clear bug of the command line 


2 - the loop : Actually, the timeout occurs because the dovecot core is
DISCARDING the results of the backend and do its own search (ie. in my
example , it search fo "milan" in my inbox , which is huge , without
even considering the backend results 


-> This is a enormous error.

On 2019-04-21 11:29, Timo Sirainen wrote:

It's because you're misunderstanding how the lookup() function works. It gets ALL the search parameters, including the "mailbox inbox". This is intentional, and not a bug. Two reasons being: 

1) The FTS plugin in theory could support indexing/searching any kinds of searches, not just regular word searches. So I didn't want to limit it unnecessarily. 

2) Especially with "mailbox inbox" this is important when searching from virtual mailboxes. If you configure "All mails in all folders" virtual mailbox, you can do a search in there that restricts which physical mailboxes are matched. In this case the FTS backend can optimize this lookup so it can filter only the physical mailboxes that have matches, leaving the others out. And it can do this in a single query if all the mailboxes are in the same FTS index. 

So again: Your lookup() function needs to be changed to only use those search args that it really wants to search, and ignore the others. Use solr_add_definite_query_args() as the template. 

Also I see now the reason for the timeout problem. It's because you're not setting search_arg->match_always=TRUE. These need to be set for the search args that you're actually using to generate the Xapian query. If it's not set, then Dovecot core doesn't think that the arg was part of the FTS search and it processes it itself. Meaning that it opens all the emails and does the search the slow way, practically making the FTS lookup ignored. 

On 21 Apr 2019, at 19.50, Joan Moreau  wrote: 

No, the parsing is made by dovecot core, that is nothing the backend can do about it. The backend shall *never*  reveive this. (would it be buggy or no) 

PLease, have a look deeper 


And the loop is a very big problem as it times out all the time (and once 
again, this is not in any of the backend  functions)

On 2019-04-21 10:42, Timo Sirainen via dovecot wrote: 
Inbox appears in the list of arguments, because fts_backend_xapian_lookup() is parsing the search args wrong. Not sure about the other issue. 

On 21 Apr 2019, at 19.31, Joan Moreau  wrote: 

For this first point, the problem is that dovecot core sends TWICE the request and "Inbox" appears in the list of arguments ! (inbox shall serve to select teh right mailbox, never sent to the backend) 

And even if this would be solved, the dovecot core loops *after* the backend hs returneds the results 


# doveadm search -u j...@grosjo.net mailbox inbox text milan
doveadm(j...@grosjo.net): Info: Get last UID of INBOX = 315526
doveadm(j...@grosjo.net): Info: Get last UID of INBOX = 315526
doveadm(j...@grosjo.net): Info: Query: FLAG=AND
doveadm(j...@grosjo.net): Info: Query(1): add term(wilcard) : inbox
doveadm(j...@grosjo.net): Info: Query(2): add term(wilcard) : milan
doveadm(j...@grosjo.net): Info: Testing if wildcard
doveadm(j...@grosjo.net): Info: Query: set GLOBAL (no specified header)
doveadm(j...@grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox 
OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox ) AND ( 
bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR 
subject:milan OR to:milan )
DOVEADM(j...@grosjo.net): INFO: QUERY: 2 RESULTS IN 1 MS // THIS IS WHEN 
BACKEND HAS FOUND RESULTS AND STOPPED
d82b4b0f550d3859364495331209 847
d82b4b0f550d3859364495331209 1569
d82b4b0f550d3859364495331209 2260
d82b4b0f550d3859364495331209 2575
d82b4b0f550d3859364495331209 2811
d82b4b0f550d3859364495331209 2885
d82b4b0f550d3859364495331209 3038
D82B4B0F550D3859364495331209 3121 -> LOOPING FOREVER 

On 2019-04-21 09:57, Timo Sirainen via dovecot wrote: 
On 3 Apr 2019, at 20.30, Joan Moreau via dovecot  wrote: doveadm search -u j...@grosjo.net mailbox inbox text milan

output

doveadm(j...@grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox 
OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox OR uid:inbox ) 
AND ( bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR 
subject:milan OR to:milan OR uid:milan )

1 - The query is wrong 
That's because fts_backend_xapian_lookup() isn't anywhere close to being correct. Try to copy the logic based on solr_add_definite_query_args().

Re: FTS delays

2019-04-21 Thread Timo Sirainen via dovecot
It's because you're misunderstanding how the lookup() function works. It gets 
ALL the search parameters, including the "mailbox inbox". This is intentional, 
and not a bug. Two reasons being:

1) The FTS plugin in theory could support indexing/searching any kinds of 
searches, not just regular word searches. So I didn't want to limit it 
unnecessarily.

2) Especially with "mailbox inbox" this is important when searching from 
virtual mailboxes. If you configure "All mails in all folders" virtual mailbox, 
you can do a search in there that restricts which physical mailboxes are 
matched. In this case the FTS backend can optimize this lookup so it can filter 
only the physical mailboxes that have matches, leaving the others out. And it 
can do this in a single query if all the mailboxes are in the same FTS index.

So again: Your lookup() function needs to be changed to only use those search 
args that it really wants to search, and ignore the others. Use 
solr_add_definite_query_args() as the template.

Also I see now the reason for the timeout problem. It's because you're not 
setting search_arg->match_always=TRUE. These need to be set for the search args 
that you're actually using to generate the Xapian query. If it's not set, then 
Dovecot core doesn't think that the arg was part of the FTS search and it 
processes it itself. Meaning that it opens all the emails and does the search 
the slow way, practically making the FTS lookup ignored.

> On 21 Apr 2019, at 19.50, Joan Moreau  wrote:
> 
> No, the parsing is made by dovecot core, that is nothing the backend can do 
> about it. The backend shall *never*  reveive this. (would it be buggy or no)
> 
> 
> 
> PLease, have a look deeper
> 
> And the loop is a very big problem as it times out all the time (and once 
> again, this is not in any of the backend  functions)
> 
>  
> 
> 
> On 2019-04-21 10:42, Timo Sirainen via dovecot wrote:
> 
>> Inbox appears in the list of arguments, because fts_backend_xapian_lookup() 
>> is parsing the search args wrong. Not sure about the other issue.
>> 
>>> On 21 Apr 2019, at 19.31, Joan Moreau >> > wrote:
>>> 
>>> For this first point, the problem is that dovecot core sends TWICE the 
>>> request and "Inbox" appears in the list of arguments ! (inbox shall serve 
>>> to select teh right mailbox, never sent to the backend)
>>> 
>>> And even if this would be solved, the dovecot core loops *after* the 
>>> backend hs returneds the results
>>> 
>>> 
>>> 
>>> # doveadm search -u j...@grosjo.net  mailbox inbox 
>>> text milan
>>> doveadm(j...@grosjo.net ): Info: Get last UID of 
>>> INBOX = 315526
>>> doveadm(j...@grosjo.net ): Info: Get last UID of 
>>> INBOX = 315526
>>> doveadm(j...@grosjo.net ): Info: Query: FLAG=AND
>>> doveadm(j...@grosjo.net ): Info: Query(1): add 
>>> term(wilcard) : inbox
>>> doveadm(j...@grosjo.net ): Info: Query(2): add 
>>> term(wilcard) : milan
>>> doveadm(j...@grosjo.net ): Info: Testing if wildcard
>>> doveadm(j...@grosjo.net ): Info: Query: set GLOBAL 
>>> (no specified header)
>>> doveadm(j...@grosjo.net ): Info: Query : ( 
>>> bcc:inbox OR body:inbox OR cc:inbox OR from:inbox OR message-id:inbox OR 
>>> subject:inbox OR to:inbox ) AND ( bcc:milan OR body:milan OR cc:milan OR 
>>> from:milan OR message-id:milan OR subject:milan OR to:milan )
>>> doveadm(j...@grosjo.net ): Info: Query: 2 results 
>>> in 1 ms // THIS IS WHEN BACKEND HAS FOUND RESULTS AND STOPPED
>>> d82b4b0f550d3859364495331209 847
>>> d82b4b0f550d3859364495331209 1569
>>> d82b4b0f550d3859364495331209 2260
>>> d82b4b0f550d3859364495331209 2575
>>> d82b4b0f550d3859364495331209 2811
>>> d82b4b0f550d3859364495331209 2885
>>> d82b4b0f550d3859364495331209 3038
>>> d82b4b0f550d3859364495331209 3121 -> LOOPING FOREVER
>>> 
>>> 
>>> 
>>>  
>>> 
>>> 
>>> On 2019-04-21 09:57, Timo Sirainen via dovecot wrote:
>>> 
>>> On 3 Apr 2019, at 20.30, Joan Moreau via dovecot >> > wrote:
>>> doveadm search -u j...@grosjo.net  mailbox inbox 
>>> text milan
>>> output
>>> 
>>> doveadm(j...@grosjo.net ): Info: Query : ( 
>>> bcc:inbox OR body:inbox OR cc:inbox OR from:inbox OR message-id:inbox OR 
>>> subject:inbox OR to:inbox OR uid:inbox ) AND ( bcc:milan OR body:milan OR 
>>> cc:milan OR from:milan OR message-id:milan OR subject:milan OR to:milan OR 
>>> uid:milan )
>>> 
>>> 1 - The query is wrong
>>> 
>>> That's because fts_backend_xapian_lookup() isn't anywhere close to being 
>>> correct. Try to copy the logic based on solr_add_definite_query_args().
>>> 
>>> 



Re: FTS delays

2019-04-21 Thread Joan Moreau via dovecot

No, the parsing is made by dovecot core, that is nothing the backend can
do about it. The backend shall *never*  reveive this. (would it be buggy
or no) 

PLease, have a look deeper 


And the loop is a very big problem as it times out all the time (and
once again, this is not in any of the backend  functions)

On 2019-04-21 10:42, Timo Sirainen via dovecot wrote:

Inbox appears in the list of arguments, because fts_backend_xapian_lookup() is parsing the search args wrong. Not sure about the other issue. 

On 21 Apr 2019, at 19.31, Joan Moreau  wrote: 

For this first point, the problem is that dovecot core sends TWICE the request and "Inbox" appears in the list of arguments ! (inbox shall serve to select teh right mailbox, never sent to the backend) 

And even if this would be solved, the dovecot core loops *after* the backend hs returneds the results 


# doveadm search -u j...@grosjo.net mailbox inbox text milan
doveadm(j...@grosjo.net): Info: Get last UID of INBOX = 315526
doveadm(j...@grosjo.net): Info: Get last UID of INBOX = 315526
doveadm(j...@grosjo.net): Info: Query: FLAG=AND
doveadm(j...@grosjo.net): Info: Query(1): add term(wilcard) : inbox
doveadm(j...@grosjo.net): Info: Query(2): add term(wilcard) : milan
doveadm(j...@grosjo.net): Info: Testing if wildcard
doveadm(j...@grosjo.net): Info: Query: set GLOBAL (no specified header)
doveadm(j...@grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox 
OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox ) AND ( 
bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR 
subject:milan OR to:milan )
DOVEADM(j...@grosjo.net): INFO: QUERY: 2 RESULTS IN 1 MS // THIS IS WHEN 
BACKEND HAS FOUND RESULTS AND STOPPED
d82b4b0f550d3859364495331209 847
d82b4b0f550d3859364495331209 1569
d82b4b0f550d3859364495331209 2260
d82b4b0f550d3859364495331209 2575
d82b4b0f550d3859364495331209 2811
d82b4b0f550d3859364495331209 2885
d82b4b0f550d3859364495331209 3038
D82B4B0F550D3859364495331209 3121 -> LOOPING FOREVER 

On 2019-04-21 09:57, Timo Sirainen via dovecot wrote: 
On 3 Apr 2019, at 20.30, Joan Moreau via dovecot  wrote: doveadm search -u j...@grosjo.net mailbox inbox text milan

output

doveadm(j...@grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox 
OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox OR uid:inbox ) 
AND ( bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR 
subject:milan OR to:milan OR uid:milan )

1 - The query is wrong 
That's because fts_backend_xapian_lookup() isn't anywhere close to being correct. Try to copy the logic based on solr_add_definite_query_args().

Re: FTS delays

2019-04-21 Thread Joan Moreau via dovecot

Antoher example so you understand how may understand the bug in dovecote
core : 

# doveadm search -u j...@grosjo.net mailbox SENT text milan 


doveadm(j...@grosjo.net): Info: Get last UID of Sent = 61707 -> CORRECTLY
ASSIGNED THE PROPER MAILBOX TO THE BACK END
doveadm(j...@grosjo.net): Info: Get last UID of Sent = 61707
doveadm(j...@grosjo.net): Info: Query: FLAG=AND
doveadm(j...@grosjo.net): Info: Query(1): add term(wilcard) : Sent -> WHY
IS "SENT" AMONG THE SERACH PARAMETERS ???
doveadm(j...@grosjo.net): Info: Query(2): add term(wilcard) : milan
doveadm(j...@grosjo.net): Info: Testing if wildcard
doveadm(j...@grosjo.net): Info: Query: set GLOBAL (no specified header)
doveadm(j...@grosjo.net): Info: Query : ( bcc:milan OR body:milan OR
cc:milan OR from:milan OR message-id:milan OR subject:milan OR to:milan
) AND ( bcc:sent OR body:sent OR cc:sent OR from:sent OR message-id:sent
OR subject:sent OR to:sent )
doveadm(j...@grosjo.net): Info: Query: 7 results in 71 ms 

(AND SAME LOOP) 


In this example, the "Sent" shall *never*  be passed as argument to the
backend (xapian, solr or any other), only the mailbox reference.
However, it appears in the search parameters 


On 2019-04-21 10:31, Joan Moreau via dovecot wrote:

For this first point, the problem is that dovecot core sends TWICE the request and "Inbox" appears in the list of arguments ! (inbox shall serve to select teh right mailbox, never sent to the backend) 

And even if this would be solved, the dovecot core loops *after* the backend hs returneds the results 


# doveadm search -u j...@grosjo.net mailbox inbox text milan
doveadm(j...@grosjo.net): Info: Get last UID of INBOX = 315526
doveadm(j...@grosjo.net): Info: Get last UID of INBOX = 315526
doveadm(j...@grosjo.net): Info: Query: FLAG=AND
doveadm(j...@grosjo.net): Info: Query(1): add term(wilcard) : inbox
doveadm(j...@grosjo.net): Info: Query(2): add term(wilcard) : milan
doveadm(j...@grosjo.net): Info: Testing if wildcard
doveadm(j...@grosjo.net): Info: Query: set GLOBAL (no specified header)
doveadm(j...@grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox 
OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox ) AND ( 
bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR 
subject:milan OR to:milan )
DOVEADM(j...@grosjo.net): INFO: QUERY: 2 RESULTS IN 1 MS // THIS IS WHEN 
BACKEND HAS FOUND RESULTS AND STOPPED
d82b4b0f550d3859364495331209 847
d82b4b0f550d3859364495331209 1569
d82b4b0f550d3859364495331209 2260
d82b4b0f550d3859364495331209 2575
d82b4b0f550d3859364495331209 2811
d82b4b0f550d3859364495331209 2885
d82b4b0f550d3859364495331209 3038
D82B4B0F550D3859364495331209 3121 -> LOOPING FOREVER 

On 2019-04-21 09:57, Timo Sirainen via dovecot wrote: 
On 3 Apr 2019, at 20.30, Joan Moreau via dovecot  wrote: doveadm search -u j...@grosjo.net mailbox inbox text milan

output

doveadm(j...@grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox 
OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox OR uid:inbox ) 
AND ( bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR 
subject:milan OR to:milan OR uid:milan )

1 - The query is wrong 
That's because fts_backend_xapian_lookup() isn't anywhere close to being correct. Try to copy the logic based on solr_add_definite_query_args().

Re: FTS delays

2019-04-21 Thread Timo Sirainen via dovecot
Inbox appears in the list of arguments, because fts_backend_xapian_lookup() is 
parsing the search args wrong. Not sure about the other issue.

> On 21 Apr 2019, at 19.31, Joan Moreau  wrote:
> 
> For this first point, the problem is that dovecot core sends TWICE the 
> request and "Inbox" appears in the list of arguments ! (inbox shall serve to 
> select teh right mailbox, never sent to the backend)
> 
> And even if this would be solved, the dovecot core loops *after* the backend 
> hs returneds the results
> 
> 
> 
> # doveadm search -u j...@grosjo.net mailbox inbox text milan
> doveadm(j...@grosjo.net): Info: Get last UID of INBOX = 315526
> doveadm(j...@grosjo.net): Info: Get last UID of INBOX = 315526
> doveadm(j...@grosjo.net): Info: Query: FLAG=AND
> doveadm(j...@grosjo.net): Info: Query(1): add term(wilcard) : inbox
> doveadm(j...@grosjo.net): Info: Query(2): add term(wilcard) : milan
> doveadm(j...@grosjo.net): Info: Testing if wildcard
> doveadm(j...@grosjo.net): Info: Query: set GLOBAL (no specified header)
> doveadm(j...@grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox 
> OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox ) AND ( 
> bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR 
> subject:milan OR to:milan )
> doveadm(j...@grosjo.net): Info: Query: 2 results in 1 ms // THIS IS WHEN 
> BACKEND HAS FOUND RESULTS AND STOPPED
> d82b4b0f550d3859364495331209 847
> d82b4b0f550d3859364495331209 1569
> d82b4b0f550d3859364495331209 2260
> d82b4b0f550d3859364495331209 2575
> d82b4b0f550d3859364495331209 2811
> d82b4b0f550d3859364495331209 2885
> d82b4b0f550d3859364495331209 3038
> d82b4b0f550d3859364495331209 3121 -> LOOPING FOREVER
> 
> 
> 
>  
> 
> 
> On 2019-04-21 09:57, Timo Sirainen via dovecot wrote:
> 
>> On 3 Apr 2019, at 20.30, Joan Moreau via dovecot > > wrote:
>>> 
>>> doveadm search -u j...@grosjo.net  mailbox inbox 
>>> text milan
>>> output
>>> 
>>> doveadm(j...@grosjo.net ): Info: Query : ( 
>>> bcc:inbox OR body:inbox OR cc:inbox OR from:inbox OR message-id:inbox OR 
>>> subject:inbox OR to:inbox OR uid:inbox ) AND ( bcc:milan OR body:milan OR 
>>> cc:milan OR from:milan OR message-id:milan OR subject:milan OR to:milan OR 
>>> uid:milan )
>>> 
>>> 1 - The query is wrong
>> 
>> That's because fts_backend_xapian_lookup() isn't anywhere close to being 
>> correct. Try to copy the logic based on solr_add_definite_query_args().
>> 
>> 



Re: FTS delays

2019-04-21 Thread Joan Moreau via dovecot

For this first point, the problem is that dovecot core sends TWICE the
request and "Inbox" appears in the list of arguments ! (inbox shall
serve to select teh right mailbox, never sent to the backend) 


And even if this would be solved, the dovecot core loops *after* the
backend hs returneds the results 


# doveadm search -u j...@grosjo.net mailbox inbox text milan
doveadm(j...@grosjo.net): Info: Get last UID of INBOX = 315526
doveadm(j...@grosjo.net): Info: Get last UID of INBOX = 315526
doveadm(j...@grosjo.net): Info: Query: FLAG=AND
doveadm(j...@grosjo.net): Info: Query(1): add term(wilcard) : inbox
doveadm(j...@grosjo.net): Info: Query(2): add term(wilcard) : milan
doveadm(j...@grosjo.net): Info: Testing if wildcard
doveadm(j...@grosjo.net): Info: Query: set GLOBAL (no specified header)
doveadm(j...@grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR
cc:inbox OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox
) AND ( bcc:milan OR body:milan OR cc:milan OR from:milan OR
message-id:milan OR subject:milan OR to:milan )
DOVEADM(j...@grosjo.net): INFO: QUERY: 2 RESULTS IN 1 MS // THIS IS WHEN
BACKEND HAS FOUND RESULTS AND STOPPED
d82b4b0f550d3859364495331209 847
d82b4b0f550d3859364495331209 1569
d82b4b0f550d3859364495331209 2260
d82b4b0f550d3859364495331209 2575
d82b4b0f550d3859364495331209 2811
d82b4b0f550d3859364495331209 2885
d82b4b0f550d3859364495331209 3038
D82B4B0F550D3859364495331209 3121 -> LOOPING FOREVER 


On 2019-04-21 09:57, Timo Sirainen via dovecot wrote:

On 3 Apr 2019, at 20.30, Joan Moreau via dovecot  wrote: 


doveadm search -u j...@grosjo.net mailbox inbox text milan
output

doveadm(j...@grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox 
OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox OR uid:inbox ) 
AND ( bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan OR 
subject:milan OR to:milan OR uid:milan )

1 - The query is wrong


That's because fts_backend_xapian_lookup() isn't anywhere close to being 
correct. Try to copy the logic based on solr_add_definite_query_args().

Re: FTS delays

2019-04-21 Thread Timo Sirainen via dovecot
On 3 Apr 2019, at 20.30, Joan Moreau via dovecot  wrote:
> doveadm search -u j...@grosjo.net mailbox inbox text milan
> output
> 
> doveadm(j...@grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR cc:inbox 
> OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox OR uid:inbox ) 
> AND ( bcc:milan OR body:milan OR cc:milan OR from:milan OR message-id:milan 
> OR subject:milan OR to:milan OR uid:milan )
> 
> 1 - The query is wrong

That's because fts_backend_xapian_lookup() isn't anywhere close to being 
correct. Try to copy the logic based on solr_add_definite_query_args().



Re: FTS delays

2019-04-20 Thread Joan Moreau via dovecot
I have no idea how to use git-bitsec 


On 2019-04-15 15:31, Josef 'Jeff' Sipek wrote:


On Sun, Apr 14, 2019 at 21:09:54 +0800, Joan Moreau wrote:
... 


THe "loop" part seems the most urgent : It breaks everything (search
timeout 100% of the time)


Any luck with git-bisect?

Jeff.

On 2019-04-06 09:56, Joan Moreau via dovecot wrote:

For the point 1, this is not "suboptimal", it is plain wrong (results are damn 
wrong ! and this is not related to the backend, but the FTS logic in Dovecot core)

For the point 2 , this has been discussed already numerous times but without action. The dovecot core shall be the one re-submitting the emails to scan, not the backend to try to figure out where and which are the emails to be re-scaned 

For the point 3, I will do a bit of research in the existing code and will get back to you 

For the point 4, this is random. FTS backend (xapian, lucene, solr, whatever..) returns X, then dovecot core choose to select only Y emails. THis is a clear bug. 

On 2019-04-05 20:08, Josef 'Jeff' Sipek via dovecot wrote: 
On Fri, Apr 05, 2019 at 19:33:57 +0800, Joan Moreau via dovecot wrote: Hi 

If you plan to fix the FTS part of Dovecot, I will be very gratefull. 
I'm trying to figure out what is causing the 3rd issue you listed, so we can

decide how severe it is and therefore how quickly it needs to be fixed.  At
the moment we are unable to reproduce it, and therefore we cannot fix it.

Not sure this is related to any specific commit but rahter the overall
design 
Ok.


The list of bugs so far 


1 - Double call to fts plugins with inconsistent parameter (first call
diferent from second call for the same request) 
Understood.  It is my understanding that this is simply suboptimal rather

than causing crashes/etc.

2 - "Rescan" features for now consists of deleting indexes. SHall be
resending emails to rescan to the fts plugin instead 
I'm not sure I follow.  The rescan operation is invoked on the fts backend

and it is up to the implementation to somehow ensure that after it is done
the fts index is up to date.  The easiest way to implement it is to simply
delete the fts index and re-index all the mails.  That is what currently
happens in the solr backend.

The lucene fts backend does a more complicated matching of the fts index
with the emails.  Finally, the deprecated squat backend seem to ignore the
rescan requests (its rescan vfunc is NULL).

3 - the loop when body search (just do a "doveadm search -u user@domain
mailbox inbox text whatevertexte") 


Refer to my email to Timo on 2019-04-03 18:30 on the same thread for bug
details 

(especially the loop) 
This seems to be the most important of the 4 issues you listed, so I'd like

to focus on this one for now.

As I mentioned, we cannot reproduce this ourselves.  So, we need your help
to narrow things down.  Therefore, can you give us the commit hashes of
revisions that you know are good and which are bad?  You can use git-bisect
to narrow the range down.

4 - Most notably, I notice that header search usually does not care
about fts plugin (even with fts_enforced) and rely on some internal
search , which si total non-sense 
You're right, that doesn't seem to make sense.  Can you provide a test case?


Jeff.

Let me know how can I help on thos 4 points 


On 2019-04-05 18:37, Josef 'Jeff' Sipek wrote:

On Fri, Apr 05, 2019 at 17:45:36 +0800, Joan Moreau wrote: 

I am on master (very latest) 

No clue exactly when this problem appears, but 


1 - the "request twice the fts plugin instead of once" issue has always
been there (since my first RC release of fts-xapian) 
Ok, good to know.


2 - the body/text loop has appeared recently (maybe during the month of
March) 
Our testing doesn't seem to be able to reproduce this.  Can you try to

git-bisect this to find which commit broke it?

Thanks,

Jeff.

On 2019-04-05 16:36, Josef 'Jeff' Sipek via dovecot wrote:

On Wed, Apr 03, 2019 at 19:02:52 +0800, Joan Moreau via dovecot wrote: 

issue seems in the Git version : 
Which git revision?


Before you updated to the broken revision, which revision/version were you
running?

Can you try it with 5f6e39c50ec79ba8847b2fdb571a9152c71cd1b6 (the commit
just before the fts_enforced=body introduction)?  That's the only recent fts
change.

Thanks,

Jeff.

On 2019-04-03 18:58, @lbutlr via dovecot wrote:

On 3 Apr 2019, at 04:30, Joan Moreau via dovecot  wrote: 

doveadm search -u j...@grosjo.net mailbox inbox text milan 
Did that search over my list mail and got 83 results, not able to duplicate your issue.


What version of dovecot and have you tried to reindex?

dovecot-2.3.5.1 here.

Re: FTS delays

2019-04-15 Thread Josef 'Jeff' Sipek via dovecot
On Sun, Apr 14, 2019 at 21:09:54 +0800, Joan Moreau wrote:
...
> THe "loop" part seems the most urgent : It breaks everything (search
> timeout 100% of the time) 

Any luck with git-bisect?

Jeff.

> 
> On 2019-04-06 09:56, Joan Moreau via dovecot wrote:
> 
> > For the point 1, this is not "suboptimal", it is plain wrong (results are 
> > damn wrong ! and this is not related to the backend, but the FTS logic in 
> > Dovecot core)
> > 
> > For the point 2 , this has been discussed already numerous times but 
> > without action. The dovecot core shall be the one re-submitting the emails 
> > to scan, not the backend to try to figure out where and which are the 
> > emails to be re-scaned 
> > 
> > For the point 3, I will do a bit of research in the existing code and will 
> > get back to you 
> > 
> > For the point 4, this is random. FTS backend (xapian, lucene, solr, 
> > whatever..) returns X, then dovecot core choose to select only Y emails. 
> > THis is a clear bug. 
> > 
> > On 2019-04-05 20:08, Josef 'Jeff' Sipek via dovecot wrote: 
> > On Fri, Apr 05, 2019 at 19:33:57 +0800, Joan Moreau via dovecot wrote: Hi 
> > 
> > If you plan to fix the FTS part of Dovecot, I will be very gratefull. 
> > I'm trying to figure out what is causing the 3rd issue you listed, so we can
> > decide how severe it is and therefore how quickly it needs to be fixed.  At
> > the moment we are unable to reproduce it, and therefore we cannot fix it.
> > 
> > Not sure this is related to any specific commit but rahter the overall
> > design 
> > Ok.
> > 
> > The list of bugs so far 
> > 
> > 1 - Double call to fts plugins with inconsistent parameter (first call
> > diferent from second call for the same request) 
> > Understood.  It is my understanding that this is simply suboptimal rather
> > than causing crashes/etc.
> > 
> > 2 - "Rescan" features for now consists of deleting indexes. SHall be
> > resending emails to rescan to the fts plugin instead 
> > I'm not sure I follow.  The rescan operation is invoked on the fts backend
> > and it is up to the implementation to somehow ensure that after it is done
> > the fts index is up to date.  The easiest way to implement it is to simply
> > delete the fts index and re-index all the mails.  That is what currently
> > happens in the solr backend.
> > 
> > The lucene fts backend does a more complicated matching of the fts index
> > with the emails.  Finally, the deprecated squat backend seem to ignore the
> > rescan requests (its rescan vfunc is NULL).
> > 
> > 3 - the loop when body search (just do a "doveadm search -u user@domain
> > mailbox inbox text whatevertexte") 
> > 
> > Refer to my email to Timo on 2019-04-03 18:30 on the same thread for bug
> > details 
> > 
> > (especially the loop) 
> > This seems to be the most important of the 4 issues you listed, so I'd like
> > to focus on this one for now.
> > 
> > As I mentioned, we cannot reproduce this ourselves.  So, we need your help
> > to narrow things down.  Therefore, can you give us the commit hashes of
> > revisions that you know are good and which are bad?  You can use git-bisect
> > to narrow the range down.
> > 
> > 4 - Most notably, I notice that header search usually does not care
> > about fts plugin (even with fts_enforced) and rely on some internal
> > search , which si total non-sense 
> > You're right, that doesn't seem to make sense.  Can you provide a test case?
> > 
> > Jeff.
> > 
> > Let me know how can I help on thos 4 points 
> > 
> > On 2019-04-05 18:37, Josef 'Jeff' Sipek wrote:
> > 
> > On Fri, Apr 05, 2019 at 17:45:36 +0800, Joan Moreau wrote: 
> > 
> > I am on master (very latest) 
> > 
> > No clue exactly when this problem appears, but 
> > 
> > 1 - the "request twice the fts plugin instead of once" issue has always
> > been there (since my first RC release of fts-xapian) 
> > Ok, good to know.
> > 
> > 2 - the body/text loop has appeared recently (maybe during the month of
> > March) 
> > Our testing doesn't seem to be able to reproduce this.  Can you try to
> > git-bisect this to find which commit broke it?
> > 
> > Thanks,
> > 
> > Jeff.
> > 
> > On 2019-04-05 16:36, Josef 'Jeff' Sipek via dovecot wrote:
> > 
> > On Wed, Apr 03, 2019 at 19:02:52 +0800, Joan Moreau via dovecot wrote: 
> > 
> > issue seems in the Git version : 
> > Which git revision?
> > 
> > Before you updated to the broken revision, which revision/version were you
> > running?
> > 
> > Can you try it with 5f6e39c50ec79ba8847b2fdb571a9152c71cd1b6 (the commit
> > just before the fts_enforced=body introduction)?  That's the only recent fts
> > change.
> > 
> > Thanks,
> > 
> > Jeff.
> > 
> > On 2019-04-03 18:58, @lbutlr via dovecot wrote:
> > 
> > On 3 Apr 2019, at 04:30, Joan Moreau via dovecot  
> > wrote: 
> > 
> > doveadm search -u j...@grosjo.net mailbox inbox text milan 
> > Did that search over my list mail and got 83 results, not able to duplicate 
> > your issue.
> > 
> > What version of dovecot and have you tried to 

Re: FTS delays

2019-04-14 Thread Joan Moreau via dovecot

I have tried to spend some time of understanding the logic (if any !) of
the fts part 


Honestly, the one who created this mess shall be the one to fix it, or
one shall refactor it totally. 

Basically, the fts "core" should be able to do 

- select the backend according to conf file 

- send new emails/maiblox to backend 

- send teh ID of the emails to be removed 

- resend an entire mailbox ('rescan') 


- send the search parameters (from client) to backend and return the
email to front end based on backend results (and NOTHING more) 

Today, the fts part is plain wong and must be totally reviewed. 


I do not have the time but I can participate in testing if someone is
ready to roll up its sleeves on teh mater 


THe "loop" part seems the most urgent : It breaks everything (search
timeout 100% of the time) 


On 2019-04-06 09:56, Joan Moreau via dovecot wrote:


For the point 1, this is not "suboptimal", it is plain wrong (results are damn 
wrong ! and this is not related to the backend, but the FTS logic in Dovecot core)

For the point 2 , this has been discussed already numerous times but without action. The dovecot core shall be the one re-submitting the emails to scan, not the backend to try to figure out where and which are the emails to be re-scaned 

For the point 3, I will do a bit of research in the existing code and will get back to you 

For the point 4, this is random. FTS backend (xapian, lucene, solr, whatever..) returns X, then dovecot core choose to select only Y emails. THis is a clear bug. 

On 2019-04-05 20:08, Josef 'Jeff' Sipek via dovecot wrote: 
On Fri, Apr 05, 2019 at 19:33:57 +0800, Joan Moreau via dovecot wrote: Hi 

If you plan to fix the FTS part of Dovecot, I will be very gratefull. 
I'm trying to figure out what is causing the 3rd issue you listed, so we can

decide how severe it is and therefore how quickly it needs to be fixed.  At
the moment we are unable to reproduce it, and therefore we cannot fix it.

Not sure this is related to any specific commit but rahter the overall
design 
Ok.


The list of bugs so far 


1 - Double call to fts plugins with inconsistent parameter (first call
diferent from second call for the same request) 
Understood.  It is my understanding that this is simply suboptimal rather

than causing crashes/etc.

2 - "Rescan" features for now consists of deleting indexes. SHall be
resending emails to rescan to the fts plugin instead 
I'm not sure I follow.  The rescan operation is invoked on the fts backend

and it is up to the implementation to somehow ensure that after it is done
the fts index is up to date.  The easiest way to implement it is to simply
delete the fts index and re-index all the mails.  That is what currently
happens in the solr backend.

The lucene fts backend does a more complicated matching of the fts index
with the emails.  Finally, the deprecated squat backend seem to ignore the
rescan requests (its rescan vfunc is NULL).

3 - the loop when body search (just do a "doveadm search -u user@domain
mailbox inbox text whatevertexte") 


Refer to my email to Timo on 2019-04-03 18:30 on the same thread for bug
details 

(especially the loop) 
This seems to be the most important of the 4 issues you listed, so I'd like

to focus on this one for now.

As I mentioned, we cannot reproduce this ourselves.  So, we need your help
to narrow things down.  Therefore, can you give us the commit hashes of
revisions that you know are good and which are bad?  You can use git-bisect
to narrow the range down.

4 - Most notably, I notice that header search usually does not care
about fts plugin (even with fts_enforced) and rely on some internal
search , which si total non-sense 
You're right, that doesn't seem to make sense.  Can you provide a test case?


Jeff.

Let me know how can I help on thos 4 points 


On 2019-04-05 18:37, Josef 'Jeff' Sipek wrote:

On Fri, Apr 05, 2019 at 17:45:36 +0800, Joan Moreau wrote: 

I am on master (very latest) 

No clue exactly when this problem appears, but 


1 - the "request twice the fts plugin instead of once" issue has always
been there (since my first RC release of fts-xapian) 
Ok, good to know.


2 - the body/text loop has appeared recently (maybe during the month of
March) 
Our testing doesn't seem to be able to reproduce this.  Can you try to

git-bisect this to find which commit broke it?

Thanks,

Jeff.

On 2019-04-05 16:36, Josef 'Jeff' Sipek via dovecot wrote:

On Wed, Apr 03, 2019 at 19:02:52 +0800, Joan Moreau via dovecot wrote: 

issue seems in the Git version : 
Which git revision?


Before you updated to the broken revision, which revision/version were you
running?

Can you try it with 5f6e39c50ec79ba8847b2fdb571a9152c71cd1b6 (the commit
just before the fts_enforced=body introduction)?  That's the only recent fts
change.

Thanks,

Jeff.

On 2019-04-03 18:58, @lbutlr via dovecot wrote:

On 3 Apr 2019, at 04:30, Joan Moreau via dovecot  wrote: 

doveadm search -u j...@grosjo.net mailbox inbox text 

Re: FTS delays

2019-04-05 Thread Joan Moreau via dovecot

For the point 1, this is not "suboptimal", it is plain wrong (results
are damn wrong ! and this is not related to the backend, but the FTS
logic in Dovecot core)

For the point 2 , this has been discussed already numerous times but
without action. The dovecot core shall be the one re-submitting the
emails to scan, not the backend to try to figure out where and which are
the emails to be re-scaned 


For the point 3, I will do a bit of research in the existing code and
will get back to you 


For the point 4, this is random. FTS backend (xapian, lucene, solr,
whatever..) returns X, then dovecot core choose to select only Y emails.
THis is a clear bug. 


On 2019-04-05 20:08, Josef 'Jeff' Sipek via dovecot wrote:

On Fri, Apr 05, 2019 at 19:33:57 +0800, Joan Moreau via dovecot wrote: 

Hi 


If you plan to fix the FTS part of Dovecot, I will be very gratefull.


I'm trying to figure out what is causing the 3rd issue you listed, so we can
decide how severe it is and therefore how quickly it needs to be fixed.  At
the moment we are unable to reproduce it, and therefore we cannot fix it.


Not sure this is related to any specific commit but rahter the overall
design


Ok.

The list of bugs so far 


1 - Double call to fts plugins with inconsistent parameter (first call
diferent from second call for the same request)


Understood.  It is my understanding that this is simply suboptimal rather
than causing crashes/etc.


2 - "Rescan" features for now consists of deleting indexes. SHall be
resending emails to rescan to the fts plugin instead


I'm not sure I follow.  The rescan operation is invoked on the fts backend
and it is up to the implementation to somehow ensure that after it is done
the fts index is up to date.  The easiest way to implement it is to simply
delete the fts index and re-index all the mails.  That is what currently
happens in the solr backend.

The lucene fts backend does a more complicated matching of the fts index
with the emails.  Finally, the deprecated squat backend seem to ignore the
rescan requests (its rescan vfunc is NULL).


3 - the loop when body search (just do a "doveadm search -u user@domain
mailbox inbox text whatevertexte") 


Refer to my email to Timo on 2019-04-03 18:30 on the same thread for bug
details 


(especially the loop)


This seems to be the most important of the 4 issues you listed, so I'd like
to focus on this one for now.

As I mentioned, we cannot reproduce this ourselves.  So, we need your help
to narrow things down.  Therefore, can you give us the commit hashes of
revisions that you know are good and which are bad?  You can use git-bisect
to narrow the range down.


4 - Most notably, I notice that header search usually does not care
about fts plugin (even with fts_enforced) and rely on some internal
search , which si total non-sense


You're right, that doesn't seem to make sense.  Can you provide a test case?

Jeff.

Let me know how can I help on thos 4 points 


On 2019-04-05 18:37, Josef 'Jeff' Sipek wrote:

On Fri, Apr 05, 2019 at 17:45:36 +0800, Joan Moreau wrote: 

I am on master (very latest) 

No clue exactly when this problem appears, but 


1 - the "request twice the fts plugin instead of once" issue has always
been there (since my first RC release of fts-xapian) 
Ok, good to know.


2 - the body/text loop has appeared recently (maybe during the month of
March) 
Our testing doesn't seem to be able to reproduce this.  Can you try to

git-bisect this to find which commit broke it?

Thanks,

Jeff.

On 2019-04-05 16:36, Josef 'Jeff' Sipek via dovecot wrote:

On Wed, Apr 03, 2019 at 19:02:52 +0800, Joan Moreau via dovecot wrote: 

issue seems in the Git version : 
Which git revision?


Before you updated to the broken revision, which revision/version were you
running?

Can you try it with 5f6e39c50ec79ba8847b2fdb571a9152c71cd1b6 (the commit
just before the fts_enforced=body introduction)?  That's the only recent fts
change.

Thanks,

Jeff.

On 2019-04-03 18:58, @lbutlr via dovecot wrote:

On 3 Apr 2019, at 04:30, Joan Moreau via dovecot  wrote: 

doveadm search -u j...@grosjo.net mailbox inbox text milan 
Did that search over my list mail and got 83 results, not able to duplicate your issue.


What version of dovecot and have you tried to reindex?

dovecot-2.3.5.1 here.

Re: FTS delays

2019-04-05 Thread Josef 'Jeff' Sipek via dovecot
On Fri, Apr 05, 2019 at 19:33:57 +0800, Joan Moreau via dovecot wrote:
> Hi 
> 
> If you plan to fix the FTS part of Dovecot, I will be very gratefull.

I'm trying to figure out what is causing the 3rd issue you listed, so we can
decide how severe it is and therefore how quickly it needs to be fixed.  At
the moment we are unable to reproduce it, and therefore we cannot fix it.

> Not sure this is related to any specific commit but rahter the overall
> design 

Ok.

> The list of bugs so far 
> 
> 1 - Double call to fts plugins with inconsistent parameter (first call
> diferent from second call for the same request) 

Understood.  It is my understanding that this is simply suboptimal rather
than causing crashes/etc.

> 2 - "Rescan" features for now consists of deleting indexes. SHall be
> resending emails to rescan to the fts plugin instead 

I'm not sure I follow.  The rescan operation is invoked on the fts backend
and it is up to the implementation to somehow ensure that after it is done
the fts index is up to date.  The easiest way to implement it is to simply
delete the fts index and re-index all the mails.  That is what currently
happens in the solr backend.

The lucene fts backend does a more complicated matching of the fts index
with the emails.  Finally, the deprecated squat backend seem to ignore the
rescan requests (its rescan vfunc is NULL).

> 3 - the loop when body search (just do a "doveadm search -u user@domain
> mailbox inbox text whatevertexte") 
> 
> Refer to my email to Timo on 2019-04-03 18:30 on the same thread for bug
> details 
> 
> (especially the loop) 

This seems to be the most important of the 4 issues you listed, so I'd like
to focus on this one for now.

As I mentioned, we cannot reproduce this ourselves.  So, we need your help
to narrow things down.  Therefore, can you give us the commit hashes of
revisions that you know are good and which are bad?  You can use git-bisect
to narrow the range down.

> 4 - Most notably, I notice that header search usually does not care
> about fts plugin (even with fts_enforced) and rely on some internal
> search , which si total non-sense 

You're right, that doesn't seem to make sense.  Can you provide a test case?

Jeff.

> Let me know how can I help on thos 4 points 
> 
> On 2019-04-05 18:37, Josef 'Jeff' Sipek wrote:
> 
> > On Fri, Apr 05, 2019 at 17:45:36 +0800, Joan Moreau wrote: 
> > 
> >> I am on master (very latest) 
> >> 
> >> No clue exactly when this problem appears, but 
> >> 
> >> 1 - the "request twice the fts plugin instead of once" issue has always
> >> been there (since my first RC release of fts-xapian)
> > 
> > Ok, good to know.
> > 
> >> 2 - the body/text loop has appeared recently (maybe during the month of
> >> March)
> > 
> > Our testing doesn't seem to be able to reproduce this.  Can you try to
> > git-bisect this to find which commit broke it?
> > 
> > Thanks,
> > 
> > Jeff.
> > 
> > On 2019-04-05 16:36, Josef 'Jeff' Sipek via dovecot wrote:
> > 
> > On Wed, Apr 03, 2019 at 19:02:52 +0800, Joan Moreau via dovecot wrote: 
> > 
> > issue seems in the Git version : 
> > Which git revision?
> > 
> > Before you updated to the broken revision, which revision/version were you
> > running?
> > 
> > Can you try it with 5f6e39c50ec79ba8847b2fdb571a9152c71cd1b6 (the commit
> > just before the fts_enforced=body introduction)?  That's the only recent fts
> > change.
> > 
> > Thanks,
> > 
> > Jeff.
> > 
> > On 2019-04-03 18:58, @lbutlr via dovecot wrote:
> > 
> > On 3 Apr 2019, at 04:30, Joan Moreau via dovecot  
> > wrote: 
> > 
> > doveadm search -u j...@grosjo.net mailbox inbox text milan 
> > Did that search over my list mail and got 83 results, not able to duplicate 
> > your issue.
> > 
> > What version of dovecot and have you tried to reindex?
> > 
> > dovecot-2.3.5.1 here.

-- 
What is the difference between Mechanical Engineers and Civil Engineers?
Mechanical Engineers build weapons, Civil Engineers build targets.


Re: FTS delays

2019-04-05 Thread Joan Moreau via dovecot
Hi 


If you plan to fix the FTS part of Dovecot, I will be very gratefull.
Not sure this is related to any specific commit but rahter the overall
design 

The list of bugs so far 


1 - Double call to fts plugins with inconsistent parameter (first call
diferent from second call for the same request) 


2 - "Rescan" features for now consists of deleting indexes. SHall be
resending emails to rescan to the fts plugin instead 


3 - the loop when body search (just do a "doveadm search -u user@domain
mailbox inbox text whatevertexte") 


Refer to my email to Timo on 2019-04-03 18:30 on the same thread for bug
details 

(especially the loop) 


4 - Most notably, I notice that header search usually does not care
about fts plugin (even with fts_enforced) and rely on some internal
search , which si total non-sense 

Let me know how can I help on thos 4 points 


On 2019-04-05 18:37, Josef 'Jeff' Sipek wrote:

On Fri, Apr 05, 2019 at 17:45:36 +0800, Joan Moreau wrote: 

I am on master (very latest) 

No clue exactly when this problem appears, but 


1 - the "request twice the fts plugin instead of once" issue has always
been there (since my first RC release of fts-xapian)


Ok, good to know.


2 - the body/text loop has appeared recently (maybe during the month of
March)


Our testing doesn't seem to be able to reproduce this.  Can you try to
git-bisect this to find which commit broke it?

Thanks,

Jeff.

On 2019-04-05 16:36, Josef 'Jeff' Sipek via dovecot wrote:

On Wed, Apr 03, 2019 at 19:02:52 +0800, Joan Moreau via dovecot wrote: 

issue seems in the Git version : 
Which git revision?


Before you updated to the broken revision, which revision/version were you
running?

Can you try it with 5f6e39c50ec79ba8847b2fdb571a9152c71cd1b6 (the commit
just before the fts_enforced=body introduction)?  That's the only recent fts
change.

Thanks,

Jeff.

On 2019-04-03 18:58, @lbutlr via dovecot wrote:

On 3 Apr 2019, at 04:30, Joan Moreau via dovecot  wrote: 

doveadm search -u j...@grosjo.net mailbox inbox text milan 
Did that search over my list mail and got 83 results, not able to duplicate your issue.


What version of dovecot and have you tried to reindex?

dovecot-2.3.5.1 here.

Re: FTS delays

2019-04-05 Thread Josef 'Jeff' Sipek via dovecot
On Fri, Apr 05, 2019 at 17:45:36 +0800, Joan Moreau wrote:
> I am on master (very latest) 
> 
> No clue exactly when this problem appears, but 
> 
> 1 - the "request twice the fts plugin instead of once" issue has always
> been there (since my first RC release of fts-xapian) 

Ok, good to know.

> 2 - the body/text loop has appeared recently (maybe during the month of
> March) 

Our testing doesn't seem to be able to reproduce this.  Can you try to
git-bisect this to find which commit broke it?

Thanks,

Jeff.

> 
> On 2019-04-05 16:36, Josef 'Jeff' Sipek via dovecot wrote:
> 
> > On Wed, Apr 03, 2019 at 19:02:52 +0800, Joan Moreau via dovecot wrote: 
> > 
> >> issue seems in the Git version :
> > 
> > Which git revision?
> > 
> > Before you updated to the broken revision, which revision/version were you
> > running?
> > 
> > Can you try it with 5f6e39c50ec79ba8847b2fdb571a9152c71cd1b6 (the commit
> > just before the fts_enforced=body introduction)?  That's the only recent fts
> > change.
> > 
> > Thanks,
> > 
> > Jeff.
> > 
> > On 2019-04-03 18:58, @lbutlr via dovecot wrote:
> > 
> > On 3 Apr 2019, at 04:30, Joan Moreau via dovecot  
> > wrote: 
> > 
> > doveadm search -u j...@grosjo.net mailbox inbox text milan 
> > Did that search over my list mail and got 83 results, not able to duplicate 
> > your issue.
> > 
> > What version of dovecot and have you tried to reindex?
> > 
> > dovecot-2.3.5.1 here.

-- 
I already backed up the [server] once, I can do it again.
- a sysadmin threatening to do more frequent backups


Re: FTS delays

2019-04-05 Thread Joan Moreau via dovecot
I am on master (very latest) 

No clue exactly when this problem appears, but 


1 - the "request twice the fts plugin instead of once" issue has always
been there (since my first RC release of fts-xapian) 


2 - the body/text loop has appeared recently (maybe during the month of
March) 


On 2019-04-05 16:36, Josef 'Jeff' Sipek via dovecot wrote:

On Wed, Apr 03, 2019 at 19:02:52 +0800, Joan Moreau via dovecot wrote: 


issue seems in the Git version :


Which git revision?

Before you updated to the broken revision, which revision/version were you
running?

Can you try it with 5f6e39c50ec79ba8847b2fdb571a9152c71cd1b6 (the commit
just before the fts_enforced=body introduction)?  That's the only recent fts
change.

Thanks,

Jeff.

On 2019-04-03 18:58, @lbutlr via dovecot wrote:

On 3 Apr 2019, at 04:30, Joan Moreau via dovecot  wrote: 

doveadm search -u j...@grosjo.net mailbox inbox text milan 
Did that search over my list mail and got 83 results, not able to duplicate your issue.


What version of dovecot and have you tried to reindex?

dovecot-2.3.5.1 here.

Re: FTS delays

2019-04-05 Thread Josef 'Jeff' Sipek via dovecot
On Wed, Apr 03, 2019 at 19:02:52 +0800, Joan Moreau via dovecot wrote:
> issue seems in the Git version : 

Which git revision?

Before you updated to the broken revision, which revision/version were you
running?

Can you try it with 5f6e39c50ec79ba8847b2fdb571a9152c71cd1b6 (the commit
just before the fts_enforced=body introduction)?  That's the only recent fts
change.

Thanks,

Jeff.

> On 2019-04-03 18:58, @lbutlr via dovecot wrote:
> 
> > On 3 Apr 2019, at 04:30, Joan Moreau via dovecot  
> > wrote: 
> > 
> >> doveadm search -u j...@grosjo.net mailbox inbox text milan
> > 
> > Did that search over my list mail and got 83 results, not able to duplicate 
> > your issue.
> > 
> > What version of dovecot and have you tried to reindex?
> > 
> > dovecot-2.3.5.1 here.

-- 
mainframe, n.:
  An obsolete device still used by thousands of obsolete companies serving
  billions of obsolete customers and making huge obsolete profits for their
  obsolete shareholders. And this year's run twice as fast as last year's.


Re: FTS delays

2019-04-03 Thread Joan Moreau via dovecot
issue seems in the Git version : 

FTS search in teh body ends up with looping 

Other search call twice the FTS plugin (for no reason) 


On 2019-04-03 18:58, @lbutlr via dovecot wrote:

On 3 Apr 2019, at 04:30, Joan Moreau via dovecot  wrote: 


doveadm search -u j...@grosjo.net mailbox inbox text milan


Did that search over my list mail and got 83 results, not able to duplicate 
your issue.

What version of dovecot and have you tried to reindex?

dovecot-2.3.5.1 here.

Re: FTS delays

2019-04-03 Thread @lbutlr via dovecot
On 3 Apr 2019, at 04:30, Joan Moreau via dovecot  wrote:
> doveadm search -u j...@grosjo.net mailbox inbox text milan

Did that search over my list mail and got 83 results, not able to duplicate 
your issue.

What version of dovecot and have you tried to reindex?

dovecot-2.3.5.1 here.


-- 
There is a tragic flaw in our precious Constitution, and I don't know
what can be done to fix it. This is it: Only nut cases want to be
president.





Re: FTS delays

2019-04-03 Thread Joan Moreau via dovecot
Example from real life 

From Roubdcube, I serach "milan" in full message (body & headers) 


Logs : 


Apr 3 10:24:01 gjserver dovecot[29778]:
imap(j...@grosjo.net)<30311><4pACp52FfCF/AAAB>: Query : ( bcc:milan OR
body:milan OR cc:milan OR from:milan OR message-id:milan OR
subject:milan OR to:milan OR uid:milan )
Apr 3 10:24:01 gjserver dovecot[29778]:
imap(j...@grosjo.net)<30311><4pACp52FfCF/AAAB>: Query: 81 results in 2 ms


81 results is correct 

but Roundcube times out 

from command line, I do : 

doveadm search -u j...@grosjo.net mailbox inbox text milan 

output 


doveadm(j...@grosjo.net): Info: Query : ( bcc:inbox OR body:inbox OR
cc:inbox OR from:inbox OR message-id:inbox OR subject:inbox OR to:inbox
OR uid:inbox ) AND ( bcc:milan OR body:milan OR cc:milan OR from:milan
OR message-id:milan OR subject:milan OR to:milan OR uid:milan )
doveadm(j...@grosjo.net): Info: Query: 1 results in 1 ms
d82b4b0f550d3859364495331209 847
d82b4b0f550d3859364495331209 1569
d82b4b0f550d3859364495331209 2260
d82b4b0f550d3859364495331209 2575
d82b4b0f550d3859364495331209 2811
d82b4b0f550d3859364495331209 2885
d82b4b0f550d3859364495331209 3038
d82b4b0f550d3859364495331209 3121
d82b4b0f550d3859364495331209 3170 

1 - The query is wrong 

2 - teh last line "d8...209 3170" gets repeated for ages 


On 2019-04-02 16:30, Timo Sirainen wrote:

On 2 Apr 2019, at 6.38, Joan Moreau via dovecot  wrote: 


Further on this topic:

When choosing any headers in the search box, dovecot core calls the plugin 
TWICE (and returns the results quickly, but not immediatly after getting the 
IDs from the plugins)

When choosing the BODY search, dovecot core calls the plugin ONCE (and never 
returns) (whereas the plugins returns properly the IDs)


If we simplify this, do you mean this calls it once and is fast:

doveadm search -u user@domain mailbox inbox body helloworld

But this calls twice and is slow:

doveadm search -u user@domain mailbox inbox text helloworld

And what about searching e.g. subject? :

doveadm search -u user@domain mailbox inbox subject helloworld

And does the slowness depend on whether there were any matches or not?


This is based on GIT version. (previous versions were working properly)


Previous versions were fast? Do you mean v2.3.5?

Re: FTS delays

2019-04-02 Thread Timo Sirainen via dovecot
On 2 Apr 2019, at 6.38, Joan Moreau via dovecot  wrote:
> 
> Further on this topic:
> 
> 
> 
> When choosing any headers in the search box, dovecot core calls the plugin 
> TWICE (and returns the results quickly, but not immediatly after getting the 
> IDs from the plugins)
> 
> When choosing the BODY search, dovecot core calls the plugin ONCE (and never 
> returns) (whereas the plugins returns properly the IDs)
> 

If we simplify this, do you mean this calls it once and is fast:

doveadm search -u user@domain mailbox inbox body helloworld

But this calls twice and is slow:

doveadm search -u user@domain mailbox inbox text helloworld

And what about searching e.g. subject? :

doveadm search -u user@domain mailbox inbox subject helloworld

And does the slowness depend on whether there were any matches or not?

> This is based on GIT version. (previous versions were working properly)

Previous versions were fast? Do you mean v2.3.5?



Re: FTS delays

2019-04-01 Thread Joan Moreau via dovecot
Further on this topic: 


When choosing any headers in the search box, dovecot core calls the
plugin TWICE (and returns the results quickly, but not immediatly after
getting the IDs from the plugins) 


When choosing the BODY search, dovecot core calls the plugin ONCE (and
never returns) (whereas the plugins returns properly the IDs) 

This is based on GIT version. (previous versions were working properly) 

Looking for feedback 


Thank you

On 2019-03-30 21:48, Joan Moreau wrote:

it is already on 

On March 31, 2019 03:47:52 Aki Tuomi via dovecot  wrote: 

On 30 March 2019 21:37 Joan Moreau via dovecot  wrote: 

Hi 

When I do a FTS search (using Xapian plugin) in the BODY part, the plugins returns the matching IDs within few milliseconds (as seen in the log). 

However, roundcube (connected on dovecot) takes ages to show (headers only vie IMAP) the few results (I tested with a matching requests of 9 emails) 

What could be the root cause ? 

Thank you 

does it help if you set 

plugin { 
fts_enforced=yes 
} 


---
Aki Tuomi

Re: FTS delays

2019-03-30 Thread Joan Moreau via dovecot

it is already on

On March 31, 2019 03:47:52 Aki Tuomi via dovecot  wrote:



On 30 March 2019 21:37 Joan Moreau via dovecot  wrote:





Hi

When I do a FTS search (using Xapian plugin) in the BODY part, the plugins 
returns the matching IDs within few milliseconds (as seen in the log).


However, roundcube (connected on dovecot) takes ages to show (headers only 
vie IMAP) the few results (I tested with a matching requests of 9 emails)


What could be the root cause ?

Thank you


does it help if you set

plugin {
  fts_enforced=yes
}
---
Aki Tuomi




Re: FTS delays

2019-03-30 Thread Aki Tuomi via dovecot


 
 
  
   
  
  
   
On 30 March 2019 21:37 Joan Moreau via dovecot  wrote:
   
   

   
   

   
   Hi
   When I do a FTS search (using Xapian plugin) in the BODY part, the plugins returns the matching IDs within few milliseconds (as seen in the log).
   However, roundcube (connected on dovecot) takes ages to show (headers only vie IMAP) the few results (I tested with a matching requests of 9 emails)
   What could be the root cause ?
   Thank you
   
  
  
   
  
  
   does it help if you set
  
  
   
  
  
   plugin {
  
  
      fts_enforced=yes
  
  
   }
  
  
   ---
Aki Tuomi