Re: Incorrect value sent to solr

2021-04-07 Thread Aki Tuomi


> On 07/04/2021 22:58 John Fawcett  wrote:
> 
>  
> On 07/04/2021 13:36, Łukasz Szczepański wrote:
> > I'm not as familiar with C, but I don't see in solr backed in dovecot
> > any  clue of subsequent queries for single mailbox lookup (which most
> > mail client uses). There is a hard limit of 10 rows for multiple
> > mailbox lookup.
> 
> 
> This was reported back in December 2020. I submitted this fix to the
> list on 31/12/2020 to give an upper bound for single mailbox queries as
> there is in place for multiple mailbox queries.
> 
> diff -ur dovecot-2.3.11.3-orig/src/plugins/fts-solr/fts-backend-solr.c
> dovecot-2.3.11.3/src/plugins/fts-solr/fts-backend-solr.c
> 
> --- dovecot-2.3.11.3-orig/src/plugins/fts-solr/fts-backend-solr.c  
> 2020-08-12 14:20:41.0 +0200
> +++ dovecot-2.3.11.3/src/plugins/fts-solr/fts-backend-solr.c   
> 2020-12-31 09:05:07.681897716 +0100
> @@ -838,7 +838,7 @@
> 
>     str = t_str_new(256);
>     str_printfa(str,
> "wt=xml&fl=uid,score&rows=%u&sort=uid+asc&q=%%7b!lucene+q.op%%3dAND%%7d",
> -   status.uidnext);
> +   I_MIN(status.uidnext,SOLR_MAX_MULTI_ROWS));
>     prefix_len = str_len(str);
> 
>     if (solr_add_definite_query_args(str, args, and_args)) {

Hi!

Thanks for reminding us, I'll make a ticket about this to avoid forgetting it 
again.

Aki


Re: Incorrect value sent to solr

2021-04-07 Thread John Fawcett
On 07/04/2021 13:36, Łukasz Szczepański wrote:
> I'm not as familiar with C, but I don't see in solr backed in dovecot
> any  clue of subsequent queries for single mailbox lookup (which most
> mail client uses). There is a hard limit of 10 rows for multiple
> mailbox lookup.


This was reported back in December 2020. I submitted this fix to the
list on 31/12/2020 to give an upper bound for single mailbox queries as
there is in place for multiple mailbox queries.

diff -ur dovecot-2.3.11.3-orig/src/plugins/fts-solr/fts-backend-solr.c
dovecot-2.3.11.3/src/plugins/fts-solr/fts-backend-solr.c

--- dovecot-2.3.11.3-orig/src/plugins/fts-solr/fts-backend-solr.c  
2020-08-12 14:20:41.0 +0200
+++ dovecot-2.3.11.3/src/plugins/fts-solr/fts-backend-solr.c   
2020-12-31 09:05:07.681897716 +0100
@@ -838,7 +838,7 @@

    str = t_str_new(256);
    str_printfa(str,
"wt=xml&fl=uid,score&rows=%u&sort=uid+asc&q=%%7b!lucene+q.op%%3dAND%%7d",
-   status.uidnext);
+   I_MIN(status.uidnext,SOLR_MAX_MULTI_ROWS));
    prefix_len = str_len(str);

    if (solr_add_definite_query_args(str, args, and_args)) {



Re: Incorrect value sent to solr

2021-04-07 Thread Łukasz Szczepański
I'm not as familiar with C, but I don't see in solr backed in dovecot 
any  clue of subsequent queries for single mailbox lookup (which most 
mail client uses). There is a hard limit of 10 rows for multiple 
mailbox lookup.


W dniu 2021-04-07 12:48, Shawn Heisey napisał(a):

On 4/7/2021 4:13 AM, Łukasz Szczepański wrote:

I've prepared pull request with fixed rows parameter:
https://github.com/dovecot/core/pull/160
I've tested this fix on mailbox which caused a problem earlier, and 
its works fine.


I admit to not being very familiar with dovecot source, but I did a
little digging and that fix looks good to my untrained eye.

If somebody has 2.2 billion or more messages in one place, the query
will still fail, but it should be pretty rare to ever run into that
use case in the wild.  And I think it might cause performance problems
in dovecot too.

Does the solr backend code ever use pagination on the search results
by sending a fixed rows value and increasing start value on subsequent
queries?

I think that I and my fellow Solr committers need to treat this as a
bug on our end, since it would be perfectly valid to request that many
rows on a distributed index.  A query like that where there really are
that many rows would be embarrassingly slow to execute and require a
LOT of memory, but it's still valid.  I am asking on the solr users
list whether they think I should file a bug report.

Thanks,
Shawn


--
Łukasz Szczepański
+48 58 3509284
www.webd.pl
Globtel Internet


Re: Incorrect value sent to solr

2021-04-07 Thread Shawn Heisey

On 4/7/2021 4:13 AM, Łukasz Szczepański wrote:

I've prepared pull request with fixed rows parameter:
https://github.com/dovecot/core/pull/160
I've tested this fix on mailbox which caused a problem earlier, and its 
works fine.


I admit to not being very familiar with dovecot source, but I did a 
little digging and that fix looks good to my untrained eye.


If somebody has 2.2 billion or more messages in one place, the query 
will still fail, but it should be pretty rare to ever run into that use 
case in the wild.  And I think it might cause performance problems in 
dovecot too.


Does the solr backend code ever use pagination on the search results by 
sending a fixed rows value and increasing start value on subsequent queries?


I think that I and my fellow Solr committers need to treat this as a bug 
on our end, since it would be perfectly valid to request that many rows 
on a distributed index.  A query like that where there really are that 
many rows would be embarrassingly slow to execute and require a LOT of 
memory, but it's still valid.  I am asking on the solr users list 
whether they think I should file a bug report.


Thanks,
Shawn


Re: Incorrect value sent to solr

2021-04-07 Thread Łukasz Szczepański

I've prepared pull request with fixed rows parameter:
https://github.com/dovecot/core/pull/160
I've tested this fix on mailbox which caused a problem earlier, and its 
works fine.


W dniu 2021-04-03 23:53, Łukasz Szczepański napisał(a):

Error java.lang.NumberFormatException: For input string: "2206267083"
I got from logs section in solr, it's visible when you expand an
error.
NumberFormatException is raised when passed value isn't correct number
value. Quick tests shows that in this place solr expects signed int.

I know that using uid is not the correct way, but Dovecot seems to be
doing just that:
https://github.com/dovecot/core/blob/57069b23e6515875796473bdd4ec4bf90343fb25/src/plugins/fts-solr/fts-backend-solr.c#L841

W dniu 2021-04-03 23:15, Shawn Heisey napisał(a):

On 4/3/2021 12:12 PM, Łukasz Szczepański wrote:
Rows isn't part of manage-schema for solr collection its build in 
common query parameters in solr, and type of this field cannot be 
changed. It controls the number of returned results (by default 10).
I still think that an issue on dovecot side, but I was also wrong, 
this parameter is important and have to be sent. Dovecot should send 
a count of messages in mailbox (or imap folder) not the highest uid 
in folder.


I just tried a query on a stock Solr example with rows set to
30 and got an error similar to the one you reported.

org.apache.solr.common.SolrException: For input string: "30"

No mention of "int" in this error.  I'm wondering if that part was
something you added to the email, not text that was actually in the
error.

I was using Solr 8.5.1, the newest version I have downloaded right 
now.


This error also occurs in cloud mode. With distributed indexes, it
could be perfectly acceptable to use a rows parameter that exceeds
what a signed int can hold.  Performance would probably suck when
using a value that high, but it would be acceptable.  Which smells
like a bug in Solr.

I can tell you that using a value from uid in the rows parameter is
almost certainly not the right thing to do.  That field would have no
connection to rows.

Thanks,
Shawn


--
Łukasz Szczepański
+48 58 3509284
www.webd.pl
Globtel Internet