Re: Incorrect value sent to solr
> On 07/04/2021 22:58 John Fawcett wrote: > > > On 07/04/2021 13:36, Łukasz Szczepański wrote: > > I'm not as familiar with C, but I don't see in solr backed in dovecot > > any clue of subsequent queries for single mailbox lookup (which most > > mail client uses). There is a hard limit of 10 rows for multiple > > mailbox lookup. > > > This was reported back in December 2020. I submitted this fix to the > list on 31/12/2020 to give an upper bound for single mailbox queries as > there is in place for multiple mailbox queries. > > diff -ur dovecot-2.3.11.3-orig/src/plugins/fts-solr/fts-backend-solr.c > dovecot-2.3.11.3/src/plugins/fts-solr/fts-backend-solr.c > > --- dovecot-2.3.11.3-orig/src/plugins/fts-solr/fts-backend-solr.c > 2020-08-12 14:20:41.0 +0200 > +++ dovecot-2.3.11.3/src/plugins/fts-solr/fts-backend-solr.c > 2020-12-31 09:05:07.681897716 +0100 > @@ -838,7 +838,7 @@ > > str = t_str_new(256); > str_printfa(str, > "wt=xml&fl=uid,score&rows=%u&sort=uid+asc&q=%%7b!lucene+q.op%%3dAND%%7d", > - status.uidnext); > + I_MIN(status.uidnext,SOLR_MAX_MULTI_ROWS)); > prefix_len = str_len(str); > > if (solr_add_definite_query_args(str, args, and_args)) { Hi! Thanks for reminding us, I'll make a ticket about this to avoid forgetting it again. Aki
Re: Incorrect value sent to solr
On 07/04/2021 13:36, Łukasz Szczepański wrote: > I'm not as familiar with C, but I don't see in solr backed in dovecot > any clue of subsequent queries for single mailbox lookup (which most > mail client uses). There is a hard limit of 10 rows for multiple > mailbox lookup. This was reported back in December 2020. I submitted this fix to the list on 31/12/2020 to give an upper bound for single mailbox queries as there is in place for multiple mailbox queries. diff -ur dovecot-2.3.11.3-orig/src/plugins/fts-solr/fts-backend-solr.c dovecot-2.3.11.3/src/plugins/fts-solr/fts-backend-solr.c --- dovecot-2.3.11.3-orig/src/plugins/fts-solr/fts-backend-solr.c 2020-08-12 14:20:41.0 +0200 +++ dovecot-2.3.11.3/src/plugins/fts-solr/fts-backend-solr.c 2020-12-31 09:05:07.681897716 +0100 @@ -838,7 +838,7 @@ str = t_str_new(256); str_printfa(str, "wt=xml&fl=uid,score&rows=%u&sort=uid+asc&q=%%7b!lucene+q.op%%3dAND%%7d", - status.uidnext); + I_MIN(status.uidnext,SOLR_MAX_MULTI_ROWS)); prefix_len = str_len(str); if (solr_add_definite_query_args(str, args, and_args)) {
Re: Incorrect value sent to solr
I'm not as familiar with C, but I don't see in solr backed in dovecot any clue of subsequent queries for single mailbox lookup (which most mail client uses). There is a hard limit of 10 rows for multiple mailbox lookup. W dniu 2021-04-07 12:48, Shawn Heisey napisał(a): On 4/7/2021 4:13 AM, Łukasz Szczepański wrote: I've prepared pull request with fixed rows parameter: https://github.com/dovecot/core/pull/160 I've tested this fix on mailbox which caused a problem earlier, and its works fine. I admit to not being very familiar with dovecot source, but I did a little digging and that fix looks good to my untrained eye. If somebody has 2.2 billion or more messages in one place, the query will still fail, but it should be pretty rare to ever run into that use case in the wild. And I think it might cause performance problems in dovecot too. Does the solr backend code ever use pagination on the search results by sending a fixed rows value and increasing start value on subsequent queries? I think that I and my fellow Solr committers need to treat this as a bug on our end, since it would be perfectly valid to request that many rows on a distributed index. A query like that where there really are that many rows would be embarrassingly slow to execute and require a LOT of memory, but it's still valid. I am asking on the solr users list whether they think I should file a bug report. Thanks, Shawn -- Łukasz Szczepański +48 58 3509284 www.webd.pl Globtel Internet
Re: Incorrect value sent to solr
On 4/7/2021 4:13 AM, Łukasz Szczepański wrote: I've prepared pull request with fixed rows parameter: https://github.com/dovecot/core/pull/160 I've tested this fix on mailbox which caused a problem earlier, and its works fine. I admit to not being very familiar with dovecot source, but I did a little digging and that fix looks good to my untrained eye. If somebody has 2.2 billion or more messages in one place, the query will still fail, but it should be pretty rare to ever run into that use case in the wild. And I think it might cause performance problems in dovecot too. Does the solr backend code ever use pagination on the search results by sending a fixed rows value and increasing start value on subsequent queries? I think that I and my fellow Solr committers need to treat this as a bug on our end, since it would be perfectly valid to request that many rows on a distributed index. A query like that where there really are that many rows would be embarrassingly slow to execute and require a LOT of memory, but it's still valid. I am asking on the solr users list whether they think I should file a bug report. Thanks, Shawn
Re: Incorrect value sent to solr
I've prepared pull request with fixed rows parameter: https://github.com/dovecot/core/pull/160 I've tested this fix on mailbox which caused a problem earlier, and its works fine. W dniu 2021-04-03 23:53, Łukasz Szczepański napisał(a): Error java.lang.NumberFormatException: For input string: "2206267083" I got from logs section in solr, it's visible when you expand an error. NumberFormatException is raised when passed value isn't correct number value. Quick tests shows that in this place solr expects signed int. I know that using uid is not the correct way, but Dovecot seems to be doing just that: https://github.com/dovecot/core/blob/57069b23e6515875796473bdd4ec4bf90343fb25/src/plugins/fts-solr/fts-backend-solr.c#L841 W dniu 2021-04-03 23:15, Shawn Heisey napisał(a): On 4/3/2021 12:12 PM, Łukasz Szczepański wrote: Rows isn't part of manage-schema for solr collection its build in common query parameters in solr, and type of this field cannot be changed. It controls the number of returned results (by default 10). I still think that an issue on dovecot side, but I was also wrong, this parameter is important and have to be sent. Dovecot should send a count of messages in mailbox (or imap folder) not the highest uid in folder. I just tried a query on a stock Solr example with rows set to 30 and got an error similar to the one you reported. org.apache.solr.common.SolrException: For input string: "30" No mention of "int" in this error. I'm wondering if that part was something you added to the email, not text that was actually in the error. I was using Solr 8.5.1, the newest version I have downloaded right now. This error also occurs in cloud mode. With distributed indexes, it could be perfectly acceptable to use a rows parameter that exceeds what a signed int can hold. Performance would probably suck when using a value that high, but it would be acceptable. Which smells like a bug in Solr. I can tell you that using a value from uid in the rows parameter is almost certainly not the right thing to do. That field would have no connection to rows. Thanks, Shawn -- Łukasz Szczepański +48 58 3509284 www.webd.pl Globtel Internet