------------------ Original ------------------ From: "dovecot" <dovecot-requ...@dovecot.org>; Date: Tue, Mar 30, 2021 11:55 PM To: "dovecot"<dovecot@dovecot.org>;
Subject: dovecot Digest, Vol 215, Issue 77 Send dovecot mailing list submissions to dovecot@dovecot.org To subscribe or unsubscribe via the World Wide Web, visit https://dovecot.org/mailman/listinfo/dovecot or, via email, send a message with subject or body 'help' to dovecot-requ...@dovecot.org You can reach the person managing the list at dovecot-ow...@dovecot.org When replying, please edit your Subject line so it is more specific than "Re: Contents of dovecot digest..." Today's Topics: 1. special characters in FTS Solr fail (pi...@piumalab.org) 2. json_parse_number broken by compiler optimization (Christian Ehrhardt) 3. Re: json_parse_number broken by compiler optimization (Aki Tuomi) ---------------------------------------------------------------------- Message: 1 Date: Tue, 30 Mar 2021 16:20:56 +0200 From: "pi...@piumalab.org" <pi...@piumalab.org> To: dovecot@dovecot.org Subject: special characters in FTS Solr fail Message-ID: <3c779646-e2fc-9f51-063c-a8816450d...@piumalab.org> Content-Type: text/plain; charset="utf-8" Hi, I want to report a bug about searching. My dovecot version is 2.3.10.1 (a3d0e1171) and use Solr 7.7.2 for full text search. The issue appear when try to search using solr special characters. For example if I have emails with the subject that contain the only character "(" open parenthesis, then try to search it I receive zero results. In the Solr log seems the "(" is correctly excaped as "\(": 2021-03-30 15:14:20.304 INFO (qtp333683827-20) [ x:dovecot] o.a.s.c.S.Request [dovecot] webapp=/solr path=/select params={q={!lucene+q.op%3DAND}subject:\(&fl=uid,score&sort=uid+asc&fq=%2Bbox:6fff5821c5a2365df4060000d0d97f83+%2Buser:titolar...@kendama.it&rows=631&wt=xml} hits=0 status=0 QTime=2 I provide dovecot-sysreport as attachment. Best regards Danilo -------------- next part -------------- A non-text attachment was scrubbed... Name: dovecot-sysreport-pecbox02.par-tec.it-1617116347.tar.gz Type: application/gzip Size: 7395 bytes Desc: not available URL: <https://dovecot.org/pipermail/dovecot/attachments/20210330/08492366/attachment-0001.gz> ------------------------------ Message: 2 Date: Tue, 30 Mar 2021 17:53:27 +0200 From: Christian Ehrhardt <christian.ehrha...@canonical.com> To: dovecot@dovecot.org Subject: json_parse_number broken by compiler optimization Message-ID: <CAATJJ0JB9e4cZrS=hiM0r=yf4-fz693zk3qla41l-ryvfy8...@mail.gmail.com> Content-Type: text/plain; charset="UTF-8" Hi, the recent Ubuntu (re)builds uncovered an issue with dovecot 1:2.3.13+dfsg1-1 build log: https://launchpadlibrarian.net/529849650/buildlog_ubuntu-hirsute-amd64.dovecot_1%3A2.3.13+dfsg1-1build1_BUILDING.txt.gz A coworker tried 2.3.14 but got the same result. What fails is the json_parser build time test like: test-json-parser.c:161: Assert(#25) failed: null_strcmp(json_output[pos].value, value) == 0 I was looking into that a bit more and what I found is that it is dependent on the new toolchain of gcc 10.2.0-1. Not all calls to json_parse_* fail, e.g. the first one looks all good and passes I was iterating the tests using a report function defined like (gdb) define repcon >c >p pos >p json_output[pos].type >p type >p json_output[pos].value >p value >call null_strcmp(json_output[pos].value, value) >end The first one to be bad was: Breakpoint 2, test_json_parser_success (full_size=<optimized out>) at test-json-parser.c:161 161 test_assert_idx(null_strcmp(json_output[pos].value, value) == 0, pos); $84 = 25 $85 = JSON_TYPE_NUMBER $86 = JSON_TYPE_NUMBER $87 = 0x555555633b25 "-12.456" $88 = 0x555555693110 "" $89 = 45 Earlier and later parsing was happy, for example Breakpoint 2, test_json_parser_success (full_size=<optimized out>) at test-json-parser.c:161 161 test_assert_idx(null_strcmp(json_output[pos].value, value) == 0, pos); $90 = 27 $91 = JSON_TYPE_NUMBER $92 = JSON_TYPE_NUMBER $93 = 0x555555633b32 "12.456e9" $94 = 0x555555693110 "12.456e9" $95 = 0 (gdb) We have two things we compare here. 1. json_output[] which is a static define and for this value is 67 ????{ JSON_TYPE_NUMBER, "-12.456" }, 2. the return value that json_parse_next returns. 25 ????" \"sub2\":-12.456,\n" 148 ????????????????ret = json_parse_next(parser, &type, &value); I tried a non negative number and got success which is suspicious Breakpoint 5, test_json_parser_success (full_size=<optimized out>) at test-json-parser.c:164 164 test_assert_idx(null_strcmp(json_output[pos].value, value) == 0, pos); $122 = 25 $123 = JSON_TYPE_NUMBER $124 = JSON_TYPE_NUMBER $125 = 0x555555633b2c "12.456" $126 = 0x555555693110 "12.456" $127 = 0 (gdb) Also the return value otherwise LGTM, it is recognized as a number: 540 } else if ((ret = json_parse_number(parser, value_r)) >= 0) { (gdb) n 541 *type_r = JSON_TYPE_NUMBER; So parsing worked one might think, but the value was not set. (gdb) p *value_r $833 = 0x555555693110 "" It turns out json_parse_number is too optimized. Too optimized for debugging further into json_parse_number. But also optimization is causing the issue. With -O0 or -O1 it works fine, even putting the lower optimization level just around json_parse_number it is enough to avoid the issue. Not a fix but "avoidance" for now: --- a/src/lib/json-parser.c +++ b/src/lib/json-parser.c @@ -386,6 +386,8 @@ static int json_parse_int(struct json_pa return 1; } +#pragma GCC push_options +#pragma GCC optimize ("-O0") static int json_parse_number(struct json_parser *parser, const char **value_r) { int ret; @@ -415,6 +417,7 @@ static int json_parse_number(struct json *value_r = str_c(parser->value); return 1; } +#pragma GCC pop_options static int json_parse_atom(struct json_parser *parser, const char *atom) { I was trying to see what happened and tracked each str_append_c that happened inside json_parse_number. A breakpoint on str_append_c and backtraces showed that we clearly do the "right" parsing. From the backtraces I've seen it does (just from positions): For example (gdb) bt #0 str_append_c (chr=<optimized out>, str=0x5555556930d0) at str.h:44 #1 json_parse_digits (parser=<optimized out>) at json-parser.c:366 => that is a digit It does "- digit digit dot digit digit digit" This perfectly matches the input value, but still the result stays empty: (gdb) p (char*)(str->data) $738 = 0x555555693110 "" In comparison - when parsing the latter 12.456e9 which works I see two odd things: 1. the value isn't cleared despite str_truncate(parser->value, 0); at the beginning of json_parse_number 2. the value (this time) is read exactly as I assumed Breakpoint 17, json_parse_number (value_r=0x7fffffffe1a0, parser=0x555555692ee0) at json-parser.c:394 394 str_truncate(parser->value, 0); Breakpoint 18, str_append_c (chr=<optimized out>, str=0x5555556930d0) at str.h:44 44 buffer_append_c(str, chr); (gdb) p (char*)(str->data) $741 = 0x555555693110 "sub3" ^^ not cleared, still has "sub3" even later on Breakpoint 18, str_append_c (chr=<optimized out>, str=0x5555556930d0) at str.h:44 44 buffer_append_c(str, chr); (gdb) p (char*)(str->data) $742 = 0x555555693110 "1ub3" ^^ reading left to right one char at a time "over the old value", which didn't update the data. So it seems in json_parse_number the ordering of a) str_truncate(parser->value, 0); b) the parsing itself is messed up by optimization. We already know from the above that reducing just this function to -O1 or less avoids the issue. But what is it really - I don't know? Maybe you have a better idea what is going on, the issue should be reproducible when building dovecot with the toolchain present on the latest Ubuntu being 21.04 (Hirsute) right now. -- Christian Ehrhardt Staff Engineer, Ubuntu Server Canonical Ltd ------------------------------ Message: 3 Date: Tue, 30 Mar 2021 18:55:44 +0300 (EEST) From: Aki Tuomi <aki.tu...@open-xchange.com> To: Christian Ehrhardt <christian.ehrha...@canonical.com>, dovecot@dovecot.org Subject: Re: json_parse_number broken by compiler optimization Message-ID: <1219230463.3728.1617119744...@appsuite-dev-gw1.open-xchange.com> Content-Type: text/plain; charset=UTF-8 > On 30/03/2021 18:53 Christian Ehrhardt <christian.ehrha...@canonical.com> wrote: > > > Hi, > the recent Ubuntu (re)builds uncovered an issue with dovecot 1:2.3.13+dfsg1-1 > build log: https://launchpadlibrarian.net/529849650/buildlog_ubuntu-hirsute-amd64.dovecot_1%3A2.3.13+dfsg1-1build1_BUILDING.txt.gz > A coworker tried 2.3.14 but got the same result. > > What fails is the json_parser build time test like: > test-json-parser.c:161: Assert(#25) failed: > null_strcmp(json_output[pos].value, value) == 0 > > I was looking into that a bit more and what I found is that it is > dependent on the new toolchain > of gcc 10.2.0-1. > > Not all calls to json_parse_* fail, e.g. the first one looks all good and passes > I was iterating the tests using a report function defined like > > (gdb) define repcon > >c > >p pos > >p json_output[pos].type > >p type > >p json_output[pos].value > >p value > >call null_strcmp(json_output[pos].value, value) > >end > > The first one to be bad was: > Breakpoint 2, test_json_parser_success (full_size=<optimized out>) at > test-json-parser.c:161 > 161 test_assert_idx(null_strcmp(json_output[pos].value, value) == 0, pos); > $84 = 25 > $85 = JSON_TYPE_NUMBER > $86 = JSON_TYPE_NUMBER > $87 = 0x555555633b25 "-12.456" > $88 = 0x555555693110 "" > $89 = 45 > > Earlier and later parsing was happy, for example > > Breakpoint 2, test_json_parser_success (full_size=<optimized out>) at > test-json-parser.c:161 > 161 test_assert_idx(null_strcmp(json_output[pos].value, value) == 0, pos); > $90 = 27 > $91 = JSON_TYPE_NUMBER > $92 = JSON_TYPE_NUMBER > $93 = 0x555555633b32 "12.456e9" > $94 = 0x555555693110 "12.456e9" > $95 = 0 > (gdb) > > > We have two things we compare here. > 1. json_output[] which is a static define and for this value is > 67 ????{ JSON_TYPE_NUMBER, "-12.456" }, > 2. the return value that json_parse_next returns. > 25 ????" \"sub2\":-12.456,\n" > 148 ????????????????ret = json_parse_next(parser, &type, &value); > > I tried a non negative number and got success which is suspicious > > Breakpoint 5, test_json_parser_success (full_size=<optimized out>) at > test-json-parser.c:164 > 164 test_assert_idx(null_strcmp(json_output[pos].value, value) == 0, pos); > $122 = 25 > $123 = JSON_TYPE_NUMBER > $124 = JSON_TYPE_NUMBER > $125 = 0x555555633b2c "12.456" > $126 = 0x555555693110 "12.456" > $127 = 0 > (gdb) > > > Also the return value otherwise LGTM, it is recognized as a number: > 540 } else if ((ret = json_parse_number(parser, value_r)) >= 0) { > (gdb) n > 541 *type_r = JSON_TYPE_NUMBER; > > So parsing worked one might think, but the value was not set. > > (gdb) p *value_r > $833 = 0x555555693110 "" > > It turns out json_parse_number is too optimized. > Too optimized for debugging further into json_parse_number. > But also optimization is causing the issue. > With -O0 or -O1 it works fine, even putting the lower optimization > level just around json_parse_number it is enough to avoid the issue. > > > Not a fix but "avoidance" for now: > > --- a/src/lib/json-parser.c > +++ b/src/lib/json-parser.c > @@ -386,6 +386,8 @@ static int json_parse_int(struct json_pa > return 1; > } > > +#pragma GCC push_options > +#pragma GCC optimize ("-O0") > static int json_parse_number(struct json_parser *parser, const char **value_r) > { > int ret; > @@ -415,6 +417,7 @@ static int json_parse_number(struct json > *value_r = str_c(parser->value); > return 1; > } > +#pragma GCC pop_options > > static int json_parse_atom(struct json_parser *parser, const char *atom) > { > > > I was trying to see what happened and tracked each str_append_c > that happened inside json_parse_number. > A breakpoint on str_append_c and backtraces showed that we clearly do the > "right" parsing. From the backtraces I've seen it does (just from positions): > > For example > (gdb) bt > #0 str_append_c (chr=<optimized out>, str=0x5555556930d0) at str.h:44 > #1 json_parse_digits (parser=<optimized out>) at json-parser.c:366 > => that is a digit > > It does "- digit digit dot digit digit digit" > > This perfectly matches the input value, but still the result stays empty: > (gdb) p (char*)(str->data) > $738 = 0x555555693110 "" > > In comparison - when parsing the latter 12.456e9 which works I see two > odd things: > 1. the value isn't cleared despite > str_truncate(parser->value, 0); > at the beginning of json_parse_number > 2. the value (this time) is read exactly as I assumed > > Breakpoint 17, json_parse_number (value_r=0x7fffffffe1a0, > parser=0x555555692ee0) at json-parser.c:394 > 394 str_truncate(parser->value, 0); > Breakpoint 18, str_append_c (chr=<optimized out>, str=0x5555556930d0) > at str.h:44 > 44 buffer_append_c(str, chr); > (gdb) p (char*)(str->data) > $741 = 0x555555693110 "sub3" > > ^^ not cleared, still has "sub3" even later on > > Breakpoint 18, str_append_c (chr=<optimized out>, str=0x5555556930d0) > at str.h:44 > 44 buffer_append_c(str, chr); > (gdb) p (char*)(str->data) > $742 = 0x555555693110 "1ub3" > > ^^ reading left to right one char at a time "over the old value", > which didn't update the data. > > So it seems in json_parse_number the ordering of > a) str_truncate(parser->value, 0); > b) the parsing itself > is messed up by optimization. > > We already know from the above that reducing just this function to -O1 or less > avoids the issue. But what is it really - I don't know? > > Maybe you have a better idea what is going on, the issue should be > reproducible when building > dovecot with the toolchain present on the latest Ubuntu being 21.04 > (Hirsute) right now. > > -- > Christian Ehrhardt > Staff Engineer, Ubuntu Server > Canonical Ltd Thanks! We'll look into this. Aki ------------------------------ Subject: Digest Footer _______________________________________________ dovecot mailing list dovecot@dovecot.org https://dovecot.org/mailman/listinfo/dovecot ------------------------------ End of dovecot Digest, Vol 215, Issue 77 ****************************************