Re: CVS commit: src/external/bsd/dhcpcd/dist/src
Date:Sat, 4 Aug 2018 04:24:01 +0200 From:Kamil Rytarowski Message-ID: <568544f4-36d5-853e-cdf8-248f84fad...@gmx.com> | I haven't changed any optimization or similar flags for the builds. Then why did the ssh example start giving "perhaps used uninit" warnings when it had not before?Something changed. | Freshly crashed pmax kernel due to integer overflow is a kernel (or | virtualization) bug, but it's also a definition of UB, that it can crash | the computer. I don't know the circumstances of that one, but sure, overflow can cause all kinds of problems, and if it actually occurs, almost anything is possible, depending upon just what the code is doing. I'm not sure what your point is here, no-one is suggesting that real bugs not get fixed. | The dhcpcd one was fixed first in upstream dhcpcd before landing it to src/. "fixed" is the wrong word, as there neither is, nor ever was, any bug there to "fix". You apparently silenced a bogus sanitiser induced warning. That is not a "fix". kre
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On 04.08.2018 03:23, Robert Elz wrote: > Date:Sat, 4 Aug 2018 02:15:15 +0200 > From:Kamil Rytarowski > Message-ID: > > > | In general there shall not be a relation between -O level and > | sanitizers. Sanitizers do not need -O0 or -g for operation. > > That's fine. But are you doing compiles that way? (necessary or not) > I haven't changed any optimization or similar flags for the builds. > | GCC also enables more warnings for UBSan that have to be addressed in > | order to compile the source, as the code would be UB anyway (like > | changing the signedness bit with a shift operation). > > Sure, some of those, even though they're not really problems, are easy > to fix in a totally harmless way. That's fine. The UB is a technical C > issue, not really anything that ever fails in those cases though. > Freshly crashed pmax kernel due to integer overflow is a kernel (or virtualization) bug, but it's also a definition of UB, that it can crash the computer. > | And regarding utility of the Undefined Behavior Sanitizer and coverage > | of new tests.. we have just caught a bug on pmax that an integer > | overflow crashed the kernel: > > Sure, no-one is saying that the extra work is not worth while. Just that > you are sometimes fixing non-problems (and causing code churn to do > it - particularly when the code being changed comes from upstream ... in > the dhcpcd case that is not such an issue, as if needed, Roy will fix that > as well, but why would anyone expect the openbsd people to alter ssh > to fix a non problem ?) > Some Undefined Behavior fixes were pulled by FreeBSD (at least one of them merged McKusick!). Regarding the OpenSSH case, I'm in touch with some of their developers and concluded to push it to one of their mailinglist. One of the UB patches was already merged by the OpenBSD project (in tmux). I will go the same way with other patches and submit them upstream, even if some are cautious. The dhcpcd one was fixed first in upstream dhcpcd before landing it to src/. signature.asc Description: OpenPGP digital signature
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
Date:Sat, 4 Aug 2018 02:15:15 +0200 From:Kamil Rytarowski Message-ID: | In general there shall not be a relation between -O level and | sanitizers. Sanitizers do not need -O0 or -g for operation. That's fine. But are you doing compiles that way? (necessary or not) | GCC is known for reporting uninitialized variables and I wouldn't blame | sanitizers for it. I wasn't. The one (which isn't) in the ssh code has been there for ages, and has compiled successfully, for ages There has to be a reason why it only needed action yesterday. | We just initialize them to tune it down and this is the current practice. Yes, I know - some of them are real potential problems, even if the circumstances that lead to the problem are so unlikely that we never see them in real life - others take knowledge about the environment the compiler does not have, or just require flow analysis more complex than it is reasonable to expect of the compiler, in order to know that there is not actually a problem.The real bugs we fix, obviously. The ones the compiler cannot detect are not bugs we deal with as you said. But when the compiler is able to detect there is no problem, but we are simply preventing it doing so, we fix the way we use the compiler. | GCC also enables more warnings for UBSan that have to be addressed in | order to compile the source, as the code would be UB anyway (like | changing the signedness bit with a shift operation). Sure, some of those, even though they're not really problems, are easy to fix in a totally harmless way. That's fine. The UB is a technical C issue, not really anything that ever fails in those cases though. There has (is currently) an issue with posix, where a current bug resulution requires abs(INT_MIN) to be INT_MIN (ie: abs(n) can end up < 0). In C that's undefined. POSIX requires 2's complement however (unlike C) and can rely upon what happens with 0 - INT_MAX even if C says that is undefined.Some people like it, others do not... (what the end result will be I have no idea.) | I don't agree with strong opinions against cautious warnings/errors from | a compiler. Sure, but when the warning goes off, we need to analyse the issue and see what the actual problem is, not just blindly do whatever makes the compiler stop issuing the warning. |They are there for purpose and dhcpcd could be really broken | with the same code, but with a different context. No, it could not. The only possible issue was if the packet was invalid (too short a len) but that was not what the warning was about (and could not have been, as there was nothing undefined if that happened, just an unwanted result). | And regarding utility of the Undefined Behavior Sanitizer and coverage | of new tests.. we have just caught a bug on pmax that an integer | overflow crashed the kernel: Sure, no-one is saying that the extra work is not worth while. Just that you are sometimes fixing non-problems (and causing code churn to do it - particularly when the code being changed comes from upstream ... in the dhcpcd case that is not such an issue, as if needed, Roy will fix that as well, but why would anyone expect the openbsd people to alter ssh to fix a non problem ?) Once again, please do not change code to fix gcc warnings unless you get the same warning with the code compiled with -O2 (or more). kre
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On 04.08.2018 01:31, Robert Elz wrote: > Kamil: assuming you agree that this is a reasonable analysis, I'd suggest > no more code changes based upon gcc warnings issued this way. In general there shall not be a relation between -O level and sanitizers. Sanitizers do not need -O0 or -g for operation. UBSan does not need disabled optimization for reporting issues in exact location in the code. It also does not need debug information (DWARF or similar)... however a runtime might make use of the additional data to print more verbose messages or stacktraces. GCC is known for reporting uninitialized variables and I wouldn't blame sanitizers for it. We just initialize them to tune it down and this is the current practice. GCC also enables more warnings for UBSan that have to be addressed in order to compile the source, as the code would be UB anyway (like changing the signedness bit with a shift operation). I don't agree with strong opinions against cautious warnings/errors from a compiler. They are there for purpose and dhcpcd could be really broken with the same code, but with a different context. And regarding utility of the Undefined Behavior Sanitizer and coverage of new tests.. we have just caught a bug on pmax that an integer overflow crashed the kernel: UB caused to crash pmax.. divrem_overflow_signed_div: pexpect reported EOF - VMM exited unexpectedly signature.asc Description: OpenPGP digital signature
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On Fri, Aug 03, 2018 at 18:08:24 +0300, Valery Ushakov wrote: > On Fri, Aug 03, 2018 at 15:54:24 +0200, Martin Husemann wrote: > > > On Fri, Aug 03, 2018 at 08:28:55PM +0700, Robert Elz wrote: > > > Where is the signed arithmetic that was supposedly a probem? > > > > Ah, stupid C integer promotion rules. uint16_t is promoted to int > > here, not unsigned int or size_t. > > Hmm, i don't think that's true. Nah, you are right. "THEN" is the important word here. But as it transpired in another branch of this thread the problem was something entirely different... > 6.3.1.8 Usual arithmetic conversions > > ... > Otherwise, the integer promotions are performed on both > operands. Then the following rules are applied to the > promoted operands: -uwe
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On Fri, Aug 03, 2018 at 07:46:01PM +0100, Roy Marples wrote: > We could split the term, but merely storing the result of htons in it's own > variable creates a larger binary for no good reason as i see it. > I suspect that the compiler will generate the same code anyway when using a local variable for intermediate results, feel free to write the cost as you find most legible :-)
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
Date:Fri, 3 Aug 2018 19:46:01 +0100 From:Roy Marples Message-ID: | Considering both methods work and the result of htons is a uint16_t (but | is this always guaranteed?) ntohs() (not that it matters) and "always" is a big word, but that is how it is defined by POSIX, so it should be something that we can rely upon. | is this just an over-zealous compiler warning? Not so much over zealous, as just plain lazy... (given Kamil's most recent message): | conversion to 'long unsigned int' from 'int' may change the sign of the | result [-Werror=sign-conversion] | *len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp); | ^ That's not the overflow on subtract that you said before, and for this one I can see how the cast can help ... but that's just sheer laziness on the part of the compiler. The "int" which is there was created by the compiler, it knows (or should know) that the underlying value is in the rangs [0..65535] and cannot possibly have its sign changed when it is converted to long unsigned int. It would be more understandable if the int appeared somewhere earlier, but here this is all in this one expression, one type promotion on top of another. Get the idiot compiler fixed, and then remove the cast. In the meantime, at least mark it with a comment indicating that the cast should not be needed, and is there purely to appease gcc. kre
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On 03.08.2018 20:49, Roy Marples wrote: > On 03/08/2018 15:22, Robert Elz wrote: >> Whether there need to be any attention to the possibility >> of a malformed packet I will leave for Roy to decide (I am >> assuming probably not) but that added cast just looks to be >> a bandaid for a broken compiler (sanitiser). > > The contents are verified further up the stack. > I'm inclined to agree it's a dodgy compiler warning, but I'm not really > an expert here. > > Roy I've repeated the compiler error as I forgot the exact error message (it was fixed in a local branch a while ago): /public/src.git/external/bsd/dhcpcd/dist/src/dhcp.c: In function 'get_udp_data': /public/src.git/external/bsd/dhcpcd/dist/src/dhcp.c:3270:29: error: conversion to 'long unsigned int' from 'int' may change the sign of the result [-Werror=sign-conversion] *len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp); ^ http://netbsd.org/~kamil/patch-00068-dhcpcd-get_udp_data.txt signature.asc Description: OpenPGP digital signature
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On 03/08/2018 15:22, Robert Elz wrote: Whether there need to be any attention to the possibility of a malformed packet I will leave for Roy to decide (I am assuming probably not) but that added cast just looks to be a bandaid for a broken compiler (sanitiser). The contents are verified further up the stack. I'm inclined to agree it's a dodgy compiler warning, but I'm not really an expert here. Roy
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On 03/08/2018 14:02, Martin Husemann wrote: On Fri, Aug 03, 2018 at 02:47:53PM +0200, Kamil Rytarowski wrote: Further if there ever was a potential problem from this line ... *len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp); then *len = (size_t)ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp); It was a build time error generated by GCC. The compiler detected that both sizeof() could be large enough to overflow the result returned from ntohs(3). And overflowing a signed integer is Undefined Behavior. But we do not do this here. This change points to the compiler that the code is safe. What exactly makes the code safe now? If ntohs(p->ip.ip_len) < (sizeof(p->ip) + sizeof(p->udp)) then we are now in even more serious trouble. Does splitting the term help? uint16_t hdr_size = sizeof(p->ip) - sizeof(p->udp); uint16_t pkt_size = ntohs(p->ip.ip_len); KASSERT(pkt_size > hdr_size); *len = pkt_size > hdr_size ? pkt_size-hdr_size : 0; or something like that? We could split the term, but merely storing the result of htons in it's own variable creates a larger binary for no good reason as i see it. Considering both methods work and the result of htons is a uint16_t (but is this always guaranteed?) is this just an over-zealous compiler warning? Roy
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On Fri, Aug 03, 2018 at 15:54:24 +0200, Martin Husemann wrote: > On Fri, Aug 03, 2018 at 08:28:55PM +0700, Robert Elz wrote: > > Where is the signed arithmetic that was supposedly a probem? > > Ah, stupid C integer promotion rules. uint16_t is promoted to int > here, not unsigned int or size_t. Hmm, i don't think that's true. 6.3.1.8 Usual arithmetic conversions ... Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands: If both operands have the same type, then no further conversion is needed. Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank. ntohs returns unsigned uint16_t, sizeof returns unsigned size_t, so uint16_t, that has "lesser integer convertion rank" (i.e. it's smaller) should be converted to size_t automagically. -uwe
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
Date:Fri, 3 Aug 2018 15:54:24 +0200 From:Martin Husemann Message-ID: <20180803135424.gc23...@mail.duskware.de> | Ah, stupid C integer promotion rules. uint16_t is promoted to int | here, not unsigned int or size_t. Even with that, there should be no problem, in signed - unsigned the '-' should be an unsigned - and the result should be unsigned. There is no signed arithmetic being done here to cause an undefined result. That's the same rule that makes strlen(s) + 1 be a size_t rather than a ssize_t or whatever. Otherwise we'd need to be adding casts to every operation like that, just in case strlen(s) == MAX_INT and the " +1 " would cause overflow, and undefined operation.No thanks. Whether there need to be any attention to the possibility of a malformed packet I will leave for Roy to decide (I am assuming probably not) but that added cast just looks to be a bandaid for a broken compiler (sanitiser). kre
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On Fri, Aug 03, 2018 at 08:28:55PM +0700, Robert Elz wrote: > Where is the signed arithmetic that was supposedly a probem? Ah, stupid C integer promotion rules. uint16_t is promoted to int here, not unsigned int or size_t. The cast makes all operands the same type and no promotion happens. Martin
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
Date:Fri, 3 Aug 2018 15:02:28 +0200 From:Martin Husemann Message-ID: <20180803130227.ga23...@mail.duskware.de> | What exactly makes the code safe now? If ntohs(p->ip.ip_len) < | (sizeof(p->ip) + sizeof(p->udp)) then we are now in even more serious | trouble. Actually, not more serious, the same serious as before. If adding that cast change anything at all, the compiler isn't working as it should. If the values haven't been verieied, they should be. If they have been verified, there is no problem and nothing needs fixing (except possibly the santiizer). In a later message ... | Overflow (underflow) of an unsigned value is defined and GCC stops | deducing whether there might be a problem. But it always was unsigned, ntohs() returns an unsigned result. Further even if it was signed, doesn't combining a signed value and an unsigned one with an arithmetic op result in an unsigned operation? Where is the signed arithmetic that was supposedly a probem? kre
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On 03.08.2018 15:20, Martin Husemann wrote: > On Fri, Aug 03, 2018 at 03:18:18PM +0200, Kamil Rytarowski wrote: >> The change was indicating to the compiler that code is safe, without >> changing the algorithm. > > I don't get why. > > Martin > Overflow (underflow) of an unsigned value is defined and GCC stops deducing whether there might be a problem. signature.asc Description: OpenPGP digital signature
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On Fri, Aug 03, 2018 at 03:18:18PM +0200, Kamil Rytarowski wrote: > The change was indicating to the compiler that code is safe, without > changing the algorithm. I don't get why. Martin
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On 03.08.2018 15:02, Martin Husemann wrote: > On Fri, Aug 03, 2018 at 02:47:53PM +0200, Kamil Rytarowski wrote: >>> Further if there ever was a potential problem from this line ... >>> >>> *len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp); >>> then >>> *len = (size_t)ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp); > >> It was a build time error generated by GCC. The compiler detected that >> both sizeof() could be large enough to overflow the result returned from >> ntohs(3). And overflowing a signed integer is Undefined Behavior. > > But we do not do this here. > >> This change points to the compiler that the code is safe. > > What exactly makes the code safe now? If ntohs(p->ip.ip_len) < > (sizeof(p->ip) + sizeof(p->udp)) then we are now in even more serious > trouble. > The change was indicating to the compiler that code is safe, without changing the algorithm. > Does splitting the term help? > > uint16_t hdr_size = sizeof(p->ip) - sizeof(p->udp); + > uint16_t pkt_size = ntohs(p->ip.ip_len); > KASSERT(pkt_size > hdr_size); > *len = pkt_size > hdr_size ? pkt_size-hdr_size : 0; > > or something like that? > This looks like a safer approach, but I will defer it to Roy to decide what to do. Previously we have agreed with the (size_t) case. > Martin > signature.asc Description: OpenPGP digital signature
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On Fri, Aug 03, 2018 at 02:47:53PM +0200, Kamil Rytarowski wrote: > > Further if there ever was a potential problem from this line ... > > > > *len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp); > > then > > *len = (size_t)ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp); > It was a build time error generated by GCC. The compiler detected that > both sizeof() could be large enough to overflow the result returned from > ntohs(3). And overflowing a signed integer is Undefined Behavior. But we do not do this here. > This change points to the compiler that the code is safe. What exactly makes the code safe now? If ntohs(p->ip.ip_len) < (sizeof(p->ip) + sizeof(p->udp)) then we are now in even more serious trouble. Does splitting the term help? uint16_t hdr_size = sizeof(p->ip) - sizeof(p->udp); uint16_t pkt_size = ntohs(p->ip.ip_len); KASSERT(pkt_size > hdr_size); *len = pkt_size > hdr_size ? pkt_size-hdr_size : 0; or something like that? Martin
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
On 03.08.2018 12:26, Robert Elz wrote: > Date:Fri, 3 Aug 2018 02:17:33 + > From:"Kamil Rytarowski" > Message-ID: <20180803021733.b2002f...@cvs.netbsd.org> > > | GCC with -fsanitize=undefiend detects a potential overflow in the code. > | Cast the return value of ntohs(3) to size_t. > > I don't understand the point of this change, and I believe it to > be wrong and misguided. > > Further if there ever was a potential problem from this line ... > > *len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp); > then > *len = (size_t)ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp); > > will not fix it. The possible problem being if ip_len < 28 (the sum of the > two sizeof's).While I didn't check this code, that possibility is usually > verified elsewhere, and even if not, adding the cast changes nothing, the > result is still bogus (that is, at best, the change could be masking a real > bug, though that's unlikely.) > > This looks to be (somehow) determining that the result of ntohs(ip_len) > might somehow be negative (or something, I'm not really sure what is > being believed is happening) but the ntohs() result is (in all cases here) > a uint16_t (either because the result is literally the arg, which is that > type, > or because it is the result of bswap16() which is declared that way.) > > The sizeof() ops make the expression at least a size_t - so unless size_t > and uint16_t are the same (which might happen on a pdp-11 or similar, > though even there it is unlikely I suspect) the narrower one needs to be > promoted - whether the ntohs() result is implicitly promoted to size_t, or > explicitly cast cannot make any difference, can it? > > So what exactly is being fixed here? Which behaviour is supposedly undefined? > It was a build time error generated by GCC. The compiler detected that both sizeof() could be large enough to overflow the result returned from ntohs(3). And overflowing a signed integer is Undefined Behavior. This change points to the compiler that the code is safe. > kre > signature.asc Description: OpenPGP digital signature
Re: CVS commit: src/external/bsd/dhcpcd/dist/src
Date:Fri, 3 Aug 2018 02:17:33 + From:"Kamil Rytarowski" Message-ID: <20180803021733.b2002f...@cvs.netbsd.org> | GCC with -fsanitize=undefiend detects a potential overflow in the code. | Cast the return value of ntohs(3) to size_t. I don't understand the point of this change, and I believe it to be wrong and misguided. Further if there ever was a potential problem from this line ... *len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp); then *len = (size_t)ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp); will not fix it. The possible problem being if ip_len < 28 (the sum of the two sizeof's).While I didn't check this code, that possibility is usually verified elsewhere, and even if not, adding the cast changes nothing, the result is still bogus (that is, at best, the change could be masking a real bug, though that's unlikely.) This looks to be (somehow) determining that the result of ntohs(ip_len) might somehow be negative (or something, I'm not really sure what is being believed is happening) but the ntohs() result is (in all cases here) a uint16_t (either because the result is literally the arg, which is that type, or because it is the result of bswap16() which is declared that way.) The sizeof() ops make the expression at least a size_t - so unless size_t and uint16_t are the same (which might happen on a pdp-11 or similar, though even there it is unlikely I suspect) the narrower one needs to be promoted - whether the ntohs() result is implicitly promoted to size_t, or explicitly cast cannot make any difference, can it? So what exactly is being fixed here? Which behaviour is supposedly undefined? kre