Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Robert Elz
Date:Sat, 4 Aug 2018 04:24:01 +0200
From:Kamil Rytarowski 
Message-ID:  <568544f4-36d5-853e-cdf8-248f84fad...@gmx.com>

  | I haven't changed any optimization or similar flags for the builds.

Then why did the ssh example start giving "perhaps used uninit" warnings
when it had not before?Something changed.

  | Freshly crashed pmax kernel due to integer overflow is a kernel (or
  | virtualization) bug, but it's also a definition of UB, that it can crash
  | the computer.

I don't know the circumstances of that one, but sure, overflow can cause
all kinds of problems, and if it actually occurs, almost anything is possible,
depending upon just what the code is doing.   I'm not sure what your point
is here, no-one is suggesting that real bugs not get fixed.

  | The dhcpcd one was fixed first in upstream dhcpcd before landing it to src/.

"fixed" is the wrong word, as there neither is, nor ever was, any bug there
to "fix".   You apparently silenced a bogus sanitiser induced warning.   That
is not a "fix".

kre



Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Kamil Rytarowski
On 04.08.2018 03:23, Robert Elz wrote:
> Date:Sat, 4 Aug 2018 02:15:15 +0200
> From:Kamil Rytarowski 
> Message-ID:  
> 
> 
>   | In general there shall not be a relation between -O level and
>   | sanitizers. Sanitizers do not need -O0  or -g for operation.
> 
> That's fine.   But are you doing compiles that way?  (necessary or not)
> 

I haven't changed any optimization or similar flags for the builds.

>   | GCC also enables more warnings for UBSan that have to be addressed in
>   | order to compile the source, as the code would be UB anyway (like
>   | changing the signedness bit with a shift operation).
> 
> Sure, some of those, even though they're not really problems, are easy
> to fix in a totally harmless way.   That's fine.   The UB is a technical C
> issue, not really anything that ever fails in those cases though.
> 

Freshly crashed pmax kernel due to integer overflow is a kernel (or
virtualization) bug, but it's also a definition of UB, that it can crash
the computer.

>   | And regarding utility of the Undefined Behavior Sanitizer and coverage
>   | of new tests.. we have just caught a bug on pmax that an integer
>   | overflow crashed the kernel:
> 
> Sure, no-one is saying that the extra work is not worth while.   Just that
> you are sometimes fixing non-problems (and causing code churn to do
> it - particularly when the code being changed comes from upstream ... in
> the dhcpcd case that is not such an issue, as if needed, Roy will fix that
> as well, but why would anyone expect the openbsd people to alter ssh
> to fix a non problem ?)
> 

Some Undefined Behavior fixes were pulled by FreeBSD (at least one of
them merged McKusick!). Regarding the OpenSSH case, I'm in touch with
some of their developers and concluded to push it to one of their
mailinglist. One of the UB patches was already merged by the OpenBSD
project (in tmux).

I will go the same way with other patches and submit them upstream, even
if some are cautious. The dhcpcd one was fixed first in upstream dhcpcd
before landing it to src/.



signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Robert Elz
Date:Sat, 4 Aug 2018 02:15:15 +0200
From:Kamil Rytarowski 
Message-ID:  


  | In general there shall not be a relation between -O level and
  | sanitizers. Sanitizers do not need -O0  or -g for operation.

That's fine.   But are you doing compiles that way?  (necessary or not)

  | GCC is known for reporting uninitialized variables and I wouldn't blame
  | sanitizers for it.

I wasn't.   The one (which isn't) in the ssh code has been there for
ages, and has compiled successfully, for ages   There has to be a
reason why it only needed action yesterday.

  | We just initialize them to tune it down and this is the current practice.

Yes, I know - some of them are real potential problems, even if the
circumstances that lead to the problem are so unlikely that we never
see them in real life - others take knowledge about the environment
the compiler does not have, or just require flow analysis more
complex than it is reasonable to expect of the compiler, in order to know
that there is not actually a problem.The real bugs we fix, obviously.
The ones the compiler cannot detect are not bugs we deal with as you
said.   But when the compiler is able to detect there is no problem, but
we are simply preventing it doing so, we fix the way we use the compiler.

  | GCC also enables more warnings for UBSan that have to be addressed in
  | order to compile the source, as the code would be UB anyway (like
  | changing the signedness bit with a shift operation).

Sure, some of those, even though they're not really problems, are easy
to fix in a totally harmless way.   That's fine.   The UB is a technical C
issue, not really anything that ever fails in those cases though.

There has (is currently) an issue with posix, where a current bug resulution
requires abs(INT_MIN) to be INT_MIN (ie: abs(n) can end up < 0).   In C
that's undefined.   POSIX requires 2's complement however (unlike C) and
can rely upon what happens with 0 - INT_MAX even if C says that is
undefined.Some people like it, others do not...  (what the end result
will be I have no idea.)

  | I don't agree with strong opinions against cautious warnings/errors from
  | a compiler.

Sure, but when the warning goes off, we need to analyse the issue and
see what the actual problem is, not just blindly do whatever makes the
compiler stop issuing the warning.

  |They are there for purpose and dhcpcd could be really broken
  | with the same code, but with a different context.

No, it could not.   The only possible issue was if the packet was
invalid (too short a len) but that was not what the warning was
about (and could not have been, as there was nothing undefined
if that happened, just an unwanted result).

  | And regarding utility of the Undefined Behavior Sanitizer and coverage
  | of new tests.. we have just caught a bug on pmax that an integer
  | overflow crashed the kernel:

Sure, no-one is saying that the extra work is not worth while.   Just that
you are sometimes fixing non-problems (and causing code churn to do
it - particularly when the code being changed comes from upstream ... in
the dhcpcd case that is not such an issue, as if needed, Roy will fix that
as well, but why would anyone expect the openbsd people to alter ssh
to fix a non problem ?)

Once again, please do not change code to fix gcc warnings unless you
get the same warning with the code compiled with -O2 (or more).

kre



Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Kamil Rytarowski
On 04.08.2018 01:31, Robert Elz wrote:
> Kamil: assuming you agree that this is a reasonable analysis, I'd suggest
> no more code changes based upon gcc warnings issued this way.

In general there shall not be a relation between -O level and
sanitizers. Sanitizers do not need -O0  or -g for operation. UBSan does
not need disabled optimization for reporting issues in exact location in
the code. It also does not need debug information (DWARF or similar)...
however a runtime might make use of the additional data to print more
verbose messages or stacktraces.

GCC is known for reporting uninitialized variables and I wouldn't blame
sanitizers for it. We just initialize them to tune it down and this is
the current practice.

GCC also enables more warnings for UBSan that have to be addressed in
order to compile the source, as the code would be UB anyway (like
changing the signedness bit with a shift operation).

I don't agree with strong opinions against cautious warnings/errors from
a compiler. They are there for purpose and dhcpcd could be really broken
with the same code, but with a different context.

And regarding utility of the Undefined Behavior Sanitizer and coverage
of new tests.. we have just caught a bug on pmax that an integer
overflow crashed the kernel:

UB caused to crash pmax.. divrem_overflow_signed_div: pexpect reported
EOF - VMM exited unexpectedly



signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Valery Ushakov
On Fri, Aug 03, 2018 at 18:08:24 +0300, Valery Ushakov wrote:

> On Fri, Aug 03, 2018 at 15:54:24 +0200, Martin Husemann wrote:
> 
> > On Fri, Aug 03, 2018 at 08:28:55PM +0700, Robert Elz wrote:
> > > Where is the signed arithmetic that was supposedly a probem?
> > 
> > Ah, stupid C integer promotion rules.  uint16_t is promoted to int
> > here, not unsigned int or size_t.
> 
> Hmm, i don't think that's true.

Nah, you are right.  "THEN" is the important word here.

But as it transpired in another branch of this thread the problem was
something entirely different...

>   6.3.1.8  Usual arithmetic conversions
> 
>   ...
> Otherwise, the integer promotions are performed on both
> operands.   Then the following rules are applied to the
> promoted operands:

-uwe


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread maya
On Fri, Aug 03, 2018 at 07:46:01PM +0100, Roy Marples wrote:
> We could split the term, but merely storing the result of htons in it's own
> variable creates a larger binary for no good reason as i see it.
> 

I suspect that the compiler will generate the same code anyway when
using a local variable for intermediate results, feel free to write
the cost as you find most legible :-)


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Robert Elz
Date:Fri, 3 Aug 2018 19:46:01 +0100
From:Roy Marples 
Message-ID:  

  | Considering both methods work and the result of htons is a uint16_t (but 
  | is this always guaranteed?)

ntohs() (not that it matters) and "always" is a big word, but that is how
it is defined by POSIX, so it should be something that we can rely upon.

  | is this just an over-zealous compiler warning?

Not so much over zealous, as just plain lazy... (given Kamil's most
recent message):

  | conversion to 'long unsigned int' from 'int' may change the sign of the
  | result [-Werror=sign-conversion]
  |   *len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp);
  |  ^

That's not the overflow on subtract that you said before, and for this one
I can see how the cast can help ... but that's just sheer laziness on the part
of the compiler.

The "int" which is there was created by the compiler, it knows (or should know)
that the underlying value is in the rangs [0..65535] and cannot possibly have
its sign changed when it is converted to long unsigned int.

It would be more understandable if the int appeared somewhere earlier, but
here this is all in this one expression, one type promotion on top of another.

Get the idiot compiler fixed, and then remove the cast.   In the meantime,
at least mark it with a comment indicating that the cast should not be needed,
and is there purely to appease gcc.

kre



Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Kamil Rytarowski
On 03.08.2018 20:49, Roy Marples wrote:
> On 03/08/2018 15:22, Robert Elz wrote:
>> Whether there need to be any attention to the possibility
>> of a malformed packet I will leave for Roy to decide (I am
>> assuming probably not) but that added cast just looks to be
>> a bandaid for a broken compiler (sanitiser).
> 
> The contents are verified further up the stack.
> I'm inclined to agree it's a dodgy compiler warning, but I'm not really
> an expert here.
> 
> Roy

I've repeated the compiler error as I forgot the exact error message (it
was fixed in a local branch a while ago):

/public/src.git/external/bsd/dhcpcd/dist/src/dhcp.c: In function
'get_udp_data':
/public/src.git/external/bsd/dhcpcd/dist/src/dhcp.c:3270:29: error:
conversion to 'long unsigned int' from 'int' may change the sign of the
result [-Werror=sign-conversion]
  *len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp);
 ^


http://netbsd.org/~kamil/patch-00068-dhcpcd-get_udp_data.txt



signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Roy Marples

On 03/08/2018 15:22, Robert Elz wrote:

Whether there need to be any attention to the possibility
of a malformed packet I will leave for Roy to decide (I am
assuming probably not) but that added cast just looks to be
a bandaid for a broken compiler (sanitiser).


The contents are verified further up the stack.
I'm inclined to agree it's a dodgy compiler warning, but I'm not really 
an expert here.


Roy


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Roy Marples

On 03/08/2018 14:02, Martin Husemann wrote:

On Fri, Aug 03, 2018 at 02:47:53PM +0200, Kamil Rytarowski wrote:

Further if there ever was a potential problem from this line ...

*len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp);
then
*len = (size_t)ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp);



It was a build time error generated by GCC. The compiler detected that
both sizeof() could be large enough to overflow the result returned from
  ntohs(3). And overflowing a signed integer is Undefined Behavior.


But we do not do this here.


This change points to the compiler that the code is safe.


What exactly makes the code safe now? If ntohs(p->ip.ip_len) <
(sizeof(p->ip) + sizeof(p->udp)) then we are now in even more serious
trouble.

Does splitting the term help?

uint16_t hdr_size = sizeof(p->ip) - sizeof(p->udp);
uint16_t pkt_size = ntohs(p->ip.ip_len);
KASSERT(pkt_size > hdr_size);
*len = pkt_size > hdr_size ? pkt_size-hdr_size : 0;

or something like that?


We could split the term, but merely storing the result of htons in it's 
own variable creates a larger binary for no good reason as i see it.


Considering both methods work and the result of htons is a uint16_t (but 
is this always guaranteed?) is this just an over-zealous compiler warning?


Roy


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Valery Ushakov
On Fri, Aug 03, 2018 at 15:54:24 +0200, Martin Husemann wrote:

> On Fri, Aug 03, 2018 at 08:28:55PM +0700, Robert Elz wrote:
> > Where is the signed arithmetic that was supposedly a probem?
> 
> Ah, stupid C integer promotion rules.  uint16_t is promoted to int
> here, not unsigned int or size_t.

Hmm, i don't think that's true.

  6.3.1.8  Usual arithmetic conversions

  ...
Otherwise, the integer promotions are performed on both
operands.   Then the following rules are applied to the
promoted operands:

 If both operands  have  the  same  type,  then  no
 further conversion is needed.

 Otherwise,  if  both  operands have signed integer
 types or both have  unsigned  integer  types,  the
 operand with the type of lesser integer conversion
 rank is converted to the type of the operand  with
 greater rank.

ntohs returns unsigned uint16_t, sizeof returns unsigned size_t, so
uint16_t, that has "lesser integer convertion rank" (i.e. it's
smaller) should be converted to size_t automagically.

-uwe


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Robert Elz
Date:Fri, 3 Aug 2018 15:54:24 +0200
From:Martin Husemann 
Message-ID:  <20180803135424.gc23...@mail.duskware.de>

  | Ah, stupid C integer promotion rules. uint16_t is promoted to int
  | here, not unsigned int or size_t.

Even with that, there should be no problem, in

signed - unsigned

the '-' should be an unsigned - and the result should
be unsigned.   There is no signed arithmetic being done
here to cause an undefined result.

That's the same rule that makes

strlen(s) + 1

be a size_t rather than a ssize_t or whatever.   Otherwise we'd
need to be adding casts to every operation like that, just in case
strlen(s) == MAX_INT and the " +1 " would cause overflow, and
undefined operation.No thanks.

Whether there need to be any attention to the possibility
of a malformed packet I will leave for Roy to decide (I am
assuming probably not) but that added cast just looks to be
a bandaid for a broken compiler (sanitiser).

kre



Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Martin Husemann
On Fri, Aug 03, 2018 at 08:28:55PM +0700, Robert Elz wrote:
> Where is the signed arithmetic that was supposedly a probem?

Ah, stupid C integer promotion rules. uint16_t is promoted to int
here, not unsigned int or size_t. The cast makes all operands the same
type and no promotion happens.

Martin


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Robert Elz
Date:Fri, 3 Aug 2018 15:02:28 +0200
From:Martin Husemann 
Message-ID:  <20180803130227.ga23...@mail.duskware.de>

  | What exactly makes the code safe now? If ntohs(p->ip.ip_len) <
  | (sizeof(p->ip) + sizeof(p->udp)) then we are now in even more serious
  | trouble.

Actually, not more serious, the same serious as before.   If adding that
cast change anything at all, the compiler isn't working as it should.

If the values haven't been verieied, they should be.   If they have been
verified, there is no problem and nothing needs fixing (except possibly the
santiizer).

In a later message ...

  | Overflow (underflow) of an unsigned value is defined and GCC stops
  | deducing whether there might be a problem.

But it always was unsigned, ntohs() returns an unsigned result.   Further
even if it was signed, doesn't combining a signed value and an unsigned
one with an arithmetic op result in an unsigned operation?

Where is the signed arithmetic that was supposedly a probem?

kre




Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Kamil Rytarowski
On 03.08.2018 15:20, Martin Husemann wrote:
> On Fri, Aug 03, 2018 at 03:18:18PM +0200, Kamil Rytarowski wrote:
>> The change was indicating to the compiler that code is safe, without
>> changing the algorithm.
> 
> I don't get why.
> 
> Martin
> 

Overflow (underflow) of an unsigned value is defined and GCC stops
deducing whether there might be a problem.



signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Martin Husemann
On Fri, Aug 03, 2018 at 03:18:18PM +0200, Kamil Rytarowski wrote:
> The change was indicating to the compiler that code is safe, without
> changing the algorithm.

I don't get why.

Martin


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Kamil Rytarowski
On 03.08.2018 15:02, Martin Husemann wrote:
> On Fri, Aug 03, 2018 at 02:47:53PM +0200, Kamil Rytarowski wrote:
>>> Further if there ever was a potential problem from this line ...
>>>
>>> *len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp);
>>> then
>>> *len = (size_t)ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp);
> 
>> It was a build time error generated by GCC. The compiler detected that
>> both sizeof() could be large enough to overflow the result returned from
>>  ntohs(3). And overflowing a signed integer is Undefined Behavior.
> 
> But we do not do this here.
> 
>> This change points to the compiler that the code is safe.
> 
> What exactly makes the code safe now? If ntohs(p->ip.ip_len) <
> (sizeof(p->ip) + sizeof(p->udp)) then we are now in even more serious
> trouble.
> 

The change was indicating to the compiler that code is safe, without
changing the algorithm.

> Does splitting the term help?
> 
>   uint16_t hdr_size = sizeof(p->ip) - sizeof(p->udp);

+

>   uint16_t pkt_size = ntohs(p->ip.ip_len);
>   KASSERT(pkt_size > hdr_size);
>   *len = pkt_size > hdr_size ? pkt_size-hdr_size : 0;
> 
> or something like that?
> 

This looks like a safer approach, but I will defer it to Roy to decide
what to do.

Previously we have agreed with the (size_t) case.

> Martin
> 




signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Martin Husemann
On Fri, Aug 03, 2018 at 02:47:53PM +0200, Kamil Rytarowski wrote:
> > Further if there ever was a potential problem from this line ...
> > 
> > *len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp);
> > then
> > *len = (size_t)ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp);

> It was a build time error generated by GCC. The compiler detected that
> both sizeof() could be large enough to overflow the result returned from
>  ntohs(3). And overflowing a signed integer is Undefined Behavior.

But we do not do this here.

> This change points to the compiler that the code is safe.

What exactly makes the code safe now? If ntohs(p->ip.ip_len) <
(sizeof(p->ip) + sizeof(p->udp)) then we are now in even more serious
trouble.

Does splitting the term help?

uint16_t hdr_size = sizeof(p->ip) - sizeof(p->udp);
uint16_t pkt_size = ntohs(p->ip.ip_len);
KASSERT(pkt_size > hdr_size);
*len = pkt_size > hdr_size ? pkt_size-hdr_size : 0;

or something like that?

Martin


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Kamil Rytarowski
On 03.08.2018 12:26, Robert Elz wrote:
> Date:Fri, 3 Aug 2018 02:17:33 +
> From:"Kamil Rytarowski" 
> Message-ID:  <20180803021733.b2002f...@cvs.netbsd.org>
> 
>   | GCC with -fsanitize=undefiend detects a potential overflow in the code.
>   | Cast the return value of ntohs(3) to size_t.
> 
> I don't understand the point of this change, and I believe it to
> be wrong and misguided.
> 
> Further if there ever was a potential problem from this line ...
> 
>   *len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp);
> then
>   *len = (size_t)ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp);
> 
> will not fix it.   The possible problem being if ip_len < 28 (the sum of the
> two sizeof's).While I didn't check this code, that possibility is usually
> verified elsewhere, and even if not, adding the cast changes nothing, the
> result is still bogus (that is, at best, the change could be masking a real
> bug, though that's unlikely.)
> 
> This looks to be (somehow) determining that the result of ntohs(ip_len)
> might somehow be negative (or something, I'm not really sure what is
> being believed is happening) but the ntohs() result is (in all cases here)
> a uint16_t (either because the result is literally the arg, which is that 
> type,
> or because it is the result of bswap16() which is declared that way.)
> 
> The sizeof() ops make the expression at least a size_t - so unless size_t
> and uint16_t are the same (which might happen on a pdp-11 or similar,
> though even there it is unlikely I suspect) the narrower one needs to be
> promoted - whether the ntohs() result is implicitly promoted to size_t, or
> explicitly cast cannot make any difference, can it?
> 
> So what exactly is being fixed here?  Which behaviour is supposedly undefined?
> 

It was a build time error generated by GCC. The compiler detected that
both sizeof() could be large enough to overflow the result returned from
 ntohs(3). And overflowing a signed integer is Undefined Behavior.

This change points to the compiler that the code is safe.

> kre
> 




signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/external/bsd/dhcpcd/dist/src

2018-08-03 Thread Robert Elz
Date:Fri, 3 Aug 2018 02:17:33 +
From:"Kamil Rytarowski" 
Message-ID:  <20180803021733.b2002f...@cvs.netbsd.org>

  | GCC with -fsanitize=undefiend detects a potential overflow in the code.
  | Cast the return value of ntohs(3) to size_t.

I don't understand the point of this change, and I believe it to
be wrong and misguided.

Further if there ever was a potential problem from this line ...

*len = ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp);
then
*len = (size_t)ntohs(p->ip.ip_len) - sizeof(p->ip) - sizeof(p->udp);

will not fix it.   The possible problem being if ip_len < 28 (the sum of the
two sizeof's).While I didn't check this code, that possibility is usually
verified elsewhere, and even if not, adding the cast changes nothing, the
result is still bogus (that is, at best, the change could be masking a real
bug, though that's unlikely.)

This looks to be (somehow) determining that the result of ntohs(ip_len)
might somehow be negative (or something, I'm not really sure what is
being believed is happening) but the ntohs() result is (in all cases here)
a uint16_t (either because the result is literally the arg, which is that type,
or because it is the result of bswap16() which is declared that way.)

The sizeof() ops make the expression at least a size_t - so unless size_t
and uint16_t are the same (which might happen on a pdp-11 or similar,
though even there it is unlikely I suspect) the narrower one needs to be
promoted - whether the ntohs() result is implicitly promoted to size_t, or
explicitly cast cannot make any difference, can it?

So what exactly is being fixed here?  Which behaviour is supposedly undefined?

kre