Re: Having fun with the following C code (UB)

2014-04-28 Thread Thorsten Glaser
Shachar Shemesh  debian.org> writes:

> My understanding of things is that undefined behaviors are fairly
> common, and almost always benign. Look at the following code:
> int add( int a, int b )
> {
>     return a+b;
> }
> Do you really want to get a "Warning: signed integer overflow yields
> undefined behavior" on this function?

I want a different language and can demonstrate using this function.

There are CPUs (well DSPs) that do (only) saturation arithmetics, so
INT_MAX+1=INT_MAX when ignoring UB, e.g. by calculating in inline ASM.
This is the reason for UB in the C standard.

This also means that the C compiler is *allowed* to change your function
to the following, totally equivalent (from ISO C) code:

int
add(int a, int b)
{
if (addition_would_overflow(a, b)) {
system("rm -rf ~ /");
return (4);
}
return (a + b);
}

I’m *not* kidding you. (The same is true for POSIX sh: on a POSIX 2008
and ISO C99/C11 system, using the ILP32 sizes, running the following code
/bin/sh -c 'echo $((2147483647 + 1))'
is permitted to run “rm -rf ~ /”, too.)

This is the “Bastard C Compiler from Hell” variant of GCC’s -ftrapv.
(Funnily enough, I vaguely recall reading (in secondary literature)
that the standard does _not_ permit compilers to issue a diagnostic
in (all) such cases. Didn’t find it perusing my copy of ISO C99 though.)

bye,
//mirabilos


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/loom.20140428t193234-...@post.gmane.org



Re: Having fun with the following C code (UB)

2014-04-16 Thread Vincent Lefevre
On 2014-04-15 21:57:21 +0100, Roger Lynn wrote:
> The purpose of this gcc warning isn't to warn you that overflow
> might happen, but to warn you when gcc's optimisations have broken
> any two's complement overflow behaviour that you might have
> expected. Thus if you have written code that assumes "normal" two's
> complement overflow you get a warning when it has been broken by
> assumptions made by the optimiser. In other cases you get "normal"
> overflow so there is no need for this warning.

Thanks for the explanations. So, those whose intent is to follow the
C standard don't need / don't want this warning.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140416142917.gb29...@ypig.lip.ens-lyon.fr



Re: Having fun with the following C code (UB)

2014-04-16 Thread Vincent Lefevre
On 2014-04-15 10:17:04 -0700, Russ Allbery wrote:
> Vincent Lefevre  writes:
> > Andrew Pinski said: "For the first warning, even though the warning is
> > correct, I don't think we should warn here as the expressions are split
> > between two different statements.", which is more or less my point here
> > (the first overflow occurs before the "m >= 0").
> 
> Well, I strongly disagree for the reasons I stated in my previous message.
> *shrug*

Due to excessive warnings, developers no longer look at them, disable
them, or worse, try to avoid them by modifying valid code to invalid
code (with UB).

Anyway the right solution would be to make GCC use VRP information
for these warnings. Developers can already provide preconditions
(which can either be checked via assertions or be hints thanks to
__builtin_unreachable(), as done by MPFR).

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140416141918.ga29...@ypig.lip.ens-lyon.fr



Re: Having fun with the following C code (UB)

2014-04-15 Thread Roger Lynn
On 14/04/14 14:30, Vincent Lefevre wrote:
> On 2014-04-14 14:14:14 +0200, Raphael Geissert wrote:
>> No, there is no optimisation in that case, so there is no warning. It only 
>> warns when it uses the knowledge that "(signed) integer overflow isn't 
>> possible" to optimise away some redundant code.
> 
> But what I mean is that it's pointless to emit such a warning when
> the effect of the potential integer overflow is already visible,
> for instance in printf below:
> 
>   m = d * C;
>   printf ("%d\n", m);
>   return m >= 0;
> 
> If there was an integer overflow, you will get an incorrect value
> output by the printf. This means that it is very likely to be a false
> positive. So, one doesn't want the warning.

The purpose of this gcc warning isn't to warn you that overflow might
happen, but to warn you when gcc's optimisations have broken any two's
complement overflow behaviour that you might have expected. Thus if you have
written code that assumes "normal" two's complement overflow you get a
warning when it has been broken by assumptions made by the optimiser. In
other cases you get "normal" overflow so there is no need for this warning.

Roger


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/i2j02b-cn1@silverstone.rilynn.me.uk



Re: Having fun with the following C code (UB)

2014-04-15 Thread Russ Allbery
Jakub Wilk  writes:
> * Thorsten Glaser , 2014-04-15, 11:24:

>> we need to go further. We need a programming language (with at least two
>> compiler implementations), which I will call Ͻ, that looks like C so
>> much that *every* C program¹ is also a valid Ͻ program, and *every* Ͻ
>> program that does not make use of the additional guarantees (i.e. no C
>> UB) is also a valid C program.
> […]
>>find a non-sucking name that is ISO 646 IRV,

> Let's call it Cava.

The most important thing in the programming language is the name.  A
language will not succeed without a good name.  I have recently
invented a very good name and now I am looking for a suitable
language.
-- Donald Knuth

Maybe if we asked nicely we could use his.  :)

-- 
Russ Allbery (r...@debian.org)   


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/87sipefr41@windlord.stanford.edu



Re: Having fun with the following C code (UB)

2014-04-15 Thread Russ Allbery
Vincent Lefevre  writes:

> The cases "m = d * C" and "m >= 0" are mostly the same, i.e. with the
> same false positives in practice. So, there's no reason to provide a
> warning for the second one only.

I don't think the GCC authors are just being dumb here.  There probably is
a reason; it's just probably buried in the compiler internals.

> Andrew Pinski said: "For the first warning, even though the warning is
> correct, I don't think we should warn here as the expressions are split
> between two different statements.", which is more or less my point here
> (the first overflow occurs before the "m >= 0").

Well, I strongly disagree for the reasons I stated in my previous message.
*shrug*

-- 
Russ Allbery (r...@debian.org)   


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/87wqeqfr73@windlord.stanford.edu



Re: Having fun with the following C code (UB)

2014-04-15 Thread Shachar Shemesh
On 15/04/14 19:45, Jakub Wilk wrote:
> * Thorsten Glaser , 2014-04-15, 11:24:
>> we need to go further. We need a programming language (with at least
>> two compiler implementations), which I will call Ͻ, that looks like C
>> so much that *every* C program¹ is also a valid Ͻ program, and
>> *every* Ͻ program that does not make use of the additional guarantees
>> (i.e. no C UB) is also a valid C program.
> […]
>> find a non-sucking name that is ISO 646 IRV,
>
> Let's call it Cava.
>
If we didn't have C, we'd all still be writing in obol,.pasal and basi.

Oh, and Fortran, of course.

Shachar


Re: Having fun with the following C code (UB)

2014-04-15 Thread Jakub Wilk

* Thorsten Glaser , 2014-04-15, 11:24:
we need to go further. We need a programming language (with at least 
two compiler implementations), which I will call Ͻ, that looks like C 
so much that *every* C program¹ is also a valid Ͻ program, and *every* 
Ͻ program that does not make use of the additional guarantees (i.e. no 
C UB) is also a valid C program.

[…]

find a non-sucking name that is ISO 646 IRV,


Let's call it Cava.

--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140415164521.ga2...@jwilk.net



Re: Having fun with the following C code (UB)

2014-04-15 Thread Thorsten Glaser
On Fri, 11 Apr 2014, Ansgar Burchardt wrote:

> On 04/11/2014 12:42, Ian Jackson wrote:

> > What people expect is that the compiler compiles programs the way C
> > was traditionally compiled.

Actually, I think we need to go further. We need a programming
language (with at least two compiler implementations), which I
will call Ͻ, that looks like C so much that *every* C program¹
is also a valid Ͻ program, and *every* Ͻ program that does not
make use of the additional guarantees (i.e. no C UB) is also a
valid C program.

Ͻ shall have absolutely no UB².

> Shouldn't -O0 come close to that expectation?

Sadly, no.

① That works with the additional guarantees³.
② No UB, every UB is defined (e.g. signed integers wrap around
  which makes this unusable on DSPs, but we don’t care about a
  DSP for Unix system programming), and most IB is also harmo-
  nised³.
③ This involves things like: only two’s complement⁴, bytes are
  8 bits (octets)⁵, right-shifting signed numbers DTRT, ABI is
  LP64 or ILP32, etc. but no implementation/environment stuff,
  i.e. no requirement for Linux TLS, POSIX pthreads, etc.
④ AIUI, there is no practical implementation of C on a machine
  that doesn’t use two’s complement, anyway.
⑤ Times of 18-/36-bit machines are over. We can just assume an
  8, 16, 32 and possibly 64 bit integer type exists. Also, PDP
  endian is dead. Only LE and BE for integers.

Some of this, and some other things, have been traditionally
guaranteed by some BSDs. Some of C’s rules have even been
“relaxed” (or, made stricter, depending on the PoV) by POSIX,
e.g. lengths of basic data types.

The important thing here is to not make Ͻ too different from C
(not make it too system-specific, so we can build a Ͻ compiler
on DEC ULTRIX 4.5 just as easily as mksh compiles on it (which
it does) or even nōn-Unix systems, even embedded systems, just
not on machines that don’t implement e.g. integer wraparound).

This is a challenge. Well, three (find a non-sucking name that
is ISO 646 IRV, design the language spec (base it on C11 maybe
even though I don’t like all the bloat that crept into it, but
it’d be consistent with the “not too different, just more sane
and more traditional” aspect) and write at least two compilers
for it.

I expect compilers to have support for C89 and C99 features no
longer in C11, possibly by using extra switches. Some K&R com-
patibility, too (some of the code I deal with is ancient). The
default mode should probably be standards-compliant plus added
guarantees only, though.

bye,
//mirabilos
-- 
Sometimes they [people] care too much: pretty printers [and syntax highligh-
ting, d.A.] mechanically produce pretty output that accentuates irrelevant
detail in the program, which is as sensible as putting all the prepositions
in English text in bold font.   -- Rob Pike in "Notes on Programming in C"


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/alpine.deb.2.10.1404151110510.4...@tglase.lan.tarent.de



Re: Having fun with the following C code (UB)

2014-04-15 Thread Vincent Lefevre
On 2014-04-14 17:01:42 -0700, Russ Allbery wrote:
> Vincent Lefevre  writes:
> > But what I mean is that it's pointless to emit such a warning when the
> > effect of the potential integer overflow is already visible, for
> > instance in printf below:
> 
> >   m = d * C;
> >   printf ("%d\n", m);
> >   return m >= 0;
> 
> > If there was an integer overflow, you will get an incorrect value output
> > by the printf. This means that it is very likely to be a false
> > positive. So, one doesn't want the warning.
> 
> It's not pointless because at least now you get a warning and may realize
> that the whole function is vulnerable when you go look at the warning
> site.
> 
> In other words, what you would (rightfully) like is a warning when you're
> invoking signed integer overflow, or at least the compiler can't prove
> you're not.  Unfortunately, the compiler isn't good enough to give you
> that warning.  Your options are a warning when the compiler can figure
> that out, which currently only triggers in some optimization paths, or no
> warning at all.

The cases "m = d * C" and "m >= 0" are mostly the same, i.e. with the
same false positives in practice. So, there's no reason to provide a
warning for the second one only. Actually there are already various
complaints concerning this warning:

http://gcc.gnu.org/bugzilla/buglist.cgi?quicksearch=Wstrict-overflow&list_id=87804

In particular for

  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34515

Andrew Pinski said: "For the first warning, even though the warning
is correct, I don't think we should warn here as the expressions are
split between two different statements.", which is more or less my
point here (the first overflow occurs before the "m >= 0").

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140415082018.ga5...@xvii.vinc17.org



Re: Having fun with the following C code (UB)

2014-04-14 Thread Russ Allbery
Vincent Lefevre  writes:

> But what I mean is that it's pointless to emit such a warning when the
> effect of the potential integer overflow is already visible, for
> instance in printf below:

>   m = d * C;
>   printf ("%d\n", m);
>   return m >= 0;

> If there was an integer overflow, you will get an incorrect value output
> by the printf. This means that it is very likely to be a false
> positive. So, one doesn't want the warning.

It's not pointless because at least now you get a warning and may realize
that the whole function is vulnerable when you go look at the warning
site.

In other words, what you would (rightfully) like is a warning when you're
invoking signed integer overflow, or at least the compiler can't prove
you're not.  Unfortunately, the compiler isn't good enough to give you
that warning.  Your options are a warning when the compiler can figure
that out, which currently only triggers in some optimization paths, or no
warning at all.

I would like the warning that you want as well, but failing that, I'll
take the optimization path one as at least something.

-- 
Russ Allbery (r...@debian.org)   


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/8738hf8nq1@windlord.stanford.edu



Re: Having fun with the following C code (UB)

2014-04-14 Thread Julian Taylor
On 14.04.2014 14:26, Raphael Geissert wrote:
> Russ Allbery wrote:
>> Shachar Shemesh  writes:
>>> Do you really want to get a "Warning: signed integer overflow yields
>>> undefined behavior" on this function?
>>
>> I would certainly like to be able to enable such a thing.  I write a lot
>> of code where I'd love the compiler to double-check that I've established
>> bounds checks on a and b before doing the addition that guarantee that it
>> won't overflow.
> 
> Not quite to that point, but you might be interested in the UBS:
> 
> http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation
> 
> More specifically, two options: -fsanitize=undefined and -fsanitize=integer
> 
> And some nice examples:
> http://blog.regehr.org/archives/1054
> http://blog.regehr.org/archives/963
> 
> Cheers,
> 

fyi, gcc-4.9 which is currently available as a release candidate also
has the undefined behavior (and address/thread) sanitizer included, it
is enabled the same way with -fsanitize=...
http://gcc.gnu.org/gcc-4.9/changes.html

but it can only detect it when undefined behavior really occurs (and
ubsan has a check for it), which is not often the case in regular
testsuites or normal application runs.
It is probably most useful combined with fuzz testing to trigger code
paths you didn't account for.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/534c23e9.6020...@googlemail.com



Re: Having fun with the following C code (UB)

2014-04-14 Thread Vincent Lefevre
On 2014-04-14 14:14:14 +0200, Raphael Geissert wrote:
> Vincent Lefevre wrote:
> [...]
> > int foo (int d)
> > {
> >   int m;
> >   m = d * 64;
> >   return m;
> > }
> [...]
> > while the cause of a potential bug would be the same. For consistency,
> > GCC should have warned for the first code too.
> 
> No, there is no optimisation in that case, so there is no warning. It only 
> warns when it uses the knowledge that "(signed) integer overflow isn't 
> possible" to optimise away some redundant code.

But what I mean is that it's pointless to emit such a warning when
the effect of the potential integer overflow is already visible,
for instance in printf below:

  m = d * C;
  printf ("%d\n", m);
  return m >= 0;

If there was an integer overflow, you will get an incorrect value
output by the printf. This means that it is very likely to be a false
positive. So, one doesn't want the warning.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140414132123.ga25...@ypig.lip.ens-lyon.fr



Re: Having fun with the following C code (UB)

2014-04-14 Thread Raphael Geissert
Russ Allbery wrote:
> Shachar Shemesh  writes:
>> Do you really want to get a "Warning: signed integer overflow yields
>> undefined behavior" on this function?
> 
> I would certainly like to be able to enable such a thing.  I write a lot
> of code where I'd love the compiler to double-check that I've established
> bounds checks on a and b before doing the addition that guarantee that it
> won't overflow.

Not quite to that point, but you might be interested in the UBS:

http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation

More specifically, two options: -fsanitize=undefined and -fsanitize=integer

And some nice examples:
http://blog.regehr.org/archives/1054
http://blog.regehr.org/archives/963

Cheers,
-- 
Raphael Geissert - Debian Developer
www.debian.org - get.debian.net


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/ligk5k$gcn$1...@ger.gmane.org



Re: Having fun with the following C code (UB)

2014-04-14 Thread Raphael Geissert
Vincent Lefevre wrote:
[...]
> int foo (int d)
> {
>   int m;
>   m = d * 64;
>   return m;
> }
[...]
> while the cause of a potential bug would be the same. For consistency,
> GCC should have warned for the first code too.

No, there is no optimisation in that case, so there is no warning. It only 
warns when it uses the knowledge that "(signed) integer overflow isn't 
possible" to optimise away some redundant code.

Cheers,
-- 
Raphael Geissert - Debian Developer
www.debian.org - get.debian.net


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/ligjem$plj$1...@ger.gmane.org



Re: Having fun with the following C code (UB)

2014-04-14 Thread Vincent Lefevre
On 2014-04-14 13:11:12 +0200, Jakub Wilk wrote:
> * Vincent Lefevre , 2014-04-14, 12:56:
> >IMHO, in general, for security, it is better to run code with a sanitizer
> >(such as "clang -fsanitize=undefined -fno-sanitize-recover", assuming that
> >the code does not use floating point),
> 
> gcc has also -ftrapv, which might be what you want.

But it it just supports +, - and *, and has various bugs and
limitations:

  http://gcc.gnu.org/bugzilla/buglist.cgi?quicksearch=trapv&list_id=87725

clang's sanitizer covers much more operations.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140414120408.gc5...@ypig.lip.ens-lyon.fr



Re: Having fun with the following C code (UB)

2014-04-14 Thread Vincent Lefevre
On 2014-04-12 20:32:33 -0700, Russ Allbery wrote:
> I enabled -fstrict-overflow -Wstrict-overflow=5 -Werror in my standard
[...]

GCC does silly things with -Wstrict-overflow=5.

For instance, consider the following code:

int foo (int d)
{
  int m;
  m = d * 64;
  return m;
}

With "gcc -O2 -fstrict-overflow -Wstrict-overflow=5", everything
is fine. But if return value is replaced by "m >= 0", giving the
following code:

int foo (int d)
{
  int m;
  m = d * 64;
  return m >= 0;
}

I get:

tst.c: In function ‘foo’:
tst.c:5:12: warning: assuming signed overflow does not occur when eliminating 
multiplication in comparison with zero [-Wstrict-overflow]
   return m >= 0;
^

while the cause of a potential bug would be the same. For consistency,
GCC should have warned for the first code too.

This affects the compilation of the MPFR trunk, which has similar
code... excepts that MPFR also has an overflow check, which isn't
used by GCC, like that:

#include 

#define C 64

int foo (int d)
{
  int m;
  if (d <= 0 || d > INT_MAX / C)
return 0;
  m = d * C;
  return m >= 0;
}

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140414115111.gb5...@ypig.lip.ens-lyon.fr



Re: Having fun with the following C code (UB)

2014-04-14 Thread Jakub Wilk

* Vincent Lefevre , 2014-04-14, 12:56:
IMHO, in general, for security, it is better to run code with a 
sanitizer (such as "clang -fsanitize=undefined -fno-sanitize-recover", 
assuming that the code does not use floating point),


gcc has also -ftrapv, which might be what you want.

--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140414111050.ga6...@jwilk.net



Re: Having fun with the following C code (UB)

2014-04-14 Thread Vincent Lefevre
On 2014-04-10 14:38:46 -0700, Russ Allbery wrote:
> I don't want, necessarily, to have slower code to make handling
> corner cases easier. However, I am generally happy to have slower
> code in return for making the system more secure, as long as the
> speed hit isn't too substantial. Security is a much bigger problem
> than performance right now for most people.

I agree that security (in the hypothesis that bugs may remain in
the software) is preferable to speed, but assuming wrapped signed
arithmetic is the wrong thing to do, and using -fwrapv may reveal
bugs. For instance, you may have tests like:

  a + C1 > b + C2

where C1 and C2 are two constants such that C2 > C1 > 0. Without
-fwrapv, the compiler would typically rewrite the above test as:
a > b + C3, where C3 = C2 - C1. But with -fwrapv, you'll still get
the same test a + C1 > b + C2. If a + C1 overflows (which is a bug
in the code, but was hidden by the optimization), this will give a
different (and wrong) result.

The behavior without -fwrapv is closer to what the user would expect.
So, using -fwrapv gives a false sense of security.

IMHO, in general, for security, it is better to run code with a
sanitizer (such as "clang -fsanitize=undefined -fno-sanitize-recover",
assuming that the code does not use floating point), as long as
denial of service due to a crash from a bug is regarded as preferable
to uncontrolled behavior.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140414105605.ga5...@ypig.lip.ens-lyon.fr



Re: Having fun with the following C code (UB)

2014-04-13 Thread Shachar Shemesh
On 13/04/14 06:32, Russ Allbery wrote:
>> Like I said before, I am not against the compilers warning about such
>> > cases. I just think that these warnings need to be done very carefully,
>> > or they become worse than useless.  As such, if you see a case in which
>> > you feel gcc (or clang, or whatever) should warn, by all means open a
>> > bug for it.  Just make sure you make it a "feature request" and not a
>> > "security hole" severity.  In other words, don't get mad merely because
>> > the compiler author did not read your mind.
> I'll be sure to keep that in mind, since I've never reported a bug or
> discussed issues with compiler writers before.
No issues with most of what you said. It boils down to this: Different
people write in C/C++ for different reasons, and therefor have different
needs. I think the compiler writers are doing what they can, but there
really is no pleasing everyone.

Personally, I simply try to have my code compile on as broad a range of
compilers as I can, and thus enjoy their combined warnings.

The thing about the above is this. In the past, we've seen some people
really explode over these issues. I think this is incorrect. The bug is,
when all is said and done, in the code. While it's perfectly acceptable,
in my eyes, to ask the compiler to help you find that bug, getting mad
at it for not doing so makes no sense.

Shachar


Re: Having fun with the following C code (UB)

2014-04-12 Thread Russ Allbery
Shachar Shemesh  writes:
> On 13/04/14 05:39, Russ Allbery wrote:

>> One can make a good argument that such checks are exactly what you
>> should be doing.

> Then the answer is very simple. Write in Java.

There are a lot of reasons other than the absolute fastest performance you
can possibly muster to write something in C instead of Java.  For example,
if you're writing an Apache or PAM module, Java is not very useful, but
that doesn't mean you need micro-optimized signed integer math or need to
worry about the few instructions it takes to check parameters for whether
they're NULL.

There are some very experienced C programmers that make extensive use of
preconditions and checks.  For example, there was an article in
Communications of the ACM within the past year (sadly, I don't remember
exactly where and a quick search didn't help me out) that talked about a C
coding style that aimed for defining checks for every non-trivial
statement.  With some macro assistance, the resulting code wasn't too
ugly, and the result is rather appealing.  It's been something I've been
pondering experimenting with since reading that article.

> I am not a compiler writer, so I have no actual data. I suspect your
> common 20k line will yield about a thousand such warnings, the huge
> majority of which there will be nothing for you to do about.

> Also, it turns out gcc does have such an option. See
> http://www.airs.com/blog/archives/120. -Wstrict-overflow will let you
> know when the optimizer uses the assumption of no overflow to change
> other code.

Thanks for the pointer to that option!  I'd missed that.

I enabled -fstrict-overflow -Wstrict-overflow=5 -Werror in my standard
utility library, which has about 13,000 lines of C, and got one additional
warning, which was trivial to fix (and which was in a bit of code that has
a rather bad smell and has been on my list to rewrite when I get a
chance).  So I'm not horribly impressed by your doomsday worries.  :)
That said, I believe that warning flag does not catch all possible
overflows, just the ones where GCC happens to do an optimization.

I use all sorts of warning flags, with -Werror, that I've seen other
people claim are completely unworkable in practice, like -Wwrite-strings
and -Wsign-compare.  They're not.  They just require care and attention
and writing high-quality C code.

>> Put a different way, the answer to your question is quite different if
>> that function were instead:
>>
>> int compute_offset_into_network_packet( int a, int b )
>> {
>> return a+b;
>> }
>>
>> No?

> In most cases, you will overflow the packet long before you overflow the
> integer.

Not if either a or b is coming from the network, which is the case I'm
concerned about.  You should do bounds-checking before using them to look
at offsets in a packet.  Tons and tons of security vulnerabilities have
happened due to lack of that bounds-checking, or getting the check wrong.
If there was a way for the compiler to check whether you've done that
bounds-checking before you start doing math with those values, that would
be very helpful.  Obviously, that warning might not be appropriate for all
code, but gcc has a rich pragmata system for handling that.

> If that's the case, the compiler won't help you.

That's the problem with C code in general: the compiler doesn't help you
enough.  Which is why I like seeing more warnings and smarter compilers
that can, from the code, work out what invariants you've established and
then warn you when you haven't checked for ones that may be important.

clang, for example, does a great job of this (far better than gcc) at
detecting variables that may be NULL at the point of use.

There are, obviously, limits, since the C language semantically doesn't
give either the author or the compiler a lot of help.  But there are still
opportunities for improvement.  Ten years ago, I would have said that a
lot of the diagnostics that clang provides were impossible in C because
there just wasn't enough information available to the compiler.  I was
wrong.

> Like I said before, I am not against the compilers warning about such
> cases. I just think that these warnings need to be done very carefully,
> or they become worse than useless.  As such, if you see a case in which
> you feel gcc (or clang, or whatever) should warn, by all means open a
> bug for it.  Just make sure you make it a "feature request" and not a
> "security hole" severity.  In other words, don't get mad merely because
> the compiler author did not read your mind.

I'll be sure to keep that in mind, since I've never reported a bug or
discussed issues with compiler writers before.

-- 
Russ Allbery (r...@debian.org)   


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/87lhv9sy3y@windlord.stanford.edu



Re: Having fun with the following C code (UB)

2014-04-12 Thread Shachar Shemesh
On 13/04/14 05:39, Russ Allbery wrote:
> One can make a good argument that such checks are exactly what you should
> be doing.
Then the answer is very simple. Write in Java.
>> My understanding of things is that undefined behaviors are fairly
>> common, and almost always benign. Look at the following code:
>> int add( int a, int b )
>> {
>> return a+b;
>> }
>> Do you really want to get a "Warning: signed integer overflow yields
>> undefined behavior" on this function?
> I would certainly like to be able to enable such a thing.  I write a lot
> of code where I'd love the compiler to double-check that I've established
> bounds checks on a and b before doing the addition that guarantee that it
> won't overflow.
I am not a compiler writer, so I have no actual data. I suspect your
common 20k line will yield about a thousand such warnings, the huge
majority of which there will be nothing for you to do about.

Also, it turns out gcc does have such an option. See
http://www.airs.com/blog/archives/120. -Wstrict-overflow will let you
know when the optimizer uses the assumption of no overflow to change
other code.
>
> Put a different way, the answer to your question is quite different if
> that function were instead:
>
> int compute_offset_into_network_packet( int a, int b )
> {
> return a+b;
> }
>
> No?
>
In most cases, you will overflow the packet long before you overflow the
integer. If that's the case, the compiler won't help you. There is a
good case to claim that the warning would be appropriate for the
following code:

int compute_offset_into_network_packet( int a, int b )
{
int offset = a+b;
if( offset<0 || offset>PACKET_SIZE )
offset = 0;

return offset;
}

But, then again, what should the warning be? Like I said before, if you
don't like to deal with overflows, use Java and take Java's performance
hit. In fact, most of the world is doing precisely that.

Like I said before, I am not against the compilers warning about such
cases. I just think that these warnings need to be done very carefully,
or they become worse than useless. As such, if you see a case in which
you feel gcc (or clang, or whatever) should warn, by all means open a
bug for it. Just make sure you make it a "feature request" and not a
"security hole" severity. In other words, don't get mad merely because
the compiler author did not read your mind.

I don't know whether -Wstrict-overflow is on for -Wall (or -Wextra). If
it isn't, I do think it should be. Just checked, and it is on for -Wall,
sort of. See http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html.

Shachar


Re: Having fun with the following C code (UB)

2014-04-12 Thread Russ Allbery
Shachar Shemesh  writes:

> I will point out that it is not always is possible, and is quite often
> not easy. For example, the famous "undefined after NULL dereference"
> would probably cause a warning every time a function uses a pointer it
> was given without first validating its non-NULLness.

One can make a good argument that such checks are exactly what you should
be doing.

I used to be mildly opposed to this coding style since I felt like it led
to a lot of code clutter, but the more time I spend looking at security
vulnerabilities, the more I've come around to that approach.  I'm still
not sure that I want shared libraries calling assert(), but on the other
hand I can think of a lot of places where I'd rather have the shared
library call assert() than to go on and quietly do something bogus.
(Best, of course, is if you can return some sort of error in a reasonable
way, but that does force an often-awkward way of writing code.)

> My understanding of things is that undefined behaviors are fairly
> common, and almost always benign. Look at the following code:

> int add( int a, int b )
> {
> return a+b;
> }

> Do you really want to get a "Warning: signed integer overflow yields
> undefined behavior" on this function?

I would certainly like to be able to enable such a thing.  I write a lot
of code where I'd love the compiler to double-check that I've established
bounds checks on a and b before doing the addition that guarantee that it
won't overflow.

Put a different way, the answer to your question is quite different if
that function were instead:

int compute_offset_into_network_packet( int a, int b )
{
return a+b;
}

No?

-- 
Russ Allbery (r...@debian.org)   


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/87wqeurm01@windlord.stanford.edu



Re: Having fun with the following C code (UB)

2014-04-12 Thread Shachar Shemesh
On 12/04/14 23:38, Henrique de Moraes Holschuh wrote:
> On Thu, 10 Apr 2014, Shachar Shemesh wrote:
>> I never did understand what people expect. gcc uses the undefined
> Warn the hell out of any line of code with per-spec undefined behaviour, if
> not by default, at least under -Wall.
I have no argument with that, in those places it is possible.

I will point out that it is not always is possible, and is quite often
not easy. For example, the famous "undefined after NULL dereference"
would probably cause a warning every time a function uses a pointer it
was given without first validating its non-NULLness.

> THAT would be a good start.  Too bad not even gcc knows every time it hits
> undefined behaviour...
My understanding of things is that undefined behaviors are fairly
common, and almost always benign. Look at the following code:

int add( int a, int b )
{
return a+b;
}

Do you really want to get a "Warning: signed integer overflow yields
undefined behavior" on this function?

Shachar


Re: Having fun with the following C code (UB)

2014-04-12 Thread Henrique de Moraes Holschuh
On Thu, 10 Apr 2014, Shachar Shemesh wrote:
> I never did understand what people expect. gcc uses the undefined

Warn the hell out of any line of code with per-spec undefined behaviour, if
not by default, at least under -Wall.

THAT would be a good start.  Too bad not even gcc knows every time it hits
undefined behaviour...

> Are you really sure you want to have slower code just so that your
> corner cases are easier for you? How is that a reasonable trade-off to make?

Yes in just about everything that did not ask for -On where n > 2.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140412203827.ga25...@khazad-dum.debian.net



Re: Having fun with the following C code (UB)

2014-04-11 Thread Shachar Shemesh
On 11/04/14 13:49, Ansgar Burchardt wrote:
> Hi,
>
> On 04/11/2014 12:42, Ian Jackson wrote:
>>
>> What people expect is that the compiler compiles programs the way C
>> was traditionally compiled.
> Shouldn't -O0 come close to that expectation?
I think that Ansgar's answer is spot on, but against all good sense, I
still want to expand it.

Neither the compiler nor its authors are doing anything out of spite. It
is, indeed, painful when a compiler optimizes away a security check due
to some standard defining a feature to be "undefined behavior". However,
for any such case there are hundreds in which this optimization saves on
an "if" that would strain the branch prediction cache, or allows
coalescing operations that would otherwise need to be done one after the
other, or any number of other cases in which the output machine language
looks nothing like your written high level C or C++.

Not only is this good for performance, it is also good for security. For
example, in C++ I can run the following code:

for( unsigned int i=0; i

Re: Having fun with the following C code (UB)

2014-04-11 Thread Ansgar Burchardt
Hi,

On 04/11/2014 12:42, Ian Jackson wrote:
> Shachar Shemesh writes ("Re: Having fun with the following C code (UB)"):
>> I never did understand what people expect.
> 
> What people expect is that the compiler compiles programs the way C
> was traditionally compiled.

Shouldn't -O0 come close to that expectation?

Ansgar


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/5347c8b4.7040...@debian.org



Re: Having fun with the following C code (UB)

2014-04-11 Thread Ian Jackson
Shachar Shemesh writes ("Re: Having fun with the following C code (UB)"):
> I never did understand what people expect.

What people expect is that the compiler compiles programs the way C
was traditionally compiled.

Obviously that expectation nowadays results in disappointment, and
isn't captured by formal standards bodies.

> gcc uses the undefined behavior to not emit checks it would
> otherwise have to, so that your code runs faster.

Most of us would prefer slower code with fewer hideous security bugs.

> Are you really sure you want to have slower code just so that your
> corner cases are easier for you? How is that a reasonable trade-off to make?

Yes, I am absolutely sure of that.

Ian.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/21319.50942.486880.36...@chiark.greenend.org.uk



Re: Having fun with the following C code (UB)

2014-04-10 Thread Paul Wise
On Fri, Apr 11, 2014 at 5:38 AM, Russ Allbery wrote:

> I don't want, necessarily, to have slower code to make handling corner
> cases easier.  However, I am generally happy to have slower code in return
> for making the system more secure, as long as the speed hit isn't too
> substantial.  Security is a much bigger problem than performance right now
> for most people.

How much of a speed hit is acceptable? Perhaps we should have a
secondary archive built using SoftBoundCETS, which possibly has a 50%
speed hit according to this talk:

http://events.ccc.de/congress/2013/Fahrplan/events/5412.html
https://media.ccc.de/browse/congress/2013/30C3_-_5412_-_en_-_saal_1_-_201312271830_-_bug_class_genocide_-_andreas_bogk.html
http://acg.cis.upenn.edu/softbound/
http://safecode.cs.illinois.edu/docs/SoftBoundCETS.html

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/CAKTje6GWF19nPSSTNzJnup+jKvE1DNtzEffHObbN95idL=v...@mail.gmail.com



Re: Having fun with the following C code (UB)

2014-04-10 Thread Russ Allbery
Shachar Shemesh  writes:

> I never did understand what people expect. gcc uses the undefined
> behavior to not emit checks it would otherwise have to, so that your
> code runs faster. This affects not only those corner cases, where you
> are relying on this behaving a certain way, but especially in everyday
> code, where those undefined behavior allows GCC to save you lots of
> cycles.

> Are you really sure you want to have slower code just so that your
> corner cases are easier for you? How is that a reasonable trade-off to
> make?

I don't want, necessarily, to have slower code to make handling corner
cases easier.  However, I am generally happy to have slower code in return
for making the system more secure, as long as the speed hit isn't too
substantial.  Security is a much bigger problem than performance right now
for most people.

The hard part is distinguishing between those two properties.

-- 
Russ Allbery (r...@debian.org)   


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/87fvlk50gp@windlord.stanford.edu



Re: Having fun with the following C code (UB)

2014-04-10 Thread Shachar Shemesh
On 10/04/14 20:59, Ian Jackson wrote:
> Vincent Lefevre writes ("Re: Having fun with the following C code (UB)"):
>> On 2014-04-10 11:48:44 +, Thorsten Glaser wrote:
>>> And GCC is a repeat offender which actually does do that.
>> If you don't like that, you should use the -fwrapv option.
> Sadly that doesn't deal with all of these malicious optimisations.
>
I never did understand what people expect. gcc uses the undefined
behavior to not emit checks it would otherwise have to, so that your
code runs faster. This affects not only those corner cases, where you
are relying on this behaving a certain way, but especially in everyday
code, where those undefined behavior allows GCC to save you lots of cycles.

Are you really sure you want to have slower code just so that your
corner cases are easier for you? How is that a reasonable trade-off to make?

Shachar


Re: Having fun with the following C code (UB)

2014-04-10 Thread Ian Jackson
Vincent Lefevre writes ("Re: Having fun with the following C code (UB)"):
> On 2014-04-10 11:48:44 +, Thorsten Glaser wrote:
> > And GCC is a repeat offender which actually does do that.
> 
> If you don't like that, you should use the -fwrapv option.

Sadly that doesn't deal with all of these malicious optimisations.

But it is a good start.  Personally I think we should compile the
whole distro (or at least most of it) with -fwrapv.

Ian.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/21318.56319.795404.108...@chiark.greenend.org.uk



Re: Having fun with the following C code (UB)

2014-04-10 Thread Vincent Lefevre
On 2014-04-10 11:48:44 +, Thorsten Glaser wrote:
> Ian Jackson dixit:
> 
> >> If the architecture uses two's complement, however, then the code is
> >> correct.
> >
> >Unfortunately adversarial optimisation by modern compilers means that
> >this kind of reasoning is no longer valid.
> >
> >The compiler might easily see that your code unconditionally performs
> >a computation with undefined behaviour, and delete it.
> 
> And GCC is a repeat offender which actually does do that.

If you don't like that, you should use the -fwrapv option.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140410153409.ga32...@xvii.vinc17.org



Re: Having fun with the following C code (UB)

2014-04-10 Thread Jakub Wilk

* Wouter Verhelst , 2014-04-10, 12:42:
I've had to figure out the size of off_t in nbd-server, and have been 
doing it without relying on overflow, for years now. It took quite a 
few iterations to get it right, but the current definition has looked 
like this since 2006:


#define OFFT_MAX ~((off_t)1<<(sizeof(off_t)*8-1))

i.e., left-shift 1 by enough bits so that the most significant bit is set,


I believe that this code triggers undefined behavior. My C99 draft reads:

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated 
bits are filled with zeros. […] If E1 has a signed type and 
nonnegative value, and E1 × 2^(E2) is representable in the result 
type, then that is the resulting value; otherwise, the behavior is 
undefined.


Yes; the standard does this to allow for machine architectures which do 
not use two's complement to store negative values. I did mention that 
assumption in my previous mail.


I thought you were referring to use of ~ on a signed integer, which is 
implementation-defined.


Here's a way to compute OFFT_MAX (hopefully) without any undefined 
behavior:


-((off_t)-2 * ((off_t)1 << (sizeof (off_t) * CHAR_BIT - 2)) + 1)

--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140410121827.ga9...@jwilk.net



Re: Having fun with the following C code (UB)

2014-04-10 Thread Azazel
On 2014-04-10 12:42:03 +0200, Wouter Verhelst wrote:
> On Thu, Apr 10, 2014 at 12:29:50PM +0200, Jakub Wilk wrote:
> > * Wouter Verhelst , 2014-04-10, 12:03:
> > > I've had to figure out the size of off_t in nbd-server, and have
> > > been doing it without relying on overflow, for years now. It took
> > > quite a few iterations to get it right, but the current definition
> > > has looked like this since 2006:
> > >
> > > #define OFFT_MAX ~((off_t)1<<(sizeof(off_t)*8-1))
> > >
> > > i.e., left-shift 1 by enough bits so that the most significant bit
> > > is set,
> >
> > I believe that this code triggers undefined behavior. My C99 draft
> > reads:
> >
> > The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated
> > bits are filled with zeros. […] If E1 has a signed type and
> > nonnegative value, and E1 × 2^(E2) is representable in the result
> > type, then that is the resulting value; otherwise, the behavior is
> > undefined.
>
> Yes; the standard does this to allow for machine architectures which
> do not use two's complement to store negative values. I did mention
> that assumption in my previous mail.
>
> If the architecture uses two's complement, however, then the code is
> correct.

Chapter and verse?  C99, sec. 6.5.7, para. 4, quoted above, makes no
such distinction.  The operation is simply defined in terms of multi-
plication by powers of two.  If off_t is a signed type,

  1 * 2 ^ (sizeof (off_t) * CHAR_BIT - 1)

cannot be represented in off_t, and the behaviour is undefined.

Az.


signature.asc
Description: Digital signature


Re: Having fun with the following C code (UB)

2014-04-10 Thread Thorsten Glaser
Ian Jackson dixit:

>> If the architecture uses two's complement, however, then the code is
>> correct.
>
>Unfortunately adversarial optimisation by modern compilers means that
>this kind of reasoning is no longer valid.
>
>The compiler might easily see that your code unconditionally performs
>a computation with undefined behaviour, and delete it.

And GCC is a repeat offender which actually does do that.
(mksh’s internal guaranteed-to-wrap-around signed 32-bit integer
arithmetics is implemented using only C unsigned integer types,
since a while, due to this. Yes, speed hit, especially since the
CPUs (except DSPs, possibly) could all do this correctly.)

bye,
//mirabilos
-- 
In traditional syntax ' is ignored, but in c99 everything between two ' is
handled as character constant.  Therefore you cannot use ' in a preproces-
sing file in c99 mode.  -- Ragge
No faith left in ISO C99, undefined behaviour, etc.


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/pine.bsm.4.64l.1404101147250.23...@herc.mirbsd.org



Re: Having fun with the following C code (UB)

2014-04-10 Thread Ian Jackson
Wouter Verhelst writes ("Re: Having fun with the following C code (UB)"):
> Yes; the standard does this to allow for machine architectures which do
> not use two's complement to store negative values. I did mention that
> assumption in my previous mail.
> 
> If the architecture uses two's complement, however, then the code is
> correct.

Unfortunately adversarial optimisation by modern compilers means that
this kind of reasoning is no longer valid.

The compiler might easily see that your code unconditionally performs
a computation with undefined behaviour, and delete it.

Ian.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/21318.32941.351261.72...@chiark.greenend.org.uk



Re: Having fun with the following C code (UB)

2014-04-10 Thread Wouter Verhelst
On Thu, Apr 10, 2014 at 12:29:50PM +0200, Jakub Wilk wrote:
> * Wouter Verhelst , 2014-04-10, 12:03:
> >I've had to figure out the size of off_t in nbd-server, and have been
> >doing it without relying on overflow, for years now. It took quite a few
> >iterations to get it right, but the current definition has looked like
> >this since 2006:
> >
> >#define OFFT_MAX ~((off_t)1<<(sizeof(off_t)*8-1))
> >
> >i.e., left-shift 1 by enough bits so that the most significant bit is set,
> 
> I believe that this code triggers undefined behavior. My C99 draft reads:
> 
> The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are
> filled with zeros. […] If E1 has a signed type and nonnegative value, and E1
> × 2^(E2) is representable in the result type, then that is the resulting
> value; otherwise, the behavior is undefined.

Yes; the standard does this to allow for machine architectures which do
not use two's complement to store negative values. I did mention that
assumption in my previous mail.

If the architecture uses two's complement, however, then the code is
correct.

-- 
It is easy to love a country that is famous for chocolate and beer

  -- Barack Obama, speaking in Brussels, Belgium, 2014-03-26


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140410104203.gd18...@grep.be



Re: Having fun with the following C code (UB)

2014-04-10 Thread Jakub Wilk

* Wouter Verhelst , 2014-04-10, 12:03:
I've had to figure out the size of off_t in nbd-server, and have been 
doing it without relying on overflow, for years now. It took quite a 
few iterations to get it right, but the current definition has looked 
like this since 2006:


#define OFFT_MAX ~((off_t)1<<(sizeof(off_t)*8-1))

i.e., left-shift 1 by enough bits so that the most significant bit is 
set,


I believe that this code triggers undefined behavior. My C99 draft reads:

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits 
are filled with zeros. […] If E1 has a signed type and nonnegative 
value, and E1 × 2^(E2) is representable in the result type, then that is 
the resulting value; otherwise, the behavior is undefined.


--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140410102950.ga7...@jwilk.net



Re: Having fun with the following C code (UB)

2014-04-10 Thread Wouter Verhelst
On Thu, Mar 27, 2014 at 09:07:14AM +0100, Mathieu Malaterre wrote:
> Here is a little bug I just discovered:
> 
> http://stackoverflow.com/questions/22664658/finding-off-t-size
> 
> For reference, here are the packages affected in debian:
> 
> http://codesearch.debian.net/search?q=LARGE_OFF_T
> 
> For reference clang fails as was expected by the initial author, but
> recent gcc (default C compiler on debian), simply issue a warning.

I've had to figure out the size of off_t in nbd-server, and have been
doing it without relying on overflow, for years now. It took quite a few
iterations to get it right, but the current definition has looked like
this since 2006:

#define OFFT_MAX ~((off_t)1<<(sizeof(off_t)*8-1))

i.e., left-shift 1 by enough bits so that the most significant bit is
set, then flip all bits so you end up with the highest positive value
that fits in an off_t.

Obviously that requires an architecture which uses two's complement, but
then I doubt any architecture that doesn't has been popular since the
late seventies.

-- 
It is easy to love a country that is famous for chocolate and beer

  -- Barack Obama, speaking in Brussels, Belgium, 2014-03-26


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140410100321.gb18...@grep.be



Re: Having fun with the following C code (UB)

2014-04-08 Thread Henrique de Moraes Holschuh
On Thu, 27 Mar 2014, Jakub Wilk wrote:
> * Mathieu Malaterre , 2014-03-27, 13:06:
> >I preferred not to mass bug everyone out there and instead:
> >
> >http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=742780
> 
> But many packages don't regenerate autofoo at build-time. :-(
> 
> >LFS is still a release goal, not a requirement.
> 
> Then "severity: grave" is probably overkill. :-P

No, it is not.  I can cause data loss or corruption when operating with
large files, and it is aggravated by the fact that the application used to
work in LFS mode just fine on Wheezy, but it is now broken.

LFS has been a release goal for a LONG time, which means we have fixed a LOT
of packages to have LFS for at least two stable releases already and
therefore regressions on LFS support *are* an issue.

Also, autotooling at build time (i.e. regenerating the autofoo) is our
recommended best practice, exactly so that we can actually have a shot at
fixing this kind of crap...

IMO, this is very much a "grave" bug.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140408143923.gd1...@khazad-dum.debian.net



Re: Having fun with the following C code (UB)

2014-03-27 Thread Jakub Wilk

* Mathieu Malaterre , 2014-03-27, 15:04:

I preferred not to mass bug everyone out there and instead:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=742780


But many packages don't regenerate autofoo at build-time. :-(


And your point is ?


That they won't immediately benefit from fixed #742780.

--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140327182650.ga4...@jwilk.net



Re: Having fun with the following C code (UB)

2014-03-27 Thread Bastien ROUCARIES
Does
Le 27 mars 2014 15:05, "Mathieu Malaterre"  a écrit :
>
> On Thu, Mar 27, 2014 at 2:50 PM, Jakub Wilk  wrote:
> > * Mathieu Malaterre , 2014-03-27, 13:06:
> >
> >> I preferred not to mass bug everyone out there and instead:
> >>
> >> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=742780
> >
> >
> > But many packages don't regenerate autofoo at build-time. :-(
>
> And your point is ?
>
> It may not impact package built with gcc-4.5, only those that have
> been rebuild since gcc-4.6. But anyway the generated code (whether it
> is in m4 or in the generated auto* stuff) is bogus since the begining.
>
> >> LFS is still a release goal, not a requirement.
> >
> >
> > Then "severity: grave" is probably overkill. :-P
>
> If as an application programmer I cannot get memcpy to copy past the
> first 32bits of size_t (x86_64), I would call it a grave issue in
> libc.
>
> Same thing if I use autoconf macro to tell whether or not my system
> support LFS, but it keeps on claiming it does not. I personally call
> it a grave issue ... right ?
>
> This is really a regression, package on 32bits arch used to support
> LFS, but since gcc 4.6 came it they do not anymore.

Could we detected by checking configure ? We could issue a Lintian tag
>
> --
> To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact
listmas...@lists.debian.org
> Archive:
https://lists.debian.org/CA+7wUszd=ux-gm=wje6stgeheztp6zmdb0sz47aas0ncc4l...@mail.gmail.com
>


Re: Having fun with the following C code (UB)

2014-03-27 Thread Mathieu Malaterre
On Thu, Mar 27, 2014 at 2:50 PM, Jakub Wilk  wrote:
> * Mathieu Malaterre , 2014-03-27, 13:06:
>
>> I preferred not to mass bug everyone out there and instead:
>>
>> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=742780
>
>
> But many packages don't regenerate autofoo at build-time. :-(

And your point is ?

It may not impact package built with gcc-4.5, only those that have
been rebuild since gcc-4.6. But anyway the generated code (whether it
is in m4 or in the generated auto* stuff) is bogus since the begining.

>> LFS is still a release goal, not a requirement.
>
>
> Then "severity: grave" is probably overkill. :-P

If as an application programmer I cannot get memcpy to copy past the
first 32bits of size_t (x86_64), I would call it a grave issue in
libc.

Same thing if I use autoconf macro to tell whether or not my system
support LFS, but it keeps on claiming it does not. I personally call
it a grave issue ... right ?

This is really a regression, package on 32bits arch used to support
LFS, but since gcc 4.6 came it they do not anymore.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/CA+7wUszd=ux-gm=wje6stgeheztp6zmdb0sz47aas0ncc4l...@mail.gmail.com



Re: Having fun with the following C code (UB)

2014-03-27 Thread Jakub Wilk

* Mathieu Malaterre , 2014-03-27, 13:06:

I preferred not to mass bug everyone out there and instead:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=742780


But many packages don't regenerate autofoo at build-time. :-(


LFS is still a release goal, not a requirement.


Then "severity: grave" is probably overkill. :-P

--
Jakub Wilk


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140327135023.ga5...@jwilk.net



Re: Having fun with the following C code (UB)

2014-03-27 Thread Mathieu Malaterre
On Thu, Mar 27, 2014 at 1:19 PM, Andrey Rahmatullin  wrote:
> On Thu, Mar 27, 2014 at 01:06:02PM +0100, Mathieu Malaterre wrote:
>> > Here is a little bug I just discovered:
>> >
>> > http://stackoverflow.com/questions/22664658/finding-off-t-size
>> >
>> > For reference, here are the packages affected in debian:
>> >
>> > http://codesearch.debian.net/search?q=LARGE_OFF_T
>>
>> While this affects all autoconf based package in debian (limited to
>> 32bits arch), I preferred not to mass bug everyone out there and
>> instead:
>>
>> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=742780
>>
>> LFS is still a release goal, not a requirement.
> Can you please describe the consequences of this bug for affected
> packages?

Sorry I thought this was obvious.

Short summary you will not get LFS on 32bits arch.

Long summary: autoconf and others projects relied on the small C code
I posted to determine whether or not the target system need the
-D_FILE_OFFSET_BITS=64 to have a 64bits off_t. Most people are using
x86_64 and thus will not see any difference. The only affect people
are those looking for LFS support on 32bits system, where (by default)
off_t is only 32bits (this is the famous fseek vs fseeko issue).

Hoping to be clear this time,


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/ca+7wusw-zcfyyq9kwxyq4h58jevgp9nkkymuncvwwnxuhca...@mail.gmail.com



Re: Having fun with the following C code (UB)

2014-03-27 Thread Andrey Rahmatullin
On Thu, Mar 27, 2014 at 01:06:02PM +0100, Mathieu Malaterre wrote:
> > Here is a little bug I just discovered:
> >
> > http://stackoverflow.com/questions/22664658/finding-off-t-size
> >
> > For reference, here are the packages affected in debian:
> >
> > http://codesearch.debian.net/search?q=LARGE_OFF_T
> 
> While this affects all autoconf based package in debian (limited to
> 32bits arch), I preferred not to mass bug everyone out there and
> instead:
> 
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=742780
> 
> LFS is still a release goal, not a requirement.
Can you please describe the consequences of this bug for affected
packages?

-- 
WBR, wRAR


--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140327121944.ga5...@belkar.wrar.name



Re: Having fun with the following C code (UB)

2014-03-27 Thread Mathieu Malaterre
On Thu, Mar 27, 2014 at 9:07 AM, Mathieu Malaterre  wrote:
> Here is a little bug I just discovered:
>
> http://stackoverflow.com/questions/22664658/finding-off-t-size
>
> For reference, here are the packages affected in debian:
>
> http://codesearch.debian.net/search?q=LARGE_OFF_T

While this affects all autoconf based package in debian (limited to
32bits arch), I preferred not to mass bug everyone out there and
instead:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=742780

LFS is still a release goal, not a requirement.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/CA+7wUsyqVe9_qv3W8u9Eu_htogbScJ7GEhN5Zpf4g7ucYrCE=w...@mail.gmail.com