Re: How to avoid some built-in expansions in gcc?

2024-06-05 Thread David Brown via Gcc

On 04/06/2024 19:43, Michael Matz via Gcc wrote:

Hello,

On Tue, 4 Jun 2024, Richard Biener wrote:


A pragmatic solution might be a new target hook, indicating a specified
builtin is not to be folded into an open-coded form.


Well, that's what the mechanism behind -fno-builtin-foobar is supposed to
be IMHO.  Hopefully the newly added additional mechanism using optabs and
ifns (instead of builtins) heeds it.


-fno-builtin makes GCC not know semantics of the functions called


Hmm, true.  Not expanding inline is orthogonal strictly speaking ...


which is worse for optimization than just not inline expanding it.


... but on AVR expanding inline is probably worse than that lost
knowledge.  So yeah, ideally we would devise a (simple/reasonable) way to
at least disable inline expansion, without making it non-builtin.



The ideal here would be to have some way to tell gcc that a given 
function has the semantics of a different function.  For example, a 
programmer might have several implementations of "memcpy" that are 
optimised for different purposes based on the size or alignment of the 
arguments.  Maybe some of these are written with inline assembly or work 
in a completely different way (I've used DMA on a microcontroller for 
the purpose).  If you could tell the compiler that the semantic 
behaviour and results were the same as standard memcpy(), that could 
lead to optimisations.


Then you could declare your "isinf" function with 
__attribute__((semantics_of(__builtin_isinf))).


And the feature could be used in any situation where you can write a 
function in a simple, easy-to-analyse version and a more efficient but 
opaque version.






Re: Is fcommon related with performance optimization logic?

2024-05-30 Thread David Brown via Gcc

On 30/05/2024 04:26, Andrew Pinski via Gcc wrote:

On Wed, May 29, 2024 at 7:13 PM 赵海峰 via Gcc  wrote:


Dear Sir/Madam,


We found that running on intel SPR UnixBench compiled with gcc 10.3 performs 
worse than with gcc 8.5 for dhry2reg benchmark.


I found it related with -fcommon option which is disabled in 10.3 by default. 
Fcommon will make global variables addresses in special order in bss 
section(watching by nm -n) whatever they are defined in source code.


We are wondering if fcommon has some special performance optimization process?


(I also post the subject to gcc-help. Hope to get some suggestion in this mail 
list. Sorry for bothering.)


This was already filed as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114532 . But someone
needs to go in and do more analysis of what is going wrong. The
biggest difference for x86_64 is how the variables are laid out and by
who (the compiler or the linker).  There is some notion that
-fno-common increases the number of L1-dcache-load-misses and that
points to the layout of the variable differences causing the
difference. But nobody has gone and seen which variables are laid out
differently and why. I am suspecting that small changes in the
code/variables would cause layout differences which will cause the
cache misses which can cause the performance which is almost all by
accident.
I suspect adding -fdata-sections will cause another performance
difference here too. And there is not much GCC can do about this since
data layout is "hard" to do to get the best performance always.



(I am most familiar with embedded systems with static linking, rather 
than dealing with GOT and other aspects of linking on big systems.)


I think -fno-common should allow -fsection-anchors to do a much better 
job.  If symbols are put in the common section, the compiler does not 
know their relative position until link time.  But if they are in bss or 
data sections (with or without -fdata-sections), it can at least use 
anchors to access data in the translation unit that defines the data 
objects.


David



Thanks,
Andrew Pinski




Best regards.


Clark Zhao






Re: aliasing

2024-03-18 Thread David Brown

On 18/03/2024 16:00, Martin Uecker via Gcc wrote:

Am Montag, dem 18.03.2024 um 14:29 +0100 schrieb David Brown:


On 18/03/2024 12:41, Martin Uecker wrote:



Hi David,

Am Montag, dem 18.03.2024 um 10:00 +0100 schrieb David Brown:

Hi,

I would very glad to see this change in the standards.


Should "byte type" include all character types (signed, unsigned and
plain), or should it be restricted to "unsigned char" since that is the
"byte" type ?  (I think allowing all character types makes sense, but
only unsigned char is guaranteed to be suitable for general object
backing store.)


At the moment, the special type that can access all others are
all non-atomic character types.  So for symmetry reasons, it
seems that this is also what we want for backing store.

I am not sure what you mean by "only unsigned char". Are you talking
about C++?  "unsigned char" has no special role in C.



"unsigned char" does have a special role in C - in 6.2.6.1p4 it
describes any object as being able to be copied to an array of unsigned
char to get the "object representation".
  The same is not true for an
array of "signed char".  I think it would be possible to have an
implementation where "signed char" was 8-bit two's complement except
that 0x80 would be a trap representation rather than -128.  I am not
sure of the consequences of such an implementation (assuming I am even
correct in it being allowed).


Yes, but with C23 this is not possible anymore. I think signed
char or char should work equally well now.


I have just noticed that in C23, the SCHAR_MIN is -128 (or -2 ^ (N-1) in 
general), eliminating the possibility of having a trap value for signed 
char (or any other integer type without padding bits).  There's always a 
bit of jumping around in the C standards to get the complete picture!


But as I said in another post, I still worry a little about the unsigned 
to signed conversion being implementation-defined, and therefore not 
guaranteed to work in a way that preserves the underlying object 
representation.  I think it should be possible to make a small change to 
the description of unsigned to signed conversions to eliminate that.





I see people making a lot of assumptions in their embedded programming
that are not fully justified in the C standards.  Sometimes the
assumptions are just bad, or it would be easy to write code without the
assumptions.  But at other times it would be very awkward or inefficient
to write code that is completely "safe" (in terms of having fully
defined behaviour from the C standards or from implementation-dependent
behaviour).  Making your own dynamic memory allocation functions is one
such case.  So I have a tendency to jump on any suggestion of changes to
the C (or C++) standards that could let people write such essential code
in a safer or more efficient manner.


That something is undefined does not automatically mean it is
forbidden or unsafe.  It simply means it is not portable.  


That is the case for things that are not defined in the C standards, but 
defined elsewhere.  If the behaviour of a piece of code is not defined 
anywhere for the toolchain you are using, then it is inherently unsafe 
to use.  ("Forbidden" is another matter.  It might be "forbidden" by 
your coding standards, or your boss, but the language and tools don't 
forbid things!)


Something that is not defined in the C standards, but defined in your 
compiler manual or additional standards (such as POSIX) is safe to use 
but limited in portability.


And of course something that is "inherently unsafe" may be considered 
safe in practice, by analysing the generated object code or doing 
exhaustive testing.



I think
in the embedded space it will be difficult to make everything well
defined.  


Yes, that is absolutely true.  (And it is even more difficult if you try 
to restrict yourself to things with full definitions in the C standards 
or explicit implementation-defined behaviour documented by toolchains. 
You almost invariably need some degree of compiler extensions for parts 
of the code.)


But I want to reduce to the smallest practical level the amount of code 
that "works in practice" rather than "known to work by design".



But I fully agree that widely used techniques should
ideally be based on defined behavior and we should  change the
standard accordingly.



Yes, where possible and practical, the standard provide the guarantees 
that programmers need.  Failing that, compiler extensions are good too - 
I'd be very happy with a GCC variable __attribute__ "backing_store" that 
could be applied to allocator backing stores and provide the aliasing 
guarantees needed.  (It might even be needed anyway, to work well with 
the "malloc" attribute, even with your change to the standard.)



David



Re: aliasing

2024-03-18 Thread David Brown

On 18/03/2024 17:46, David Brown via Gcc wrote:

On 18/03/2024 14:54, Andreas Schwab via Gcc wrote:

On Mär 18 2024, David Brown wrote:


I think it would be possible to have an implementation where "signed
char" was 8-bit two's complement except that 0x80 would be a trap
representation rather than -128.


signed char cannot have padding bits, thus it cannot have a trap
representation.



The premise is correct (no padding bits are allowed in signed char), but 
does it follow that it cannot have a trap representation?


5.2.4.2.1p3 in C23 makes the range of a signed integer type go from
- (2 ^ (N-1)) to (2 ^ (N-1)) - 1, which means all values are valid and 
there can be no trap value if there are no padding bits.


I don't think 
the standards are clear either way here - I think the committee missed a 
chance to tidy up the description a bit more when C23 removed formats 
other than two's complement for signed integer types.


I also feel slightly uneasy using signed char for accessing object 
representations since the object representation is defined in terms of 
an unsigned char array, and conversion from unsigned char to signed char 
is implementation-defined.  (This too could have been tightened in C23, 
as there is unlikely to be any implementation that does not do the 
conversion in the obvious manner.)


But I am perhaps worrying too much here.









Re: aliasing

2024-03-18 Thread David Brown via Gcc

On 18/03/2024 14:54, Andreas Schwab via Gcc wrote:

On Mär 18 2024, David Brown wrote:


I think it would be possible to have an implementation where "signed
char" was 8-bit two's complement except that 0x80 would be a trap
representation rather than -128.


signed char cannot have padding bits, thus it cannot have a trap
representation.



The premise is correct (no padding bits are allowed in signed char), but 
does it follow that it cannot have a trap representation?  I don't think 
the standards are clear either way here - I think the committee missed a 
chance to tidy up the description a bit more when C23 removed formats 
other than two's complement for signed integer types.


I also feel slightly uneasy using signed char for accessing object 
representations since the object representation is defined in terms of 
an unsigned char array, and conversion from unsigned char to signed char 
is implementation-defined.  (This too could have been tightened in C23, 
as there is unlikely to be any implementation that does not do the 
conversion in the obvious manner.)


But I am perhaps worrying too much here.






Re: aliasing

2024-03-18 Thread David Brown




On 18/03/2024 12:41, Martin Uecker wrote:



Hi David,

Am Montag, dem 18.03.2024 um 10:00 +0100 schrieb David Brown:

Hi,

I would very glad to see this change in the standards.


Should "byte type" include all character types (signed, unsigned and
plain), or should it be restricted to "unsigned char" since that is the
"byte" type ?  (I think allowing all character types makes sense, but
only unsigned char is guaranteed to be suitable for general object
backing store.)


At the moment, the special type that can access all others are
all non-atomic character types.  So for symmetry reasons, it
seems that this is also what we want for backing store.

I am not sure what you mean by "only unsigned char". Are you talking
about C++?  "unsigned char" has no special role in C.



"unsigned char" does have a special role in C - in 6.2.6.1p4 it 
describes any object as being able to be copied to an array of unsigned 
char to get the "object representation".  The same is not true for an 
array of "signed char".  I think it would be possible to have an 
implementation where "signed char" was 8-bit two's complement except 
that 0x80 would be a trap representation rather than -128.  I am not 
sure of the consequences of such an implementation (assuming I am even 
correct in it being allowed).




Should it also include "uint8_t" (if it exists) ?  "uint8_t" is often an
alias for "unsigned char", but it could be something different, like an
alias for __UINT8_TYPE__, or "unsigned int
__attribute__((mode(QImode)))", which is used in the AVR gcc port.


I think this might be a reason to not include it, as it could
affect aliasing analysis. At least, this would be a different
independent change to consider.



I think it is important that there is a guarantee here, because people 
do use uint8_t as a generic "raw memory" type.  Embedded standards like 
MISRA strongly discourage the use of "unsized" types such as "unsigned 
char", and it is generally assumed that "uint8_t" has the aliasing 
superpowers of a character type.  But it is possible that the a change 
would be better put in the library section on  rather than 
this section.




In my line of work - small-systems embedded development - it is common
to have "home-made" or specialised memory allocation systems rather than
relying on a generic heap.  This is, I think, some of the "existing
practice" that you are considering here - there is a "backing store" of
some sort that can be allocated and used as objects of a type other than
the declared type of the backing store.  While a simple unsigned char
array is a very common kind of backing store, there are others that are
used, and it would be good to be sure of the correctness guarantees for
these.  Possibilities that I have seen include:

unsigned char heap1[N];

uint8_t heap2[N];

union {
double dummy_for_alignment;
char heap[N];
} heap3;

struct {
uint32_t capacity;
uint8_t * p_next_free;
uint8_t heap[N];
} heap4;

uint32_t heap5[N];

Apart from this last one, if "uint8_t" is guaranteed to be a "byte
type", then I believe your wording means that these unions and structs
would also work as "byte arrays".  But it might be useful to add a
footnote clarifying that.



I need to think about this.



Thank you.

I see people making a lot of assumptions in their embedded programming 
that are not fully justified in the C standards.  Sometimes the 
assumptions are just bad, or it would be easy to write code without the 
assumptions.  But at other times it would be very awkward or inefficient 
to write code that is completely "safe" (in terms of having fully 
defined behaviour from the C standards or from implementation-dependent 
behaviour).  Making your own dynamic memory allocation functions is one 
such case.  So I have a tendency to jump on any suggestion of changes to 
the C (or C++) standards that could let people write such essential code 
in a safer or more efficient manner.



(It is also not uncommon to have the backing space allocated by the
linker, but then it falls under the existing "no declared type" case.)


Yes, although with the change we would make the "no declared type" also
be byte arrays, so there is then simply no difference anymore.



Fair enough.  (Linker-defined storage does not just have no declared 
type, it has no directly declared size or other properties either.  The 
start and the stop of the storage area is typically declared as "extern 
uint8_t __space_start[], __space_stop[];", or perhaps as single 
characters or uint32_t types.  The space in between is just calculated 
as the difference between pointers to these.)





I would not want uint32_t to be considered an "alias an

Re: aliasing

2024-03-18 Thread David Brown

Hi,

I would very glad to see this change in the standards.


Should "byte type" include all character types (signed, unsigned and 
plain), or should it be restricted to "unsigned char" since that is the 
"byte" type ?  (I think allowing all character types makes sense, but 
only unsigned char is guaranteed to be suitable for general object 
backing store.)


Should it also include "uint8_t" (if it exists) ?  "uint8_t" is often an 
alias for "unsigned char", but it could be something different, like an 
alias for __UINT8_TYPE__, or "unsigned int 
__attribute__((mode(QImode)))", which is used in the AVR gcc port.


In my line of work - small-systems embedded development - it is common 
to have "home-made" or specialised memory allocation systems rather than 
relying on a generic heap.  This is, I think, some of the "existing 
practice" that you are considering here - there is a "backing store" of 
some sort that can be allocated and used as objects of a type other than 
the declared type of the backing store.  While a simple unsigned char 
array is a very common kind of backing store, there are others that are 
used, and it would be good to be sure of the correctness guarantees for 
these.  Possibilities that I have seen include:


unsigned char heap1[N];

uint8_t heap2[N];

union {
double dummy_for_alignment;
char heap[N];
} heap3;

struct {
uint32_t capacity;
uint8_t * p_next_free;
uint8_t heap[N];
} heap4;

uint32_t heap5[N];

Apart from this last one, if "uint8_t" is guaranteed to be a "byte 
type", then I believe your wording means that these unions and structs 
would also work as "byte arrays".  But it might be useful to add a 
footnote clarifying that.


(It is also not uncommon to have the backing space allocated by the 
linker, but then it falls under the existing "no declared type" case.)



I would not want uint32_t to be considered an "alias anything" type, but 
I have occasionally seen such types used for memory store backings.  It 
is perhaps worth considering defining "byte type" as "non-atomic 
character type, [u]int8_t (if they exist), or other 
implementation-defined types".


Some other compilers might guarantee not to do type-based alias analysis 
and thus view all types as "byte types" in this way.  For gcc, there 
could be a kind of reverse "may_alias" type attribute to create such types.




There are a number of other features that could make allocation 
functions more efficient and safer in use, and which could be ideally be 
standardised in the C standards or at least added as gcc extensions, but 
I think that's more than you are looking for here!


David



On 18/03/2024 08:03, Martin Uecker via Gcc wrote:


Hi,

can you please take a quick look at this? This is intended to align
the C standard with existing practice with respect to aliasing by
removing the special rules for "objects with no declared type" and
making it fully symmetric and only based on types with non-atomic
character types being able to alias everything.


Unrelated to this change, I have another question:  I wonder if GCC
(or any other compiler) actually exploits the " or is copied as an
array of  byte type, " rule to  make assumptions about the effective
types of the target array? I know compilers do this work memcpy...
Maybe also if a loop is transformed to memcpy?

Martin


Add the following definition after 3.5, paragraph 2:

byte array
object having either no declared type or an array of objects declared with a 
byte type

byte type
non-atomic character type

Modify 6.5,paragraph 6:
The effective type of an object that is not a byte array, for an access to its
stored value, is the declared type of the object.97) If a value is
stored into a byte array through an lvalue having a byte type, then
the type of the lvalue becomes the effective type of the object for that
access and for subsequent accesses that do not modify the stored value.
If a value is copied into a byte array using memcpy or memmove, or is
copied as an array of byte type, then the effective type of the
modified object for that access and for subsequent accesses that do not
modify the value is the effective type of the object from which the
value is copied, if it has one. For all other accesses to a byte array,
the effective type of the object is simply the type of the lvalue used
for the access.

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3230.pdf







Re: Optimization of bit field assignemnts

2024-02-12 Thread David Brown

On 12/02/2024 17:47, Hugh Gleaves via Gcc wrote:

I’m interested in whether it would be feasible to add an optimization that 
compacted assignments to multiple bit fields.

Today, if I have a 32 bit long struct composed of say, four 8 bit fields and 
assign constants to them like this:

 ahb1_ptr->RCC.CFGR.MCO1_PRE = 7;
 ahb1_ptr->RCC.CFGR.I2SSC = 0;
 ahb1_ptr->RCC.CFGR.MCO1 = 3;

This generates code (on Arm) like this:

 ahb1_ptr->RCC.CFGR.MCO1_PRE = 7;
0x08000230  ldr.w r1, [r3, #2056]  @ 0x808
0x08000234  orr.w r1, r1, #117440512 @ 0x700
0x08000238  str.w r1, [r3, #2056]  @ 0x808
 ahb1_ptr->RCC.CFGR.I2SSC = 0;
0x0800023c  ldr.w r1, [r3, #2056]  @ 0x808
0x08000240  bfc r1, #23, #1
0x08000244  str.w r1, [r3, #2056]  @ 0x808
 ahb1_ptr->RCC.CFGR.MCO1 = 3;
0x08000248  ldr.w r1, [r3, #2056]  @ 0x808
0x0800024c  orr.w r1, r1, #6291456  @ 0x60
0x08000250  str.w r1, [r3, #2056]  @ 0x808

It would be an improvement, if the compiler analyzed these assignments and 
realized they are all modifications to the same 32 bit datum, generate an 
appropriate OR and AND bitmask and then apply those to the register and do just 
a single store at the end.

In other words, infer the equivalent of this:

 RCC->CFGR &= ~0x07E0;
 RCC->CFGR |=0x0760;

This strikes me as very feasible, the compiler knows the offset and bit length 
of the sub fields so all of the information needed seems to be present.

Thoughts…



In most such cases, the underlying definition of the structure (or the 
pointer to the structure) is volatile, because it is a hardware 
register.  The compiler cannot combine the register field settings, 
because volatile accesses must not be combined - precisely so that 
programmers can reliably control hardware.  It is normal to want to be 
sure that a particular bitfield is changed, and only after that will the 
next bitfield be changed, and so on.  Sometimes that means the result is 
slower than it would have to be - but this is much better than giving 
wrong results when the programmer needs the changes to be handled 
separately.


It is not uncommon for the bytes underlying a hardware register bitfield 
struct to be available directly as well, letting you do the bit 
manipulation in a local copy which you then write out in a single operation.





Re: issue: unexpected results in optimizations

2023-12-12 Thread David Brown via Gcc

Hi,

First, please ignore everything Dave Blanchard writes.  I don't know 
why, but he likes to post angry, rude and unhelpful messages to this list.


Secondly, this is the wrong list.  gcc-help would be the correct list, 
as you are asking for help with gcc.  This list is for discussions on 
the development of gcc.


Thirdly, if you want help, you need to provide something that other 
people can comprehend.  There is very little that anyone can do to 
understand lumps of randomly generated code, especially when it cannot 
compile without headers and additional files or libraries that we do not 
have.


So your task is to write a /minimal/ piece of stand-alone code that 
demonstrates the effect that concerns you.  It is fine to use standard 
headers like , but no external headers like this "csmith" 
stuff.  Aim to make it small enough to be included directly in the text 
of the post, not as an attachment.  Include the compiler version(s) you 
tried, the command line flags, what you expect the results to give, and 
what wrong results you got.


Always do development compiles with comprehensive sets of warnings.  I 
managed to take a section of your code (part that was different between 
the "initial.c" and "transformed.c") and compile it - there were lots of 
warnings.  There are a lot of overflows in initialisations, pointless 
calculations on the left of commas, and other indications of badly 
written code.  There were also static warnings about undefined behaviour 
in some calculations - and that, most likely, is key.


When code has undefined behaviour, you cannot expect the compiler to 
give any particular results.  It's all down to luck.  And luck varies 
with the details, such as optimisation levels.  It's "garbage in, 
garbage out", and that is the explanation for differing results.


So compile with "-Wall -Wextra -std=c99 -Wpedantic -O2" and check all 
the warnings.  (Static warnings work better when optimising the code.) 
If you have fixed the immediate problems in the code, add the 
"-fsanitize=undefined" flag before running it.  That will do run-time 
undefined behaviour checks.


If you have a code block that is small enough to comprehend, and that 
you are now confident has no undefined behaviour, and you get different 
results with different optimisations, post it to the gcc-help list. 
Then people can try it and give opinions - maybe there is a gcc bug.


I hope that all helps.

David





On 11/12/2023 18:14, Jingwen Wu via Gcc wrote:

Hello, I'm sorry to bother you. And I have some gcc compiler optimization
questions to ask you.
First of all, I used csmith tools to generate c files randomly. Meanwhile,
the final running result was the checksum for global variables in a c file.
For the two c files in the attachment, I performed the equivalent
transformation of loop from *initial.**c* to *transformed.c*. And the two
files produced different results (i.e. different checksum values) when
using *-Os* optimization level, while the results of both were the same
when using other levels of optimization such as *-O0*, -O1, -O2, -O3,
*-Ofast*.
Please help me to explain why this is, thank you.

command line: *gcc file.c -Os -lm -I $CSMITH_HOME/include && ./a.out*
version: gcc 12.2.0
os: ubuntu 22.04





Re: Suboptimal warning formatting with `bool` type in C

2023-11-02 Thread David Brown via Gcc

On 02/11/2023 00:28, peter0x44 via Gcc wrote:

On 2023-11-01 23:13, Joseph Myers wrote:


On Wed, 1 Nov 2023, peter0x44 via Gcc wrote:


Why is #define used instead of typedef? I can't imagine how this could
possibly break any existing code.


That's how stdbool.h is specified up to C17.  In C23, bool is a keyword
instead.


I see, I didn't know it was specified that way. It seems quite strange 
that typedef wouldn't be used for this purpose.


I suppose perhaps it matters if you #undef bool and then use it to 
define your own type? Still, it seems very strange to do this.




Yes, that is part of the reason.  The C standards mandate a number of 
things to be macros when it would seem that typedef's, functions, 
enumeration constants or other things would be "nicer" in some sense. 
Macros have two advantages, however - you can "#undef" them, and you can 
use "#ifdef" to test for them.  This makes them useful in several cases 
in the C standards, especially for changes that could break backwards 
compatibility.  Someone writing /new/ code would hopefully never make 
their own "bool" type, but there's plenty of old code around - if you 
ever need to include some pre-C99 headers with their own "bool" type and 
post-C99 headers using , within the same C file, then it's 
entirely possible that you'll be glad "bool" is a macro.


Maybe it's something to offer as a GNU extension? Though, I'm leaning 
towards too trivial to be worth it, just for a (very minor) improvement 
to a diagnostic that can probably be handled in other ways.




Speaking as someone with absolutely zero authority (I'm a GCC user, not 
a GCC developer), I strongly doubt that "bool" will be made a typedef as 
a GCC extension.


But if there are problems with the interaction between pre-processor 
macros and the formatting of diagnostic messages, then that is 
definitely something that you should file as a bug report and which can 
hopefully be fixed.


David




Re: C89 question: Do we need to accept -Wint-conversion warnings

2023-10-11 Thread David Brown via Gcc




On 11/10/2023 12:17, Florian Weimer wrote:

* David Brown:


On 11/10/2023 10:10, Florian Weimer wrote:

* David Brown:


So IMHO (and as I am not a code contributor to GCC, my opinion really
is humble) it is better to be stricter than permissive, even in old
standards.  It is particularly important for "-std=c89", while
"-std=gnu89" is naturally more permissive.  (I have seen more than
enough terrible code in embedded programs - I don't want to make it
easier for them to write even worse code!)

We can probably make (say) -std=gnu89 -fno-permissive work, in a way
that is a bit less picky than -std=gnu89 -pedantic-errors today.



The gcc manual has "-permissive" under "C++ Dialect Options".  Are you
planning to have it for C as well?


Yes, I've got local patches on top of Jason's permerror enhancement:

   [PATCH v2 RFA] diagnostic: add permerror variants with opt
   
<https://inbox.sourceware.org/gcc-patches/20231003210916.1027930-1-ja...@redhat.com/>



That sounds like a good idea (perhaps with some examples in the
documentation?).  Ideally (and I realise I like stricter checking than
many people) some long-obsolescent features like non-prototype
function declarations could be marked as errors unless "-permissive"
were used, even in C89 standards.


For some of such declarations, this falls out of the implicit-int
removal.


Yes.



C23 changes meaning of of extern foo(); to match the C++ interpretation
of extern foo(void);.  I don't think we should warn about that.  If we
warn, it would be at the call site.


I'm not sure I fully agree.  "extern foo();" became invalid when 
implicit int was removed in C99.  But "extern T foo();", where "T" is 
void or any type, has changed meaning between C17 (and before) and C23.


With C23, it means the same as "extern T foo(void);", like in C++ (and 
like all C standards if it is part of the definition of the function). 
However, prior to C23, a declaration of "T foo();" that is not part of 
the definition of the function declares the function and "specifies that 
no information about the number or types of the parameters is supplied". 
 This use was obsolescent from C90.


To my mind, this is very different.  I think it is fair to suppose that 
for many cases of pre-C23 declarations with empty parentheses, the 
programmer probably meant "(void)".  But the language standards have 
changed the meaning of the declaration.


IMHO I think calling "foo" with parameters should definitely be a 
warning, enabled by default, for at least -std=c99 onwards - it is 
almost certainly a mistake.  (Those few people that use it as a feature 
can ignore or disable the warning.)  I would also put warnings on the 
declaration itself at -Wall, or at least -Wextra (i.e., 
"-Wstrict-prototypes").  I think that things that change between 
standards, even subtly, should be highlighted.  Remember, this concerns 
a syntax that was marked obsolescent some 35 years ago, because the 
alternative (prototypes) was considered "superior to the old style on 
every count".


It could be reasonable to consider "extern T foo();" as valid in 
"-std=gnu99" and other "gnu" standards - GCC has an established history 
of "back-porting" useful features of newer standards to older settings. 
But at least for "-std=std99" and other "standard" standards, I think it 
is best to warn about the likely code error.





(As a side note, I wonder if "-fwrapv" and "-fno-strict-aliasing"
should be listed under "C Dialect Options", as they give specific
semantics to normally undefined behaviour.)


They are code generation options, too.


I see them as semantic extensions to the language, and code generation 
differences are a direct result of that (even if they historically arose 
as code generation options and optimisation flags respectively). 
Perhaps they could be mentioned or linked to in the C dialect options 
page?  Maybe it would be clearer to have new specific flags for the 
dialect options, which are implemented by activating these flags? 
Perhaps that would be confusing.





And of course there's still -Werror, that's not going to go away.  So if
you are using -Werror=implicit-function-declaration today (as you
probably should 8-), nothing changes for you in GCC 14.


I have long lists of explicit warnings and flags in my makefiles, so I
am not concerned for my own projects.  But I always worry about the
less vigilant users - the ones who don't know the details of the
language or the features of the compiler, and don't bother finding
out.  I don't want default settings to be less strict for them, as it
means higher risks of bugs escaping out to released code.


We have a tension regarding support for legacy software, and ongoing
development.  


Agreed, a

Re: C89 question: Do we need to accept -Wint-conversion warnings

2023-10-11 Thread David Brown via Gcc




On 11/10/2023 10:10, Florian Weimer wrote:

* David Brown:


So IMHO (and as I am not a code contributor to GCC, my opinion really
is humble) it is better to be stricter than permissive, even in old
standards.  It is particularly important for "-std=c89", while
"-std=gnu89" is naturally more permissive.  (I have seen more than
enough terrible code in embedded programs - I don't want to make it
easier for them to write even worse code!)


We can probably make (say) -std=gnu89 -fno-permissive work, in a way
that is a bit less picky than -std=gnu89 -pedantic-errors today.



The gcc manual has "-permissive" under "C++ Dialect Options".  Are you 
planning to have it for C as well?  That sounds like a good idea 
(perhaps with some examples in the documentation?).  Ideally (and I 
realise I like stricter checking than many people) some long-obsolescent 
features like non-prototype function declarations could be marked as 
errors unless "-permissive" were used, even in C89 standards.


(As a side note, I wonder if "-fwrapv" and "-fno-strict-aliasing" should 
be listed under "C Dialect Options", as they give specific semantics to 
normally undefined behaviour.)




And of course there's still -Werror, that's not going to go away.  So if
you are using -Werror=implicit-function-declaration today (as you
probably should 8-), nothing changes for you in GCC 14.


I have long lists of explicit warnings and flags in my makefiles, so I 
am not concerned for my own projects.  But I always worry about the less 
vigilant users - the ones who don't know the details of the language or 
the features of the compiler, and don't bother finding out.  I don't 
want default settings to be less strict for them, as it means higher 
risks of bugs escaping out to released code.





I suspect (again with numbers taken from thin air) that the proportion
of C programmers or projects that actively choose C11 or C17 modes, as
distinct from using the compiler defaults, will be less than 1%.  C99
(or gnu99) is the most commonly chosen standard for small-systems
embedded programming, combining C90 libraries, stacks, and RTOS's with
user code in C99.  So again, my preference is towards stricter
control, not more permissive tools.


I don't think the estimate is accurate.  Several upstream build systems
I've seen enable -std=gnu11 and similar options once they are supported.
Usually, it's an attempt to upgrade to newer language standards that
hasn't aged well, not a downgrade.  It's probably quite bit more than
1%.



Fair enough.  My experience is mostly within a particular field that is 
probably more conservative than a lot of other areas of programming.


David






Re: C89 question: Do we need to accept -Wint-conversion warnings

2023-10-11 Thread David Brown via Gcc

On 10/10/2023 18:30, Jason Merrill via Gcc wrote:

On Tue, Oct 10, 2023 at 7:30 AM Florian Weimer via Gcc 
wrote:


Are these code fragments valid C89 code?

   int i1 = 1;
   char *p1 = i;

   char c;
   char *p2 = 
   int i2 = p2;

Or can we generate errors for them even with -std=gnu89?

(It will still be possible to override this with -fpermissive or
-Wno-int-conversion.)



Given that C89 code is unlikely to be actively maintained, I think we
should be permissive by default in that mode.  People compiling with an old
-std flag are presumably doing it to keep old code compiling, and it seems
appropriate to respect that.



That is - unfortunately, IMHO - not true.

In particular, in the small-systems embedded development world (and that 
is a /big/ use-case for C programming), there is still a lot done in 
C89/C90.  It is the dominant variety of C for things like RTOS's (such 
as FreeRTOS and ThreadX), network stacks (like LWIP), microcontroller 
manufacturers' SDK's and libraries, and so on.  There are also still 
some microcontrollers for which the main toolchains (not GCC, obviously) 
do not have full C99 support, and there is a significant proportion of 
embedded C programmers who write all their code in C90, even for new 
projects.  There is a "cult" within C coders who think "The C 
Programming Language" is the "Bible", and have never learned anything 
since then.


The biggest target device in this field is the 32-bit ARM Cortex-M 
family, and the the most used compiler is gcc.


Taking numbers out of thin air, but not unrealistically I believe, there 
are millions of devices manufactured every day running code compiled by 
gcc -std=gnu89 or -std=c89 (or an equivalent).


Add to that the libraries on "big" systems that are written to C89/C90 
standards.  After all, that is the lowest common denominator of the 
C/C++ world - with a bit of care, the code will be compatible with all 
other C and C++ standards.  It is not just of old code, though a great 
deal of modern library code has roots back to pre-C99 days, but it is 
also cross-platform code.  It is only relatively recently that 
Microsoft's development tools have had reasonable support for C99 - many 
people writing code to work in both the *nix world and the Windows world 
stick to C89/C90 if they want a clear standard (rather than "the subset 
of C99 supported by the MSVC version they happen to have").


Now, pretty much all of that code could also be compiled with -std=c99 
(or -std=gnu99).  And in a great many cases, it /is/ compiled as C99. 
But for those that want to be careful about their coding, and many do, 
the natural choice here is "-std=c90 -pedantic-errors".



So IMHO (and as I am not a code contributor to GCC, my opinion really is 
humble) it is better to be stricter than permissive, even in old 
standards.  It is particularly important for "-std=c89", while 
"-std=gnu89" is naturally more permissive.  (I have seen more than 
enough terrible code in embedded programs - I don't want to make it 
easier for them to write even worse code!)




I'm also (though less strongly) inclined to be permissive in C99 mode, and
only introduce the new strictness by default for C11/C17 modes.



I suspect (again with numbers taken from thin air) that the proportion 
of C programmers or projects that actively choose C11 or C17 modes, as 
distinct from using the compiler defaults, will be less than 1%.  C99 
(or gnu99) is the most commonly chosen standard for small-systems 
embedded programming, combining C90 libraries, stacks, and RTOS's with 
user code in C99.  So again, my preference is towards stricter control, 
not more permissive tools.


I am aware, however, that I personally am a lot fussier than most 
programmers.  I run gcc with lots of additional warnings and 
-Wfatal-errors, and want ever-stricter tools.  I don't think many people 
would be happy with the choices /I/ would prefer for default compiler 
flags!


I am merely a happy GCC user, not a contributor, much less anyone 
involved in decision making.  But I hope it is helpful to you to hear 
other opinions here, especially about small-systems embedded 
programming, at least in my own experience.


David








Re: seek advice about GCC learning

2023-09-27 Thread David Brown

On 26/09/2023 08:48, weizhe wang via Gcc wrote:

Thanks for your reply. Is there some guide for building rv32 cross compiler gcc 
? I encounter some error in the building progress.




You might find useful information here:








I can recommend google.  It took me perhaps 10 seconds to find these sites.




Re: GCC support addition for Safety compliances

2023-07-12 Thread David Brown via Gcc

On 12/07/2023 14:43, Jonathan Wakely via Gcc wrote:

On Wed, 12 Jul 2023 at 10:25, Vishal B Patil via Gcc  wrote:


Hi Team,

Any updates ?


You're not going to get any useful answers.

You asked "Please share the costs and time as well." Costs for what? From whom?

GCC is an open-source project with a diverse community of hundreds of
contributors. Who are you asking to give you costs? What work are you
expecting them to do?

It is unlikely that you obtained GCC from https://gcc.gnu.org so you
should probably talk to whoever provided you with your GCC binaries.


Most people get their GCC binaries for free, and no such source is going 
to be able to help for safety compliance or any other kind of 
certification.  Certification always costs time, effort and money.  But 
there are suppliers who provide toolchain binaries with commercial 
support contract, and which could help with certification.  I know Code 
Sourcery certainly used to be able to provide language compliance 
certification - I have no idea if they still can (it seems they are part 
of Siemens these days).  Maybe Red Hat (part of IBM) can do so too, and 
possibly others.  But perhaps that will give the OP a starting point.


David



For safety compliance you will probably need to talk to a third-party
who specializes in that. I don't think you will achieve anything by
asking the GCC project to do that for you.

That's not how open source projects work.





Regards,
Vishal B Patil

vishal.b.pa...@cummins.com

Dahanukar Colony, Kothrud
Pune
Maharashtra
411038
India

-Original Message-
From: Vishal B Patil
Sent: Wednesday, July 5, 2023 4:18 PM
To: Basile Starynkevitch 
Subject: RE: GCC support addition for Safety compliances

Hi Team,

Thanks for the response.

Actually required for UL60730, UL6200. Please share the costs and time as well.

Regards,
Vishal B Patil

vishal.b.pa...@cummins.com

Dahanukar Colony, Kothrud
Pune
Maharashtra
411038
India

-Original Message-
From: Basile Starynkevitch 
Sent: Wednesday, July 5, 2023 4:07 PM
To: Vishal B Patil 
Subject: GCC support addition for Safety compliances

EXTERNAL SENDER: This email originated outside of Cummins. Do not click links 
or open attachments unless you verify the sender and know the content is safe.


Hello


Need support from the GCC GNU for the some safety compliances. Can you please 
guide or check which GCC support the safety compliances.

For safety compliance GCC is probably not enough.


Consider (if allowed by your authorities) using static analysis tools like 
https://frama-c.com/ or https://www.absint.com/products.htm


Be sure to understand what technically safety compliance means to you.
DOI178C? ISO26262?

Be also aware that safety compliance costs a lot of money and a lot of time. 
(you'll probably need a budget above 100k€ ou 100kUS$ and about a person*year 
of developer efforts)


--
Basile Starynkevitch  
(only mine opinions / les opinions sont miennes uniquement)
92340 Bourg-la-Reine, France
web page: starynkevitch.net/Basile/







Re: user sets ABI

2023-07-07 Thread David Brown via Gcc

On 07/07/2023 00:27, André Albergaria Coelho via Gcc wrote:

What if the user chooses in own ABI, say specifying a config file like

My abi

" Parameters = pushed in stack"


say

gcc -abi "My abi" some.c -o some

what would be the problems of specifying an ABI?? would that improve the 
usage of user? less complex / more


simpler for user (say user is used to code asm in a way)




You can fiddle things a bit, using the -ffixed-reg, -fcall-used-reg and 
-fcall-saved-reg flags:




This is almost certainly a bad idea for most situations - you really 
have to have a special niche case to make it worth doing.  The register 
allocation algorithms in GCC are complex, and I would expect changing 
these settings would give you less efficient results.  And of course it 
will mess up all calls to any code compiled with different settings - 
such as library code.


A far better solution is for the user who is used to coding in assembly, 
to get used to coding in C, C++, or other languages supported by GCC. 
If you really need some assembly, as happens occasionally, then learn 
about GCC's extended syntax inline assembly.  That lets GCC worry about 
details such as register allocation and operands, so that your assembly 
is minimal, and allows the assembly to work well along with the compiler 
optimisation.


If you have legacy assembly functions that are written to a non-standard 
calling convention, write a thunk to translate as necessary.





Re: wishlist: support for shorter pointers

2023-07-06 Thread David Brown via Gcc

On 06/07/2023 09:00, Rafał Pietrak via Gcc wrote:

Hi,

W dniu 5.07.2023 o 19:39, David Brown pisze:
[--]
I'm not sure what this means? At compile time, you only have 
literals, so what's missing?


The compiler knows a lot more than just literal values at compile time 
- lots of things are "compile-time constants" without being literals 
that can be used in string literals.  That includes the value of 
static "const" variables, and the results of calculations or "pure" 
function 


const --> created by a literal.


Technically in C, the only "literals" are "string literals".  Something 
like 1234 is an integer constant, not a literal.  But I don't want to 
get too deep into such standardese - especially not for C++ !


Even in C, there are lots of things that are known at compile time 
without being literals (or explicit constants).  In many situations you 
can use "constant expressions", which includes basic arithmetic on 
constants, enumeration constants, etc.  The restrictions on what can be 
used in different circumstances is not always obvious (if you have 
"static const N = 10;", then "static const M = N + 1;" is valid but "int 
xs[N];" is not).


C++ has a very much wider concept of constant expressions at compile 
time - many more ways to make constant expressions, and many more ways 
to use them.  But even there, the compiler will know things at compile 
time that are not syntactically constant in the language.  (If you have 
code in a function "if (x < 0) return; bool b = (x >= 0);" then the 
compiler can optimise in the knowledge that "b" is a compile-time 
constant of "true".)





calls using compile-time constant data.  You can do a great deal more of 


"compile time constant data" -> literal

this in C++ than in C ("static const int N = 10; int arr[N];" is valid 
in C++, but not in C).  Calculated section names might be useful for 
sections that later need to be sorted.


To be fair, you can construct string literals by the preprocessor that 
would cover many cases.


OK. We are talking of convenience syntax that allows for using any 
"name" in c-sources as "const-literal" if only its rooted in literals 
only. That's useful.


+2. :)



I can also add that generating linker symbols from compile-time 
constructed names could be useful, to use (abuse?) the linker to find 
issues across different source files.  Imagine you have a 


+1

microcontroller with multiple timers, and several sources that all 
need to use timers.  A module that uses timer 1 could define a 

[--]


 __attribute__((section("jit_buffer,\"ax\"\n@")))


I assume, that adding an attribute should split a particular section 
into "an old one" and "the new one with new attribute", right?


You can't have the same section name and multiple flags.  But you 
sometimes want to have unusual flag combinations, such as executable 
ram sections for "run from ram" functions.


section flags reflect "semantic" of the section (ro v.s. rw is different 
semantics at that level). So, how do you "merge" RAM (a section called 
".data"), one with "!x" flag, and the other with "x" flag?


conflicting flags of sections with the same name have to be taken into 
consideration.




It doesn't make sense to merge linker input sections with conflicting 
flags - this is (and should be) an error at link time.  So I am not 
asking for a way to make a piece of ".data" section with different flags 
from the standard ".data" section - I am asking about nicer ways to make 
different sections with different selections of flags.  (Input sections 
with different flags can be merged into one output section, as the 
semantic information is lost there.)






One would need to have linker logic (and linker script definitions) 
altered, to follow that (other features so far wouldn't require any 
changes to linkers, I think).


to add the flags manually, then a newline, then a line comment 
character (@ for ARM, but this varies according to target.)


6. Convenient support for non-initialised non-zeroed data sections 
in a standardised way, without having to specify sections manually 
in the source and linker setup.


What gain and under which circumstances you get with this? I mean, 
why enforce keeping uninitialized memory fragment, while that is just 
a one shot action at load time?




Very often you have buffers in your programs, which you want to have 
statically allocated in ram (so they have a fixed address, perhaps 
specially aligned, and so you have a full overview of your memory 
usage in your map files), but you don't care about the contents at 
startup. Clearing these to 0 is just a waste of processor time.


At star

Re: wishlist: support for shorter pointers

2023-07-05 Thread David Brown via Gcc

On 05/07/2023 18:13, Rafał Pietrak via Gcc wrote:

Hi,

W dniu 5.07.2023 o 16:45, David Brown pisze:

On 05/07/2023 15:29, Rafał Pietrak wrote:

[---]
OK. I don't see a problem here, but I admit that mixing semantics 
often lead to problems.




I think it also allows better generalisation and flexibility if they 
are separate.  You might want careful control over where something is 
allocated, but the access would be using normal instructions. 
Conversely, you might not be bothered about where the data is 
allocated, but want control of access (maybe you want interrupts 
disabled around accesses to make it atomic).


that would require compiler to know the "semantics" of such section. I 
don't think you've listed it below, worth adding. If I understand you 
correctly, that means the code generated varies depending on target 
section selected. This is linker "talking" to compiler if I'm not mistaken.




No, it's about the access - not the allocation (or section).  Access 
boils down to a "read" function and a "write" function (or possibly 
several, optimised for different sizes - C11 _Generic can make this 
neater, though C++ handles it better).




[--]
Let me try to list some things I think might be useful (there may be 
some overlap).  I am not giving any particular order here.


1. Adding a prefix to section names rather than replacing them.


OK. +1


2. Adding a suffix to section names.


+1

3. Constructing section names at compile time, rather that just using 
a string literal.  (String literals can be constructed using the 
pre-processor, but that has its limitations.)


I'm not sure what this means? At compile time, you only have literals, 
so what's missing?


The compiler knows a lot more than just literal values at compile time - 
lots of things are "compile-time constants" without being literals that 
can be used in string literals.  That includes the value of static 
"const" variables, and the results of calculations or "pure" function 
calls using compile-time constant data.  You can do a great deal more of 
this in C++ than in C ("static const int N = 10; int arr[N];" is valid 
in C++, but not in C).  Calculated section names might be useful for 
sections that later need to be sorted.


To be fair, you can construct string literals by the preprocessor that 
would cover many cases.


I can also add that generating linker symbols from compile-time 
constructed names could be useful, to use (abuse?) the linker to find 
issues across different source files.  Imagine you have a 
microcontroller with multiple timers, and several sources that all need 
to use timers.  A module that uses timer 1 could define a 
"using_timer_1" symbol for link time (but with no allocation to real 
memory).  Another module might use timer 2 and define "using_timer_2". 
If a third module uses timer 1 again, then you'd get a link-time error 
with two conflicting definitions of "use_timer_1" and you'd know you 
have to change one of the modules.




4. Pragmas to apply section names (or prefixes or suffixes) to a block 
of definitions, changing the defaults.


+1

5. Control of section flags (such as read-only, executable, etc.).  At 
the moment, flags are added automatically depending on what you put 
into the section (code, data, read-only data).  So if you want to 
override these, such as to make a data section in ram that is 
executable (for your JIT compiler :-) ), you need something like :


 __attribute__((section("jit_buffer,\"ax\"\n@")))


I assume, that adding an attribute should split a particular section 
into "an old one" and "the new one with new attribute", right?


You can't have the same section name and multiple flags.  But you 
sometimes want to have unusual flag combinations, such as executable ram 
sections for "run from ram" functions.




One would need to have linker logic (and linker script definitions) 
altered, to follow that (other features so far wouldn't require any 
changes to linkers, I think).


to add the flags manually, then a newline, then a line comment 
character (@ for ARM, but this varies according to target.)


6. Convenient support for non-initialised non-zeroed data sections in 
a standardised way, without having to specify sections manually in the 
source and linker setup.


What gain and under which circumstances you get with this? I mean, why 
enforce keeping uninitialized memory fragment, while that is just a one 
shot action at load time?




Very often you have buffers in your programs, which you want to have 
statically allocated in ram (so they have a fixed address, perhaps 
specially aligned, and so you have a full overview of your memory usage 
in your map files), but you don't care about the contents at startup. 
Clearing these to 0 is just a waste of processor time.



7. Co

Re: wishlist: support for shorter pointers

2023-07-05 Thread David Brown via Gcc

On 05/07/2023 15:29, Rafał Pietrak wrote:

Hi,


W dniu 5.07.2023 o 14:57, David Brown pisze:
[]


My objection to named address spaces stem from two points:

1. They are compiler implementations, not user code (or library code), 
which means development is inevitably much slower and less flexible.


2. They mix two concepts that are actually quite separate - how 
objects are allocated, and how they are accessed.


OK. I don't see a problem here, but I admit that mixing semantics often 
lead to problems.




I think it also allows better generalisation and flexibility if they are 
separate.  You might want careful control over where something is 
allocated, but the access would be using normal instructions. 
Conversely, you might not be bothered about where the data is allocated, 
but want control of access (maybe you want interrupts disabled around 
accesses to make it atomic).


Access to different types of object in different sorts of memory can 
be done today.  In C, you can use inline functions or macros.  For 
target-specific stuff you can use inline assembly, and GCC might have 
builtins for some target-specific features.  In C++, you can also wrap 
things in classes if that makes more sense.


Personally, I'd avoid inline assembly whenever possible. It does a very 
good job of obfuscating programmers' intentions. From my experience, I'd 
rather put the entire functions into assembler if compiler makes obstacles.




I'd rather keep the assembly to a minimum, and let the compiler do what 
it is good at - such as register allocation.  That means extended syntax 
inline assembly (but typically wrapped inside a small inline function).



But that's not an issue here.


Agreed.



Allocation is currently controlled by "section" attributes.  This is 
where we I believe GCC could do better, and give the user more 
control. (It may be possible to develop a compiler-independent syntax 
here that could become part of future C and C++ standards, but I think 
it will unavoidably be heavily implementation dependent.)


I agree.



All we really need is a way to combine these with types to improve 
user convenience and reduce the risk of mistakes.  And I believe that 
allowing allocation control attributes to be attached to types would 
give us that in GCC.  Then it would all be user code - typedefs, 
macros, functions, classes, whatever suits.


OK. Sounds good.

Naturally I have my "wishlist": the "small pointers" segment/attribute :)

But how (and to what extend) would you do that? I mean, the convenient 
syntax is desirable, but IMHO at this point there is also a question of 
semantics: what exactly compiler is supposed to tell linker? I think it 
would be good to list here the use scenarios that we now of. Scenarios 
that would benefit from compiler communicating to linker more then 
names@sections. (even if such list wouldn't evolve into any 
implementation effort at this point I think that would nicely conclude 
this thread.)




Let me try to list some things I think might be useful (there may be 
some overlap).  I am not giving any particular order here.


1. Adding a prefix to section names rather than replacing them.

2. Adding a suffix to section names.

3. Constructing section names at compile time, rather that just using a 
string literal.  (String literals can be constructed using the 
pre-processor, but that has its limitations.)


4. Pragmas to apply section names (or prefixes or suffixes) to a block 
of definitions, changing the defaults.


5. Control of section flags (such as read-only, executable, etc.).  At 
the moment, flags are added automatically depending on what you put into 
the section (code, data, read-only data).  So if you want to override 
these, such as to make a data section in ram that is executable (for 
your JIT compiler :-) ), you need something like :


__attribute__((section("jit_buffer,\"ax\"\n@")))

to add the flags manually, then a newline, then a line comment character 
(@ for ARM, but this varies according to target.)


6. Convenient support for non-initialised non-zeroed data sections in a 
standardised way, without having to specify sections manually in the 
source and linker setup.


7. Convenient support for sections (or variables) placed at specific 
addresses, in a standardised way.


8. Convenient support for sections that are not allocated space by the 
linker in the target memory, but where the contents are still included 
in the elf file and map files, where they can be read by other tools. 
(This could be used for external analysis tools.)


9. Support for getting data from the linker to the code, such as section 
sizes and start addresses, without having to manually add the symbols to 
the linker file and declare extern symbols in the C or C++ code.


10. Support for structs (or C++ classes) where different parts of the 
struct are in different sections.  This would mean the st

Re: wishlist: support for shorter pointers

2023-07-05 Thread David Brown via Gcc




On 05/07/2023 14:25, Rafał Pietrak wrote:

Hi,

W dniu 5.07.2023 o 13:55, David Brown pisze:

On 05/07/2023 11:42, Rafał Pietrak via Gcc wrote:

[--]

So your current objections to named spaces ... are in fact in favor 
of them. Isn't it so?




Not really, no - I would rather see better ways to handle allocation 
and section control than more named address spaces.


Doesn't it call for "something" that a c-source (through the compiler) 
can express to the linker programmers' intention?




Yes, I think that is fair to say.  And that "something" should be more 
advanced and flexible than the limited "section" attribute we have 
today.  But I don't think it should be "named address spaces".


My objection to named address spaces stem from two points:

1. They are compiler implementations, not user code (or library code), 
which means development is inevitably much slower and less flexible.


2. They mix two concepts that are actually quite separate - how objects 
are allocated, and how they are accessed.


Access to different types of object in different sorts of memory can be 
done today.  In C, you can use inline functions or macros.  For 
target-specific stuff you can use inline assembly, and GCC might have 
builtins for some target-specific features.  In C++, you can also wrap 
things in classes if that makes more sense.


Allocation is currently controlled by "section" attributes.  This is 
where we I believe GCC could do better, and give the user more control. 
(It may be possible to develop a compiler-independent syntax here that 
could become part of future C and C++ standards, but I think it will 
unavoidably be heavily implementation dependent.)


All we really need is a way to combine these with types to improve user 
convenience and reduce the risk of mistakes.  And I believe that 
allowing allocation control attributes to be attached to types would 
give us that in GCC.  Then it would all be user code - typedefs, macros, 
functions, classes, whatever suits.


David



Re: wishlist: support for shorter pointers

2023-07-05 Thread David Brown via Gcc

On 05/07/2023 11:42, Rafał Pietrak via Gcc wrote:

Hi,

W dniu 5.07.2023 o 11:11, David Brown pisze:

On 05/07/2023 10:05, Rafał Pietrak via Gcc wrote:

[---]


I am not sure if you are clear about this, but the address space 
definition macros here are for use in the source code for the 
compiler, not in user code.  There is (AFAIK) no way for user code to 
create address spaces - you need to check out the source code for GCC, 
modify it to support your new address space, and build your own 
compiler.  This is perfectly possible (it's all free and open source, 
after all), but it is not a minor undertaking - especially if you 
don't like C++ !


Hmmm.

Wouldn't it be easier and more natural to make the "named spaces" a 
synonym to specific linker sections (like section names, or section name 
prefix when instead of ".data.array.*" one gets ".mynamespace.array.*")?


You can, of course, write :

#define __smalldata __attribute__((section(".smalldata)))

I'd rather see the "section" attribute extended to allow it to specify a 
prefix or suffix (to make subsections) than more named address spaces.


I'm a big fan of only putting things in the compiler if they have to be 
there - if a feature can be expressed in code (whether it be C, C++, or 
preprocessor macros), then I see that as the best choice.




[--]
I realise that learning at least some C++ is a significant step beyond 
learning C - but /using/ C++ classes or templates is no harder than C 
coding.  And it is far easier, faster and less disruptive to make a 
C++ header library implementing such features than adding new named 
address spaces into the compiler itself.


The one key feature that is missing is that named address spaces can 
affect the allocation details of data, which cannot be done with C++ 
classes.  You could make a "small_data" class template, but variables 
would still need to be marked __attribute__((section(".smalldata"))) 
when used.  I think this could be handled very neatly with one single 
additional feature in GCC - allow arbitrary GCC variable attributes to 
be specified for types, which would then be applied to any variables 
declared for that type.


OK. I see your point.

But let's have look at it. You say, that "names spaces affect allocation 
details, which cannot be done with C++". Pls consider:
1. for small embedded devices C++ is not a particularly "seller". We 
even turn to assembler occasionally.


I have been writing code for small embedded systems for about 30 years. 
I used to write a lot in assembly, but it is very rare now.  Almost all 
of the assembly I write these days is inline assembly in gcc format - 
and a lot of that actually contains no assembly at all, but is for 
careful control of dependencies or code re-arrangements.  The smallest 
device I have ever used was an AVR Tiny with no ram at all - just 2K 
flash, a 3-level return stack and its 32 8-bit registers.  I programmed 
that in C (with gcc).


C++ /is/ a big "seller" in this market.  It is definitely growing, just 
as the market for commercial toolchains with non-portable extensions is 
dropping and 8-bit CISC devices are being replaced by Cortex-M0 cores. 
There is certainly plenty of C-only coding going on, but C++ is growing.


2. affecting allocation details is usually the hole point of engineering 
skills when dealing with small embedded devices - the hole point is to 
have tools to do that.




When you are dealing with 8-bit CISC devices like the 8051 or the COP8, 
then allocation strategies are critical, and good tools are essential.


But for current microcontrollers, they are not nearly as important 
because you have a single flat address space - pointers to read-only 
data in flash and pointers to data in ram are fully compatible.  You do 
sometimes need to place particular bits of data in particular places, 
but that is usually for individual large data blocks such as putting 
certain buffers in non-cached memory, or a large array in external 
memory.  Section attributes suffice for that.


Allocation control is certainly important at times, but it's far from 
being as commonly needed as you suggest.


(Dynamic allocation is a different matter, but I don't believe we are 
talking about that here.)


So your current objections to named spaces ... are in fact in favor of 
them. Isn't it so?




Not really, no - I would rather see better ways to handle allocation and 
section control than more named address spaces.


David




Re: wishlist: support for shorter pointers

2023-07-05 Thread David Brown via Gcc




On 05/07/2023 11:25, Martin Uecker wrote:

Am Mittwoch, dem 05.07.2023 um 11:11 +0200 schrieb David Brown:

On 05/07/2023 10:05, Rafał Pietrak via Gcc wrote:


...


In my personal opinion (which you are all free to disregard), named
address spaces were an interesting idea that failed.  I was
enthusiastic
about a number of the extensions in TR 18307 "C Extensions to support
embedded processors" when the paper was first published.  As I
learned
more, however, I saw it was a dead-end.  The features are too
under-specified to be useful or portable, gave very little of use to
embedded programmers, and fit badly with C.  It was an attempt to
standardise and generalise some of the mess of different extensions
that
proprietary toolchain developers had for a variety of 8-bit CISC
microcontrollers that could not use standard C very effectively.  But
it
was all too little, too late - and AFAIK none of these proprietary
toolchains support it.  GCC supports some of the features to some
extent
- a few named address spaces on a few devices, for "gnuc" only (not
standard C, and not C++), and has some fixed point support for some
targets (with inefficient generated code - it appears to be little
more
than an initial "proof of concept" implementation).

I do not think named address spaces have a future - in GCC or
anywhere
else.  The only real use of them at the moment is for the AVR for
accessing data in flash, and even then it is of limited success since
it
does not work in C++.


Can you explain a little bit why you think it is a dead-end?  It
seems an elegant solution to a range of problems to me.


Named address spaces are not standardised in C, and I do not expect they 
ever will be.  The TR18307 document is not anywhere close to being of a 
quality that could be integrated with the C standards, even as optional 
features, and much of it makes no sense in practice (I have never heard 
of the IO stuff being implemented or used).


The few compilers that implement any of it do so in different ways - the 
"__flash" address space in AVR GCC is slightly different from the same 
extension in IAR's AVR compiler.  For existing compilers, there is a 
strong inconsistency as to whether such things are "named address 
spaces", "extension keywords", "type qualifiers", "attributes", or other 
terms, all with subtly (or not so subtly) different effects on how they 
are used, what restrictions exist, conversions between types, and how 
errors can be diagnosed.  Sometimes these features are considered part 
of the data type, sometimes of pointer types, sometimes they are just 
about data placement.


Since every compiler targeting these small awkward microcontrollers has 
a different idea of what something like "const __flash int x = 123;" 
means, and has been implementing their own ideas for a decade or two 
before TR18307 ever proposed "named address spaces", the TR hasn't a 
hope of being a real standard.


Named address spaces are not implemented at all, anywhere (AFAIK), for 
C++.  (Some embedded toolchains have limited support for C++ on such 
microcontrollers, but these are again not really named address spaces.) 
Since C++ usage is heavily increasing in the small embedded system 
world, this is important.  (GCC has much of the honour for that - as ARM 
took a bigger share of the market and GCC for ARM improved, the 
toolchain market was no longer at the mercy of big commercial vendors 
who charged absurd amounts for their C++ toolchains.)  A feature which 
is only for C, and not supported by C++, is almost guaranteed to be 
dead-end.


And of course the type of processor for which named address spaces or 
other related extensions are essential, are a dying breed.  The AVR is 
probably the only one with a significant future.  Part of the appeal of 
ARM in the embedded world is it frees you from the pains of 
target-specific coding with some of your data in "near" memory, some in 
"extended" memory, some in "flash" address spaces or "IO" address 
spaces.  It all works with standard C or C++.  The same applies to 
challengers like RISC-V, MIPS, PPC, and any other core - you have a 
single flat address space for normal data.




I have no idea how much the GCC features are actually used,
but other compilers for  embedded systems such as SDCC also
support named address spaces.



And the targets supported by SDCC are also dead-end devices - there is 
not a single one of them that I would consider for a new project.  These 
microcontrollers are now used almost exclusively for legacy projects - 
updates to existing hardware or software, and rely on compatibility with 
existing C extensions (whether they are called "named address spaces", 
"extension keywords", or anything else).



Now, there are things that I would like to be able to write in my code 
that could ap

Re: wishlist: support for shorter pointers

2023-07-05 Thread David Brown via Gcc

On 05/07/2023 10:05, Rafał Pietrak via Gcc wrote:

Hi,

W dniu 5.07.2023 o 09:29, Martin Uecker pisze:

Am Mittwoch, dem 05.07.2023 um 07:26 +0200 schrieb Rafał Pietrak:

[---]

And if it's so ... there is no mention of how does it show up for
"simple user" of the GCC (instead of the use of that "machinery" by
creators of particular GCC port). In other words: how the sources should
look like for the compiler to do "the thing"?



Not sure I understand the question.  You would add a name space
to an object as a qualifier and then the object would be allocated
in a special (small) region of memory.  Pointers known to point
into that special region of memory (which is encoded into the
type) would then be smaller.  At least, this is my understanding
of how it could work.


Note that this only applies to pointers declared to be of the address 
space specific type.  If you have "__smalldata int x;" using a 
hypothetical new address space, then "" is of type "__smalldata int *" 
and you need to specify the address space specific pointer type to get 
the size advantages.  (Since the __smalldata address space is a subset 
of the generic space, conversions between pointer types are required to 
work correctly.)




Apparently you do understand my question.

Then again ... apparently you are guessing the answer. Incidentally, 
that would be my guess, too. And while such "syntax" is not really 
desirable (since such attribution at every declaration of every "short 
pointer" variable would significantly obfuscate the sources and a thing 
like "#pragma" at the top of a file would do a better job), better 
something then nothing. Then again, should you happen to fall onto an 
actual documentation of syntax to use this feature with, I'd appreciate 
you sharing it :)




I am not sure if you are clear about this, but the address space 
definition macros here are for use in the source code for the compiler, 
not in user code.  There is (AFAIK) no way for user code to create 
address spaces - you need to check out the source code for GCC, modify 
it to support your new address space, and build your own compiler.  This 
is perfectly possible (it's all free and open source, after all), but it 
is not a minor undertaking - especially if you don't like C++ !


In my personal opinion (which you are all free to disregard), named 
address spaces were an interesting idea that failed.  I was enthusiastic 
about a number of the extensions in TR 18307 "C Extensions to support 
embedded processors" when the paper was first published.  As I learned 
more, however, I saw it was a dead-end.  The features are too 
under-specified to be useful or portable, gave very little of use to 
embedded programmers, and fit badly with C.  It was an attempt to 
standardise and generalise some of the mess of different extensions that 
proprietary toolchain developers had for a variety of 8-bit CISC 
microcontrollers that could not use standard C very effectively.  But it 
was all too little, too late - and AFAIK none of these proprietary 
toolchains support it.  GCC supports some of the features to some extent 
- a few named address spaces on a few devices, for "gnuc" only (not 
standard C, and not C++), and has some fixed point support for some 
targets (with inefficient generated code - it appears to be little more 
than an initial "proof of concept" implementation).


I do not think named address spaces have a future - in GCC or anywhere 
else.  The only real use of them at the moment is for the AVR for 
accessing data in flash, and even then it is of limited success since it 
does not work in C++.



I realise that learning at least some C++ is a significant step beyond 
learning C - but /using/ C++ classes or templates is no harder than C 
coding.  And it is far easier, faster and less disruptive to make a C++ 
header library implementing such features than adding new named address 
spaces into the compiler itself.


The one key feature that is missing is that named address spaces can 
affect the allocation details of data, which cannot be done with C++ 
classes.  You could make a "small_data" class template, but variables 
would still need to be marked __attribute__((section(".smalldata"))) 
when used.  I think this could be handled very neatly with one single 
additional feature in GCC - allow arbitrary GCC variable attributes to 
be specified for types, which would then be applied to any variables 
declared for that type.


David





Re: wishlist: support for shorter pointers

2023-07-04 Thread David Brown via Gcc

On 04/07/2023 16:46, Rafał Pietrak wrote:

Hi,

W dniu 4.07.2023 o 14:38, David Brown pisze:
[-]
A key difference is that using 32-bit pointers on an x86 is enough 
address space for a large majority of use-cases, while even on the 
smallest small ARM microcontroller, 16-bit is not enough.  (It's not 
even enough to access all memory on larger AVR microcontrollers - the 
only 8-bit device supported by mainline gcc.)  So while 16 bits would 
cover the address space of the RAM on a small ARM microcontroller, it 
would not cover access to code/flash space (including read-only data), 
IO registers, or other areas of memory-mapped memory and peripherals. 
Generic low-level pointers really have to be able to access everything.


Naturaly 16-bit is "most of the time" not enough to cover the entire 
workspace on even the smallest MCU (AVR being the only close to an 
exception here), but in my little experience, that is not really 
necessary.


(Most MSP430 devices, also supported by GCC, are also covered by a 
16-bit address space.)


Meaning "generic low-level pointers really have to...", I 
don't think so. I really don't. Programs often manipulate quite 
"localized" data, and compiler is capable enough to distinguish and keep 
separate pointers of different "domains". What makes it currently 
impossible is tools (semantic constructs like pragma or named sections) 
that would let it happen.




No, generic low-level pointers /do/ have to work with all reasonable 
address spaces on the device.  A generic pointer has to support pointing 
to modifiable ram, to constant data (flash on small microcontrollers), 
to IO registers, etc.  If you want something that can access a specific, 
restricted area, then it is a specialised pointer - not a generic one. 
C has no support for making your own pointer types, but C++ does.




So an equivalent of x32 mode would not work at all.  Really, what you 
want is a 16-bit "small pointer" that is added to 0x2000 (the base 
address for RAM in small ARM devices, in case anyone following this 
thread is unfamiliar with the details) to get a real data pointer.  
And you'd like these small pointers to have convenient syntax and 
efficient use.


more or less yes. But "with a twist". A "compiler construct" that would 
be (say) sufficient to get the RAM-savings/optimization I'm aiming at 
could be "reduced" to the ability to create "medium-size" array of "some 
objects" and have them reference each other all WITHIN that "array". 
That array was in my earlier emails referred to as segment or section. 
So whenever a programmer writes a construct like:


struct test_s attribute((small-and-funny)) {
 struct test_s attribute((small-and-funny)) *next, *prev, *head;
 struct test_s attribute((small-and-funny)) *user, *group;
} repository[1000];
struct test_s attribute((small-and-funny)) *master, *trash;

compiler puts that data into that small array (dedicated section), so no 
"generic low-level pointers" referring that data would need to exist 
within the program. And if it happens, error is thrown (or 
autoconversion happen).




GCC attributes for sections already exist.

And again - indices will give you what you need here more efficiently 
than pointers.  All of your pointers can be converted to "repository[i]" 
format.  (And if your repository has no more than 256 entries, 8-bit 
indices will be sufficient.)  It can be efficient to store pointers to 
the entries in local variables if you are using them a lot, though GCC 
will do a fair amount of that automatically.




I think a C++ class (or rather, class template) with inline functions 
is the way to go here.  gcc's optimiser will give good code, and the 
C++ class will let you get nice syntax to hide the messy details.


OK. Thenx for the advice, but going into c++ is a major thing for me and 
(at least for  the time being) I'll stay with ordinary "big" pointers in 
plain C instead.


There is no good way to do this in C.  Named address spaces would be a 
possibility, but require quite a bit of effort and change to the 
compiler to implement, and they don't give you anything that you would 
not get from a C++ class.


Yes. named address spaces would be great. And for code, too.



It is good to have a wishlist (and you can file a wishlist "bug" in the 
gcc bugzilla, so that it won't be forgotten).  But it is also good to be 
realistic.  Indices will give you what you need in terms of space 
efficiency, but will be messier in the syntax.  A small pointer class 
will give you efficient code and neat syntax, but require C++.  These 
two solutions will, however, work today.  (And they are both target 
independent.)


David


(That's not quite true - named address spaces can, I believe, also 
influence the section name used for allocation of data defined in 
these spaces, which cannot be done by a C++ class.)


OK.

-R




Re: wishlist: support for shorter pointers

2023-07-04 Thread David Brown via Gcc

On 04/07/2023 16:20, Rafał Pietrak wrote:



W dniu 3.07.2023 o 18:29, Rafał Pietrak pisze:

Hi David,


[--]
4. It is worth taking a step back, and thinking about how you would 
like to use these pointers.  It is likely that you would be better 
thinking in terms of an array, rather than pointers - after all, you 
don't want to be using dynamically allocated memory here if you can 
avoid it, and certainly not generic malloc().  If you can use an 
array, then your index type can be as small as you like - maybe 
uint8_t is enough.


I did that trip ... some time ago. May be I discarded the idea 
prematurely, but I dropped it because I was afraid of cost of 


I remember now what was my main problem with indexes implementation: 
inability to express/write chain "references" with them. Table/index 
semantic of:

 t[a][b][c][d].
is a "multidimentional table" which is completely different from 
"pointer semantic" of:

 *t->a->b->c->d

It is quite legit to do a full circle around a circular list this way, 
while table semantics doesn't allow that.


Indexes are off the table.

-R


If you have a circular buffer, it is vastly more efficient to have an 
array with no pointers or indices, and use head and tail indices to 
track the current position.  But I'm not sure if that is what you are 
looking for.  And you can use indices in fields for chaining, but the 
syntax will be different.  (For some microcontrollers, the 
multiplications involved in array index calculations can be an issue, 
but not for ARM devices.)





Re: wishlist: support for shorter pointers

2023-07-04 Thread David Brown via Gcc

On 03/07/2023 18:42, Rafał Pietrak via Gcc wrote:

Hi Ian,

W dniu 3.07.2023 o 17:07, Ian Lance Taylor pisze:
On Wed, Jun 28, 2023 at 11:21 PM Rafał Pietrak via Gcc 
 wrote:

[]

I was thinking about that, and it doesn't look as requiring that deep
rewrites. ABI spec, that  could accomodate the functionality could be as
little as one additional attribute to linker segments.


If I understand correctly, you are looking for something like the x32
mode that was available for a while on x86_64 processors:
https://en.wikipedia.org/wiki/X32_ABI .  That was a substantial amount
of work including changes to the compiler, assembler, linker, standard
library, and kernel.  And at least to me it's never seemed
particularly popular.


Yes.

And WiKi reporting up to 40% performance improvements in some corner 
cases is impressive and encouraging. I believe, that the reported 
average of 5-8% improvement would be significantly better within MCU 
tiny resources environment. In MCU world, such improvement could mean 
fit-nofit of a project into a particular device.


-R



A key difference is that using 32-bit pointers on an x86 is enough 
address space for a large majority of use-cases, while even on the 
smallest small ARM microcontroller, 16-bit is not enough.  (It's not 
even enough to access all memory on larger AVR microcontrollers - the 
only 8-bit device supported by mainline gcc.)  So while 16 bits would 
cover the address space of the RAM on a small ARM microcontroller, it 
would not cover access to code/flash space (including read-only data), 
IO registers, or other areas of memory-mapped memory and peripherals. 
Generic low-level pointers really have to be able to access everything.


So an equivalent of x32 mode would not work at all.  Really, what you 
want is a 16-bit "small pointer" that is added to 0x2000 (the base 
address for RAM in small ARM devices, in case anyone following this 
thread is unfamiliar with the details) to get a real data pointer.  And 
you'd like these small pointers to have convenient syntax and efficient use.


I think a C++ class (or rather, class template) with inline functions is 
the way to go here.  gcc's optimiser will give good code, and the C++ 
class will let you get nice syntax to hide the messy details.


There is no good way to do this in C.  Named address spaces would be a 
possibility, but require quite a bit of effort and change to the 
compiler to implement, and they don't give you anything that you would 
not get from a C++ class.


(That's not quite true - named address spaces can, I believe, also 
influence the section name used for allocation of data defined in these 
spaces, which cannot be done by a C++ class.)


David



Re: wishlist: support for shorter pointers

2023-07-03 Thread David Brown via Gcc

On 28/06/2023 10:35, Rafał Pietrak via Gcc wrote:

Hi Jonathan,

W dniu 28.06.2023 o 09:31, Jonathan Wakely pisze:




If you use a C++ library type for your pointers the syntax above 
doesn't need to change, and the fancy pointer type can be implemented 
portable, with customisation for targets where you could use 16 bits 
for the pointers.


As you can expect from the problem I've stated - I don't know C++, so 
I'll need some more advice there.


But, before I dive into learning C++ (forgive the naive question) 
isn't it so, that C++ comes with a heavy runtime? One that will bloat my 
tiny project? Or the bloat comes only when one uses particular 
elaborated class/inheritance scenarios, and this particular case ( for 
(...; ...; x = x->next) {} ) will not draw any of that into this project?





Let me make a few points (in no particular order) :

1. For some RISC targets, such as PowerPC, it is common to have a 
section of memory called the "small data section".  One of the registers 
is dedicated as an anchor to this section, and data within it is 
addressed as Rx + 16-bit offset.  But this is primarily for data at 
fixed (statically allocated) addresses, since reads and writes using 
this address mode are smaller and faster than full 32-bit addresses. 
Normal pointers are still 32-bit.  It also requires a dedicated register 
- not a big cost when you have 31 GPRs, but much more costly when you 
have only 13.


2. C++ is only costly if you use costly features.  On small embedded 
systems, you want "-fno-exceptions -fno-rtti", and you will get as good 
(or bad!) results for C++ as for C.  Many standard library features 
will, however, result in a great deal of code - it is usually fairly 
obvious which classes and functions are appropriate.


3. In C, you could make a type such as :

typedef struct {
uint16_t p;
} small_pointer_t;

and conversion functions :

static const uintptr_t ram_base = 0x2000;

static inline void * sp_to_voidp(small_pointer_t sp) {
return (void *)(ram_base + sp);
}

static inline small_pointer_t voidp_to_sp(void * p) {
small_pointer_t sp;
sp.p = (uintptr_t) p - ram_base;
return sp;
}

Then you would use these access functions to turn your "small pointers" 
into normal pointers.  The source code would become significantly harder 
to read and write, and less type-safe, but could be quite efficient.


In C++, you'd use the same kinds of functions.  But they would now be 
methods in a class template, and tied to overloaded operators and/or 
conversion functions.  The result would be type-safe and let you 
continue to use a normal pointer-like syntax, and with equally efficient 
generated code.  You could also equally conveniently have small pointers 
to ram and to peripheral groups.  This mailing list is not really the 
place to work through an implementation of such class templates - but it 
certainly could be done.



4. It is worth taking a step back, and thinking about how you would like 
to use these pointers.  It is likely that you would be better thinking 
in terms of an array, rather than pointers - after all, you don't want 
to be using dynamically allocated memory here if you can avoid it, and 
certainly not generic malloc().  If you can use an array, then your 
index type can be as small as you like - maybe uint8_t is enough.



David





Re: Will GCC eventually learn to use BSR or even TZCNT on AMD/Intel processors?

2023-06-06 Thread David Brown via Gcc

On 06/06/2023 14:53, Paul Smith wrote:

On Tue, 2023-06-06 at 16:36 +0800, Julian Waters via Gcc wrote:

Sorry for my outburst, to the rest of this list. I can no longer stay
silent and watch these little shits bully people who are too kind to
fire back with the same kind of venom in their words.


Many of us have had Dave in our killfiles for a long time already.  I
recommend you (and everyone else) do the same.  You won't miss out on
any information of any use to anyone: he apparently just enjoys making
other people angry.

I'm quite serious: it's so not worth the mental energy to even read his
messages, much less to reply to him.  Arguing with "people who are
wrong on the internet" can be cathartic but this is not arguing, it's
just stabbing yourself in the eye with a pencil.  Don't play.



If a poster is causing enough aggravation that a large number of people 
have killfiled him, is there a process for banning him from the list? 
That is surely a better solution than having many people individually 
killfiling him?  I would assume those with the power to blacklist 
addresses from the mailing list do not do so lightly, and that there is 
a procedure for it.


David




Re: Will GCC eventually learn to use BSR or even TZCNT on AMD/Intel processors?

2023-06-06 Thread David Brown via Gcc

On 06/06/2023 02:09, Dave Blanchard wrote:


If this guy's threads are such a terrible waste of your time, how
about employing your email client's filters to ignore his posts (and
mine too) and fuck off?



You apparently appreciate Stefan's posts, but burst a blood vessel when 
reading anyone else's.  And Stefan has shown a total disregard for what 
anyone else writes.


Rather than everyone else having to killfile the pair of you, why don't 
you do everyone a favour and have your little rants with each other 
directly, and not on this list?


If either of you are remotely interested in improving gcc's 
optimisation, there are two things you must do:


1. Stop wasting the developers' time and driving them up the wall, so 
that they have more time to work on improving the tools.


2. Make the suggestions and requests for improvements through the proper 
channels - polite, factual and detailed bug reports.


This is not rocket science - it's basic human decency, and should not be 
difficult to understand.


David



Re: Who cares about performance (or Intel's CPU errata)?

2023-05-28 Thread David Brown

On 28/05/2023 01:30, Andrew Pinski via Gcc wrote:

On Sat, May 27, 2023 at 3:54 PM Stefan Kanthak  wrote:





 seteal
 movzx   eax, al  # superfluous


No it is not superfluous, well ok it is because of the context of eax
(besides the lower 8 bits) are already zero'd but keeping that track
is a hard problem and is turning problem really. And I suspect it
would cause another false dependency later on too.

For -Os -march=skylake (and -Oz instead of -Os) we get:
 popcnt  rdi, rdi
 popcnt  rsi, rsi
 add esi, edi
 xor eax, eax
 dec esi
 seteal

Which is exactly what you want right?

Thanks,
Andrew

There is also the option of using "bool" as the return type for boolean 
functions, rather than "int".  When returning a "bool", gcc does not add 
the "movzx eax, al" instruction.  (There are some circumstances where 
returning "int" for a boolean value is a better choice, but usually 
"bool" makes more sense, and it can often be a touch more efficient.)


David




Re: Will GCC eventually support correct code compilation?

2023-05-28 Thread David Brown

On 27/05/2023 20:16, Dave Blanchard wrote:

On Fri, 26 May 2023 18:44:41 +0200 David Brown via Gcc
 wrote:


On 26/05/2023 17:49, Stefan Kanthak wrote:


I don't like to argue with idiots: they beat me with experience!

Stefan



Stefan, you are clearly not happy about the /free/ compiler you
are using, and its /free/ documentation (which, despite its flaws,
is better than I have seen for most other compilers).


When the flaws continue to stack up as things get provably worse over
time, at some point you need to stop patting yourself on the back,
riding on the coattails of your past successes, and get to work
making things right.



I think your idea of "proof" might differ from that of everyone else. 
The GCC developers are entirely aware that their tools have bugs and 
scope for improvement, but anyone who has followed the project for any 
length of time can see it has continually progressed in many ways. 
There are regularly minor regressions, and occasionally serious issues - 
but the serious issues get fixed.


This is open source software.  If newer versions were "getting provably 
worse over time", then people would simply fork earlier versions and use 
them.  That's what happens in projects where a significant number of 
users or developers feel the project is moving in the wrong direction.



At the very least, GCC documentation is HORRIBLE, as this previous
thread proves.


Now I am sure that you don't know what "proof" is.  In regard to 
documentation, this thread proves that GCC's documentation is not 
perfect, that the GCC developers know this, that they ask people for 
suggestions for improvement, and that they keep track of suggestions or 
complaints so that they can be fixed when time and resources allow.




If the branch is rotten and splintered then maybe it's time to get
off that branch and climb onto another one.


Feel free to do so.




Remember, these are people with /no/ obligation to help you.


... and it often shows!


My experience, like that of most people (judging from the mailing lists 
and the bugzilla discussions I have read), is different - those who 
treat the GCC developers politely and with the respect due any fellow 
human, get a great deal of help.  They might not always agree on what 
should be changed, but even then you can generally come out of the 
discussion with an understanding of /why/ they cannot or will not change 
GCC as you'd like.


But - like everyone else - the GCC developers can quickly lose interest 
in helping those who come across as rude, demanding, unhelpful and 
wilfully ignorant.





Some do gcc development as voluntary contributions, others are paid
to work on it - but they are not paid by /you/.  And none are paid
to sit and listen to your tantrums.


So is this proof of the technical and intellectually bankruptcy of
the open source development model, or...?


No, it is not.



If nobody wants to have detailed discussions about the technical
workings of a very serious tool that millions are relying on day in
and day out, what is this mailing list FOR, exactly?



It /is/ for such discussions.  This thread has not been a discussion - 
it has been driven by someone who preferred to yell and whine rather 
than discuss, and insisted on continuing here rather than filing bug 
reports in the right places.  The GCC developers prefer to work /with/ 
the users in finding out how to make the toolchain better - /that/ is 
what the mailing lists are for.





Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread David Brown via Gcc

On 26/05/2023 17:49, Stefan Kanthak wrote:


I don't like to argue with idiots: they beat me with experience!

Stefan



Stefan, you are clearly not happy about the /free/ compiler you are 
using, and its /free/ documentation (which, despite its flaws, is better 
than I have seen for most other compilers).


Instead of filing a bug report, as you have been asked to do, or reading 
the documentation, or thinking, or posting to an appropriate mailing 
list, you have chosen to rant, yell, shout at and insult the very people 
who could make the changes and improvements you want.


So who, exactly, do you think is acting like an idiot?  I'd say it is 
the rude and arrogant fool that is sawing off the branch he is sitting on.


Remember, these are people with /no/ obligation to help you.  Some do 
gcc development as voluntary contributions, others are paid to work on 
it - but they are not paid by /you/.  And none are paid to sit and 
listen to your tantrums.



So if you want to shout and rant and blow off steam, go make a tweet or 
something.  If you actually hope to see gcc change its optimisation, 
flag details or documentation to your liking, then your current 
behaviour is the worst possible tactic.  So let your final post to this 
thread be an apology, then register bug reports with what you see as 
bugs or scope for improvement in the project.  Please - for the sanaity 
of the gcc developers and for the benefit of gcc users everywhere - stop 
your aggravating posts here, so that Jonathan and the others can back to 
what they do best - improving gcc for everyone.


David




Re: More C type errors by default for GCC 14

2023-05-14 Thread David Brown

On 14/05/2023 07:28, Po Lu via Gcc wrote:

Eli Schwartz  writes:


Quoting my previous reply on the topic.

Until everyone is on the same page as you about whether these are GNUC
extensions, this conversation will go nowhere.

You are of the opinion that "GCC currently behaves a certain way when
compiling the code" means that the behavior is documented and specified
as a "C Extension" for the GNUC language dialect.


Yes, by the definition of ``extension'' used by everyone except you.


I can't speak for "everyone", and I don't think you can either.  But I 
believe I am safe in saying that everyone who has expressed an opinion 
in this thread, agrees with Eli's definition - GCC C extensions are what 
the GCC C manual documents as extensions.


The behaviour of a particular compiler does not define the extensions. 
At best, these are "undocumented features" - if you find them consistent 
enough and useful enough on a particular version of a particular 
compiler, then you may be willing to rely on them despite a lack of any 
kind of guarantees or documentation.


(You might note that there are several compilers, not just GCC, that 
implement many of the GNU C extensions.  I don't believe any of them 
makes any guarantees about your imagined undocumented extensions either.)





Undefined and undocumented behavior is not a language extension. It is
undefined and undocumented behavior.


But when it becomes defined by the translator, in a precise way, it
becomes an extension to the Standard.


No, it becomes an artefact of the particular implementation.  And if you 
have studied that implementation closely enough to see that it fulfils 
the specifications you want, not just its actual specifications, then 
feel free to keep that implementation and use it.  But you have 
absolutely no basis for expecting that any other implementation (such as 
future gcc versions) implement the same undocumented specifications.





You may feel free to take an exact GCC release (source or binary),
analyze it, reverse-engineer it, or verify that it does what you want
it to do, and then trust that those undefined and undocumented
behaviors are ***benevolent***, but that doesn't cause it to be
defined and documented, and your trust will, if you are wise, depend
on you pinning an exact source code commit of the compiler. Do not
depend on bugfix releases of GCC to preserve your undocumented
semantics. They may or they may not, but if they don't, it's not a GCC
bug, because it is ***undocumented***.


GCC does not implement its documentation.  The documentation is supposed
to describe (_not_ specify) how GCC behaves, and when the documentation
is wrong or contains omissions, the documentation will have to be fixed.
Not the compiler itself.

And it's not just GCC.  Almost all programs work this way.



It is a sad fact that many programs /are/ written that way - rushed 
coding under a manager's whip until the code works well enough to pass 
some basic tests, then it is shipped to end users.  Documentation, if 
any, is written afterwards with barely a nod towards a specification.


The /correct/ way to write code is to specify first (as formally and 
thoroughly as appropriate for the project), and have everyone agree that 
the specification fulfils the requirements for the program and that it 
is feasible to implement in practice.  /Then/ start writing code.  If 
the specification is user readable, then it forms the basis for the user 
documentation - if not, then user documentation can be written before, 
during or after code development.


For a living and evolving project like GCC, this all works in lots of 
small steps and in practice changes to the code and changes to the 
documented specifications get committed together, to keep them 
synchronised.  But it is always logically a documented feature and code 
that implements that specification.  (The specification can be a bug 
report, not just the user documentation.)



Attempting to do what you describe - look at the behaviour of gcc in 
practice, document it and call it the specification - would be insane. 
I've already tried to explain on this thread how the logical consequence 
of such ideas is total stagnation of the gcc development.  Look at the 
bugzilla database for GCC - it is /huge/.  Every (valid) bug there is a 
case where GCC does not do what it was supposed to do - what it is 
/documented/ to do.  You would have the GCC folks spend their time 
updating the documentation to redefine these bugs as previously 
undocumented features, rather than fixing the bugs in the code, and 
requiring all future versions of gcc to maintain bug-for-bug 
compatibility with the older versions.


Or could it be that you think this only applies to the features that 
/you/, personally, want to keep?  Sod the rest of the world, as long as 
/you/ don't need to fix your code?



I've come across a fair number of C programmers with your attitude 
before.  Generally they don't realise the 

Re: More C type errors by default for GCC 14

2023-05-14 Thread David Brown

On 14/05/2023 07:38, Po Lu via Gcc wrote:


No, all you have to do is to tell GNU CC to compile Standard C.  But
what's being debated here is the behavior of GNU CC when translating
both Standard C and GNU C, so your demonstration is almost completely
pointless.


You keep using the term "Standard C", but you are using it incorrectly.

"Standard C" means "The language C as recognised by standardisation 
bodies".  It is, currently, ISO/IEC 9899:2018 - also known as C17/C18. 
(The standard was completed in C17, but not actually published until C18.)


If you want to refer to older standards (or unpublished future 
standards), you should do so explicitly - C90, C99, C11, C23.


Then there are "extended standards" - specific documented extensions on 
top of an official standard.  "gnu17" would be an example of that.


Any language from before the first ANSI C is /not/ standard, since it is 
not based on any standards document.  The nearest would be the de-facto 
"standard" (anything "de-facto" is, by definition, not an actual 
standard) language described in the first edition of "The C Programming 
Language", and known as "K C".  Many of the C compilers of that time 
followed the book and implemented - modulo bugs, extensions, 
inconsistencies, and missing features - the language "K C".  Many also 
had their own variants, as "C" was already popular before the book was 
published.



So - if you are referring to "K C", then use that term.  It is quite 
well defined, and accurately describes a popular pre-standard C language.


And you may note that if you look at the GCC manual, it supports all 
standards of C (at least as closely as possible).  All /standards/. 
"K C" is not a standard, and not officially supported by GCC - there 
has never, to my knowledge, been any kind of guarantee that GCC would 
support pre-standard syntax.  There has only been a guarantee that 
appropriate choices of flags would cause pre-standard syntax to be 
detected and rejected or diagnosed.  Ironically, the discussion here, 
with its suggestions of explicit flags to allow "K C" constructs, has 
come closer to guaranteeing support for that pre-standard dialect than 
GCC has ever had before.








(When people refer to "ANSI C", they almost invariably mean the ANSI 
standard from 1989 that formed, after a bit of section renumbering, ISO 
C90, it is worth remembering that ANSI actually delegates the C 
standardisation to ISO.  So "ANSI C" refers to the same language as 
C17/C18.)





Re: [wish] Flexible array members in unions

2023-05-12 Thread David Brown via Gcc

On 12/05/2023 08:16, Richard Biener via Gcc wrote:

On Thu, May 11, 2023 at 11:14 PM Kees Cook via Gcc  wrote:


On Thu, May 11, 2023 at 08:53:52PM +, Joseph Myers wrote:

On Thu, 11 May 2023, Kees Cook via Gcc wrote:


On Thu, May 11, 2023 at 06:29:10PM +0200, Alejandro Colomar wrote:

On 5/11/23 18:07, Alejandro Colomar wrote:
[...]

Would you allow flexible array members in unions?  Is there any
strong reason to disallow them?


Yes please!! And alone in a struct, too.

AFAICT, there is no mechanical/architectural reason to disallow them
(especially since they _can_ be constructed with some fancy tricks,
and they behave as expected.) My understanding is that it's disallowed
due to an overly strict reading of the very terse language that created
flexible arrays in C99.


Standard C has no such thing as a zero-size object or type, which would
lead to problems with a struct or union that only contains a flexible
array member there.


Ah-ha, okay. That root cause makes sense now.


Hmm. but then the workaround

struct X {
   int n;
   union u {
   char at_least_size_one;
   int iarr[];
   short sarr[];
   };
};

doesn't work either.  We could make that a GNU extension without
adverse effects?

Richard.



I would like and use an extension like that (for C and C++) - the 
flexible arrays would act as though they were the same size as the 
size-specific part of the union, rounding up in this case to make the 
alignments correct.


I regularly want something like :

union ProtocolBuffer {
struct {
header ...
data fields ...
}
uint8_t raw8[];
uint32_t raw32[];
}

The "raw" arrays would be used to move data around, or access it from 
communication drivers.  As C (and C++) is defined, I have to split this 
up so that the "raw" arrays can use "sizeof(ProtocolTelegram) / 4" or 
similar expressions for their size.  If flexible arrays in unions were 
allowed here, it could make my code a little neater and use more 
anonymous unions and structs to reduce unhelpful verbosity.






Why are zero-sized objects missing in Standard C? Or, perhaps, the better
question is: what's needed to support the idea of a zero-sized object?

--
Kees Cook







Re: More C type errors by default for GCC 14

2023-05-12 Thread David Brown via Gcc

On 12/05/2023 04:08, Po Lu via Gcc wrote:

Eli Schwartz  writes:





Because that's exactly what is going on here. Features that were valid
C89 code are being used in a GNU99 or GNU11 code file, despite that
***not*** being valid GNU99 or GNU11 code.


How GCC currently behaves defines what is valid GNU C.



What GCC /documents/ defines what is valid GNU C.  (Much of that is, of 
course, imported by reference from the ISO C standards, along with 
target-specific details such as ABI's.)


Anything you write that relies on undocumented behaviour may work by 
luck, not design, and you have no basis for expecting future versions of 
gcc, or any other compiler, to give the same lucky results.


Each version of a compiler is, in fact, a different compiler - that is 
how you should be viewing your tools.  The move between different 
versions of the same compiler is usually much smaller than moving 
between different compiler vendors, but you still look at the release 
notes, change notices, porting information, etc., before changing.  You 
still make considered decisions, and appropriate testing.  You still 
check your build systems and modify flags if needed.  And you do that 
even if you are confident that your code is solid with fully defined 
behaviour and conforming to modern C standards - you might have a bug 
somewhere, and the new compiler version might have a bug.




I am not dictating anything to you or anyone else in this paragraph,
though? All I said was that if one writes a c89 program and tells the
compiler that, then they will not even notice this entire discussion to
begin with.

What, precisely, have I dictated?


That people who are writing GNU C code should be forced to rewrite their
code in ANSI C, in order to make use of GNU C extensions to the 1999
Standard.



You are joking, right?  Surely no one can /still/ be under the 
misapprehension that anyone is proposing GCC stop accepting the old 
code?  All that is changing is the default behaviour, which will mean 
some people might have to use an extra flag or two in their build setup.





However, it does appear that we are still stuck in confusion here,
because you think that GCC is no longer able to compile such code, when
in fact it is able to.


It won't, not by default.



That's pretty much irrelevant.  People don't use gcc without flags.  The 
only thing that will change is which flags you need to use.


If you are not in a position to change the source code, and not in a 
position to change the build flags, then you are not in a position to 
change the compiler version.  (That's fine, of course - in my line of 
work, I almost never change compiler version for existing projects.  I 
have old code where the makefile specifies gcc 2.95.)


David




Re: More C type errors by default for GCC 14

2023-05-11 Thread David Brown via Gcc

On 11/05/2023 04:09, Po Lu via Gcc wrote:

jwakely@gmail.com (Jonathan Wakely) writes:


So let's do it. Let's write a statement saying that the GCC developers
consider software security to be of increasing importance, and that we
consider it irresponsible to default to accepting invalid constructs in the
name of backwards compatibility. State that we will make some changes which
were a break from GCC's traditional stance, for the good of the ecosystem.


I'm sorry you think that way.


Given recent pushes to discourage or outright ban the use of memory-safe
languages in some domains, I think it would be good to make a strong
statement about taking the topic seriously. And not just make a statement,
but take action too.

If we don't do this, I believe it will harm GCC in the long run. The vocal
minority who want to preserve the C they're used to, like some kind of
historical reenactment society, would get their wish: it would become a
historical dead end and go nowhere.


Vocal minority? Do you have any evidence to back this claim?

What I see is that some reasonable organizations have already chosen
other C compilers which are capable of supporting their existing large
bodies of C code that have seen significant investment over many years,
while others have chosen to revise their C code with each major change
to the language.

The organizations which did not wish to change their code did not
vocally demand changes to GCC after GCC became unsuitable, but quietly
arranged to license other compilers.

Those that continue write traditional C code know what they are doing,
and the limitations of traditional C do not affect the quality of their
code.  For example, on the Unix systems at my organization, the SGS is
modified so that it will not link functions called through a declaration
with no parameter specification with a different set of parameters than
it was defined with.

Naturally, the modified linker is not used to run configure scripts.



Let's be absolutely clear here - gcc has been, and will continue to be, 
able to compile code according to old and new standards.  It can handle 
K C, right through to the cutting edge of newest C and C++ standards. 
It can handle semantic requirements such as two's complement wrapping 
and "anything goes" pointer type conversions - features that a lot of 
old code relies on but which are not documented or guaranteed behaviour 
for the vast majority of other compilers.  It can handle all these 
things - /if/ you pick the correct flags.


With the proposed changes, you can still compile old K code with gcc - 
if you give it the right flags.  No features are being removed - only 
the default flags are being changed.  If anyone is changing from gcc to 
other compilers because they think newer gcc does not support older 
code, then they are perhaps doing so from ignorance.


If some users are willing to change to different compilers, but 
unwilling to learn or use new flags in order to continue using their 
existing compiler after it changes its defaults, then perhaps gcc could 
pick different defaults depending on the name used for the executable? 
If it is invoked with the name "gcc-kr", then it could accept K code 
by default and have "-std=gnu90" (I believe that's the oldest standard 
option).  If it is invoked as "gcc", then it would reject missing 
function declarations, implicit int, etc., as hard errors.


Then these users could continue to use gcc, and their "new" compiler to 
handle their old code would be nothing more than a symbolic link.


David







Re: More C type errors by default for GCC 14

2023-05-11 Thread David Brown

On 10/05/2023 18:28, Eli Zaretskii via Gcc wrote:

Date: Wed, 10 May 2023 17:58:16 +0200
From: David Brown via Gcc 


In any case, I was not not talking about bug-compatibility, I was
talking about being able to compile code which GCC was able to compile
in past versions.  Being able to compile that code is not a bug, it's
a feature.


No, being able to compile /incorrect/ code by default is a bug.  It is
not helpful.


I actually agree; we just have different definitions of "incorrect".



Fair enough.


I've seen this kind of argument many times - "The compiler used to
accept my code and give the results I wanted, and now newer compiler
versions make a mess of it".


But we are not talking about some random code that just happened to
slip through cracks as a side effect of the particular implementation.
We are talking about code that was perfectly valid, had well-defined
semantics, and produced a working program.  I don't care about the
former; I do care about the latter.


How would you know that the code had perfectly valid, well-defined 
semantics?


I've had the dubious pleasure of trying to maintain and update code 
where the previous developer had a total disregard for things like 
function declaration - he really did dismiss compiler complaints about 
implicit function declarations as "only a warning".  The program worked 
as he expected, for the most part.  But that was despite many functions 
being defined in one part of the code with one set of parameters (number 
and type), and called elsewhere with a different set - sometimes more 
than one selection of parameter types in the same C file.


Insisting on proper function declarations and removing implicit int does 
not guarantee that such messes won't happen - but it /does/ reduce some 
of the opportunities to do so accidentally.





If the gcc developers really were required to continue to compile /all/
programs that compiled before, with the same results, then the whole gcc
project can be stopped.


You will have to explain this to me.  Just stating this is not enough.
How will accepting K stop GCC development?



People write incorrect code all the time.  Studies have shown that 
pretty much any sizeable C or C++ program - including gcc itself - 
contain undefined behaviour but rely on particular results from that. 
Lax interpretation and long outdated syntaxes do not, in themselves, 
imply undefined behaviour or incorrect code - but they make it far 
easier to accidentally have such errors, and to cover up such errors. 
(That's why they were removed in the first place.)


If a compiler is required to continue to compile every program that a 
previous version compiled, where the developer was satisfied that the 
program worked as expected, then the only way to guarantee that is to 
stop changing and improving gcc.


I agree that accepting fully correct programs written in K C would not 
limit the future development of gcc.  But how many programs written in 
K C, big enough and important enough to be relevant today, are fully 
correct?  I'd be surprised if you needed more than one hand to count 
them.  I would expect subtle errors and assumptions to flourish - with 
typical examples being signed integer arithmetic overflows and abuses of 
pointer casts and invalid mixing of pointer and integer types.


Continuing to give developers what they expect, rather than what the 
standards (and gcc extensions) guarantee, is always an issue for 
backwards compatibility.  Each new version of gcc can, and sometimes 
does, "break" old code - code that people relied on before, but was 
actually incorrect.  This is unavoidable if gcc is to progress.


That is why I suggested that a flag such as "-fold-code" that enables 
long outdated syntaxes should also disable the kind of optimisations 
that are most likely to cause issues with old code, and should enable 
semantics chances to match likely assumptions in such code.  I don't 
believe in the existence of correct K C code - but I /do/ believe in 
the importance of some K C code despite its errors.




As for the two's complement wrapping example: I'm okay with having
this broken because some useful feature requires to modify the basic
arithmetics and instructions emitted by GCC in a way that two's
complement wrapping can no longer be supported.  _That_ is exactly an
example of a "good reason" for backward incompatibility: GCC must do
something to compile valid programs, and that something is
incompatible with old programs which depended on some de-facto
standard that is nowadays considered UB.  


The problem for backwards compatibility and continuing to compile old 
code is that these things were /always/ UB - but they were, as you say, 
viewed as de-facto.



But the case in point is not
like that, AFAIU: in this case, GCC will deliberately break a program
although it could compile it without adversely affecting its output
for any othe

Re: More C type errors by default for GCC 14

2023-05-10 Thread David Brown via Gcc

On 10/05/2023 16:39, Eli Zaretskii via Gcc wrote:

Date: Wed, 10 May 2023 15:30:02 +0200
From: David Brown via Gcc 


If some developers want to ignore warnings, it is not the business of
GCC to improve them, even if you are right in assuming that they will
not work around errors like they work around warnings (and I'm not at
all sure you are right in that assumption).  But by _forcing_ these
errors on _everyone_, GCC will in effect punish those developers who
have good reasons for not changing the code.


What would those "good reasons" be, in your opinion?


For example, something that adversely affects GCC itself and its
ability to compile valid programs.


If gcc itself contains code that relies on outdated features, these 
should be fixed in the gcc source code.  It is one thing to suggest that 
a project that has been "maintenance only" for several decades cannot 
reasonably be updated, but that does not apply to current programs like gcc.





On the other hand, continuing to accept old, outdated code by lax
defaults is punishing /current/ developers and users.  Why should 99.99%
of current developers have to enable extra errors to catch mistakes (and
we all make occasional mistakes in our coding - so they /should/ be
enabling these error flags)?


Adding a flag to a Makefile is infinitely easier than fixing old
sources in a way that they produce the same machine code.



The suggestion has been - always - that support for old syntaxes be 
retained.  But that flag should be added to the makefiles of the 0.01% 
of projects that need it because they have old code - not the 99.99% of 
projects that are written (or updated) this century.



I do agree that backwards compatibility breaks should only be done for
good reasons.  But I think the reasons are good.


Not good enough, not for such a radical shift in the balance between
the two groups.



Do you have any reason to believe that the old code group is of relevant 
size?  I think it is quite obvious that I have been pulling percentages 
out of thin air, but can you justify claiming anything different?


I mean, if gcc simply added a default "-Werror=implicit" flag in the 
release candidate for gcc-14, how many people do you think would 
actually complain?  I'd guess that there would be far fewer complaints 
than there are posts in this thread discussing whether or not it's a 
good idea.




And no,
educating/forcing GCC users to use more modern dialect of C is not a
good reason.



Yes, it /is/ a good reason.


Not for a compiler.  A compiler is a tool, it is none of its business
to teach me what is and what isn't a good dialect in each particular
case.  Hinting on that, via warnings, is sufficient and perfectly
okay, but _forcing_ me is not.


Again - did you miss the point about people who really want to work with 
old code can do so, by picking the right flag(s) ?





Consider why Rust has become the modern fad in programming.  People
claim it is because it is inherently safer than C and C++.  It is not.
There are really two reasons for it appearing to be safer.  One is that
the /defaults/ for the tools, and the language idioms, are safer than
the /defaults/ for C and C++ tools.  That makes it harder to make
mistakes.  The other is that it has no legacy of decades of old code and
old habits, and no newbie programmers copying those old styles.


Exactly.  We cannot reasonably expect that a compiler which needs to
support 50 years of legacy code to be as safe as a compiler for a
language invented yesterday afternoon.  People who want a safe
programming environment should not choose C as their first choice.



We cannot expect a /language/ with a 50 year history to be as safe as a 
modern one.  But we can expect a /compiler/ released /today/ to be as 
safe as it can be made /today/.


I agree that C is not the best choice of language for many people. 
Actually, I'd say that most people who program in C would be better off 
programming in something else.  And most programs that are written in C 
could be better in a different language.  But when C /is/ the right 
choice - or even when it is the choice made despite being the wrong 
choice, I want it to be /good/ C, and I want tools to help out there as 
best they possibly can.  That includes good default flags, because not 
all gcc users are experts on gcc flags.


My ideal, actually, would be that gcc has "-Wall -Wextra" by default, 
trying to help developers from the get-go.  It should also have an flag 
"-sep" that disables all warnings and uses lax modes, for people using 
it to build software provided by others and they want nothing to do with 
the source code.  But of course that is not the ideal situation for 
everyone else!


(See <https://en.wikipedia.org/wiki/Somebody_else%27s_problem> for an 
explanation behind the "-sep" flag.)




So yes, anything that pushes C programmers into being better C
programmers is worth consi

Re: More C type errors by default for GCC 14

2023-05-10 Thread David Brown via Gcc

On 10/05/2023 16:14, Eli Zaretskii via Gcc wrote:

Date: Wed, 10 May 2023 14:41:27 +0200
Cc: jwakely@gmail.com, fwei...@redhat.com, gcc@gcc.gnu.org,
  ar...@aarsen.me
From: Gabriel Ravier 


Because GCC is capable of compiling it.

That is not a good argument.  GCC is capable of compiling any code in all
the reported accepts-invalid bugs on which it doesn't ICE.  That doesn't
mean those bugs shouldn't be fixed.

Fixing those bugs, if they are bugs, is not the job of the compiler.
It's the job of the programmer, who is the one that knows what the
code was supposed to do.  If there's a significant risk that the code
is a mistake or might behave in problematic ways, a warning to that
effect is more than enough.


Are you seriously saying that no accepts-invalid bug should ever be
fixed under any circumstances on the basis that some programmers might
rely on code exploiting that bug ??


Sorry, I'm afraid I don't understand the question.  What are
"accepts-invalid bugs"?



They are cases where the C standards (plus documented gcc extensions) 
have syntax or constraint requirements, but code which breaks these 
requirements is accepted by the compiler.  For example, if the compiler 
accepted "long long long int" as a type, that would be an 
"accepts-invalid" bug.  They are important for two reasons.  One is that 
they mean the compiler fails to help the developer catch the mistake in 
their code.  The other is that the code might have an inconsistent 
interpretation, and that might change in the future.  In the 
hypothetical example of a three-long int, a current compiler might treat 
it as a "long long", while a future standard might add support for it as 
a new type with minimum 128-bit size.



In any case, I was not not talking about bug-compatibility, I was
talking about being able to compile code which GCC was able to compile
in past versions.  Being able to compile that code is not a bug, it's
a feature.



No, being able to compile /incorrect/ code by default is a bug.  It is 
not helpful.


(The compiler cannot, of course, spot /all/ mistakes - the gcc 
developers are a smart group, but I think asking them to solve the 
Halting Problem is a bit much!)


I've seen this kind of argument many times - "The compiler used to 
accept my code and give the results I wanted, and now newer compiler 
versions make a mess of it".  The cause is almost invariably undefined 
behaviour, but it can occasionally be through changes to the standards 
such as removal of old behaviour or other differences in the 
interpretation of code (there were a number of incompatibilities between 
K and C90, and between C90 and C99).


The compiler is under /no/ obligation to compile undefined behaviour in 
the same way as it might have done for a particular piece of code.  It 
is under /no/ obligation to continue to accept incorrect or invalid 
code, just because it used to accept it.  It /is/ - IMHO - under an 
obligation to do what it can to help spot problems in code and help 
developers get good quality correct code in the end.  If it fails to do 
that, people will, and should, move to using different tools.


New compiler versions are not required to do two's complement wrapping 
of signed integer overflow, even though old broken code might have been 
written under the assumption that it did and even though older, less 
powerful versions of the compiler might have compiled that code into 
something the developer wanted.  In the same way, new compiler versions 
are not required to support syntax that has been dead for decades - at 
least not by default.  (Unlike most other compilers, gcc developers go 
far out of their way to support such outdated and incorrect code - all 
they ask is that people use non-default flags to get such non-standard 
syntax and semantics.)


If the gcc developers really were required to continue to compile /all/ 
programs that compiled before, with the same results, then the whole gcc 
project can be stopped.  The only way to ensure perfect backwards 
compatibility would be to stop development, and no longer release any 
new versions of the compiler.  That is the logical consequence of "it 
used to compile (with defaults or a given set of flags), so it should 
continue to compile (with these same flags)" - assuming "compile" here 
means "giving the same resulting behaviour in the executable" rather 
than just "giving an executable that may or may not work".


Clearly, you don't mean gcc development should stop.  That means a line 
must be drawn, and some code that compiled with older gcc will not 
compile with newer gcc.  The only question is where the line should be.







Re: More C type errors by default for GCC 14

2023-05-10 Thread David Brown via Gcc

On 10/05/2023 15:10, Basile Starynkevitch wrote:

Hello all,

After a suggestion by Eric Gallager

Idea for a compromise: What if, instead of flipping the switch on all
3 of these at once, we staggered them so that each one becomes a
default in a separate release? i.e., something like:

- GCC 14: -Werror=implicit-function-declaration gets added to the 
defaults

- GCC 15: -Werror=implicit-int gets added to the defaults
- GCC 16: -Werror=int-conversion gets added to the defaults

That would give people more time to catch up on a particular warning,
rather than overwhelming them with a whole bunch all at once. Just an
idea.


Eli Zaretskii  wrote on 10 may 2023, at 14:00


And that is just one example of perfectly valid reasons for not
wanting or not being able to make changes to pacify GCC.

Once again, my bother is not about "villains" who don't want to get
their act together, my bother is about cases such as the one above,
where the developers simply have no practical choice.

And please don't tell me they should use an older GCC, because as
systems go forward and are upgraded, older GCC will not work anymore.



My experience is that for safety critical software (per DOI 178C, 
embedded in aircrafts, or for the French covid breathing machine on 
https://github.com/Recovid/Controller ) the regulations, funders, and 
authorities requires a very specific version of GCC with very specific 
compilation flags.



Changing either the compiler (even from gcc-12.1 to gcc-12.2) or the 
compilation flags (even changing -O1 by -O2) requires written (on paper) 
approval by a large number of human persons, and formal certifications 
(eg ISO9001, ISO27001 procedures) and lots of checks and headaches.



I do know several persons making their living of these constraints.

I do know several corporations making a living from them (and keeping 
decade older GCC compiler binaries on many disks).


So I really think that for safety critical software (whose failure may 
impact lives) people are using an older (and well specified) GCC.



Of course, to compile an ordinary business web service (e-shop for 
clothes) with e.g. libonion (from https://github.com/davidmoreno/onion 
...) or to compile a zsh.org from source code (for or on a developer's 
laptop) the constraints are a lot lighter.


Regards!



In my line of work (small-systems embedded programming), the source for 
a program does not just include the C source code.  It includes the 
build system, compiler version, the flags used, and the library used - 
everything that can affect the resulting binary.  I realise I am far 
more paranoid about that kind of thing than the majority of developers, 
but it is also noteworthy that there is a trend towards reproducible 
builds in more mainstream development.


The oldest gcc I have on my machine is 2.95.3 for the 68k, from 1998.  I 
have some older compilers, but they are not gcc.


I wouldn't say I made a living out of this, but I have had a customer 
who was very happy that I could make a fix in a program I wrote 20 years 
previously, and could compile it with exactly the same tools as I used then.


One of the reasons I use gcc (in a world where companies are willing to 
pay $5000 for tools from the likes of Green Hills) is that I can keep 
the old versions around, and copy and use them at will.


And for those that are more demanding than me, they can of course 
archive the sources for gcc (and other parts of the toolchain).


Those that /really/ need old versions of the toolchain, can use old 
versions of the toolchain.  And if gcc 14 changes in such a way that 
distro maintainers can't use it to build ancient packages, then they 
should make gcc-13 a part of their base packages as well as current gcc, 
and ship gcc version 13 for as long as they ship "ed", "rn" and other 
software from the middle ages.






Re: More C type errors by default for GCC 14

2023-05-10 Thread David Brown via Gcc

On 10/05/2023 14:22, Eli Zaretskii via Gcc wrote:

From: Jonathan Wakely 
Date: Wed, 10 May 2023 12:49:52 +0100
Cc: David Brown , gcc@gcc.gnu.org


If some developers want to ignore warnings, it is not the business of
GCC to improve them, even if you are right in assuming that they will
not work around errors like they work around warnings (and I'm not at
all sure you are right in that assumption).  But by _forcing_ these
errors on _everyone_, GCC will in effect punish those developers who
have good reasons for not changing the code.


What would those "good reasons" be, in your opinion?  (I realise I am 
asking you to be speculative and generalise.  This discussion is an 
exchange of opinions, thoughts, experiences and impressions.)


Frankly, the most common "good" reason for a developer not changing 
their code from pre-C99 is that they retired long ago.  And people 
should definitely question whether the code should be kept.


As I noted in another post, it is entirely reasonable to suspect that 
such old code has errors - unwarranted assumptions that were considered 
appropriate back in the days when such code techniques were considered 
appropriate.  It has always been the unfortunate case with C programming 
that getting optimal results for some compilers has sometimes involved 
"cheating" a bit, such as assuming wrapping signed arithmetic or 
converting pointer types and breaking the "strict aliasing" rules.


Changing the gcc defaults and requiring old code to use flags that allow 
old constructs but limiting optimisations is not /punishing/ the old 
code or its developers or maintainers.  It is /supporting/ it - allowing 
it to be used more safely with modern tools.



On the other hand, continuing to accept old, outdated code by lax 
defaults is punishing /current/ developers and users.  Why should 99.99% 
of current developers have to enable extra errors to catch mistakes (and 
we all make occasional mistakes in our coding - so they /should/ be 
enabling these error flags)?  Why should they have to deal with other 
people's code that was badly written 30 years ago?  Is it really worth 
it, just so that a half-dozen maintainers at Linux distributions can 
recompile the 40-year old source for "ed" without adding a flag to the 
makefile?



Ultimately, /someone/ is going to suffer - a compiler can't have good 
defaults for current developers and simultaneously good defaults for 
ancient relics.  The question to consider is not whether we "punish" 
someone, but /whom/ do we punish, and what is the best balance overall 
going forward.





There will be options you can use to continue compiling the code
without changing it. You haven't given a good reason why it's OK for
one group of developers to have to use options to get their desired
behaviour from GCC, but completely unacceptable for a different group
to have to use options to get their desired behaviour.

This is just a change in defaults.


A change in defaults that is not backward-compatible should only be
done for very good reasons, because it breaks something that was
working for years.  No such good reasons were provided.  


I'm sorry, but I believe I /did/ provide good reasons.  Granted, they 
were in more than one post.  And many others here have also given many 
good reasons.  At the very least, making a safer and more useful 
compiler that helps developers make better code is a good reason, as is 
making a C compiler that is closer to standards compatibility by default.


I do agree that backwards compatibility breaks should only be done for 
good reasons.  But I think the reasons are good.




And no,
educating/forcing GCC users to use more modern dialect of C is not a
good reason.



Yes, it /is/ a good reason.  But I suppose that one is a matter of opinion.

I encourage you to look at CERT/CC, or other lists of code errors 
leading to security issues or functional failures.  When someone writes 
poor code, lots of people suffer.  Any initiative that reduces the 
likelihood of such errors getting into the wild is not just good for gcc 
and its users, it's good for the whole society.


Consider why Rust has become the modern fad in programming.  People 
claim it is because it is inherently safer than C and C++.  It is not. 
There are really two reasons for it appearing to be safer.  One is that 
the /defaults/ for the tools, and the language idioms, are safer than 
the /defaults/ for C and C++ tools.  That makes it harder to make 
mistakes.  The other is that it has no legacy of decades of old code and 
old habits, and no newbie programmers copying those old styles.  Rust 
code is written in modern development styles, with a care for 
correctness rather than getting maximum efficiency from limited 
old-fashioned tools or macho programming.  The only reason there is any 
sense in re-writing old programs in Rust is because re-writing them in 
good, clear, modern C (or C+

Re: More C type errors by default for GCC 14

2023-05-10 Thread David Brown via Gcc

On 09/05/2023 22:13, David Edelsohn via Gcc wrote:

On Tue, May 9, 2023 at 3:22 PM Eli Zaretskii via Gcc 
wrote:


Date: Tue, 9 May 2023 21:07:07 +0200
From: Jakub Jelinek 
Cc: Jonathan Wakely , ar...@aarsen.me,

gcc@gcc.gnu.org


On Tue, May 09, 2023 at 10:04:06PM +0300, Eli Zaretskii via Gcc wrote:

From: Jonathan Wakely 
Date: Tue, 9 May 2023 18:15:59 +0100
Cc: Arsen Arsenović , gcc@gcc.gnu.org

On Tue, 9 May 2023 at 17:56, Eli Zaretskii wrote:


No one has yet explained why a warning about this is not enough,

and

why it must be made an error.  Florian's initial post doesn't

explain

that, and none of the followups did, although questions about

whether

a warning is not already sufficient were asked.

That's a simple question, and unless answered with valid arguments,
the proposal cannot make sense to me, at least.


People ignore warnings. That's why the problems have gone unfixed for
so many years, and will continue to go unfixed if invalid code keeps
compiling.


People who ignore warnings will use options that disable these new
errors, exactly as they disable warnings.  So we will end up not


Some subset of them will surely do that.  But I think most people will

just

fix the code when they see hard errors, rather than trying to work around
them.


The same logic should work for warnings.  That's why we have warnings,
no?



This seems to be the core tension.  If developers cared about these issues,
they would enable appropriate warnings and -Werror.



-Werror is a /big/ stick.  An unused parameter message might just be an 
indication that the programmer isn't finished with that bit of code, and 
a warning is fine.  An implicit function declaration message shows a 
clear problem in the code - a typo in the function call, a missing 
#include, or a major flaw in the design and organisation of the code.


The C language takes backwards compatibility more seriously than any 
other programming language.  When the C standards mark previously 
acceptable features as deprecated, obsolescent, or constrain errors, it 
is done for very good reasons.  People should not be writing code with 
implicit int, or non-prototype function declarations.  Such mis-features 
of the language were outdated 30 years ago.



The code using these idioms is not safe and does create security
vulnerabilities.  And software security is increasingly important.

The concern is using the good will of the GNU Toolchain brand as the tip of
the spear or battering ram to motivate software packages to fix their
problems. It's using GCC as leverage in a manner that is difficult for
package maintainers to avoid.  Maybe that's a necessary approach, but we
should be clear about the reasoning.  Again, I'm not objecting, but let's
clarify why we are choosing this approach.



There are two problems I see with the current state of affairs, where 
deeply flawed code can be accepted (possibly with warnings) by gcc by 
default.


1. Modern developers who are not particularly well versed in the 
language write code with these same risky features.  It is depressing 
how many people think "The C Programming Language" (often a battered 
first edition) is all you need for learning C programming.  Turning more 
outdated syntax and more obvious mistakes into hard errors will help 
such developers - and help everyone who has to use the code they make.



2. Old code gets compiled with with modern tools that do not fulfil the 
assumptions made by the developer decades ago.  Compiling such code with 
modern gcc risks all sorts of problems due to the simpler compilation 
models of older tools.  For example, the code might assume two's 
complement wrapping arithmetic, or that function calls always act as a 
memory barrier.



My suggestion would be to have a flag "-fold-code" that would do the 
following (at a minimum) :


* Disallow higher optimisation flags.
* Force -fwrapv, -fno-strict-aliasing, -fno-inline.
* Require an explicit "-std=" selection.
* Allow old-style syntax, such as implicit int, with just a warning

If the "-fold-code" is /not/ included, then old, deprecated or 
obsolescent syntax would be a hard error that cannot be turned off or 
downgraded to a warning by flags.  A substantial subset of -Wall 
warnings would be enabled automatically.  (I think the "unused" warnings 
should not be included, for example.)



Distributions and upstream code maintainers should be pushed towards 
either fixing and updating their code, or marking it as "-fold-code" if 
it is too outdated to modernise without a major re-write.  This might be 
painful during the transition, but waiting longer just makes the 
situation work.


(I'm a long-term gcc user, but not a gcc developer.  I'm fully aware 
that I am asking others to do a lot of work here, but I think something 
of this sort is important going forward.)



David







Re: More C type errors by default for GCC 14

2023-05-10 Thread David Brown via Gcc

On 09/05/2023 21:04, Eli Zaretskii via Gcc wrote:

From: Jonathan Wakely 
Date: Tue, 9 May 2023 18:15:59 +0100
Cc: Arsen Arsenović , gcc@gcc.gnu.org

On Tue, 9 May 2023 at 17:56, Eli Zaretskii wrote:


No one has yet explained why a warning about this is not enough, and
why it must be made an error.  Florian's initial post doesn't explain
that, and none of the followups did, although questions about whether
a warning is not already sufficient were asked.

That's a simple question, and unless answered with valid arguments,
the proposal cannot make sense to me, at least.


People ignore warnings. That's why the problems have gone unfixed for
so many years, and will continue to go unfixed if invalid code keeps
compiling.


People who ignore warnings will use options that disable these new
errors, exactly as they disable warnings.  So we will end up not
reaching the goal, but instead harming those who are well aware of the
warnings.



My experience is that many of the people who ignore warnings are not 
particularly good developers, and not particularly good at 
self-improvement.  They know how to ignore warnings - the attitude is 
"if it really was a problem, the compiler would have given an error 
message, not a mere warning".  They don't know how to disable error 
messages, and won't bother to find out.  So they will, in fact, be a lot 
more likely to fix their code.




IOW, if we are targeting people for whom warnings are not enough, then
we have already lost the battle.  Discipline cannot be forced by
technological means, because people will always work around.



Agreed.  But if we can make it harder for them to release bad code, 
that's good overall.


Ideally, I'd like the compiler to email such people's managers with a 
request that they be sent on programming courses!






Re: Is it possible to enable data sections and function sections without explicitly giving flags "-fdata-sections" and "-ffunction-sections"?

2023-03-19 Thread David Brown

On 19/03/2023 13:38, 3119369616.qq via Gcc wrote:

To divide functions into sections and then remove unused sections, I
must provide flags "-fdata-sections" and "-ffunction-sections" in GCC
and a flag "--gc-sections" in LD. Most of the build systems don't
support these flags so GCC will generate bigger binaries. Is it
possible to enable this feature without giving any command line
flags manually?


Just to be clear here - removing unused sections is only beneficial if 
you have a significant amount of unused code and data compiled in the 
build.  That can sometimes be the case, if you are making re-usable 
library code.  But for other types of code, it is better to be clear in 
the source code about what is and is not part of the program - i.e., if 
the function is not used by the program, it should not be in the source 
code.  I don't know what kind of code you are working on, but it's worth 
considering.


Re: Please, really, make `-masm=intel` the default for x86

2022-11-25 Thread David Brown

On 25/11/2022 07:39, LIU Hao via Gcc wrote:
I am a Windows developer and I have been writing x86 and amd64 assembly 
for more than ten years. One annoying thing about GCC is that, for x86 
if I need to write I piece of inline assembly then I have to do it 
twice: one in AT syntax and one in Intel syntax.



The AT syntax is an awkward foreign dialect, designed originally for 
PDP-11 and spoken by bumpkins that knew little about x86 or ARM. No 
official Intel or AMD documentation ever adopts it. The syntax is 
terrible. Consider:


    movl $1, %eax  ; k; moves $1 into EAX
   ; but in high-level languages we expect '%eax = $1',
   ; so it goes awkwardly backwards.

If this looks fine to you, please re-consider:

   cmpl $1, %eax
   jg .L1  ; does this mean 'jump if $1 is greater than %eax'
   ; or something stupidly reversed?

If CMP still looks fine to you, please consider how to write VFMADD231PD 
in AT syntax, really.



I have been tired of such inconsistency. For God's sake, please 
deprecate it.





You can have all the personal preferences or prejudices you want, but 
that won't change the fact that AT syntax was the standard x86 
assembly from long before Intel thought of making their own syntax, and 
it is here to stay.  No one is going to deprecate it, remove it, or 
change any defaults.



#include 

int main(void)
{
int temp=0;

asm
(   ".intel_syntax noprefix"
"mov %0, 1"
".att_syntax"
: "=r"(temp)
:   /* no input*/
);
printf("temp=%d\n", temp);
}


A use feature that could be added to gcc, perhaps, would be a way to let 
the user specify the assembler dialect as part of the "asm" statement:


asm __attribute__((masm = "intel")) ( ... )

The idea with this is that it would issue the requested ".intel_syntax 
noprefix" or ".att_syntax" at the start of the assembly, and the 
appropriate directive to return to normal syntax at the end - adjusting 
according to the "-masm" setting for the compilation.  This would, I 
think, let people write the assembly once in the syntax they choose, and 
have it work smoothly regardless of which syntax is chosen for compilation.




Re: [BUG] -Wuninitialized: initialize variable with itself

2022-11-14 Thread David Brown via Gcc




On 14/11/2022 16:10, NightStrike wrote:



On Mon, Nov 14, 2022, 04:42 David Brown via Gcc 



Warnings are not perfect - there is always the risk of false positives
and false negatives.  And different people will have different ideas
about what code is perfectly reasonable, and what code is risky and
should trigger a warning.  Thus gcc has warning flag groups (-Wall,
-Wextra) that try to match common consensus, and individual flags for
personal fine-tuning.

Sometimes it is useful to have a simple way to override a warning in
code, without going through "#pragma GCC diagnostic" lines (which are
powerful, but not pretty).

So if you have :

         int i;
         if (a == 1) i = 1;
         if (b == 1) i = 2;
         if (c == 1) i = 3;
         return i;

the compiler will warn that "i" may not be initialised.  But if you
/know/ that one of the three conditions will match (or you don't care
what "i" is if it does not match), then you know your code is fine and
don't want the warning.  Writing "int i = i;" is a way of telling the
compiler "I know what I am doing, even though this code looks dodgy,
because I know more than you do".

It's just like writing "while ((*p++ = *q++));", or using a cast to
void
to turn off an "unused parameter" warning.


Wouldn't it be easier, faster, and more obvious to the reader to just 
use "int i = 0"? I'm curious what a real world use case is where you 
can't do the more common thing if =0.




You can write "int i = 0;" if you prefer.  I would not, because IMHO 
doing so would be wrong, unclear to the reader, less efficient, and 
harder to debug.


In the code above, the value returned should never be 0.  So why should 
"i" be set to 0 at any point?  That's just an extra instruction the 
compiler must generate (in my line of work, my code often needs to be 
efficient).  More importantly, perhaps, it means that if you use 
diagnostic tools such as sanitizers you are hiding bugs from them 
instead of catching them - a sanitizer could catch the case of "return 
i;" when "i" is not set.


(I don't know if current sanitizers will do that or not, and haven't 
tested it, but they /could/.)


But I'm quite happy with :

int i = i;  // Self-initialise to silence warning

I don't think there is a "perfect" solution to cases like this, and 
opinions will always differ, but self-initialisation seems a good choice 
to me.  Regardless of the pros and cons in this particular example, the 
handling of self-initialisation warnings in gcc is, AFAIUI, to allow 
such code for those that want to use it.





Re: [BUG] -Wuninitialized: initialize variable with itself

2022-11-14 Thread David Brown via Gcc

On 13/11/2022 19:43, Alejandro Colomar via Gcc wrote:

Hi Andrew!

On 11/13/22 19:41, Andrew Pinski wrote:

On Sun, Nov 13, 2022 at 10:40 AM Andrew Pinski  wrote:


On Sun, Nov 13, 2022 at 10:36 AM Alejandro Colomar via Gcc
 wrote:


Hi,

While discussing some idea for a new feature, I tested the following 
example

program:


  int main(void)
  {
  int i = i;
  return i;
  }


This is NOT a bug but a documented way of having the warning not 
being there.
See 
https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Warning-Options.html#index-Winit-self 

https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Warning-Options.html#index-Wuninitialized 


"If you want to warn about code that uses the uninitialized value of
the variable in its own initializer, use the -Winit-self option."


I should note the main reason why I Know about this is because I fixed
this feature years ago (at least for C front-end)
and added the option to disable the feature.


I'm curious: what are the reasons why one would want to disable such a 
warning?

Why is it not in -Wall or -Wextra?

Thanks,

Alex



Warnings are not perfect - there is always the risk of false positives 
and false negatives.  And different people will have different ideas 
about what code is perfectly reasonable, and what code is risky and 
should trigger a warning.  Thus gcc has warning flag groups (-Wall, 
-Wextra) that try to match common consensus, and individual flags for 
personal fine-tuning.


Sometimes it is useful to have a simple way to override a warning in 
code, without going through "#pragma GCC diagnostic" lines (which are 
powerful, but not pretty).


So if you have :

int i;
if (a == 1) i = 1;
if (b == 1) i = 2;
if (c == 1) i = 3;
return i;

the compiler will warn that "i" may not be initialised.  But if you 
/know/ that one of the three conditions will match (or you don't care 
what "i" is if it does not match), then you know your code is fine and 
don't want the warning.  Writing "int i = i;" is a way of telling the 
compiler "I know what I am doing, even though this code looks dodgy, 
because I know more than you do".


It's just like writing "while ((*p++ = *q++));", or using a cast to void 
to turn off an "unused parameter" warning.


Re: -Wint-conversion, -Wincompatible-pointer-types, -Wpointer-sign: Are they hiding constraint C violations?

2022-11-11 Thread David Brown via Gcc

On 10/11/2022 20:16, Florian Weimer via Gcc wrote:

* Marek Polacek:


On Thu, Nov 10, 2022 at 07:25:21PM +0100, Florian Weimer via Gcc wrote:

GCC accepts various conversions between pointers and ints and different
types of pointers by default, issuing a warning.

I've been reading the (hopefully) relevant partso f the C99 standard,
and it seems to me that C implementations are actually required to
diagnose errors in these cases because they are constraint violations:
the types are not compatible.


It doesn't need to be a hard error, a warning is a diagnostic message,
which is enough to diagnose a violation of any syntax rule or
constraint.

IIRC, the only case where the compiler _must_ emit a hard error is for
#error.


Hmm, you could be right.

The standard says that constraint violations are not undefiend behavior,
but of course it does not define what happens in the presence of a
constraint violation.  So the behavior is undefined by omission.  This
seems to be a contradiction.



Section 5.1.1.3p1 of the C standard covers diagnostics.  (I'm looking at 
the C11 version at the moment, but numbering is mostly consistent 
between C standards.)  If there is at least one constraint violation or 
syntax error in the translation unit, then the compiler must emit at 
least one diagnostic message.  That is all that is required.


The C standard does not (as far as I know) distinguish between "error 
messages" and "warnings", or require that diagnostics stop compilation 
or the production of output files.


So that means a conforming compiler can sum up all warnings and errors 
with a single "You did something wrong" message - and it can still 
produce an object file.  It is even allowed to generate the same message 
when /nothing/ is wrong.  The minimum behaviour to be conforming here is 
not particularly helpful!


Also note that gcc, with default flags, is not a conforming compiler - 
it does not conform to any language standards.  You need at least 
"-std=c99" (or whatever) and "-Wpedantic".  Even then, I think gcc falls 
foul of the rule in 5.1.1.3p1 that says at least one diagnostic must be 
issued for a syntax or constraint violation "even if the behaviour is 
explicitly specified as undefined or implementation-defined".  I am not 
entirely sure, but I think some of the extensions that are enabled even 
in non-gnu standards modes could contradict that.


I personally think the key question for warnings on things like pointer 
compatibility depends on whether the compiler will do what the 
programmer expects.  If you have a target where "int" and "long" are the 
same size, a programmer might use "pointer-to-int" to access a "long", 
and vice-versa.  (This can easily be done accidentally on something like 
32-bit ARM, where "int32_t" is "long" rather than "int".)  If the 
compiler may use this incompatibility for type-based alias analysis and 
optimise on the assumption that the "pointer-to-int" never affects a 
"long", then such mixups should by default be at least a warning, if not 
a hard error.  The primary goal for warnings and error messages must be 
to stop the programmer writing code that is wrong and does not do what 
they expect (as best the compiler can guess what the programmer expects).


The secondary goal is to help the programmer write good quality code, 
and avoid potentially risky constructs - things that might work now, but 
could fail with other compiler versions, flags, targets, etc.  It is not 
unreasonable to have warnings in this category need "-Wall" or explicit 
flags.  (I'd like to see more warnings in gcc by default, and more of 
them as errors, but compatibility with existing build scripts is important.)




I assumed that there was a rule similar to the the rule for #error for
any kind of diagnostic, which would mean that GCC errors are diagnostic
messages in the sense of the standard, but GCC warnings are not.


I believe that both "error" and "warning" messages are "diagnostics" in 
the terms of the standard.


As I said above, the minimum requirements of the standard provide a very 
low bar here.  A useful compiler must do far better (and gcc /does/ do 
far better).




I wonder how C++ handles this.

Thanks,
Florian






Re: Local type inference with auto is in C2X

2022-11-04 Thread David Brown via Gcc

On 03/11/2022 16:19, Michael Matz via Gcc wrote:

Hello,

On Thu, 3 Nov 2022, Florian Weimer via Gcc wrote:


will not have propagated widely once GCC 13 releases, so rejecting
implicit ints in GCC 13 might be too early.  GCC 14 might want to switch
to C23/C24 mode by default, activating auto support, if the standard
comes out in 2023 (which apparently is the plan).

Then we would go from
warning to changed semantics in a single release.

Comments?


I would argue that changing the default C mode to c23 in the year that
comes out (or even a year later) is too aggressive and early.  Existing
sources are often compiled with defaults, and hence would change
semantics, which seems unattractive.  New code can instead easily use
-std=c23 for a time.

E.g. c99/gnu99 (a largish deviation from gnu90) was never default and
gnu11 was made default only in 2014.



That's true - and the software world still has not recovered from the 
endless mass of drivel that gcc (and other compilers) accepted in lieu 
of decent C as a result of not changing to C99 as the standard.


Good C programmers put the standards flag explicitly in their makefile 
(or other build system).  Bad ones use whatever the compiler gives them 
by default and believe "the compiler accepted it, it must be good code".


My vote would be to make "-std=c17 -Wall -Wextra -Wpedantic -Werror -O2" 
the default flags.  Force those who don't really know what they are 
doing, to learn - it's not /that/ hard, and the effort pays off quickly. 
 (Or they can give up and move to Python.)  Those who understand how to 
use their tools can happily change the standards and warnings to suit 
their needs.


And the person who first decided "implicit declaration of function" 
should merely be a /warning/ should be sentenced to 10 years Cobol 
programming.


It's probably a good thing that it is not I who decides the default 
flags for gcc !


Re: rust non-free-compatible trademark

2022-07-18 Thread David Brown

On 17/07/2022 18:31, Mark Wielaard wrote:

Hi Luke,

On Sun, Jul 17, 2022 at 04:28:10PM +0100, lkcl via Gcc wrote:

with the recent announcement that rust is supported by gcc


There is just a discussion about whether and how to integrate
(portions) of the gccrs frontend into the main gcc repository. Nobody
claims that means the rust programming language is supported by gcc
yet. There is a lot of work to be done to be able to claim that.


has it been taken into consideration that the draconian (non-free-compatible)
requirements of the rust Trademark make the distribution of the gcc
compiler Unlawful?

 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1013920


That looks to me as an overreaching interpretation of how to interpret
a trademark. I notice you are the bug reporter. It would only apply if
a product based on gcc with the gccrs frontend integrated would claim
to be endorsed by the Rust Foundation by using the Rust wordmark. Just
using the word rust doesn't trigger confusion about that. And
trademarks don't apply when using common words to implement an
interface or command line tool for compatibility with a programming
language.

If you are afraid your usage of gcc with the gccrs frontend integrated
does cause confusion around the Rust word mark then I would suggest
contacting the Rust Foundation to discuss how you can remove such
confusion. Probably adding a README explicitly saying "this product
isn't endorsed by and doesn't claim to be endoresed by the Rust
Foundation" will be enough.

Good luck,

Mark



Speaking as someone who is neither a lawyer, nor a GCC developer, nor 
even (as yet) a Rust user, it seems to me that step 1 would be to hear 
what the Rust Foundation has to say on the matter:




As far as I can tell, if they have been happy with the current gccrs 
project, they should in principle be happy with its integration in gcc 
mainline.  And they are also happy to talk to people, happy to promote 
rust, and happy to work with all kinds of free and open source projects. 
 The key thing they want to avoid would be for GCC to produce a 
compiler that is mostly like rust, but different - leading to 
fragmentation, incompatibilities, confusion, bugs in user code.  /No 
one/ wants that.


I am sure that if the Rust Foundation foresaw a big problem here, they'd 
already have contacted the gccrs and/or GCC folks - the project is not a 
secret.


I would think that the long term aim here is that the gcc implementation 
of rust (may I suggest "grust" as a name, rather than "gust"?) be 
considered "official" by the Rust Foundation - with links and 
information on their website, their logo on the GCC website, and 
coordination between GCC and the Rust Foundation on future changes. 
That may be ambitious, or far off, but it should be the goal.


In the meantime, as far as I can see it is just a matter of writing 
"rust" without capital letters, and a documentation disclaimer that 
grust is not (yet) endorsed by the Rust Foundation.


David


Re: Gcc Digest, Vol 29, Issue 7

2022-07-06 Thread David Brown via Gcc

On 05/07/2022 09:19, Yair Lenga via Gcc wrote:

Hi,

Wanted to get some feedback on an idea that I have - trying to address the
age long issue with type check on VA list function - like 'scanf' and
friends. In my specific case, I'm trying to build code that will parse a
list of values from SELECT statement into list of C variables. The type of
the values is known (by inspecting the result set meta-data). My ideal
solution will be to implement something like:

int result_set_read(struct result_set *p_result_set, ...);

Which can be called with

int int_var ; float float_var ; char c[20] ;
result_set_read(rs1, _var, _var, c) ;

The tricky part is to verify argument type - make sure . One possible path
I thought was - why not leverage the ability to describe scanf like
functions (
result_set_read(rs1, const char *format, ...) __attribute((format (scanf,
2, 3)) ;

And then the above call will be
result_set-read(rs1, "%d %f %s", _var, _var, c) ;

With the added benefit that GCC will flag as error, if there is mismatch
between the variable and the type. My function parses the scanf format to
decide on conversions (just the basic formatting '%f', '%d', '%*s', ...).
So far big improvement, and the only missing item is the ability to enforce
check on string sizes - to support better checks against buffer overflow
(side note: wish there was ability to force inclusion of the max string
size, similar to the sscanf_s).

My question: does anyone know how much effort it will be to add a new GCC
built-in (or extension), that will automatically generate a descriptive
format string, consistent with scanf formatting, avoiding the need to
manually enter the formatting string. This can be thought of as "poor man
introspection". Simple macro can then be used to generate it

#define RESULT_SET_READ(rs, ...) result_set_read(rs,
__builtin_format(__VA_ARGS__),  __VA_ARGS__)

Practically, making the function "safe" (with respect to buffer overflow,
type conversions) for most use cases.

Any feedback, pointers, ... to how to implement will be appreciated

Yair



I haven't worked through all the details, but I wonder if this could be 
turned around a bit.  Rather than your function taking a variable number 
of arguments of different types, which as you know can be a risky 
business, have it take an array of (type, void*) pairs (where "type" is 
an enumeration).  Use some variadic macro magic to turn the 
"RESULT_SET_READ" into the creation of a local array that is then passed 
on to the function.




Re: reordering of trapping operations and volatile

2022-01-11 Thread David Brown
On 11/01/2022 08:11, Richard Biener via Gcc wrote:
> On Mon, Jan 10, 2022 at 6:36 PM Martin Uecker  wrote:
>>



> 
> I realize that UB in a + b isn't (usually) observable but
> UB resulting in traps are.
> 
> So I'm still wondering why you think that 'volatile' makes
> a critical difference we ought to honor?  I don't remember
> 'volatile' being special in the definition of the abstract
> machine with regarding to observability (as opposed to
> sequence points).
> 
> 

Actually, volatile accesses /are/ critical to observable behaviour -
observable behaviour is program start and termination (normal
termination flushing file buffers, not crashes which are UB), input and
output via "interactive devices" (these are not defined by the
standard), and volatile accesses.  (See 5.1.2.3p6 in the standards if
you want the details.  Note that in C18, "volatile access" was expanded
to include all accesses through volatile-qualified lvalues.)


However, undefined behaviour is /not/ observable behaviour.  It can also
be viewed as not affecting anything else, and so moving it does not
affect volatile accesses.

So you can't re-order two volatile accesses with respect to each other.
 But you /can/ re-order UB with respect to anything else, including
volatile accesses.  (IMHO)

"Performing a trap" - such as some systems will do when dividing by 0,
for example - is not listed as observable behaviour.





Re: reordering of trapping operations and volatile

2022-01-08 Thread David Brown
On 08/01/2022 09:32, Martin Uecker via Gcc wrote:
> 
> Hi Richard,
> 
> I have a question regarding reodering of volatile
> accesses and trapping operations. My initial
> assumption (and  hope) was that compilers take
> care to avoid creating traps that are incorrectly
> ordered relative to observable behavior.
> 
> I had trouble finding examples, and my cursory
> glace at the code seemed to confirm that GCC
> carefully avoids this.  But then someone showed
> me this example, where this can happen in GCC:
> 
> 
> volatile int x;
> 
> int foo(int a, int b, _Bool store_to_x)
> {
>   if (!store_to_x)
> return a / b;
>   x = b;
>   return a / b;
> }
> 
> 
> https://godbolt.org/z/vq3r8vjxr
> 
> In this example a division is hoisted 
> before the volatile store. (the division
> by zero which could trap is UB, of course).
> 

Doesn't this depend on whether the trap is considered "observable
behaviour", or "undefined behaviour" ?

If (on the given target cpu and OS, and with any relevant compiler
flags) dividing by zero is guaranteed to give a trap with specific known
behaviour, then it is observable behaviour and thus should be ordered
carefully with respect to the volatile accesses.

On the other hand, if division by 0 is considered undefined behaviour
(the C and C++ standards explicitly mark it as undefined, but a compiler
can of course define its behaviour) then the compiler can assume it does
not happen, or you don't care about the result of the program if it
happens.  Undefined behaviour can be freely re-ordered around volatile
accesses, as far as I understand it - though that can come as a surprise
to some people.


I don't know which of these views gcc takes - I think both are valid.
But it might be worth noting in the reference manual.

David



> As Martin Sebor pointed out this is done
> as part of redundancy elimination 
> in tree-ssa-pre.c and that this might
> simply be an oversight (and could then be
> fixed with a small change).
> 
> Could you clarify whether such reordering
> is intentional and could be exploited in
> general also in other optimizations or
> confirm that this is an oversight that
> affects only this specific case?
> 
> If this is intentional, are there examples
> where this is important for optimization?
> 
> 
> Martin
> 
> 
> 
> 
> 
> 
> 



Re: Can gcc.dg/torture/pr67828.c be an infinite loop?

2021-09-24 Thread David Brown
On 24/09/2021 10:03, Aldy Hernandez via Gcc wrote:
> Hi folks.
> 
> My upcoming threading improvements turn the test below into an infinite
> runtime loop:
> 
> int a, b;
> short c;
> 
> int
> main ()
> {
>   int j, d = 1;
>   for (; c >= 0; c++)
>     {
> BODY:
>   a = d;
>   d = 0;
>   if (b)
> {
>   xprintf (0);
>   if (j)
>     xprintf (0);
> }
>     }
>   xprintf (d);
>   exit (0);
> }
> 
> On the false edge out of if(b) we thread directly to BODY, eliding the
> loop conditional, because we know that c>=0 because it could never
> overflow.
> 
> Since B is globally initialized to 0, this has the effect of turning the
> test into an infinite loop.
> 
> Is this correct, or did I miss something?
> Aldy
> 
> 

I am wondering about the claim that you can use "b" being 0 to optimise
the conditional.  If the compiler knows that this is the complete
program, that is fair enough.  But since "b" is not static, another
compilation unit could modify "b" before "main" is called.  (In C, it is
legal for another part of the code to call main() - perhaps the
implementation of xprintf does so.  And in C++, a global constructor
could change "b" before main() starts.)

mvh.,

David


Re: Can gcc.dg/torture/pr67828.c be an infinite loop?

2021-09-24 Thread David Brown
On 24/09/2021 11:38, Andrew Pinski via Gcc wrote:
> On Fri, Sep 24, 2021 at 2:35 AM Aldy Hernandez  wrote:
>>
>>
>>
>> On 9/24/21 11:29 AM, Andrew Pinski wrote:
>>> On Fri, Sep 24, 2021 at 1:05 AM Aldy Hernandez via Gcc  
>>> wrote:


>>> Huh about c>=0 being always true? the expression, "c++" is really c=
>>> (short)(((int)c)+1).  So it will definitely wrap over when c is
>>> SHRT_MAX.
>>
>> I see.
>>
>> Is this only for C++ or does it affect C as well?
> 
> This is standard C code; promotion rules; that is if a type is less
> than int, it will be promoted to int if all of the values fit into
> int; otherwise it will be promoted to unsigned int.
> 

But remember that for some gcc targets (msp430, AVR, and others), int is
16-bit and the same size as short.  The short still gets promoted to
int, but it will not longer wrap as SHORT_MAX + 1 is an int overflow.

(I've no idea if this is relevant to the code in question, or if that
code is only used on specific targets where short is smaller than int.)



Re: Can gcc.dg/torture/pr67828.c be an infinite loop?

2021-09-24 Thread David Brown
On 24/09/2021 10:59, Aldy Hernandez via Gcc wrote:
> 
> 
> On 9/24/21 10:08 AM, Richard Biener wrote:
>> On Fri, Sep 24, 2021 at 10:04 AM Aldy Hernandez via Gcc
>>  wrote:
>>>
>>> Is this correct, or did I miss something?
>>
>> Yes, 'c' will wrap to negative SHORT_MIN and terminate the loop via
>> the c>=0 test.
> 
> Huh, so SHORT_MAX + 1 = SHORT_MIN?  I thought that was an overflow, and
> therefore undefined.
> 

C and C++ don't do arithmetic on "short" (or "char").  They are
immediately promoted to "int" (or "unsigned int", as appropriate).  So
if short is smaller than int, the code behaviour is well defined (as
Richard described below).  If short is the same size as int (such as on
the 16-bit mspgcc port of gcc), however, then SHORT_MAX + 1 /would/ be
an overflow and the compiler can assume it does not happen - thus giving
you an infinite loop.

With more common 32-bit int and 16-bit short, the loop should execute
32768 times.

(At least, that is my understanding.)

> 
>>
>> Mind c++ is really (short)(((int)c)++) and signed integer truncation
>> is implementation
>> defined.
>>
>> Richard.
>>
>>> Aldy
>>>
>>
> 
> 



Re: unexpected result with -O2 solved via "volatile"

2021-09-20 Thread David Brown
On 19/09/2021 20:17, Allin Cottrell via Gcc wrote:
> Should this perhaps be considered a bug? Below is a minimal test case
> for a type of calculation that occurs in my real code. It works as
> expected when compiled without optimization, but produces what seems
> like a wrong result when compiled with -O2, using both gcc 10.3.1
> 20210422 on Fedora and gcc 11.1.0-1 on Arch Linux. I realize there's a
> newer gcc release but it's not yet available for Arch, and looking at
> https://gcc.gnu.org/gcc-11/changes.html I didn't see anything to suggest
> that something relevant has changed.
> 
> 
> #include 
> #include 
> 
> int test (int *pk, int n)
> {
>     int err = 0;
> 
>     if (*pk > n) {
>     err = 1;
>     if (*pk > 2e9) {
>     int k = *pk + n - INT_MAX;
> 
>     *pk = k;
>     if (k > 0) {
>     printf("Got positive revised k = %d\n", k);
>     err = 0;
>     } else {
>     printf("k = %d tests as non-positive?!\n", k);
>     }
>     }
>     }
> 
>     return err;
> }
> 
> int main (void)
> {
>     int k = INT_MAX - 10;
>     int err;
> 
>     err = test(, 20);
>     printf("main: err = %d\n", err);
> 
>     return 0;
> }
> 
> 
> What strikes me as "seems wrong" is that the "(k > 0)" branch in test()
> is not taken, although in the alternative branch it turns out that k =
> 10. This can be fixed by using the "volatile" keyword in front of the
> statement "int k = *pk + n - INT_MAX;" or by parenthesizing (n -
> INT_MAX) in that statement.
> 
> I can see the case for assuming that k can't be positive if one thinks
> of the expression as (*pk + n) - INT_MAX, since (*pk + n) can't be
> greater than INT_MAX in context, being the sum of two ints. All the
> same, since gcc does in fact end up assigning the value 10 to k the
> optimization seems a risky one.
> 

Your code is broken.  Signed integer overflow is undefined behaviour in
C - it has no meaning, and the compiler can assume it does not happen or
that you don't care what results you get if it /does/ happen.

This gives the compiler a lot of small but useful optimisation
opportunities - it can assume a number of basic mathematical identities
apply, and can use these to simplify code.

In this particular case, it knows that "*pk + n" will result in a valid
int value between INT_MIN and INT_MAX inclusive (or that you don't care
what happens if you try to overflow), with that int value being the
mathematically correct result as if the types had unlimited sizes.  It
also knows that when it takes an int "x" and subtracts INT_MAX, it again
has an int result between INT_MIN and INT_MAX that has the correct
mathematical value.  It is thus a simple reasoning that, regardless of
the value of "x", the result of "k = x - INT_MAX;" must lie between
INT_MIN and 0 inclusive.  The "k > 0" branch cannot be taken, and code
generation for the conditional and the branch can be skipped.

You appear to be assuming that signed integer arithmetic has two's
complement wrapping, which is not the case in C.  (It's a common
misunderstanding.  Another common misunderstanding is that it /should/
be wrapping, and C is a silly language for not doing so - more careful
thought and research will show that it is a /better/ language because it
makes this undefined behaviour.)

Now that you know the problem in your code (and that it is not a bug in
gcc), you should be able to find plenty of information about signed
integer arithmetic in C in whatever format you prefer (C standards,
blogs, stackoverflow, youtube, whatever suits you).  There is also the
Usenet group comp.lang.c.  This list is not appropriate, however, now
that you know it is nothing specific to gcc.



mvh.,

David


Re: a feature to the wishlist

2021-09-15 Thread David Brown



On 14/09/2021 20:48, Rafał Pietrak wrote:



W dniu 13.09.2021 o 13:41, David Brown pisze:

On 13/09/2021 11:51, Rafał Pietrak via Gcc wrote:

Hi,

Thenk you for very prompt reply.



(I'm not sure how much this is a "gcc development list" matter, rather
than perhaps a "gcc help list" or perhaps comp.arch.embedded Usenet
group discussion, but I'll continue here until Jonathan or other gcc
developers say stop.)


Thank you. I appreciate it.

And yet, I do think, that it's about the "core" gcc - it's about a
programmer is able to "talk to gcc" without much "missunderstanding".

But I'm not going to push it much more. I accept, that "talking to gcc"
with d->cr1 &= ~ { ... } syntax is not what majority of programmers
would like to be able to do.


The gcc developers are always interested in new ideas.  But they are not 
keen on new syntaxes or extensions - there has to be a /very/ good 
reason for them.  The days of gcc being a maverick that makes up its own 
improved C are long, long gone - they'd much rather work with the C and 
C++ committees towards improving the languages in general, instead of 
having more of their own non-standard additions.  So an extension like 
this should ideally be a proposal for adding to C23, though gcc could 
always implement it earlier.  And while I appreciate what you are trying 
to do here, it is simply not general enough or important enough to 
justify such changes.  To get a new feature implemented, it has to do 
something you could not do before, or do it /far/ more simply, clearly, 
safely or efficiently.








W dniu 13.09.2021 o 10:44, Jonathan Wakely pisze:

On Mon, 13 Sept 2021 at 07:55, Rafał Pietrak via Gcc  wrote:

[-]

#elif VARIANT_WORKING
 struct cr_s a = (struct cr_s) {
 .re = 1,
 .te = 1,
 .rxneie = 1,
 .txeie = 1,
 .ue = 1 };
 int *b = (int *) 
 d->cr1 &= ~(*b);


This is a strict aliasing violation. You should either use a union or
memcpy to get the value as an int.


Yes, I know. I know, this is a "trick" I use (I had to use to missleed
gcc).



Don't think of it as a "trick" - think of it as a mistake.  A completely
unnecessary mistake, that will likely give you unexpected results at times :

 union {
struct cr_s s;
 uint32_t raw;
 } a = {(struct cr_s) {
.re = 1,
.te = 1,
.rxneie = 1,
.txeie = 1,
.ue = 1 }
 };
 d->cr1 &= ~(a.raw);


Ha! This is very nice.

But pls note, that if contrary to my VARIANT_WORKING this actually is
kosher (and not an error, like you've said about the my "WORKING"), and
it actually looks very similar to the VARIANT_THE_BEST ... may be there
is a way to implement VARIANT_THE_BEST as "syntactic trick" leading
compiler into this semantics you've outlined above?



The issue I noticed with your "WORKING" is the bit order - that is 
orthogonal to the code structure, and easy to fix by re-arranging the 
fields in the initial bit-field.  It is independent from the structure 
of the code.



I'm raising this questions, since CR1 as int (or better as uint32_t) is
already declared in my code. Compiler shouldn't have too hard time
weeding out struct cr_s from union embedding it.


The code to convert between a bit-field of size 32-bit and a uint32_t is 
nothing at all, and the compiler can handle that fine :-)  But you have 
to write the source code in a way that the conversion is well defined 
behaviour.  When you write something that is not defined behaviour, the 
compiler can generate code in ways you don't expect because technically 
you haven't told it what you actually want.






I hope you also realise that the "VARIANT_TRADITIONAL" and
"VARIANT_WORKING" versions of your code do different things.  The ARM
Cortex-M devices (like the STM-32) are little-endian, and the bitfields
are ordered from the LSB.  So either your list of #define's is wrong, or
your struct cr_s bitfiled is wrong.  (I haven't used that particular
device myself, so I don't know which is wrong.)


I'm sory. this is my mistake. I've taken a shortcut and quickly written
an at hoc '#defines' for the email. In my code I have:
enum usart_cr1_e {
USART_SBK, USART_RWU, USART_RE, USART_TE, USART_IDLEIE,
USART_RXNEIE, USART_TCIE, USART_TXEIE, USART_PEIE, USART_PS,
USART_PCE, USART_WAKE, USART_M, USART_UE,
};

And gcc produces exactly the same code in both variants:
  8000c56:  68d3ldr r3, [r2, #12]
  8000c58:  f423 5302   bic.w   r3, r3, #8320   ; 0x2080
  8000c5c:  f023 032c   bic.w   r3, r3, #44 ; 0x2c
  8000c60:  60d3str r3, [r2, #12]




Great.



Also - perhaps beside the point, but good advice anyway - for

Re: a feature to the wishlist

2021-09-13 Thread David Brown
On 13/09/2021 11:51, Rafał Pietrak via Gcc wrote:
> Hi,
> 
> Thenk you for very prompt reply.


(I'm not sure how much this is a "gcc development list" matter, rather
than perhaps a "gcc help list" or perhaps comp.arch.embedded Usenet
group discussion, but I'll continue here until Jonathan or other gcc
developers say stop.)

> 
> W dniu 13.09.2021 o 10:44, Jonathan Wakely pisze:
>> On Mon, 13 Sept 2021 at 07:55, Rafał Pietrak via Gcc  wrote:
> [-]
>>> #elif VARIANT_WORKING
>>> struct cr_s a = (struct cr_s) {
>>> .re = 1,
>>> .te = 1,
>>> .rxneie = 1,
>>> .txeie = 1,
>>> .ue = 1 };
>>> int *b = (int *) 
>>> d->cr1 &= ~(*b);
>>
>> This is a strict aliasing violation. You should either use a union or
>> memcpy to get the value as an int.
> 
> Yes, I know. I know, this is a "trick" I use (I had to use to missleed
> gcc).
> 

Don't think of it as a "trick" - think of it as a mistake.  A completely
unnecessary mistake, that will likely give you unexpected results at times :

union {
struct cr_s s;
uint32_t raw;
} a = {(struct cr_s) {
.re = 1,
.te = 1,
.rxneie = 1,
.txeie = 1,
.ue = 1 }
};
d->cr1 &= ~(a.raw);


I hope you also realise that the "VARIANT_TRADITIONAL" and
"VARIANT_WORKING" versions of your code do different things.  The ARM
Cortex-M devices (like the STM-32) are little-endian, and the bitfields
are ordered from the LSB.  So either your list of #define's is wrong, or
your struct cr_s bitfiled is wrong.  (I haven't used that particular
device myself, so I don't know which is wrong.)

Also - perhaps beside the point, but good advice anyway - for this kind
of work you should always use the fixed-size  types such as
"uint32_t", not home-made types like "uint".  And you always want
unsigned types when doing bit operations such as bitwise complement.

Using a union of the bitfield struct with a "raw" combined field is a
common idiom, and gives you exactly what you need here.  It is
guaranteed to work in C, unlike your code (which has undefined
behaviour).  If it is important to you, you should note that it is not
defined behaviour in C++ (though maybe gcc guarantees it will work - the
documentation of "-fstrict-aliasing" is not clear on the matter).

As Jonathan says, a small inline function or macro (using gcc's
extension of declarations and statements inside an expression) can be
used for a wrapper that will work simply and efficiently while being
safe for both C and C++ :

#define raw32(x) \
({ uint32_t raw; memcpy(, , sizeof(uint32_t)); raw;})


struct cr_s a = (struct cr_s) {
.re = 1,
.te = 1,
.rxneie = 1,
.txeie = 1,
.ue = 1 }
};
d->cr1 &= ~raw32(a);


mvh.,

David


Re: [Questions] Is there any bit in gimple/rtl to indicate this IR support fast-math or not?

2021-07-14 Thread David Brown
On 14/07/2021 09:49, Matthias Kretz wrote:
> On Wednesday, 14 July 2021 09:39:42 CEST Richard Biener wrote:
>> -ffast-math decomposes to quite some flag_* and those generally are not
>> reflected into the IL but can be different per function (and then
>> prevent inlining).
> 
> Is there any chance the "and then prevent inlining" can be eliminated? 
> Because 
> then I could write my own fast class in C++, marking all operators 
> with 
> __attribute__((optimize("-Ofast")))...
> 

You can add your voice to
.

I think there is a lot of scope for improved code in general if function
attributes did not disable inlining in so many cases.  And it would open
the possibility of C++ template classes to let programmers mix and match
"fast" floating point vs. strict IEEE, or different overflow behaviour,
or different trapping behaviour - without the overhead of function calls
ruining the whole idea.

But it is a tough challenge for the gcc developers, and would (as far as
my limited understanding goes) involve a lot of changes the to the way
the compiler works.  I expect it is something that would need to be done
as a project rather than just a few patches.



Re: Using source-level annotations to help GCC detect buffer overflows

2021-07-01 Thread David Brown



Thanks for the reply here.  I've snipped a bit to save space.

On 30/06/2021 19:12, Martin Sebor wrote:

On 6/29/21 12:31 PM, David Brown wrote:

On 29/06/2021 17:50, Martin Sebor wrote:

On 6/29/21 6:27 AM, David Brown wrote:

On 28/06/2021 21:06, Martin Sebor via Gcc wrote:

I wrote an article for the Red Hat Developer blog about how
to annotate code to get the most out of GCC's access checking
warnings like -Warray-bounds, -Wformat-overflow, and
-Wstringop-overflow.  The article published last week:

https://developers.redhat.com/articles/2021/06/25/use-source-level-annotations-help-gcc-detect-buffer-overflows 



Could these attributes not be attached to the arguments when the
function is called, or the parameters when the function is expanded?
After all, in cases such as the "access" attribute it is not the
function as such that has the access hints, it is the parameters of the
function.

(I'm talking here based on absolutely no knowledge of how this is
implemented, but it's always possible that a different view, unbiased by
knowing the facts, can inspire new ideas.)


Attaching these attributes to function parameters is an interesting
idea that might be worth exploring.  We've talked about letting
attribute access apply to variables for other reasons (detecting
attempts to modify immutable objects, as I mention in the article).
so your suggestion would be in line with that.  Associating two
variables that aren't parameters might be tricky.



It has always seemed to me that some of these attributes are about the 
parameters, rather than the functions.  It would make more sense when 
using them if the attributes worked as qualifiers for the parameter 
(vaguely like "restrict") and were written before the parameter itself, 
rather than as a function attribute with a parameter index.  Of course, 
that gets messy when you have an attribute that ties two parameters 
together (readonly with a size, for example).




Certainly since first reading about the "access" attributes, I have been
considering adding them to my current project.  I have also been mulling
around in my head possibilities of making variadic templates in C++ that
add access attributes in the right places for some kinds of pointers -
but now that I know the attributes will get dropped for inline
functions, and such templates would involve inline functions, there is
little point.  (Maybe I will still figure a neat way to do this for
external functions - it just won't be useful in as many places.)


Unfortunately, with extensive inlining and templates, C++ support
for these attributes is less robust than it ideally would be.
Improving it is on my to do list.



I think with modern coding styles and compiler usage, you have to assume 
that /everything/ is likely to get inlined somewhere.  Lots of C++ 
coding is done with header-only libraries, full of templates and inline 
functions.  Even variables can be templates or inline variables with 
current standards.  And no matter how the code is written, once you use 
LTO then pretty much anything can be inlined back and forth, or 
partially inlined, or partially outlined (is that the right term?), or 
cloned.  The concept of "function" in C or C++ that corresponds to an 
externally visible assembly label, a sequence of assembly instructions 
following a strict ABI and a "return", is fast disappearing.


I don't foresee you or the other gcc developers getting bored anytime soon!


Whether an attribute has an effect depends on the compilation stage
where it's handled.  warn_unused_result is handled very early (well
before inlining) so it always has the expected effect.  Attribute
nonnull is handled both early (to catch the simple cases) and also
later, after inlining, to benefit from some flow analysis, so its
effect is lost if the function it attaches to is inlined.  Attribute
access is handled very late and so it suffers from this problem
even more.



I suppose some attributes are not needed for inline functions, since the
compiler has the full function definition and can figure some things out
itself.  That would apply to "pure" and "const" functions, I expect.


I was going to agree, but then I tested it and found out that const
(and most likely pure) do actually make a difference on inline
functions.  For example in the test case below the inequality is
folded to false.

int f (int);

__attribute__ ((const))
int g (int i) { return f (i); }

void h (int i)
{
   if (g (i) != g (i))
     __builtin_abort ();
}

On the other hand, the equality below is not folded unless f() is
also declared with attribute malloc:

void* f (void);

static int a[1];

__attribute__ ((malloc))
void* g (void) { return f (); }

void h (void)
{
   if (g () == a)
     __builtin_abort ();
}

With heavy inlining (e.g., with LTO) whether a function attribute
will have an effect or not in a given caller is anyone's guess :(


And if you want a par

Re: Using source-level annotations to help GCC detect buffer overflows

2021-06-29 Thread David Brown
On 29/06/2021 17:50, Martin Sebor wrote:
> On 6/29/21 6:27 AM, David Brown wrote:
>> On 28/06/2021 21:06, Martin Sebor via Gcc wrote:
>>> I wrote an article for the Red Hat Developer blog about how
>>> to annotate code to get the most out of GCC's access checking
>>> warnings like -Warray-bounds, -Wformat-overflow, and
>>> -Wstringop-overflow.  The article published last week:
>>>
>>> https://developers.redhat.com/articles/2021/06/25/use-source-level-annotations-help-gcc-detect-buffer-overflows
>>>
>>>
>>
>> Thanks for that write-up - and of course thank you to whoever
>> implemented these attributes!
>>
>> The caveat that the access attributes are lost when a function is
>> inlined is an important one.  As a user who appreciates all the checks I
>> can get, it is disappointing - but I assume there are good reasons for
>> that limitation.  I can merely hope that will change in future gcc
>> versions.
> 
> There's nothing the attribute could obviously attach to after a call
> has been inlined.  An extreme example is a function whose argument
> isn't used:
> 
>   __attribute__ ((access (write_only, 1, 2))) void
>   f (char *p, int n) { }
> 
> (The function might have a body in the original source that could
> be eliminated from the IL based on the values of other arguments.)

Could these attributes not be attached to the arguments when the
function is called, or the parameters when the function is expanded?
After all, in cases such as the "access" attribute it is not the
function as such that has the access hints, it is the parameters of the
function.

(I'm talking here based on absolutely no knowledge of how this is
implemented, but it's always possible that a different view, unbiased by
knowing the facts, can inspire new ideas.)

> 
> Calls to it that are not inlined will be checked but those that are
> won't be.  This could be improved by doing the checking also before
> inlining but at a cost of some false positives for code that's later
> determined to be unreachable.  I don't have a sense of how bad it
> might be so it's something to try.  This class of false positives
> could also be dealt with by queuing up the warnings (e.g., by
> emitting them into the IL via __builtin_warning) and issuing them
> only if they survive dead code elimination.  This is something I'd
> like to try to tackle for GCC 12.
> 

I fully appreciate that some checks can be easier earlier in the
process, others later.  It might even be helpful to do similar checks at
more than one stage, and combine the results.

>>
>> I believe it would make sense to add this information to the gcc manual
>> page for common function attributes.  There are quite a number of
>> attributes that are useful for static checking, such as
>> "warn_unused_result" and "nonnull".  Are these also dropped if the
>> function is inlined?
> 
> I agree the documentation could stand to be made clearer on this
> point.  In general, I think it would be helpful to give users
> more guidance about what to expect from attributes as well as
> warnings: which ones are purely lexical and which ones flow-
> sensitive and so inherently susceptible to false positives and
> negatives, and to what extent.

It could be difficult to quantify that kind of thing, but sometimes
guidance could be useful.  (There is already such information for some
warning flags, especially those that support multiple levels.)

Certainly since first reading about the "access" attributes, I have been
considering adding them to my current project.  I have also been mulling
around in my head possibilities of making variadic templates in C++ that
add access attributes in the right places for some kinds of pointers -
but now that I know the attributes will get dropped for inline
functions, and such templates would involve inline functions, there is
little point.  (Maybe I will still figure a neat way to do this for
external functions - it just won't be useful in as many places.)

> 
> Whether an attribute has an effect depends on the compilation stage
> where it's handled.  warn_unused_result is handled very early (well
> before inlining) so it always has the expected effect.  Attribute
> nonnull is handled both early (to catch the simple cases) and also
> later, after inlining, to benefit from some flow analysis, so its
> effect is lost if the function it attaches to is inlined.  Attribute
> access is handled very late and so it suffers from this problem
> even more.
> 

I suppose some attributes are not needed for inline functions, since the
compiler has the full function definition and can figure some things out
itself.  That would apply to "pure" and "cons

Re: Using source-level annotations to help GCC detect buffer overflows

2021-06-29 Thread David Brown
On 28/06/2021 21:06, Martin Sebor via Gcc wrote:
> I wrote an article for the Red Hat Developer blog about how
> to annotate code to get the most out of GCC's access checking
> warnings like -Warray-bounds, -Wformat-overflow, and
> -Wstringop-overflow.  The article published last week:
> 
> https://developers.redhat.com/articles/2021/06/25/use-source-level-annotations-help-gcc-detect-buffer-overflows
> 

Thanks for that write-up - and of course thank you to whoever
implemented these attributes!

The caveat that the access attributes are lost when a function is
inlined is an important one.  As a user who appreciates all the checks I
can get, it is disappointing - but I assume there are good reasons for
that limitation.  I can merely hope that will change in future gcc versions.

I believe it would make sense to add this information to the gcc manual
page for common function attributes.  There are quite a number of
attributes that are useful for static checking, such as
"warn_unused_result" and "nonnull".  Are these also dropped if the
function is inlined?




Re: [RFC] Implementing detection of saturation and rounding arithmetic

2021-05-12 Thread David Brown
On 12/05/2021 10:00, Tamar Christina wrote:
> Hi David, 
> 
>> -Original Message-----
>> From: David Brown 
>> Sent: Tuesday, May 11, 2021 11:04 AM
>> To: Tamar Christina ; gcc@gcc.gnu.org
>> Cc: Richard Sandiford ; Richard Biener
>> 
>> Subject: Re: [RFC] Implementing detection of saturation and rounding
>> arithmetic
>>
>> On 11/05/2021 07:37, Tamar Christina via Gcc wrote:
>>> Hi All,
>>>
>>> We are looking to implement saturation support in the compiler.  The
>>> aim is to recognize both Scalar and Vector variant of typical saturating
>> expressions.
>>>
>>> As an example:
>>>
>>> 1. Saturating addition:
>>>char sat (char a, char b)
>>>{
>>>   int tmp = a + b;
>>>   return tmp > 127 ? 127 : ((tmp < -128) ? -128 : tmp);
>>>}
>>>
>>> 2. Saturating abs:
>>>char sat (char a)
>>>{
>>>   int tmp = abs (a);
>>>   return tmp > 127 ? 127 : ((tmp < -128) ? -128 : tmp);
>>>}
>>>
>>> 3. Rounding shifts
>>>char rndshift (char dc)
>>>{
>>>   int round_const = 1 << (shift - 1);
>>>   return (dc + round_const) >> shift;
>>>}
>>>
>>> etc.
>>>
>>
>> I can't comment on the implementation part - I don't know anything about it.
>>
>> However, in your examples above I see a few points.
>>
>> One is your use of "char".  "char" is a type that varies in signedness from
>> target to target (and also depending on compiler flags), and is slightly
>> different in C and C++ ('a' has char type in C++, int type in C).  If you 
>> must use
>> "char" in arithmetic contexts, I recommend using "signed char" or "unsigned
>> char" explicitly.
>>
>> I would rather recommend you use the size-specific  types - int8_t,
>> etc., - as being more appropriate for this kind of thing.
>> (AFAIK all gcc targets have 8-bit CHAR.)  This also makes it easier to see 
>> the
>> sizes you need for the "tmp" value as you make functions for bigger sizes -
>> remember that on some gcc targets, "int" is 16-bit.
> 
> Indeed, but unfortunately we're targeting existing code that's quite old, so 
> the
> C99 fixed sized types may not have been used.

I was thinking of your examples (for testing) and in your implementation
- and you have C99 in those cases.  If you make gcc optimisations that
you confirm work correctly for int8_t, int16_t, int32_t, int64_t, (and
possibly __int128), and the unsigned types, then they will automatically
work with fundamental types (signed char, short, etc.) and any other
typedefs users have in their code.


> 
>>
>> It is also worth noting that gcc already has support for saturating types on
>> some targets:
>>
>> <https://gcc.gnu.org/onlinedocs/gcc/Fixed-Point.html>
>>
>> My testing of these (quite a long time ago) left me with a feeling that it 
>> was
>> not a feature anyone had worked hard to optimise - certainly it did not make
>> use of saturating arithmetic instructions available on some of the targets I
>> tested (ARM Cortex M4, for example).  But it is possible that there are 
>> things
>> here that would be of use to you.  (I am not convinced that it is worth
>> spending time optimising the implementation of these - I don't think the
>> N1169 types are much used by
>> anyone.)
>>
> 
> I did notice these and was wondering if it makes sense to use them, but I'd 
> need
> to check how well supported they are.  For instance would need to check if 
> they
> don't block vectorization and stuff like that.
> 

I get the impression that they are likely to fail with vectorisation or
other more advanced optimisations (though I am no expert here).  I
mentioned them in case there is any inspiration or ideas you can copy,
rather than because I think they are useful.  I have not seen any use of
these types in real code, and I think the whole N1169 / TR 18037 was a
case of too little, too inconvenient to use, too late.  Few compilers
support it, fewer programmers use it.  (The syntax for named address
space is used by many embedded compilers, including gcc, but it is a
simple and obvious syntax extension that was used long before that TR.)

>>
>> While it is always good that the compiler can spot patterns in generic C code
>> and generate optimal instruction sequences, another possibility here would
>> be a set of built-in functions for saturated and rounding arithmetic.  T

Re: [RFC] Implementing detection of saturation and rounding arithmetic

2021-05-12 Thread David Brown
On 11/05/2021 19:00, Joseph Myers wrote:
> On Tue, 11 May 2021, David Brown wrote:
> 
>> It is also worth noting that gcc already has support for saturating
>> types on some targets:
>>
>> <https://gcc.gnu.org/onlinedocs/gcc/Fixed-Point.html>
>>
>> My testing of these (quite a long time ago) left me with a feeling that
>> it was not a feature anyone had worked hard to optimise - certainly it
> 
> The implementation isn't well-integrated with any optimizations for 
> arithmetic on ordinary integer types / modes, because it has its own 
> completely separate machine modes and operations on those.  I still think 
> it would be better to have a GIMPLE pass that lowers from fixed-point 
> types to saturating etc. operations on ordinary integer types, as I said 
> in <https://gcc.gnu.org/legacy-ml/gcc-patches/2011-05/msg00846.html>.

That would make sense (to me anyway, with my limited knowledge),
especially if this work on ordinary integer types pays off.  That would
surely let you simplify the sat/accum/fract type handling while
simultaneously making it work on a wider variety of targets.

> 
> Note however that such lowering should be more or less independent of 
> what's being discussed in this thread - this thread is about better 
> optimization of such operations on ordinary types (with or without 
> built-in functions of some kind in addition to recognition of such 
> operations written in generic C), which you can do independently of 
> what's done with fixed-point types.
> 

Yes, indeed.  I mentioned them for comparison, and in case there were
ideas which could be copied.  I don't think the N1169/TR 18037 types are
much (if ever) used in real code - with Tamar's planned optimisations,
the use-cases for them will be even fewer.


Re: [RFC] Implementing detection of saturation and rounding arithmetic

2021-05-11 Thread David Brown
On 11/05/2021 07:37, Tamar Christina via Gcc wrote:
> Hi All,
> 
> We are looking to implement saturation support in the compiler.  The aim is to
> recognize both Scalar and Vector variant of typical saturating expressions.
> 
> As an example:
> 
> 1. Saturating addition:
>char sat (char a, char b)
>{
>   int tmp = a + b;
>   return tmp > 127 ? 127 : ((tmp < -128) ? -128 : tmp);
>}
> 
> 2. Saturating abs:
>char sat (char a)
>{
>   int tmp = abs (a);
>   return tmp > 127 ? 127 : ((tmp < -128) ? -128 : tmp);
>}
> 
> 3. Rounding shifts
>char rndshift (char dc)
>{
>   int round_const = 1 << (shift - 1);
>   return (dc + round_const) >> shift;
>}
> 
> etc.
> 

I can't comment on the implementation part - I don't know anything about it.

However, in your examples above I see a few points.

One is your use of "char".  "char" is a type that varies in signedness
from target to target (and also depending on compiler flags), and is
slightly different in C and C++ ('a' has char type in C++, int type in
C).  If you must use "char" in arithmetic contexts, I recommend using
"signed char" or "unsigned char" explicitly.

I would rather recommend you use the size-specific  types -
int8_t, etc., - as being more appropriate for this kind of thing.
(AFAIK all gcc targets have 8-bit CHAR.)  This also makes it easier to
see the sizes you need for the "tmp" value as you make functions for
bigger sizes - remember that on some gcc targets, "int" is 16-bit.

It is also worth noting that gcc already has support for saturating
types on some targets:



My testing of these (quite a long time ago) left me with a feeling that
it was not a feature anyone had worked hard to optimise - certainly it
did not make use of saturating arithmetic instructions available on some
of the targets I tested (ARM Cortex M4, for example).  But it is
possible that there are things here that would be of use to you.  (I am
not convinced that it is worth spending time optimising the
implementation of these - I don't think the N1169 types are much used by
anyone.)


While it is always good that the compiler can spot patterns in generic C
code and generate optimal instruction sequences, another possibility
here would be a set of built-in functions for saturated and rounding
arithmetic.  That would take the guesswork out of it for users - if
there code requires efficient saturated addition, they can use
__builtin_add_sat and get the best their target can offer (just like
__builtin_add_overflow, and that kind of thing).  And it might be easier
to implement in the compiler.


I hope these comments give you a few ideas or useful thoughts.

David



Re: -flto and -Werror

2021-05-04 Thread David Brown
On 04/05/2021 14:39, Matthias Klose wrote:
> Using -flto exposes some new warnings in code, as seen in the both build logs
> below, for upstream elfutils and systemd.  I have seen others.  These 
> upstreams
> enable -Werror by default, but probably don't see these warnings turning to
> errors themself, because the LTO flags are usually injected by the packaging 
> tools.
> 
> e.g.
> https://launchpadlibrarian.net/536740411/buildlog_ubuntu-hirsute-ppc64el.systemd_248.666.gd4d7127d94+21.04.20210503043716_BUILDING.txt.gz
> e.g.
> https://launchpadlibrarian.net/536683989/buildlog_ubuntu-hirsute-amd64.elfutils_0.183.43.g92980edc+21.04.20210502190301_BUILDING.txt.gz
> 
> showing:
> 
> ../src/shared/efi-loader.c: In function ‘efi_get_reboot_to_firmware’:
> ../src/shared/efi-loader.c:168:16: error: ‘b’ may be used uninitialized in 
> this
> function [-Werror=maybe-uninitialized]
> 
> i386_lex.c: In function ‘i386_restart’:
> i386_lex.c:1816:25: error: potential null pointer dereference
> [-Werror=null-dereference]
>  1816 | b->yy_bs_column = 0;
> 
> A coworker worked out by review that these warnings are false positives.  Now
> the first option already has the *maybe* in it's name, the second option gives
> this hint in the message (*potentially*).  Now getting the complaint that
> -Werror isn't usable with -flto anymore.
> 
> Would it make sense to mark warnings with a high potential of false positives,
> which are not turned into errors with -Werror? And only turn these into errors
> with a new option, e.g. -Wall-errors?
> 
> Matthias
> 

I don't think that would make sense.  Compiling with -Werror is only
appropriate if you have a specific compiler version and a specific set
of warning flags - otherwise new warnings (either from different flags,
or a different compiler version) may cause your build to fail.  That's
the price you pay for the benefits of static error analysis and for
using -Werror to ensure that your code is checked against your set of
warnings.

Personally, I use -Werror in my own builds - but like most
warning-related flags, it is a flag you use during development (and thus
the solution in this case is probably to fix the possible issue in the
code), not for a bundle of known-good code for distribution and builds.

(That's my opinion on such flags.  Other opinions are available
elsewhere :-) )

David


Re: removing toxic emailers

2021-04-20 Thread David Brown
On 20/04/2021 16:15, Richard Kenner via Gcc wrote:
>> Just for the record, Google has no problem with the GPLv3.  Google stopped
>> working on GCC because they made a company decision to use clang instead.
>> That decision was made for technical reasons, not licensing reasons.
> 
> But note that some cellphone manufacturers (e.g, Samsung) have taken
> steps to prevent non-signed binaries from being loaded in their phones.
> This would have been problematic ("Tivoisation") if GPLv3 code was
> included in Android.
> 

That would surely only be relevant if people wanted to use their
telephones to compile code?  The license of the compiler does not matter
(except for libraries or code snippets that are copied directly by the
compiler to the binaries, and gcc has permissive license exceptions for
these.)




Re: On US corporate influence over Free Software and the GCC Steering Committee

2021-04-20 Thread David Brown
On 20/04/2021 08:54, Giacomo Tesio wrote:
> Hi GCC developers,
> 
> just to further clarify why I think the current Steering Committee is highly 
> problematic,
> I'd like you to give a look at this commit
> message over Linux MAINTAINERS
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=4acd47644ef1e1c8f8f5bc40b7cf1c5b9bcbbc4e
> 
> Here the relevant excerpt (but please go chech the quotation):
> 
> "As an IBM employee, you are not allowed to use your gmail account to work in 
> any way 
> on VNIC. You are not allowed to use your personal email account as a "hobby". 
> You 
> are an IBM employee 100% of the time. 
> Please remove yourself completely from the maintainers file. I grant you a 1 
> time 
> exception on contributions to VNIC to make this change." 
> 
> 
> This is happened yesterday (literally).

I know nothing of this case other than the link you sent.  But it seems
to me that the complaint from IBM is that the developer used his private
gmail address here rather than his IBM address.

It is normal practice in most countries that if you are employed full
time to do a certain type of job, then you can't do the same kind of
work outside of the job without prior arrangement with the employer.
That applies whether it is extra paid work, or unpaid (hobby) work.
This is partly because it can quickly become a conflict of interests,
and partly because you are supposed to be refreshed and ready for work
each day and not tired out from an all-night debugging session on a
different project.

Usually employers are quite flexible about these things unless there is
a clear conflict of interests (like working on DB2 during the day, and
Postgresql in the evening).  Some employers prefer to keep things
standardised and rigid.

A company like IBM that is heavily involved in Linux kernel coding will
want to keep their copyrights and attributions clear.  So if they have
an employee that is working on this code - whether it is part of their
day job or not - it makes sense to insist that attributions, maintainer
contact information and copyrights all make it clear that the work is
done by an IBM employee.  It is not only IBM's right to insist on this,
it might also be a legal obligation.

(It is quite possible that this guy's manager could have expressed
things a bit better - we are not privy to the rest of the email or any
other communication involved.)


This is precisely why copyright assignment for the FSF can involve
complicated forms and agreements from contributors' employers.


> 
> And while this is IBM, the other US corporations with affiliations in
> the Steering Committee are no better: 
> https://gcc.gnu.org/pipermail/gcc/2021-April/235777.html
> 

I can't see any relevance in that post other than your "big corporations
are completely evil because there are examples of them being bad" comments.

> I can understand that some of you consider working for such corporations "a 
> joy".
> But for the rest of us, and to most people outside the US, their influence
> over the leadership of GCC is a threat.

Please stop claiming to speak for anyone but yourself.  You certainly do
not speak for /me/.  I don't work for "such corporations", I am outside
the US, but I do not see IBM or others having noticeable influence over
gcc and thus there is no threat.

David


Re: removing toxic emailers

2021-04-17 Thread David Brown
On 17/04/2021 13:56, Giacomo Tesio wrote:
> Hi Gerald,,
> 
> On April 17, 2021 9:09:19 AM UTC, Gerald Pfeifer  wrote:
>> On Fri, 16 Apr 2021, Frosku wrote:
>>> In my view, if people employed by a small number of American
>> companies
>>> succeed in disassociating GCC from GNU/FSF, which is representative
>>> of the free software grassroots community
>>
>> I find this insistant focus by some on "American companies" 
>> interesting - and quite pointless. And my passport is burgundy.
> 
> 
> So much that in fact, we are talking about some of the most controversial
> corporation in the whole world.
> 
> And while we are talking about "toxic emailers", it's not lost to me
> the irony that all this divisive debate about inclusive and righteous 
> behaviour started with an email of a Facebook employee that defines
> working in Facebook "a joy".
> https://gcc.gnu.org/pipermail/gcc/2021-March/235091.html
> 
> Yeah the same Facebook that still does what Cambridge Analytica used to.
> 
>> It also is a completely unwarranted attack on the integrity of the
>> maintainers, contributors, and other leaders of GCC. Regardless of
>> the color of their passports.
> 
> This is a strawman.
> 
> People are just concerned about the undue influence that these
> controversial corporations can have on GCC through the influence they
> have on their employees.
> 

Do you have any justification for thinking that the number of such
"concerned people" is significant?  It is clearly at least one - you -
and arguably a couple of the others who have posted here.  But do you
think it is many, and do you think they have any reason or justification
for this concern?  (Repeating it multiple times in these mailing list
threads is not reasoning or justification - "proof by repeated
assertion" arguments can be dismissed off-hand.)

I am not a Facebook fan myself.  I have an account that I use almost
exclusively just for keeping up with a couple of sports clubs of which I
am a member, and which use Facebook to publish information.  I don't
like the way it tracks so much information about me and other people,
and I don't see how it benefits me.  (Google tracks information too, but
I see more benefit in it.)  However, that is /my/ choice and /my/
opinion, and the way /I/ like to use (or avoid) social media.  Other
people have very different opinions, and find a lot to like about
Facebook.  That's /their/ choice.

Big companies like Facebook and Google are powerful tools.  They usually
try to be "good" most of the time - after all, they are staffed by real
people with real consciences who are, as most people are the world over,
basically good people.  They will make mistakes sometimes, and powerful
tools get abused on occasion.  But on the whole they are trying to
provide a service people want and can make use of, while also making a
living in the process.  Anything else is paranoia - and like most
conspiracy theories, it falls flat when you realise it would involve
huge numbers of people keeping quite about doing evil.


I do believe that Facebook, Google, IBM, etc., will have /some/
influence on gcc and all the other free and open source projects that
they support.  That is because they are big users of such software - it
makes sense for them to support them and help and encourage them.  And
sometimes they will be contributing towards specific features that they
want for their systems.  (This does not seem to be common for gcc, as
far as I understand it from the key developers here.)  For example,
Facebook want improvements to filesystems in Linux so they have employed
people specifically to work on btrfs.

IMHO, this is /fine/.  There is nothing wrong with that.  It is
companies "scratching their own itches", just as individual developers
often do.  We all benefit.  It may be /influence/, but it is minor and
it is certainly not /undue/ influence.


The way you go on about "controversial American companies" and "undue
influence" suggests you think these companies are forcing their
employees on the gcc steering committee to add backdoors to gcc to tell
Facebook what projects you are compiling, or make gcc only work well on
Red Hat.  That would be utter nonsense.


So what is it that you think these companies are doing wrong for gcc?
How do you think they are influencing it?  Who are all these "concerned
people" ?

If you have justification, evidence, or even a rational argument for
your concerns, please share them.  If not, please stop repeating
baseless paranoia.  You have made your point, such as it is - please
move along now.  (That is not censorship - it's just a polite request to
stop wasting people's time.)

David Brown



Re: GCC association with the FSF

2021-04-11 Thread David Brown
On 11/04/2021 17:06, Jonathan Wakely via Gcc wrote:
> On Sun, 11 Apr 2021, 15:26 Richard Sandiford via Gcc, 

>>
>> FWIW, again speaking personally, I would be in favour of joining a fork.[*]
>>
> 
> Glad to hear it :-)
> 
> I will be forking, alone if necessary, but I've already been told by a few
> people I won't be alone.
> 

The big problem with a fork, rather than an amiable split (where FSF/GNU
accepts that gcc wants to be a separate project) is the name.  If the
FSF keep their own "gcc" project, then calling the new fork "gcc" as
well would cause confusion.  And calling it something else would also
confuse people - many would use the FSF gcc because of its name, not
realising that there is a better fork.  You can see that in the
OpenOffice / LibreOffice split - I think a large proportion of people
downloading OpenOffice do so without realising that LibreOffice exists
and is way ahead of it on features.

A fork may be unavoidable in the end, but a more diplomatic change of
structure would have many advantages if it can be achieved.


Re: GCC association with the FSF

2021-04-11 Thread David Brown
On 11/04/2021 16:37, Richard Kenner via Gcc wrote:
>> I guess my point is that the direction in which a project *does* go is not
>> always the direction in which it *should* go.  
> 
> I agree.  And depending on people's "political" views, that can either be
> an advantage or disadvantage of the free software development model.
> 
>> To give just one small practical example, I'm told (by people who are more
>> familiar with GCC internals than I) that it is not feasible with today's
>> GCC to port to backends which have a small number of registers.
> 
> [Finally, a technical discussion in this thread!]
> 
> It never really has been.  Maybe it's not even possible now (I don't
> know), but if you tried it in the past the results would never have
> been very good.  Almost all multi-backend systems operate by having
> very large numbers of expressions at all levels, which you gradually
> lower to actual registers.  This works quite well if you have enough
> registers to hold the high-usage expressions in them, but when you
> have high register pressure, the model breaks down completely.
> Although the situation may well have gotten worse in recent versions
> that I'm not familiar with, I'd say that GCC was probably doing a
> *better* job with a small number of registers in more recent versions
> than in older ones: "reload" was particularly bad when there was high
> register pressure.
> 
> When your main constraint is register pressure, in order to get
> high-quality results, I think you almost have to change the entire
> philosophy of compilation, to the point I think where you have an
> almost entirely different compilation chain for such machines.
> 

Low register count cpu designs have been out of fashion for quite some
time now (perhaps precisely because they are not a good fit for common
compiler strategies).  They are mostly found in older families, such as
the 8-bit CISC designs in older microcontrollers (8051, PIC, COP8, 6502,
etc.).  And you are absolutely right that you need a different way of
thinking in order to get the best out of such chips - low register count
is only one aspect.  Other issues are few or no flexible pointer
registers, no "SP + offset" addressing modes for efficient parameters or
stack frames, banked ram and code blocks, and multiple separate address
spaces.  Good toolchains for such devices need to work in a very
different way, and typically encompass compilation, assembling and
linking in one "omniscient" build so that variables, parameters, etc.,
can be placed statically in ways that minimise banking and maximise
reuse, based on lifetime analysis of the whole program.

This would be a massively different way of working from how gcc does
things now, and given that such devices are very much on the decline
(when 32-bit ARM microcontrollers can be bought for 30 cents, smaller
and cheaper cpu cores are rarely the right choice for a new design), it
would not make sense to spend the effort supporting them in gcc.  There
is, after all, quite a solid GPL'ed compiler toolchain for such devices
at .


Re: GCC association with the FSF

2021-04-11 Thread David Brown



On 11/04/2021 15:39, Alfred M. Szmidt wrote:
>It should remain an acronym, but it should now stand for "GCC Compiler
>Collection".  That allows the project to be disassociated from the GNU
>name while still subtly acknowledging its heritage.
> 
> Then it would not longer be GCC.  It would be something different.
> The whole point of GCC is to provide a free software compiler for the
> GNU system and systems based on GNU, and not to be pragmatic at the
> cost of software freedom.  Commercial interessts are often at odds
> with software freedom as well.  This is one of the many reasons why
> the GNU project is entierly volunteer based.
> 

It is decades since gcc has been /just/ a free compiler for the GNU
system.  That is still an important role, of course, but the compiler's
use has vastly outgrown that area.  The same applies to most of the GNU
projects.

And while I agree that commercial interests are /sometimes/ at odds with
free software, they are also essential for it - GNU would never have
existed without commercial software, and most or all of its projects
would have languished without commercial interest.  (Look, for example,
at the Hurd project - it is absolutely critical to the concept of having
a complete software system using only free software, but it is of almost
no commercial interest to anyone.  And thus it has had negligible
progress.)

Like it or not, money is essential to the way the world works, and
commercial interests are unavoidable.  You can make them work for you
while keeping the values and ideals you hold dear (such as by having
volunteers for development, with contributions and leadership
appointments being personal, while letting a commercial organisation pay
your wages).  Commercial interests are generally only a problem if you
let them be a problem.

> But I'd hope that we can avoid words like "fanaticism", "childish",
> "cultish" simply because of disagreement in philosophies or continuing
> to spread obvious misunderstandings of what someone wrote, it is not
> constructive and only causes unnsesescary agitation.
> 


Re: GCC association with the FSF

2021-04-10 Thread David Brown
On 10/04/2021 14:58, Pankaj Jangid wrote:
> 
> I have never said that the project will survive without maintainers. I
> just asked you to count me as well. Success of the project also depends
> on how widely it is used. And you need to look at the reasons why people
> are using it.
> 

I think it is useful to consider why people use gcc - I agree that
without users, there would be no project.

So why /do/ people use it?  I suspect that one of the biggest reason is
"it's the only compiler that will do the job".  For a lot of important
software, such as Linux kernel, it is gcc or nothing.  Another big
reason is that gcc comes with their system, which is commonly the case
for Linux systems.  In the embedded development world (where I work),
the normal practice for getting a toolchain for a microcontroller is to
download an IDE and toolchain from the manufacturer - and these days it
is more often gcc than not.  You use gcc because that is the standard,
not from choice.

For those that actively /choose/ gcc, why do they do so?  I'd guess
being convenient, well-known and free (as in beer) come a lot higher
than the details of the licence, or the difference between "free
software" and "open source software".  (For me, a major reason is that
the same compiler supports a wide range of targets.  That, and that gcc
is technically a better compiler for my needs than any alternatives.)

I suspect that only a very small (but not zero) proportion of gcc users
care that the project is part of GNU and under the FSF.  I suspect that
a larger proportion would start caring if they felt (rightly or wrongly)
that at the top of the hierarchy was a misogynist who patronises and
sexually harasses women.

(As always, this is just my opinion.)


Re: GCC association with the FSF

2021-04-10 Thread David Brown



On 09/04/2021 20:36, John Darrington wrote:
> On Fri, Apr 09, 2021 at 07:01:07PM +0200, David Brown wrote:
>  
>  Different opinions are fine.  Bringing national or international
>  politics into the discussion (presumably meant to be as an insult) is
>  not fine.  This is not a political discussion - please stop trying to
>  make it one.
> 
> For the record it was David who first brought up the political allegory so
> this comment should be directed in his direction.

Fair enough.

> 
> As for your second point, I find it disappointing but not suprising that
> you "presumed" this comment to be an insult.   This is precisely the
> thing which has caused so much poisonous discourse in recent years.  Some
> people take any opinion they disagree with and look for ways to interpret
> it as an insult.   This gives them a lever to claim that anyone who holds
> that opinion is a chauvanist, a bigot or worse.   This must stop.
> 

I did not take the comment as an insult - I merely presumed that when
Christopher says someone is acting like the Russian or Chinese
government, he does not mean it in a good way.  (His later posts make
that entirely clear.)  I simply don't want to see this turn into a
political discussion.

I agree with you entirely that it is not helpful to perceive insults,
prejudice or bigotry - in general, it is important to keep the
discussion polite and try to remain focused.  That is what I wanted to
do by asking Christopher to avoid politics.


Re: GCC association with the FSF

2021-04-10 Thread David Brown
On 09/04/2021 20:02, Christopher Dimech wrote:
> 
>> Sent: Saturday, April 10, 2021 at 5:01 AM
>> From: "David Brown" 

>>
>> Different opinions are fine.  Bringing national or international
>> politics into the discussion (presumably meant to be as an insult) is
>> not fine.  This is not a political discussion - please stop trying to
>> make it one.
> 
> It is an assessment of what you propose.  The removal of people from all
> positions is a political statements.  I have no problem with political
> discussions and certainly don't take instructions from you, to say the 
> least!  What you talk about is exactly what drives Chinese and Russian
> officials to suppress anybody who does not conform with their demands.
> The consequences will be the same should you and others get your way
> of doing things.

There is a big difference between suppression or censorship, and wanting
people in leadership positions to be representative of the values of the
group they lead.  RMS can have all the opinions he wants, and act has he
will (until he ends up arrested for it), but if he is to remain a
representative for others (FSF, GNU and/or GCC), then he has a duty to
act appropriately according to the values those organisations think are
important.

I think that you mix up freedom and free reins.  Freedom is not anarchy.
 Being free from sexism, prejudice, bullying, and harassment are as
important as freedom of speech or politics.

>>
>> We (the free software world) does not need a person with the qualities
>> of RMS any more - that is the point.  There should not be such a
>> position as "Chief GNUsance".
>  
> Secondly,  I cannot clearly see what status you have for making statements
> that imply a representation for the free software world!!!
> 

I have said very clearly that I am a user of gcc - not a developer, and
the opinions I express are very much my own.  The does not hinder me
from saying what I think the free software world (developers and users)
want or need.  I have not made any claims or suggestions that I am privy
to the minds of others, or that my opinions and ideas are in any way
more weighty than those of others.



Re: GCC association with the FSF

2021-04-09 Thread David Brown
On 09/04/2021 16:40, Christopher Dimech wrote:
>> Sent: Friday, April 09, 2021 at 10:37 PM
>> From: "David Brown" 
>> To: "John Darrington" , "David Malcolm" 
>> 
>> Cc: g...@gnu.org
>> Subject: Re: GCC association with the FSF
>>
>> On 09/04/2021 08:37, John Darrington wrote:
>>
>>>
>>> Nobody is suggesting that RMS should be regarded by everyone or indeed
>>> anyone as "mein Führer".  I think he would be very much concerned if anyone
>>> tried to confer a cult hero status on him.
>>>
>>> Sooner or later, if for no reason other than his age, RMS will have to step
>>> down as leader of GNU.   Rather than calling for his head on a block it
>>> would be more constructive to think to the future.  Unfortunately to date,
>>> I have not seen anyone who in my opinion would have the qualities necessary
>>> to take over the role.
>>>
>>
>> And I don't think people (at least, not many) are "calling for his
>> head".  My thought is that he should be encouraged to step down from all
>> his positions within GNU, FSF, gcc, and any other projects he is
>> involved with.  Retire now, while he can do so with dignity and without
>> harm to the free and open source software worlds.
> 
> David, I oppose your thought that he should be made to step down from ALL
> his positions.  That's the fundamental philosophy of China and Russia.
>  

Different opinions are fine.  Bringing national or international
politics into the discussion (presumably meant to be as an insult) is
not fine.  This is not a political discussion - please stop trying to
make it one.

>> It is only if it is left too late that people will be /forced/ to call
>> for his head.  You can be very sure that complaints about his behaviour
>> and attitudes will not diminish - they will grow, and the result will
>> not be good for RMS, GNU, gcc, users, developers, or anyone else except
>> the sellers of tabloid newspapers.  I would rather see him leave quietly
>> now with respect, than be hounded out later and his statues pulled down
>> - along with the careers and reputations of many who work with him.  (I
>> am not saying that such a destruction would be correct or appropriate -
>> I am saying it will happen in the end if the free software community is
>> not careful.)
>  
>> (I agree that there are few, if any, people who had the qualities of RMS
>> to do the job he did.  But IMHO that role is over - we don't need
>> someone to fill his shoes.)
>  
> I do not see that a person with the qualities of RMS would ask permission for
> the job.  I certainly don't! 
>  

We (the free software world) does not need a person with the qualities
of RMS any more - that is the point.  There should not be such a
position as "Chief GNUsance".



Re: GCC association with the FSF

2021-04-09 Thread David Brown
On 09/04/2021 08:37, John Darrington wrote:

> 
> Nobody is suggesting that RMS should be regarded by everyone or indeed
> anyone as "mein Führer".  I think he would be very much concerned if anyone
> tried to confer a cult hero status on him.
> 
> Sooner or later, if for no reason other than his age, RMS will have to step
> down as leader of GNU.   Rather than calling for his head on a block it
> would be more constructive to think to the future.  Unfortunately to date,
> I have not seen anyone who in my opinion would have the qualities necessary
> to take over the role.
> 

And I don't think people (at least, not many) are "calling for his
head".  My thought is that he should be encouraged to step down from all
his positions within GNU, FSF, gcc, and any other projects he is
involved with.  Retire now, while he can do so with dignity and without
harm to the free and open source software worlds.

It is only if it is left too late that people will be /forced/ to call
for his head.  You can be very sure that complaints about his behaviour
and attitudes will not diminish - they will grow, and the result will
not be good for RMS, GNU, gcc, users, developers, or anyone else except
the sellers of tabloid newspapers.  I would rather see him leave quietly
now with respect, than be hounded out later and his statues pulled down
- along with the careers and reputations of many who work with him.  (I
am not saying that such a destruction would be correct or appropriate -
I am saying it will happen in the end if the free software community is
not careful.)


(I agree that there are few, if any, people who had the qualities of RMS
to do the job he did.  But IMHO that role is over - we don't need
someone to fill his shoes.)


David Brown


Re: GCC association with the FSF

2021-04-08 Thread David Brown



On 08/04/2021 19:22, Giacomo Tesio wrote:
> No, David, 
> 
> On April 8, 2021 3:00:57 PM UTC, David Brown  wrote:
> 
>>  (And yes, I mean FOSS here, not just free software.)
> 
> you are not talking about Free Software, but Open Source.
> 
> FOSS, as a term, has been very successful to spread confusion.
> 

You have snipped the context.  Let me repeat it:

"""
... no one can
be in doubt that [RMS's] attitudes and behaviour are not acceptable by
modern standards and are discouraging to developers and users in the
FOSS community.  (And yes, I mean FOSS here, not just free software.)
"""

For most people that have enough interest in software to be aware of the
concepts of free and/or open source software, lump them together.  That
applies to users and developers.  To the majority of gcc users, they do
not care whether the project refers to itself as "free software" or
"open source software".  They often care that it is easily available at
zero cost (though some pay for it - I and my company have, at times,
bought gcc packages), and they like the fact that all the source code is
available even if they don't look at the source themselves.

But whoever you blame for spreading confusion, or for artificially
creating distinctions that rarely matter (this viewpoint has its
supporters too), the fact remains that the mix-up is real.  In almost
all circumstances, to almost all people, it is all "FOSS".  And the GNU
project, along with Linux, LibreOffice (or still OpenOffice, in most
people's minds), Firefox, and a few other big projects are viewed
together as a group and the opposite of "big company" software such as
MS Windows and Office, Apple software, and Adobe Photoshop (to take some
well-known examples).  The attitudes of GNU leaders have an influence on
all of this, as do other public leader figures such as Linus Torvalds.
Their influence (for good or bad) extends well outside the direct
hierarchy of their official positions within their projects.

> 
>> his attitudes and behaviour are not acceptable by
>> modern standards and are discouraging to developers and users in the
>> FOSS community.
> 
> In fact, I'm actively looking for alternatives to GCC (and LLVM) because I 
> cannot trust a 
> GCC anymore and I cannot review each and every change.
> 

That is your choice, obviously.  I don't agree with your points
expressed in this list so far, but you make your own decisions here.
Call me naïve, but I trust the maintainers of gcc to make good technical
decisions and make changes that improve the compiler suite.

I do think it is entirely possible that - for example - Facebook will
pay an employee to add features to gcc with the specific aim of
improving the efficiency of the code Facebook uses.  I think that would
be entirely reasonable, and I would be quite happy with it - either the
changes will coincidentally improve that is useful to me, or it will do
it no harm.  I think it is /implausible/ that any company would exert an
influence over gcc in order to make it worse for competitors or other
users.  This is an open source project (in addition to being free
software) - it is hard to make hidden changes when all changes are
reviewed and visible to many people.  I don't believe in conspiracy
theories - they require the cooperation of too many people who would
disagree and make a noise.

(Mistakes happen, and attacks from outside occasionally happen in open
source projects, but that's another matter.)

> I won't contribute my port and in general will suggest people to look for 
> alternatives.
> 
> 
> But that's not a problem for you, because you do not actually care about real 
> developers 
> and users, just about the US corporations you effectively mentioned and now 
> control 
> several GNU projects:

No, I have no particular interest in any companies (other than loyalty
to my own company).  I am not an American, nor do I live in America - I
am Scottish and live in Norway.  Not that that matters here.

And yes, I care about the gcc developers and their ability and freedom
to work as they want on the project.  I care about potential new
developers too - and I do not want to see them reject the idea of
working for gcc (or any other project) because they perceive a foul
atmosphere of bullying, sexual harassment or misogyny.  Nor would I want
anyone to avoid contributing to gcc because of perceived bias for or
against any particular country, culture, religion, or any other aspect
of life that has no relevance for code development.

And yes, I care about users - I am one, having used gcc for some 25
years on perhaps a dozen different targets.

I don't think any corporations control any GNU projects (with which I am
familiar) in the sense of deciding what goes into them, who works on
them, what direction they should take, or anything of tha

Re: GCC association with the FSF

2021-04-08 Thread David Brown
On 08/04/2021 18:43, Christopher Dimech wrote:
> 
>> Sent: Friday, April 09, 2021 at 3:00 AM
>> From: "David Brown" 
>> To: "Jonathan Wakely" , "David Malcolm" 
>> 
>> Cc: "GCC Development" , "Mark Wielaard" 
>> Subject: Re: GCC association with the FSF
>>

>> From a practical viewpoint, I am concerned that opinions about him will
>> spread.  If the gcc project is not disassociated from anything involving
>> RMS, I fear the project will suffer from that assosiation, no matter how
>> unfair it may be.  At some point, someone in the public relations
>> department at IBM, Google, Facebook, ARM, or other big supporters of the
>> project will get the impression that the FSF and GNU are lead by a
>> misogynist who thinks child abuse is fine if the child consents, and
>> will cut off all support from the top down.  The other companies will
>> immediately follow.  The gcc lead developers like Ian, Jonathan, Joseph
>> and Nathan will be given the choice of leaving gcc or leaving the job
>> that puts food on their tables.  gcc is not a hobby project run by
>> amateurs in their free time - it is a serious project that needs
>> commercial backing as well as the massive personal dedication it receives.
> 
> If RMS in not indispensable, Ian, Jonathan, Joseph and Nathan are likewise
> not indispensable.  Someone could that over and make their own project and
> lead it how they wish.  There are many projects where the original author
> knows best where to lead.  Classic examples include medical project Gnu
> Health and my project.  Although can also mess a project up, mistakes are
> allowed.  Einstein did not get his ideas from committees, neither did 
> Stallman.
> At work, I have never encountered any committee that done me any good.
> 

RMS was key to getting GNU and the whole concept of Free Software off
the ground.  He was key to the initial development of several important
pieces of software.  He is no longer key to the development of any
software in a technical sense, nor is he key to the philosophical or
ideological parts of the process.

I don't think that any of Ian, Jonathan, and the others are
indispensable.  But I think all of them together are.  If any one or two
of the key gcc developers left the project, life would go on.  If my
feared scenario occurred and many or all of the current gcc developers
who are employed by major IT and hardware companies had to leave, the
project would be dead.

> A good book to read is Maskell's "The New Idea of a University".
> If some think serious maintainers care about some public relations
> group at IBM, Google, or Facebook, they are highly mistaken.  I
> don't care.

As I said, I am a user.  I don't speak for the main developers of gcc,
or the maintainers of subprojects.  I expect that they do care about the
attitudes of the companies that employ them, at the very least.

> 
> Stallman can think whatever he likes.  There exist many valid opinions
> on questions like exactly how young people can be to get married or be
> depicted in pornography.  New Hampshire law allows 13 year olds to get
> married.  The only problem is that many western people are too far
> freaked out in relation to children, sex, and colonial guilt.
> 

Stallman can indeed think whatever he likes, in that no one else can
decide his opinions for him.  He cannot /do/ whatever he likes - I
believe (but do not claim to be able to prove) that some of his past
actions would fall foul of laws against sexual harassment.

However, those of us who think differently on such matters - and that
is, I think, the solid majority of people (not just westerns) - will not
want anything to do with a person who holds such opinions and encourages
such attitudes.



Re: GCC association with the FSF

2021-04-08 Thread David Brown
On 07/04/2021 19:17, Jonathan Wakely via Gcc wrote:
> On Wed, 7 Apr 2021 at 15:04, David Malcolm wrote:
>> For myself, I'm interested in copyleft low-level tools being used to
>> build a Free Software operating system, but the "GNU" name may be
>> permanently tarnished for me; I have no wish to be associated with a
>> self-appointed "chief GNUisance".  I hope the FSF can be saved, since
>> it would be extremely inconvenient to have to move.
> 
> This matches my feelings. If the FSF can be saved, fine, but I don't
> think GCC needs to remain associated with it.
> 
> If the GNU name is a problem, rename the projects to be simply "GCC",
> "Glibc", "GDB" etc without being an initialism.
> 

It should remain an acronym, but it should now stand for "GCC Compiler
Collection".  That allows the project to be disassociated from the GNU
name while still subtly acknowledging its heritage.

I am a gcc user, but not a developer or contributor.  I think it is
important to appreciate the good RMS has done for the software world,
and to accept history as it has happened rather than how we wish it had
been.  But going forward I don't think any project or organisation has
anything to gain by association with RMS, but will have much to lose.
To a large extent, he has done his job - the free and open source worlds
are now far too big and well-established to fail easily.  The time for
fanaticism, ideology and childish (ref. "Chief GNUisance") and
anti-social leadership is over - pragmatism, practicality and
cooperation are the way of the future.  It is time for the FSF to say to
RMS, "Thank you for all you have done.  Now move over for the next
generation, have a happy retirement, and please don't spoil the future
for the rest of us".  (We still need a few ideologists involved, to
remind us of important principles if anyone strays too far.  It's like a
healthy democratic parliament requiring a few representatives from the
greens, communists and other niche parties - you just don't want them
running the show.)

For me as a person, I cannot condone certain aspects of RMS' behaviour.
 I strongly disapprove of "proof by accusation and rumour" or "trial by
public opinion", but there is enough documented evidence in his own
publications and clearly established personal accounts that no one can
be in doubt that his attitudes and behaviour are not acceptable by
modern standards and are discouraging to developers and users in the
FOSS community.  (And yes, I mean FOSS here, not just free software.)

>From a practical viewpoint, I am concerned that opinions about him will
spread.  If the gcc project is not disassociated from anything involving
RMS, I fear the project will suffer from that assosiation, no matter how
unfair it may be.  At some point, someone in the public relations
department at IBM, Google, Facebook, ARM, or other big supporters of the
project will get the impression that the FSF and GNU are lead by a
misogynist who thinks child abuse is fine if the child consents, and
will cut off all support from the top down.  The other companies will
immediately follow.  The gcc lead developers like Ian, Jonathan, Joseph
and Nathan will be given the choice of leaving gcc or leaving the job
that puts food on their tables.  gcc is not a hobby project run by
amateurs in their free time - it is a serious project that needs
commercial backing as well as the massive personal dedication it receives.


It is my opinion - entirely personal, and as a long and happy user
rather than a developer, and not speaking for my company or anyone else
- that gcc would be a stronger project if it were to separate from the
FSF and GNU.  It should have a "board of directors", or steering
committee, or something similar - but these should be selected
democratically and openly in some manner, perhaps by votes from major
contributors and/or subproject maintainers.  This board or committee
could have representatives from the gcc developers, from major
commercial contributors, from major users (Linux kernel people, Debian
folk, etc.), from target manufacturers (Intel, ARM, etc.), from ordinary
users - in short, it should represent the people who have most interest
in the future success of the project.

It might also make sense to gang together with other important toolchain
projects, such as the binutils folk.


David Brown
(A mostly happy embedded gcc user.)



Re: using undeclared function returning bool results in wrong return value

2021-02-20 Thread David Brown



On 20/02/2021 16:46, David Malcolm wrote:
> On Sat, 2021-02-20 at 15:25 +0100, David Brown wrote:


> 
> I think we need to think about both of these use-cases e.g. as we
> implement our diagnostics, and that we should mention this distinction
> in our UX guidelines...
> 
>> Is it possible to distinguish these uses, and then have different
>> default flags?  Perhaps something as simple as looking at the name
>> used
>> to call the compiler - "cc" or "gcc" ?
>>
> 
> ...but I'm wary of having an actual distinction between them in the
> code; it seems like a way to complicate things and lead to "weird"
> build failures.
> 

Fair enough.

> Thought experiment: what might a "--this-is-my-code" option do?
> 

It should read the programmer's mind and tell them of any discrepancies
between what they wrote and what they meant :-)

I'd say it should make "-Wall" the default, and complain if "-std" is
not specified explicitly, and if there is no "-O" flag (or a #pragma GCC
optimise early in the code - even if it is an explicit -O0).  That would
cover things that I see people getting wrong regularly.

(I am a big fan of explicit rather than implicit, and having default
behaviour be a complaint that you are relying on default behaviour.  But
I may not be a typical user.)

> Hope this is constructive
> Dave
> 


Re: using undeclared function returning bool results in wrong return value

2021-02-20 Thread David Brown
On 19/02/2021 12:18, Jonathan Wakely via Gcc wrote:
> On Fri, 19 Feb 2021 at 09:42, David Brown wrote:
>> Just to be clear - I am not in any way suggesting that this situation is
>> the fault of any gcc developers.  If configure scripts are failing
>> because they rely on poor C code or inappropriate use of gcc (code that
>> requires a particular C standard should specify it - gcc has the "-std="
>> flags for that purpose), then the maintainers of those scripts should
>> fix them.  If Fedora won't build just because the C compiler insists C
>> code is written in C, then the Fedora folk need to fix their build system.
> 
> It's not Fedora's build system, it's the packages in Fedora's build
> systems. Lots of them. And those same packages are in every other
> Linux distro, so everybody needs to fix them.
> 

It seems to me that there are two very different uses of gcc going on
here.  (I'm just throwing up some ideas here - if people think they are
daft, wrong or impractical, feel free to throw them out again!  I am
trying to think of ways to make it easier for people to see that there
are problems with their C or C++ code, without requiring impractical
changes on large numbers of configuration files and build setups.)

gcc can be used as a development tool - it is an aid when writing code,
and helps you write better code.  Here warnings of all sorts are useful,
as it is better to find potential or real problems as early as possible
in the development process.  Even warnings about style are important
because they improve the long-term maintainability of the code.

gcc can also be used to build existing code - for putting together
distributions, installing on your own machine, etc.  Here flags such as
"-march=native" can be useful but non-critical warnings are not, because
the person (or program) running the compiler is not a developer of the
code.  This use is as a "system C compiler".

Is it possible to distinguish these uses, and then have different
default flags?  Perhaps something as simple as looking at the name used
to call the compiler - "cc" or "gcc" ?


Re: using undeclared function returning bool results in wrong return value

2021-02-19 Thread David Brown




On 19/02/2021 09:45, Florian Weimer wrote:

* David Brown:


On 18/02/2021 13:31, Florian Weimer via Gcc wrote:

* Jonathan Wakely via Gcc:


Declare your functions. Don't ignore warnings.


It's actually a GCC bug that this isn't an error.  However, too many
configure scripts would still break if we changed the default.



People have had 22 years to fix them.  Implicit function declarations
were a terrible idea from day 1, and banned outright in C99.  It was
reasonable for them to be accepted when the default C standard for gcc
was "gnu90" - they should have never been acceptable in any later
standards without needing an explicit flag.  Since gcc 5 they have given
a warning by default - surely it is time for them to be a hard error?




Just to be clear - I am not in any way suggesting that this situation is 
the fault of any gcc developers.  If configure scripts are failing 
because they rely on poor C code or inappropriate use of gcc (code that 
requires a particular C standard should specify it - gcc has the "-std=" 
flags for that purpose), then the maintainers of those scripts should 
fix them.  If Fedora won't build just because the C compiler insists C 
code is written in C, then the Fedora folk need to fix their build system.


I appreciate that consistency and compatibility with existing, old and 
unmaintained code bases, configures and build systems is important.  But 
the cost is that people continue to make the same mistakes they did 
before, they continue to write buggy code, and they continue to cause 
crashes, security holes, and other trouble.  At least some problems 
could be stopped entirely by checks from tools like gcc.  There is never 
going to be a catch-all "-Wbug-in-the-program" warning, but I really 
don't think it is unreasonable for a compiler to give an error on code 
that is considered so bad it is no longer supported by the language.


The big problem I see is that as long as tools turn a blind eye (or at 
least a tolerant eye) to code faults in order to retain compatibility 
with older and poorer code bases, they let the same mistakes through in 
/new/ code.



Have you actually tried to make the change and seen what happens?



No - my comments are entirely wishful thinking.  I realise the decisions 
for this kind of thing have to be made by people who /have/ tried it, 
and see the effects on a wide range of software - not by someone like me 
who uses gcc primarily for his own code.



I fixed one bug in GCC less than two years ago because apparently, I was
the first person trying to change the GCC default for real.  This was
actually my second attempt, this time using Jeff's testing
infrastructure.  The first attempt totally broke Fedora, so we gave up
immediately and never even got as far as encountering the GCC bug.  The
second attempt got a little bit further, fixing bash, gawk, gettext,
gnulib, make.  Maybe that's all of GNU that needed fixing, but that
seems unlikely (I didn't get through the full list of failing
components).  There were also many failures from other sources.  Some
looked rather hard to fix, for example unzip
<https://bugzilla.redhat.com/show_bug.cgi?id=1750694>.  In many cases
key system components were affected where the upstream status is a bit
dubious, so there is no good place for distributions coordinating their
fixes and share the effort.

This is just another thing that is unfixable in (GNU) C.  Personally, I
have stopped caring as long as the problem is not present in C++.



I do understand that.  I am not really expecting gcc to change its 
defaults here - the practicalities involve too much work.  But it does 
not stop me /wanting/ a change, or believing the software world would be 
better for having such a change.


David



Re: using undeclared function returning bool results in wrong return value

2021-02-18 Thread David Brown
On 18/02/2021 13:31, Florian Weimer via Gcc wrote:
> * Jonathan Wakely via Gcc:
> 
>> Declare your functions. Don't ignore warnings.
> 
> It's actually a GCC bug that this isn't an error.  However, too many
> configure scripts would still break if we changed the default.
> 

People have had 22 years to fix them.  Implicit function declarations
were a terrible idea from day 1, and banned outright in C99.  It was
reasonable for them to be accepted when the default C standard for gcc
was "gnu90" - they should have never been acceptable in any later
standards without needing an explicit flag.  Since gcc 5 they have given
a warning by default - surely it is time for them to be a hard error?

> So either use -Werror=implicit-function-declaration or C++ for the time
> being.
> 
> Thanks,
> Florian
> 
> 



Re: Comma Operator - Left to Right Associativity

2021-02-04 Thread David Brown



On 04/02/2021 22:21, Andreas Schwab wrote:
> On Feb 04 2021, David Brown wrote:
> 
>> For the built-in comma operator, you get guaranteed order of evaluation
>> (or more precisely, guaranteed order of visible side-effects).  But for
>> a user-defined comma operator, you do not - until C++17, which has
>> guaranteed evaluation ordering in some circumstances.
> 
> But not the evaluation order of function arguments.  See
> <https://en.cppreference.com/w/cpp/language/eval_order> Sequenced-before
> rules, rule 15.

Correct.

> 
>> Try your test again with "-std=c++17" or "-std=g++17" - if the order is
>> still reversed, it's a gcc bug (AFAICS).
> 
> I don't think so.
> 

Unless I am missing something, in the OP's program it is a user-defined
comma operator that is called.  There is only one argument to the
"test_comma_operator" function, the result of that user-defined comma
operator.  So rule 15 above does not apply - rule 16 applies.

At least that is /my/ reading of the cppreference page and the OP's program.

David



Re: Comma Operator - Left to Right Associativity

2021-02-04 Thread David Brown
On 04/02/2021 21:08, AJ D via Gcc wrote:
> Isn't comma operator suppose to honor left-to-right associativity?
> 
> When I try it on this test case, it exhibits right-to-left associativity.

You are not talking about associativity - you are talking about
evaluation order.  (The two things are often mixed up.)

For the built-in comma operator, you get guaranteed order of evaluation
(or more precisely, guaranteed order of visible side-effects).  But for
a user-defined comma operator, you do not - until C++17, which has
guaranteed evaluation ordering in some circumstances.

See  (it's
easier to read than the C++ standards).

Try your test again with "-std=c++17" or "-std=g++17" - if the order is
still reversed, it's a gcc bug (AFAICS).  But for standards prior to
C++17, the ordering is unspecified for user-defined comma operators.

This is not unlike the difference between the built-in logic operators
&& and ||, which are guaranteed short-circuiting, while user-defined
overloads are not.  And for arithmetic operators, you don't get integer
promotion, automatic conversion to a common type, etc.

Basically, the user-defined operators are just syntactic sugar for a
function call - they don't have the "magic" features of the real operators.

David



Re: Static analysis updates in GCC 11

2021-01-29 Thread David Brown

On 29/01/2021 01:03, Martin Sebor wrote:

On 1/28/21 2:27 PM, David Malcolm via Gcc wrote:

On Thu, 2021-01-28 at 22:06 +0100, David Brown wrote:




I wrote a feature request for gcc a while back, involving adding tag
attributes to functions in order to ensure that certain classes of
functions are only used from specific allowed functions.  The feature
request attracted only a little interest at the time.  But I suspect
it
could work far better along with the kind of analysis you are doing
with
-fanalyzer than with the normal syntactical analyser in gcc.

<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88391>


Interesting.  The attribute ideas seem designed to work with the
callgraph: partitioning the callgraph into families of functions for
which certain kinds of inter-partition edges are disallowed.  Can a
function change its tag internally, or is it assumed that a function
has a single tag throughout its whole body?  I see that you have a case
in example 3 where a compound statement is marked with an attribute
(which may be an extension of our syntax).


Florian suggested a similar approach (tags) as an enhancement to
the malloc attribute extension we've just added, to avoid having
to exhaustively associate every allocator with every deallocator.



That could be nice - and it could be useful for all sorts of other 
resource management, not just memory pools and allocators.


One thing that always concerns me about the "malloc" attribute and 
memory pools is that the source of the pool has to come from somewhere 
(such as an OS memory allocation, or perhaps statically memory blocks) 
and your allocator will generally have pointers to keep track of it. 
That means the pointer given out by the malloc-type function /is/ 
aliased to existing memory theoretically accessible via other methods. 
I've never felt entirely comfortable that home-made allocators are 
actually completely safe and correct for all possible alias analysis. 
(And I suspect the move towards provenance based alias tracking will not 
make this easier.)


Perhaps if there are tags for malloc-like function attributes, there 
could be attributes that use the same tags to mark data blocks or 
pointers as being the source for the allocator pools.





Re: Static analysis updates in GCC 11

2021-01-29 Thread David Brown




On 28/01/2021 22:27, David Malcolm wrote:

On Thu, 2021-01-28 at 22:06 +0100, David Brown wrote:

On 28/01/2021 21:23, David Malcolm via Gcc wrote:

I wrote a blog post covering what I've been working on in the
analyzer
in this release:
  
https://developers.redhat.com/blog/2021/01/28/static-analysis-updates-in-gcc-11/



As a gcc user, I am always glad to hear of more static analysis and
static warning work.  My own work is mostly on small embedded
systems,
where "malloc" and friends are severely frowned upon in any case and
there is no file system, so most of the gcc 10 -fanalyzer warnings
are
of no direct use to me.  (I still think they are great ideas - even
if
/I/ don't write much PC code, everyone benefits if there are fewer
bugs
in programs.)  I will get more use for the new warnings you've added
for
gcc 11.


I wrote a feature request for gcc a while back, involving adding tag
attributes to functions in order to ensure that certain classes of
functions are only used from specific allowed functions.  The feature
request attracted only a little interest at the time.  But I suspect
it
could work far better along with the kind of analysis you are doing
with
-fanalyzer than with the normal syntactical analyser in gcc.

<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88391>


Interesting.  The attribute ideas seem designed to work with the
callgraph: partitioning the callgraph into families of functions for
which certain kinds of inter-partition edges are disallowed.  Can a
function change its tag internally, or is it assumed that a function
has a single tag throughout its whole body?  I see that you have a case
in example 3 where a compound statement is marked with an attribute
(which may be an extension of our syntax).

One thing I forgot to mention in the blog post is that the analyzer now
supports plugins; there's an example of a mutex-checking plugin here:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=66dde7bc64b75d4a338266333c9c490b12d49825
which is similar to your examples 1 and 3.  Your example 2 is also
reminiscent of the async-signal-unsafe checking that the analyzer has
(where it detects code paths that are called within a signal handler
and complains about bad calls within them).  Many of the existing
checks in the analyzer are modelled as state machines (either global
state for things like "are we in a signal handler", or per-value state
for things like "has this pointer been freed"), and your examples could
be modelled that way too (e.g. "what sections are in RAM" could be a
global state) - so maybe it could all be done as analyzer plugins, in
lieu of implementing the RFE you posted.

Hope this is constructive
Dave



Thanks for the feedback.

Just to be clear, I am not particularly tied to the syntax I suggested 
in the bugzilla suggestion - it is merely a starting point for ideas. 
"caller_tag" and "callee_tag", for example, are probably too similar and 
easily mixed up by people using the attributes.  "needs_tag" and 
"limit_to_tag" might be a better second suggestion.  Or perhaps someone 
else will think of a completely different arrangement to get a similar 
end result.


Yes, I believe tying an attribute to a compound statement might be a new 
idea, but that could just be because the only "statement attribute" 
currently in gcc (AFAIK, from reading the manual) is "fallthrough" - and 
it is generally applied to a null statement.


I suggested the possibility of attaching attributes to statements within 
a function, so that the developer can make it clear when the tag is 
acquired and released.  But it is not essential - it would be fine to 
simply say that the whole function is tagged, if that makes an 
implementation simpler.


At least some of the potential uses here could be handled by C++ strong 
typing and tag structs (carrying no data, but with restrictions on how 
they can be created and copied around).  But tag attributes, or an 
equivalent mechanism, would let you do this in C as well, and it would 
be less strict and structured - and thus easier to add to existing code.





Re: Static analysis updates in GCC 11

2021-01-28 Thread David Brown
On 28/01/2021 21:23, David Malcolm via Gcc wrote:
> I wrote a blog post covering what I've been working on in the analyzer
> in this release:
>  
> https://developers.redhat.com/blog/2021/01/28/static-analysis-updates-in-gcc-11/
> 

As a gcc user, I am always glad to hear of more static analysis and
static warning work.  My own work is mostly on small embedded systems,
where "malloc" and friends are severely frowned upon in any case and
there is no file system, so most of the gcc 10 -fanalyzer warnings are
of no direct use to me.  (I still think they are great ideas - even if
/I/ don't write much PC code, everyone benefits if there are fewer bugs
in programs.)  I will get more use for the new warnings you've added for
gcc 11.


I wrote a feature request for gcc a while back, involving adding tag
attributes to functions in order to ensure that certain classes of
functions are only used from specific allowed functions.  The feature
request attracted only a little interest at the time.  But I suspect it
could work far better along with the kind of analysis you are doing with
-fanalyzer than with the normal syntactical analyser in gcc.



David


Re: No warning for module global variable which is set but never used

2020-12-10 Thread David Brown
On 10/12/2020 16:10, webmaster wrote:

(As a general rule, you'll get more useful responses if you use your
name in your posts.  It's common courtesy.)


> Is it possible to request such feature?
> 

Of course you can file a request for it.  Go to the gcc bugzilla site:



First, search thoroughly to see if it is already requested - obvious
duplicate requests just waste developers' time.  If you find a
duplicate, add a comment and put yourself on the cc list.  If you don't
find a duplicate, file it as a new bug.

Given the replies on this list from gcc developers, I would not hold my
breath waiting for this feature.  It is unlikely to be implemented
unless the relevant compiler passes are re-organised in some way, or
extra information is tracked.  So I don't think it will be a priority.

However, it's always good to track these things - and if many people
want a particular feature, it can't harm its chances of getting done
eventually.

mvh.,

David


> Am 09.12.2020 um 16:45 schrieb webmaster:
>> I have the following Code C\C++:
>>
>> static int foo = 0;
>>
>> static void bar(void)
>> {
>> foo = 1;
>> }
>>
>> Here it is clear for the compiler that the variable foo can only be
>> accessed from the same modul and not from ther modules. From the
>> explanations before I understand that the variable is removed due to
>> optimization. But I do not understand why GCC does not throws a warning.
>>
>> >From my point of view it is responsibility of the developer to remove
>> the unused variable.
>>
> 
> 



Re: No warning for module global variable which is set but never used

2020-12-09 Thread David Brown
On 09/12/2020 11:00, Jakub Jelinek wrote:
> On Wed, Dec 09, 2020 at 10:50:22AM +0100, David Brown wrote:
>> I'd say that it makes sense to have such a warning as a natural
>> enhancement to the existing "-Wunused-but-set-variable" warning.  But I
> 
> That is not really possible.
> The -Wunused-but-set-* warning works by having two bits for the DECL,
> TREE_USED and DECL_READ_P, where any uses mark the var TREE_USED and
> (conservatively) what can read the value marks it DECL_READ_P
> and -Wunused-but-set-* is then variables that are TREE_USED and
> !DECL_READ_P.  All this needs to be done early in the FE.
> For the static vars, the optimization to remove them altogether is done
> much later, and at that point the compiler doesn't know if it isn't used
> because all the reads in the program have been optimized away vs. there were
> none.
> 
>   Jakub
> 

That's what I thought might be the case.  I've seen this before in
situations where it might seem to the layman that it is "obvious" that
there should be a warning here, or that "the compiler can optimise here,
surely it can also issue a warning".  If it were easy to implement a
warning in a situation like this, I guess the gcc developers would have
implemented it already!

I hope this gives the OP the information he is looking for.

David



Re: No warning for module global variable which is set but never used

2020-12-09 Thread David Brown
On 09/12/2020 10:25, webmaster wrote:
> Hello,I'm wondering why GCC does not throw any warning when a module global 
> variable is set (write) but never used (read).Is this behavior wanted? Does 
> it makes sense to add such warning?Greets
> 

How do you expect the compiler to know if the variable is never read?

If it has file linkage (it is declared "static") and its address is not
taken and exported out of the translation unit, then the compiler knows
all about it - and could warn about it.  If it has external linkage
(declared at file scope without "static", or with "extern") or its
address is passed out of the translation unit, then the compiler has no
way to tell how it might be used in other translation units.


If you write:

static int xx;
void foo(void) {
xx = 2;
}

then gcc will eliminate the variable "xx" entirely, as it is never used.
 The function "foo" is compiled to a single "return".  But no
"unused-but-set-variable" warning is emitted - and clearly the compiler
knows that the variable is set but not used.  (You get the warning if
the static variable is local to the function.)

I'd say that it makes sense to have such a warning as a natural
enhancement to the existing "-Wunused-but-set-variable" warning.  But I
can't say if it is a simple matter or not - sometimes these things are
surprisingly difficult to implement depending on the order of passes in
the compiler.  Then it becomes a trade-off of the utility of such a
warning against the effort needed to implement it.




Re: LTO Dead Field Elimination

2020-07-27 Thread David Brown
On 24/07/2020 17:43, Erick Ochoa wrote:
> This patchset brings back struct reorg to GCC.
> 
> We’ve been working on improving cache utilization recently and would
> like to share our current implementation to receive some feedback on it.
> 
> Essentially, we’ve implemented the following components:
> 
>     Type-based escape analysis to determine if we can reorganize a type
> at link-time
> 
>     Dead-field elimination to remove unused fields of a struct at link-time
> 
> The type-based escape analysis provides a list of types, that are not
> visible outside of the current linking unit (e.g. parameter types of
> external functions).
> 
> The dead-field elimination pass analyses non-escaping structs for fields
> that are not used in the linking unit and thus can be removed. The
> resulting struct has a smaller memory footprint, which allows for a
> higher cache utilization.
> 

I am very much a lurker on this list, and not a gcc developer - I am
just an enthusiastic user.  So my opinion here might not even be worth
the apocryphal two cents, and I won't feel insulted if someone says I
don't know what I am talking about here.

With that disclaimer out of the way, my immediate reaction to this idea
is "Why?".

What is the aim of this feature?  I can see it making many things a lot
more complicated, but I can't see it being of correspondingly
significant benefit.

Do you have reason to suppose that there really are many structs in use
in real code, for which there are fields that aren't used, and for which
removing those fields makes a /significant/ improvement to the
efficiency of the code?

If I have the history correct, gcc used to have an optimisation that
would let it re-order fields in a struct to reduce padding.  This got
removed in later versions because it made things so complicated in other
parts of compilation and linking, especially if some parts of the code
are compiled with different options.  Why would these new features not
suffer from exactly the same kinds of issues?

I worry that this kind of global optimisation has so many knock-on
effects that it simply is not worth the cost.  Remember that it is not
just in writing these patches that work must be done - people are going
to have to maintain the code for ever after, and it could mean changes
or limitations in other parts of gcc.

As I see it, there is a simpler path with much lower cost.  When I am
writing code, if I have poor struct layout or unnecessary fields, I
don't want the compiler to re-organise things behind my back.  I want to
be /told/.  Either I want the field there (perhaps I haven't written the
code that uses it yet), or I've made a mistake in my code.

My recommendation here would be to keep the analysis part - that could
be useful for other features too.  But drop the optimisation part -
replace it with a warning.  Tell the programmer, and let /them/ decide
whether or not the field is unnecessary.  Let the programmer make the
decision and the change - then you get all the benefits of the
optimisation with none of the risks or complications.

David



Re: Two new proposals to the upcoming C2X standard

2020-05-31 Thread David Brown

Hi,

On 31/05/2020 22:24, Xavier Del Campo Romero wrote:

Hi David,


-Wsizeof-pointer-div isn't required by the standard, so any compiler 
other than gcc or clang is not required to emit anything to the user. In 
such compilers, the security risk would still be there and would be up 
to the maintainers' willingness to implement such feature (or 
developers' to provide their own safe ARRAY_SIZE macro, if possible). 
IMHO, the standard should find a way to detect this situation at 
compile-time, which is the reason why I though a new keyword (_Lengthof) 
would provide better semantics than status quo and a way to emit a 
meaningful diagnostic message if being incorrectly used.


The suggested _Lengthof does not add any semantics or changes to the 
language - a program that uses _Lengthof correctly (i.e., with a 
parameter that does not give an error) could equally have used a plain 
ARRAY_SIZE macro to get identical semantics and identical code.  For 
code that is designed for "release" - for people other than the authors 
or maintainers to use - it makes no difference.  It is a feature that is 
only useful as an aid to developers who are interested in using tools 
with good warning messages, and interested in using those tools as much 
as possible (otherwise they wouldn't bother with a C20 specific 
feature).  Such users will already get warnings from gcc or clang, or 
use macros like Linux's safe array size macro (or equivalents for 
whatever compiler and tools they use for development).  These things 
don't need to be supported by a range of compilers.


So as far as I see it, the only people that would be interested in using 
_Lengthof already have access to the safety if gives, and the people who 
probably /should/ use it, would not.


The idea in itself is not bad, but I feel that the practice is not going 
to be useful to anyone.





However, I would like _Typeof/typeof +  to be added to the 
standard too, although I am not sure whether this has been already 
suggested to the committee (if so, with no success). Could you please 
help me bringing up reasons why it should be proposed to the committee?




It is very useful, and it has existed for many years in at least two 
compilers.  (I guess I could give a number of examples of where I have 
found it useful, if you need it.)




I would prefer _Typeof over C++'s decltype to avoid confusion among 
users. C and C++ compatibility is already too complex to provide yet 
another keyword that has slightly different behavior depending on the 
language.




Agreed.  "typeof" predates "decltype" and I don't think "decltype" would 
add anything useful to C that you don't get from "typeof".  Adding 
"typedef" to C2x would just be documenting and standardising a 
long-standing existing feature.




OTOH, I see no suggestions regarding the second proposal (static 
compound literals inside body functions). Do you find this proposal 
acceptable?


I thought it seems perfectly reasonable, but had nothing to add beyond 
that - so I left it for others to comment.





Thank you very much for your feedback.


Thank /you/ for trying to improve the language standard.  It is not an 
easy task!


mvh.,

David



--
Xavier Del Campo Romero



May 29, 2020, 08:06 by david.br...@hesbynett.no:

On 28/05/2020 23:01, Xavier Del Campo Romero via Gcc wrote:

Hello gcc team,

I have sent the following proposals to the committee, but they
require them to be implemented at least into two major
compilers, so I am proposing them to be implemented into gcc.
This is going to be a rather lengthy e-mail, so TL;DR:

Proposal 1: New pointer-proof keyword _Lengthof to determine
array length
Motivation: solve silent bugs when a pointer is accidentally
used instead of an array


In gcc, the simple "#define ARRAY_SIZE(a) sizeof (a) / sizeof *(a)"
gives a compile-time warning if passed a pointer, since the
introduction of "-Wsizeof-pointer-div" in -Wall in gcc 8.

I am not convinced that anyone who would be careful enough in their
coding to use such a new "_Lengthof" feature would not already be
using "-Wall" at least. So for gcc, the new keyword would, I think,
be useless. And for any compiler without an equivalent warning
(clang has it from version 7), it would likely to be easier to add
the warning rather than the keyword.

Many people have figured out alternative "ARRAY_SIZE" macros that
work with existing compilers and give compile-time warnings on naked
pointers. But like your suggestion below, they rely on compiler
extensions such as gcc's "typeof".


So my counter-proposal for you would be to recommend gcc's "typeof"
as a new keyword (spelt "_Typeof", with "typeof" as a macro in
, in solid C tradition).

Then you have a feature that has a long-established history in two
major compilers (gcc and clang, at least), has been massively used

Re: Two new proposals to the upcoming C2X standard

2020-05-29 Thread David Brown

On 28/05/2020 23:01, Xavier Del Campo Romero via Gcc wrote:

Hello gcc team,

I have sent the following proposals to the committee, but they require them to 
be implemented at least into two major compilers, so I am proposing them to be 
implemented into gcc. This is going to be a rather lengthy e-mail, so TL;DR:

Proposal 1: New pointer-proof keyword _Lengthof to determine array length
Motivation: solve silent bugs when a pointer is accidentally used instead of an 
array


In gcc, the simple "#define ARRAY_SIZE(a) sizeof (a) / sizeof *(a)" 
gives a compile-time warning if passed a pointer, since the introduction 
of "-Wsizeof-pointer-div" in -Wall in gcc 8.


I am not convinced that anyone who would be careful enough in their 
coding to use such a new "_Lengthof" feature would not already be using 
"-Wall" at least.  So for gcc, the new keyword would, I think, be 
useless.  And for any compiler without an equivalent warning (clang has 
it from version 7), it would likely to be easier to add the warning 
rather than the keyword.


Many people have figured out alternative "ARRAY_SIZE" macros that work 
with existing compilers and give compile-time warnings on naked 
pointers.  But like your suggestion below, they rely on compiler 
extensions such as gcc's "typeof".



So my counter-proposal for you would be to recommend gcc's "typeof" as a 
new keyword (spelt "_Typeof", with "typeof" as a macro in , 
in solid C tradition).


Then you have a feature that has a long-established history in two major 
compilers (gcc and clang, at least), has been massively used in 
real-world code for decades, and has a huge range of useful use-cases. 
Get "typeof" into the C standards and many people will thank you!  And 
then your _Lengthof becomes a simple macro that can be put in a standard 
header file, and needs no established implementation to be acceptable. 
(And you don't need any help from gcc or clang, except perhaps for 
describing the details of the semantics of "typeof".)




Re: size of exception handling (Was: performance of exception handling)

2020-05-13 Thread David Brown

On 13/05/2020 00:48, Jonathan Wakely via Gcc wrote:

On Tue, 12 May 2020 at 23:39, Jonathan Wakely wrote:

On Tue, 12 May 2020, 21:57 Freddie Chopin,  wrote:

Anyway... If you have to recompile the toolchain, the problem is still
there. Most of the people (like 99,666%) will not do that for various
reasons. Some don't know how, some use only Windows, some don't have
time to deal with the compilation (the whole toolchain takes around an
hour here, but this excludes the time to prepare the script that builds
it), some other consider the toolchain provided by MCU vendor (or by
ARM) as "tested to work correctly" so they don't want to replace that
with their custom built solution, and so on, and so on...


There is no one-size-fits-all solution that gives everybody their
ideal set of defaults, so we provide configuration options to tune
things for your needs. Complaining that you have to rebuild things to
get different defaults seems silly. Would you prefer we don't offer
the options at all?


And I also never said that every user should rebuild the toolchain.
The options can be used by vendors providing a toolchain for their
hardware, if the verbose handler (or exceptions in general!) are not
appropriate for their users. Just because something isn't the default,
doesn't mean every user needs to change it themselves.


I think complaining about extra unnecessary code (such as string 
handling for std::terminate) is justified - but the complaints should 
not be directed at the gcc or libstdc++ folks.  As you say, /you/ 
provide the options - if the vendors make poor choices of options, then 
it is /they/ who should get the bug reports and complaints.


One option that would be nice (I don't know if it is realistic), would 
be to say that the code should never stop normally.  On many embedded 
systems, main() never exits.  std::terminate() doesn't need any code 
except perhaps to reset the processor (that will be target-specific, of 
course).  exit() can never be called - there is no need for atexit 
functions, terminate handlers, global destructors, or any of the other 
machinery used for controlled shutdown and ending of a program.





And if writing a script and waiting an hour is too much effort to
reduce unwanted overhead, then I guess that overhead isn't such a big
deal anyway.



There are, as Freddie mentions, many other reasons for end-users not 
building their own toolchains.  I have built many cross-gcc toolcahins 
over the years (starting with a gcc 2.95 m68k toolchain over 20 years 
ago, IIRC).  But for most professional embedded development, pre-built 
toolchains from vendors are a requirement - home-built is simply not an 
acceptable option.  Time and effort don't come into it.  (This is a good 
thing for gcc - a fair number of major gcc developers work for companies 
that earn money selling pre-built toolchains.)


Re: Usage of C11 Annex K Bounds-checking interfaces on GCC

2019-12-15 Thread David Brown

On 15/12/2019 02:57, Jeffrey Walton wrote:

On Sat, Dec 14, 2019 at 12:36 PM Martin Sebor  wrote:


On 12/9/19 8:15 PM, li zi wrote:

Hi All,
We are using gcc in our projects and we found some of the C standard functions 
(like memcpy, strcpy) used in gcc may induce security vulnerablities like 
buffer overflow. Currently we have not found any instances which causes such 
issues.


(This post is all "In My Humble Opinion".)

The correct use of memcpy, strcpy, etc., does not introduce any security 
vulnerabilities.  /Incorrect/ use does - just as incorrect use of any 
function can risk bugs, and therefore security vulnerabilities.



But we feel better to change these calls to Cll Annex K Bounds-checking 
interfaces like memcpy_s, strcpy_s etc. By defining a secure calls method (list 
of func pointers) and allowing application to register the method. I understand 
that this affects performance because of return value check added for _s 
calls, but this will relieve overflow kind of issues from code. And also 
currently using bounds-checking interfaces is a general industry practice.
Please share your opinion on it, and if any discussion happened in community to 
do some changes in future.


Thoughtless "change all standard  functions to Annex K 
functions" is management arse-covering as an alternative to proper 
quality development techniques.  It adds nothing to the software 
quality, it has no reduction in the risk of errors or their 
consequences.  It is nothing more than a way to try to blame other 
people when something has gone wrong.


If you find that using these functions really does give you better 
software with lower risks of problems, then by all means use them.  But 
don't use them blindly just because someone says they are "safer".




GCC's Object Size Checking is a non-intrusive solution to
the problem.  It avoids the considerable risk of introducing
bugs while replacing existing calls with those to the _s
functions.  The implementation is restricted to constant
sizes so its effectiveness is a limited, but we have been
discussing enhancing it to non-constant sizes as well, as
Clang already does.  With that, it should provide protection
with an effectiveness comparable to the _s functions but
without any of the downsides.  (Note that GCC's buffer
overflow warnings are not subject to the same limitation.)

Besides Object Size Checking, I would suggest making use of
the new attribute access.  It lets GCC detect (though not
prevent) out-of-bounds accesses by calls to user-defined
functions decorated with the attribute.


The safer functions have three or four security goals. The workarounds
don't meet the goals.



Let's call them "Annex K" functions, rather than "safer functions", 
until it is demonstrated that they actually /are/ safer in some way. 
And let's talk about other tools in the toolbox, to use in addition or 
as alternatives, rather than calling them "workarounds".



The safer functions require the destination size to ensure a buffer
overflow does not occur.



That is useless unless you know the destination size.  And if you know 
the destination size, you can use that to make sure your calls to 
memcpy, strcpy, etc., are safe and correct.


The use of the Annex K functions therefore gives you no technical 
benefits in this aspect.  They may, however, give you the non-technical 
benefit of forcing the programmer to think about destination sizes - 
they can't be lazy about it.  However, if you are dealing with code that 
needs to be of good quality then you should already have development 
practices that cover this - such as code reviews, programmer training, 
code standards, etc.



The safe functions always terminate the destination buffer.


So do the standard functions when used correctly.

I'm quite happy to agree that strncpy has a silly specification, and it 
can surprise people both in its inefficiency and in the fact that it 
does not always null-terminate the destination.  Creating a "strncpy_s" 
with redundant parameters and confusingly different semantics is /not/ 
the answer.  The right solution would be to deprecate strncpy, and make 
a replacement with a different name and better semantics, such as:


char * strscpy(char * restrict s1, const char * restrict s2, size_t n)
{
if (n > 0) {
*s1 = '\0';
strncat(s1, s2, n - 1);
}
return s1;
}

/That/ would have been a useful, clear, and consistent function that 
copies at most "n" characters from the string, and always terminates the 
copy.




The safe functions provide a consistent return value. There is only
one success code.


There is only one success code from the Annex K functions - but no 
guarantees about failures.  It will call the constraint handler - which 
might return, or might not.  Frankly, you have no idea what it might do 
in the general case, since the constraint handler is a global state 
variable and not even thread-safe.


The only way to write good, safe and reliable 

Re: programming language that does not inhibit further optimization by gcc

2019-03-30 Thread David Brown

On 30/03/2019 08:13, Albert Abramson wrote:

Now I'm on a totally unrelated project, writing code in C, but still using
the GCC compiler under the hood.  The previous developers used raw pointers
quite a bit.  However, as I expand the code, I'd like to use some of the
features in C++, but Atmel Studio doesn't REALLY support C++.
  Code::Blocks is letting me turn warning flags on and off, but I don't see
the same for individual C++ features.  (Lots of subsets of C++, but I
really need a true superset of C.)

Is there some way to manually turn on individual C++17 features, one at a
time?  I read somewhere that you can, but I don't see the post online
anywhere.  And if I create a list of individual C++ features, can this made
into a kind of standard, shared with other programmers?  In other words,
I'd like to make use of lambdas, namespaces, smart pointers, range-based
for loops, and a few others that would save me a lot of time and maybe even
reduce the size of the binary (since we're very RAM limited in the embedded
world).

I'm writing about this on the Facebook page "C+ Project," which is open to
all programmers.



Your post here is new - it is not helpful to make it a follow-up on a 6 
year old conversation.  And it is not helpful to write as though we are 
all familiar with what you are doing and the projects you work on.


To get the crux of the matter, it seems that you want to write mainly C, 
but use some C++17 features.  The way to do that is with C++17.  You 
don't "turn on individual C++17 features" - you enable C++17 support 
(with the "-std=gnu++17" flag support) and just use the features you 
want (limited by the lack of C++ libraries for the AVR).



If you choose to use an IDE and use it to control compiler flags, there 
is always a way to manually enter flags yourself.


  1   2   3   4   >