Re: Apply function attributes (e.g., [[gnu::access()]]) to pointees too

2024-07-11 Thread David Brown via Gcc

On 11/07/2024 11:58, Martin Uecker via Gcc wrote:


Am Donnerstag, dem 11.07.2024 um 11:35 +0200 schrieb Alejandro Colomar via Gcc:

Hi,

I was wondering how we could extend attributes such as gnu::access() to
apply it to pointees too.  Currently, there's no way to specify the
access mode of a pointee.

Let's take for example strsep(3):

With current syntax, this is what we can specify:

[[gnu::access(read_write, 1)]]
[[gnu::access(read_only, 2)]]
[[gnu::nonnull(1, 2)]]
[[gnu::null_terminated_string_arg(2)]]
char *
strsep(char **restrict sp, const char *delim);


The main problem from a user perspective is that
these are attributes on the function declaration
and not on the argument (type).



I was thinking that with floating numbers, one could specify the number
of dereferences with a number after the decimal point.  It's a bit
weird, since the floating point is interpreted as two separate integer
numbers separated by a '.', but could work.  In this case:

[[gnu::access(read_write, 1)]]
[[gnu::access(read_write, 1.1)]]
[[gnu::access(read_only, 2)]]
[[gnu::nonnull(1, 2)]]
[[gnu::null_terminated_string_arg(1.1)]]
[[gnu::null_terminated_string_arg(2)]]
char *
strsep(char **restrict sp, const char *delim);

Which would mark the pointer *sp as read_write and a string.  What do
you think about it?


If the attributes could be applied to the type, then
one could attach them directly at an intermediate
pointer level, which would be more intuitive and
less fragile.



That would be a huge improvement (IMHO).  Then you could write :

#define RW [[gnu::access(read_write)]]
#define RO [[gnu::access(read_only)]]
#define NONNULL [[gnu::nonnull]]
#define CSTRING [[gnu::null_terminated_string_arg]]

char * strsep(char * RW * RW NONNULL CSTRING restrict sp,
const char * RO NUNNULL CSTRING delim);

It would be even better if the characteristics could be tied into a typedef.

typedef const char * [[gnu::access(read_only)]] [[gnu::nonnull]] 
[[gnu::null_terminated_string_arg]] const_cstring;


David



Re: How to avoid some built-in expansions in gcc?

2024-06-05 Thread David Brown via Gcc

On 04/06/2024 19:43, Michael Matz via Gcc wrote:

Hello,

On Tue, 4 Jun 2024, Richard Biener wrote:


A pragmatic solution might be a new target hook, indicating a specified
builtin is not to be folded into an open-coded form.


Well, that's what the mechanism behind -fno-builtin-foobar is supposed to
be IMHO.  Hopefully the newly added additional mechanism using optabs and
ifns (instead of builtins) heeds it.


-fno-builtin makes GCC not know semantics of the functions called


Hmm, true.  Not expanding inline is orthogonal strictly speaking ...


which is worse for optimization than just not inline expanding it.


... but on AVR expanding inline is probably worse than that lost
knowledge.  So yeah, ideally we would devise a (simple/reasonable) way to
at least disable inline expansion, without making it non-builtin.



The ideal here would be to have some way to tell gcc that a given 
function has the semantics of a different function.  For example, a 
programmer might have several implementations of "memcpy" that are 
optimised for different purposes based on the size or alignment of the 
arguments.  Maybe some of these are written with inline assembly or work 
in a completely different way (I've used DMA on a microcontroller for 
the purpose).  If you could tell the compiler that the semantic 
behaviour and results were the same as standard memcpy(), that could 
lead to optimisations.


Then you could declare your "isinf" function with 
__attribute__((semantics_of(__builtin_isinf))).


And the feature could be used in any situation where you can write a 
function in a simple, easy-to-analyse version and a more efficient but 
opaque version.






Re: Is fcommon related with performance optimization logic?

2024-05-30 Thread David Brown via Gcc

On 30/05/2024 04:26, Andrew Pinski via Gcc wrote:

On Wed, May 29, 2024 at 7:13 PM 赵海峰 via Gcc  wrote:


Dear Sir/Madam,


We found that running on intel SPR UnixBench compiled with gcc 10.3 performs 
worse than with gcc 8.5 for dhry2reg benchmark.


I found it related with -fcommon option which is disabled in 10.3 by default. 
Fcommon will make global variables addresses in special order in bss 
section(watching by nm -n) whatever they are defined in source code.


We are wondering if fcommon has some special performance optimization process?


(I also post the subject to gcc-help. Hope to get some suggestion in this mail 
list. Sorry for bothering.)


This was already filed as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114532 . But someone
needs to go in and do more analysis of what is going wrong. The
biggest difference for x86_64 is how the variables are laid out and by
who (the compiler or the linker).  There is some notion that
-fno-common increases the number of L1-dcache-load-misses and that
points to the layout of the variable differences causing the
difference. But nobody has gone and seen which variables are laid out
differently and why. I am suspecting that small changes in the
code/variables would cause layout differences which will cause the
cache misses which can cause the performance which is almost all by
accident.
I suspect adding -fdata-sections will cause another performance
difference here too. And there is not much GCC can do about this since
data layout is "hard" to do to get the best performance always.



(I am most familiar with embedded systems with static linking, rather 
than dealing with GOT and other aspects of linking on big systems.)


I think -fno-common should allow -fsection-anchors to do a much better 
job.  If symbols are put in the common section, the compiler does not 
know their relative position until link time.  But if they are in bss or 
data sections (with or without -fdata-sections), it can at least use 
anchors to access data in the translation unit that defines the data 
objects.


David



Thanks,
Andrew Pinski




Best regards.


Clark Zhao






Re: aliasing

2024-03-18 Thread David Brown via Gcc

On 18/03/2024 14:54, Andreas Schwab via Gcc wrote:

On Mär 18 2024, David Brown wrote:


I think it would be possible to have an implementation where "signed
char" was 8-bit two's complement except that 0x80 would be a trap
representation rather than -128.


signed char cannot have padding bits, thus it cannot have a trap
representation.



The premise is correct (no padding bits are allowed in signed char), but 
does it follow that it cannot have a trap representation?  I don't think 
the standards are clear either way here - I think the committee missed a 
chance to tidy up the description a bit more when C23 removed formats 
other than two's complement for signed integer types.


I also feel slightly uneasy using signed char for accessing object 
representations since the object representation is defined in terms of 
an unsigned char array, and conversion from unsigned char to signed char 
is implementation-defined.  (This too could have been tightened in C23, 
as there is unlikely to be any implementation that does not do the 
conversion in the obvious manner.)


But I am perhaps worrying too much here.






Re: issue: unexpected results in optimizations

2023-12-12 Thread David Brown via Gcc

Hi,

First, please ignore everything Dave Blanchard writes.  I don't know 
why, but he likes to post angry, rude and unhelpful messages to this list.


Secondly, this is the wrong list.  gcc-help would be the correct list, 
as you are asking for help with gcc.  This list is for discussions on 
the development of gcc.


Thirdly, if you want help, you need to provide something that other 
people can comprehend.  There is very little that anyone can do to 
understand lumps of randomly generated code, especially when it cannot 
compile without headers and additional files or libraries that we do not 
have.


So your task is to write a /minimal/ piece of stand-alone code that 
demonstrates the effect that concerns you.  It is fine to use standard 
headers like , but no external headers like this "csmith" 
stuff.  Aim to make it small enough to be included directly in the text 
of the post, not as an attachment.  Include the compiler version(s) you 
tried, the command line flags, what you expect the results to give, and 
what wrong results you got.


Always do development compiles with comprehensive sets of warnings.  I 
managed to take a section of your code (part that was different between 
the "initial.c" and "transformed.c") and compile it - there were lots of 
warnings.  There are a lot of overflows in initialisations, pointless 
calculations on the left of commas, and other indications of badly 
written code.  There were also static warnings about undefined behaviour 
in some calculations - and that, most likely, is key.


When code has undefined behaviour, you cannot expect the compiler to 
give any particular results.  It's all down to luck.  And luck varies 
with the details, such as optimisation levels.  It's "garbage in, 
garbage out", and that is the explanation for differing results.


So compile with "-Wall -Wextra -std=c99 -Wpedantic -O2" and check all 
the warnings.  (Static warnings work better when optimising the code.) 
If you have fixed the immediate problems in the code, add the 
"-fsanitize=undefined" flag before running it.  That will do run-time 
undefined behaviour checks.


If you have a code block that is small enough to comprehend, and that 
you are now confident has no undefined behaviour, and you get different 
results with different optimisations, post it to the gcc-help list. 
Then people can try it and give opinions - maybe there is a gcc bug.


I hope that all helps.

David





On 11/12/2023 18:14, Jingwen Wu via Gcc wrote:

Hello, I'm sorry to bother you. And I have some gcc compiler optimization
questions to ask you.
First of all, I used csmith tools to generate c files randomly. Meanwhile,
the final running result was the checksum for global variables in a c file.
For the two c files in the attachment, I performed the equivalent
transformation of loop from *initial.**c* to *transformed.c*. And the two
files produced different results (i.e. different checksum values) when
using *-Os* optimization level, while the results of both were the same
when using other levels of optimization such as *-O0*, -O1, -O2, -O3,
*-Ofast*.
Please help me to explain why this is, thank you.

command line: *gcc file.c -Os -lm -I $CSMITH_HOME/include && ./a.out*
version: gcc 12.2.0
os: ubuntu 22.04





Re: Suboptimal warning formatting with `bool` type in C

2023-11-02 Thread David Brown via Gcc

On 02/11/2023 00:28, peter0x44 via Gcc wrote:

On 2023-11-01 23:13, Joseph Myers wrote:


On Wed, 1 Nov 2023, peter0x44 via Gcc wrote:


Why is #define used instead of typedef? I can't imagine how this could
possibly break any existing code.


That's how stdbool.h is specified up to C17.  In C23, bool is a keyword
instead.


I see, I didn't know it was specified that way. It seems quite strange 
that typedef wouldn't be used for this purpose.


I suppose perhaps it matters if you #undef bool and then use it to 
define your own type? Still, it seems very strange to do this.




Yes, that is part of the reason.  The C standards mandate a number of 
things to be macros when it would seem that typedef's, functions, 
enumeration constants or other things would be "nicer" in some sense. 
Macros have two advantages, however - you can "#undef" them, and you can 
use "#ifdef" to test for them.  This makes them useful in several cases 
in the C standards, especially for changes that could break backwards 
compatibility.  Someone writing /new/ code would hopefully never make 
their own "bool" type, but there's plenty of old code around - if you 
ever need to include some pre-C99 headers with their own "bool" type and 
post-C99 headers using , within the same C file, then it's 
entirely possible that you'll be glad "bool" is a macro.


Maybe it's something to offer as a GNU extension? Though, I'm leaning 
towards too trivial to be worth it, just for a (very minor) improvement 
to a diagnostic that can probably be handled in other ways.




Speaking as someone with absolutely zero authority (I'm a GCC user, not 
a GCC developer), I strongly doubt that "bool" will be made a typedef as 
a GCC extension.


But if there are problems with the interaction between pre-processor 
macros and the formatting of diagnostic messages, then that is 
definitely something that you should file as a bug report and which can 
hopefully be fixed.


David




Re: C89 question: Do we need to accept -Wint-conversion warnings

2023-10-11 Thread David Brown via Gcc




On 11/10/2023 12:17, Florian Weimer wrote:

* David Brown:


On 11/10/2023 10:10, Florian Weimer wrote:

* David Brown:


So IMHO (and as I am not a code contributor to GCC, my opinion really
is humble) it is better to be stricter than permissive, even in old
standards.  It is particularly important for "-std=c89", while
"-std=gnu89" is naturally more permissive.  (I have seen more than
enough terrible code in embedded programs - I don't want to make it
easier for them to write even worse code!)

We can probably make (say) -std=gnu89 -fno-permissive work, in a way
that is a bit less picky than -std=gnu89 -pedantic-errors today.



The gcc manual has "-permissive" under "C++ Dialect Options".  Are you
planning to have it for C as well?


Yes, I've got local patches on top of Jason's permerror enhancement:

   [PATCH v2 RFA] diagnostic: add permerror variants with opt
   




That sounds like a good idea (perhaps with some examples in the
documentation?).  Ideally (and I realise I like stricter checking than
many people) some long-obsolescent features like non-prototype
function declarations could be marked as errors unless "-permissive"
were used, even in C89 standards.


For some of such declarations, this falls out of the implicit-int
removal.


Yes.



C23 changes meaning of of extern foo(); to match the C++ interpretation
of extern foo(void);.  I don't think we should warn about that.  If we
warn, it would be at the call site.


I'm not sure I fully agree.  "extern foo();" became invalid when 
implicit int was removed in C99.  But "extern T foo();", where "T" is 
void or any type, has changed meaning between C17 (and before) and C23.


With C23, it means the same as "extern T foo(void);", like in C++ (and 
like all C standards if it is part of the definition of the function). 
However, prior to C23, a declaration of "T foo();" that is not part of 
the definition of the function declares the function and "specifies that 
no information about the number or types of the parameters is supplied". 
 This use was obsolescent from C90.


To my mind, this is very different.  I think it is fair to suppose that 
for many cases of pre-C23 declarations with empty parentheses, the 
programmer probably meant "(void)".  But the language standards have 
changed the meaning of the declaration.


IMHO I think calling "foo" with parameters should definitely be a 
warning, enabled by default, for at least -std=c99 onwards - it is 
almost certainly a mistake.  (Those few people that use it as a feature 
can ignore or disable the warning.)  I would also put warnings on the 
declaration itself at -Wall, or at least -Wextra (i.e., 
"-Wstrict-prototypes").  I think that things that change between 
standards, even subtly, should be highlighted.  Remember, this concerns 
a syntax that was marked obsolescent some 35 years ago, because the 
alternative (prototypes) was considered "superior to the old style on 
every count".


It could be reasonable to consider "extern T foo();" as valid in 
"-std=gnu99" and other "gnu" standards - GCC has an established history 
of "back-porting" useful features of newer standards to older settings. 
But at least for "-std=std99" and other "standard" standards, I think it 
is best to warn about the likely code error.





(As a side note, I wonder if "-fwrapv" and "-fno-strict-aliasing"
should be listed under "C Dialect Options", as they give specific
semantics to normally undefined behaviour.)


They are code generation options, too.


I see them as semantic extensions to the language, and code generation 
differences are a direct result of that (even if they historically arose 
as code generation options and optimisation flags respectively). 
Perhaps they could be mentioned or linked to in the C dialect options 
page?  Maybe it would be clearer to have new specific flags for the 
dialect options, which are implemented by activating these flags? 
Perhaps that would be confusing.





And of course there's still -Werror, that's not going to go away.  So if
you are using -Werror=implicit-function-declaration today (as you
probably should 8-), nothing changes for you in GCC 14.


I have long lists of explicit warnings and flags in my makefiles, so I
am not concerned for my own projects.  But I always worry about the
less vigilant users - the ones who don't know the details of the
language or the features of the compiler, and don't bother finding
out.  I don't want default settings to be less strict for them, as it
means higher risks of bugs escaping out to released code.


We have a tension regarding support for legacy software, and ongoing
development.  


Agreed, and I fully understand that there is no easy answer here.  On 
the one hand, you don't want to break existing code bases or build 
setups, and on the other hand you want to help developers write good 
code (and avoid bad code) going forwards.



I 

Re: C89 question: Do we need to accept -Wint-conversion warnings

2023-10-11 Thread David Brown via Gcc




On 11/10/2023 10:10, Florian Weimer wrote:

* David Brown:


So IMHO (and as I am not a code contributor to GCC, my opinion really
is humble) it is better to be stricter than permissive, even in old
standards.  It is particularly important for "-std=c89", while
"-std=gnu89" is naturally more permissive.  (I have seen more than
enough terrible code in embedded programs - I don't want to make it
easier for them to write even worse code!)


We can probably make (say) -std=gnu89 -fno-permissive work, in a way
that is a bit less picky than -std=gnu89 -pedantic-errors today.



The gcc manual has "-permissive" under "C++ Dialect Options".  Are you 
planning to have it for C as well?  That sounds like a good idea 
(perhaps with some examples in the documentation?).  Ideally (and I 
realise I like stricter checking than many people) some long-obsolescent 
features like non-prototype function declarations could be marked as 
errors unless "-permissive" were used, even in C89 standards.


(As a side note, I wonder if "-fwrapv" and "-fno-strict-aliasing" should 
be listed under "C Dialect Options", as they give specific semantics to 
normally undefined behaviour.)




And of course there's still -Werror, that's not going to go away.  So if
you are using -Werror=implicit-function-declaration today (as you
probably should 8-), nothing changes for you in GCC 14.


I have long lists of explicit warnings and flags in my makefiles, so I 
am not concerned for my own projects.  But I always worry about the less 
vigilant users - the ones who don't know the details of the language or 
the features of the compiler, and don't bother finding out.  I don't 
want default settings to be less strict for them, as it means higher 
risks of bugs escaping out to released code.





I suspect (again with numbers taken from thin air) that the proportion
of C programmers or projects that actively choose C11 or C17 modes, as
distinct from using the compiler defaults, will be less than 1%.  C99
(or gnu99) is the most commonly chosen standard for small-systems
embedded programming, combining C90 libraries, stacks, and RTOS's with
user code in C99.  So again, my preference is towards stricter
control, not more permissive tools.


I don't think the estimate is accurate.  Several upstream build systems
I've seen enable -std=gnu11 and similar options once they are supported.
Usually, it's an attempt to upgrade to newer language standards that
hasn't aged well, not a downgrade.  It's probably quite bit more than
1%.



Fair enough.  My experience is mostly within a particular field that is 
probably more conservative than a lot of other areas of programming.


David






Re: C89 question: Do we need to accept -Wint-conversion warnings

2023-10-11 Thread David Brown via Gcc

On 10/10/2023 18:30, Jason Merrill via Gcc wrote:

On Tue, Oct 10, 2023 at 7:30 AM Florian Weimer via Gcc 
wrote:


Are these code fragments valid C89 code?

   int i1 = 1;
   char *p1 = i;

   char c;
   char *p2 = 
   int i2 = p2;

Or can we generate errors for them even with -std=gnu89?

(It will still be possible to override this with -fpermissive or
-Wno-int-conversion.)



Given that C89 code is unlikely to be actively maintained, I think we
should be permissive by default in that mode.  People compiling with an old
-std flag are presumably doing it to keep old code compiling, and it seems
appropriate to respect that.



That is - unfortunately, IMHO - not true.

In particular, in the small-systems embedded development world (and that 
is a /big/ use-case for C programming), there is still a lot done in 
C89/C90.  It is the dominant variety of C for things like RTOS's (such 
as FreeRTOS and ThreadX), network stacks (like LWIP), microcontroller 
manufacturers' SDK's and libraries, and so on.  There are also still 
some microcontrollers for which the main toolchains (not GCC, obviously) 
do not have full C99 support, and there is a significant proportion of 
embedded C programmers who write all their code in C90, even for new 
projects.  There is a "cult" within C coders who think "The C 
Programming Language" is the "Bible", and have never learned anything 
since then.


The biggest target device in this field is the 32-bit ARM Cortex-M 
family, and the the most used compiler is gcc.


Taking numbers out of thin air, but not unrealistically I believe, there 
are millions of devices manufactured every day running code compiled by 
gcc -std=gnu89 or -std=c89 (or an equivalent).


Add to that the libraries on "big" systems that are written to C89/C90 
standards.  After all, that is the lowest common denominator of the 
C/C++ world - with a bit of care, the code will be compatible with all 
other C and C++ standards.  It is not just of old code, though a great 
deal of modern library code has roots back to pre-C99 days, but it is 
also cross-platform code.  It is only relatively recently that 
Microsoft's development tools have had reasonable support for C99 - many 
people writing code to work in both the *nix world and the Windows world 
stick to C89/C90 if they want a clear standard (rather than "the subset 
of C99 supported by the MSVC version they happen to have").


Now, pretty much all of that code could also be compiled with -std=c99 
(or -std=gnu99).  And in a great many cases, it /is/ compiled as C99. 
But for those that want to be careful about their coding, and many do, 
the natural choice here is "-std=c90 -pedantic-errors".



So IMHO (and as I am not a code contributor to GCC, my opinion really is 
humble) it is better to be stricter than permissive, even in old 
standards.  It is particularly important for "-std=c89", while 
"-std=gnu89" is naturally more permissive.  (I have seen more than 
enough terrible code in embedded programs - I don't want to make it 
easier for them to write even worse code!)




I'm also (though less strongly) inclined to be permissive in C99 mode, and
only introduce the new strictness by default for C11/C17 modes.



I suspect (again with numbers taken from thin air) that the proportion 
of C programmers or projects that actively choose C11 or C17 modes, as 
distinct from using the compiler defaults, will be less than 1%.  C99 
(or gnu99) is the most commonly chosen standard for small-systems 
embedded programming, combining C90 libraries, stacks, and RTOS's with 
user code in C99.  So again, my preference is towards stricter control, 
not more permissive tools.


I am aware, however, that I personally am a lot fussier than most 
programmers.  I run gcc with lots of additional warnings and 
-Wfatal-errors, and want ever-stricter tools.  I don't think many people 
would be happy with the choices /I/ would prefer for default compiler 
flags!


I am merely a happy GCC user, not a contributor, much less anyone 
involved in decision making.  But I hope it is helpful to you to hear 
other opinions here, especially about small-systems embedded 
programming, at least in my own experience.


David








Re: GCC support addition for Safety compliances

2023-07-12 Thread David Brown via Gcc

On 12/07/2023 14:43, Jonathan Wakely via Gcc wrote:

On Wed, 12 Jul 2023 at 10:25, Vishal B Patil via Gcc  wrote:


Hi Team,

Any updates ?


You're not going to get any useful answers.

You asked "Please share the costs and time as well." Costs for what? From whom?

GCC is an open-source project with a diverse community of hundreds of
contributors. Who are you asking to give you costs? What work are you
expecting them to do?

It is unlikely that you obtained GCC from https://gcc.gnu.org so you
should probably talk to whoever provided you with your GCC binaries.


Most people get their GCC binaries for free, and no such source is going 
to be able to help for safety compliance or any other kind of 
certification.  Certification always costs time, effort and money.  But 
there are suppliers who provide toolchain binaries with commercial 
support contract, and which could help with certification.  I know Code 
Sourcery certainly used to be able to provide language compliance 
certification - I have no idea if they still can (it seems they are part 
of Siemens these days).  Maybe Red Hat (part of IBM) can do so too, and 
possibly others.  But perhaps that will give the OP a starting point.


David



For safety compliance you will probably need to talk to a third-party
who specializes in that. I don't think you will achieve anything by
asking the GCC project to do that for you.

That's not how open source projects work.





Regards,
Vishal B Patil

vishal.b.pa...@cummins.com

Dahanukar Colony, Kothrud
Pune
Maharashtra
411038
India

-Original Message-
From: Vishal B Patil
Sent: Wednesday, July 5, 2023 4:18 PM
To: Basile Starynkevitch 
Subject: RE: GCC support addition for Safety compliances

Hi Team,

Thanks for the response.

Actually required for UL60730, UL6200. Please share the costs and time as well.

Regards,
Vishal B Patil

vishal.b.pa...@cummins.com

Dahanukar Colony, Kothrud
Pune
Maharashtra
411038
India

-Original Message-
From: Basile Starynkevitch 
Sent: Wednesday, July 5, 2023 4:07 PM
To: Vishal B Patil 
Subject: GCC support addition for Safety compliances

EXTERNAL SENDER: This email originated outside of Cummins. Do not click links 
or open attachments unless you verify the sender and know the content is safe.


Hello


Need support from the GCC GNU for the some safety compliances. Can you please 
guide or check which GCC support the safety compliances.

For safety compliance GCC is probably not enough.


Consider (if allowed by your authorities) using static analysis tools like 
https://frama-c.com/ or https://www.absint.com/products.htm


Be sure to understand what technically safety compliance means to you.
DOI178C? ISO26262?

Be also aware that safety compliance costs a lot of money and a lot of time. 
(you'll probably need a budget above 100k€ ou 100kUS$ and about a person*year 
of developer efforts)


--
Basile Starynkevitch  
(only mine opinions / les opinions sont miennes uniquement)
92340 Bourg-la-Reine, France
web page: starynkevitch.net/Basile/







Re: user sets ABI

2023-07-07 Thread David Brown via Gcc

On 07/07/2023 00:27, André Albergaria Coelho via Gcc wrote:

What if the user chooses in own ABI, say specifying a config file like

My abi

" Parameters = pushed in stack"


say

gcc -abi "My abi" some.c -o some

what would be the problems of specifying an ABI?? would that improve the 
usage of user? less complex / more


simpler for user (say user is used to code asm in a way)




You can fiddle things a bit, using the -ffixed-reg, -fcall-used-reg and 
-fcall-saved-reg flags:




This is almost certainly a bad idea for most situations - you really 
have to have a special niche case to make it worth doing.  The register 
allocation algorithms in GCC are complex, and I would expect changing 
these settings would give you less efficient results.  And of course it 
will mess up all calls to any code compiled with different settings - 
such as library code.


A far better solution is for the user who is used to coding in assembly, 
to get used to coding in C, C++, or other languages supported by GCC. 
If you really need some assembly, as happens occasionally, then learn 
about GCC's extended syntax inline assembly.  That lets GCC worry about 
details such as register allocation and operands, so that your assembly 
is minimal, and allows the assembly to work well along with the compiler 
optimisation.


If you have legacy assembly functions that are written to a non-standard 
calling convention, write a thunk to translate as necessary.





Re: wishlist: support for shorter pointers

2023-07-06 Thread David Brown via Gcc

On 06/07/2023 09:00, Rafał Pietrak via Gcc wrote:

Hi,

W dniu 5.07.2023 o 19:39, David Brown pisze:
[--]
I'm not sure what this means? At compile time, you only have 
literals, so what's missing?


The compiler knows a lot more than just literal values at compile time 
- lots of things are "compile-time constants" without being literals 
that can be used in string literals.  That includes the value of 
static "const" variables, and the results of calculations or "pure" 
function 


const --> created by a literal.


Technically in C, the only "literals" are "string literals".  Something 
like 1234 is an integer constant, not a literal.  But I don't want to 
get too deep into such standardese - especially not for C++ !


Even in C, there are lots of things that are known at compile time 
without being literals (or explicit constants).  In many situations you 
can use "constant expressions", which includes basic arithmetic on 
constants, enumeration constants, etc.  The restrictions on what can be 
used in different circumstances is not always obvious (if you have 
"static const N = 10;", then "static const M = N + 1;" is valid but "int 
xs[N];" is not).


C++ has a very much wider concept of constant expressions at compile 
time - many more ways to make constant expressions, and many more ways 
to use them.  But even there, the compiler will know things at compile 
time that are not syntactically constant in the language.  (If you have 
code in a function "if (x < 0) return; bool b = (x >= 0);" then the 
compiler can optimise in the knowledge that "b" is a compile-time 
constant of "true".)





calls using compile-time constant data.  You can do a great deal more of 


"compile time constant data" -> literal

this in C++ than in C ("static const int N = 10; int arr[N];" is valid 
in C++, but not in C).  Calculated section names might be useful for 
sections that later need to be sorted.


To be fair, you can construct string literals by the preprocessor that 
would cover many cases.


OK. We are talking of convenience syntax that allows for using any 
"name" in c-sources as "const-literal" if only its rooted in literals 
only. That's useful.


+2. :)



I can also add that generating linker symbols from compile-time 
constructed names could be useful, to use (abuse?) the linker to find 
issues across different source files.  Imagine you have a 


+1

microcontroller with multiple timers, and several sources that all 
need to use timers.  A module that uses timer 1 could define a 

[--]


 __attribute__((section("jit_buffer,\"ax\"\n@")))


I assume, that adding an attribute should split a particular section 
into "an old one" and "the new one with new attribute", right?


You can't have the same section name and multiple flags.  But you 
sometimes want to have unusual flag combinations, such as executable 
ram sections for "run from ram" functions.


section flags reflect "semantic" of the section (ro v.s. rw is different 
semantics at that level). So, how do you "merge" RAM (a section called 
".data"), one with "!x" flag, and the other with "x" flag?


conflicting flags of sections with the same name have to be taken into 
consideration.




It doesn't make sense to merge linker input sections with conflicting 
flags - this is (and should be) an error at link time.  So I am not 
asking for a way to make a piece of ".data" section with different flags 
from the standard ".data" section - I am asking about nicer ways to make 
different sections with different selections of flags.  (Input sections 
with different flags can be merged into one output section, as the 
semantic information is lost there.)






One would need to have linker logic (and linker script definitions) 
altered, to follow that (other features so far wouldn't require any 
changes to linkers, I think).


to add the flags manually, then a newline, then a line comment 
character (@ for ARM, but this varies according to target.)


6. Convenient support for non-initialised non-zeroed data sections 
in a standardised way, without having to specify sections manually 
in the source and linker setup.


What gain and under which circumstances you get with this? I mean, 
why enforce keeping uninitialized memory fragment, while that is just 
a one shot action at load time?




Very often you have buffers in your programs, which you want to have 
statically allocated in ram (so they have a fixed address, perhaps 
specially aligned, and so you have a full overview of your memory 
usage in your map files), but you don't care about the contents at 
startup. Clearing these to 0 is just a waste of processor time.


At startup? Really? Personally I wouldn't care if I waste those cycles.



Usually it is not an issue, but it can be for some systems.  I've seen 
systems where a hardware watchdog has timed out while the startup code 
is clearing large buffers unnecessarily.  There are also some low-power 
systems that 

Re: wishlist: support for shorter pointers

2023-07-05 Thread David Brown via Gcc

On 05/07/2023 18:13, Rafał Pietrak via Gcc wrote:

Hi,

W dniu 5.07.2023 o 16:45, David Brown pisze:

On 05/07/2023 15:29, Rafał Pietrak wrote:

[---]
OK. I don't see a problem here, but I admit that mixing semantics 
often lead to problems.




I think it also allows better generalisation and flexibility if they 
are separate.  You might want careful control over where something is 
allocated, but the access would be using normal instructions. 
Conversely, you might not be bothered about where the data is 
allocated, but want control of access (maybe you want interrupts 
disabled around accesses to make it atomic).


that would require compiler to know the "semantics" of such section. I 
don't think you've listed it below, worth adding. If I understand you 
correctly, that means the code generated varies depending on target 
section selected. This is linker "talking" to compiler if I'm not mistaken.




No, it's about the access - not the allocation (or section).  Access 
boils down to a "read" function and a "write" function (or possibly 
several, optimised for different sizes - C11 _Generic can make this 
neater, though C++ handles it better).




[--]
Let me try to list some things I think might be useful (there may be 
some overlap).  I am not giving any particular order here.


1. Adding a prefix to section names rather than replacing them.


OK. +1


2. Adding a suffix to section names.


+1

3. Constructing section names at compile time, rather that just using 
a string literal.  (String literals can be constructed using the 
pre-processor, but that has its limitations.)


I'm not sure what this means? At compile time, you only have literals, 
so what's missing?


The compiler knows a lot more than just literal values at compile time - 
lots of things are "compile-time constants" without being literals that 
can be used in string literals.  That includes the value of static 
"const" variables, and the results of calculations or "pure" function 
calls using compile-time constant data.  You can do a great deal more of 
this in C++ than in C ("static const int N = 10; int arr[N];" is valid 
in C++, but not in C).  Calculated section names might be useful for 
sections that later need to be sorted.


To be fair, you can construct string literals by the preprocessor that 
would cover many cases.


I can also add that generating linker symbols from compile-time 
constructed names could be useful, to use (abuse?) the linker to find 
issues across different source files.  Imagine you have a 
microcontroller with multiple timers, and several sources that all need 
to use timers.  A module that uses timer 1 could define a 
"using_timer_1" symbol for link time (but with no allocation to real 
memory).  Another module might use timer 2 and define "using_timer_2". 
If a third module uses timer 1 again, then you'd get a link-time error 
with two conflicting definitions of "use_timer_1" and you'd know you 
have to change one of the modules.




4. Pragmas to apply section names (or prefixes or suffixes) to a block 
of definitions, changing the defaults.


+1

5. Control of section flags (such as read-only, executable, etc.).  At 
the moment, flags are added automatically depending on what you put 
into the section (code, data, read-only data).  So if you want to 
override these, such as to make a data section in ram that is 
executable (for your JIT compiler :-) ), you need something like :


 __attribute__((section("jit_buffer,\"ax\"\n@")))


I assume, that adding an attribute should split a particular section 
into "an old one" and "the new one with new attribute", right?


You can't have the same section name and multiple flags.  But you 
sometimes want to have unusual flag combinations, such as executable ram 
sections for "run from ram" functions.




One would need to have linker logic (and linker script definitions) 
altered, to follow that (other features so far wouldn't require any 
changes to linkers, I think).


to add the flags manually, then a newline, then a line comment 
character (@ for ARM, but this varies according to target.)


6. Convenient support for non-initialised non-zeroed data sections in 
a standardised way, without having to specify sections manually in the 
source and linker setup.


What gain and under which circumstances you get with this? I mean, why 
enforce keeping uninitialized memory fragment, while that is just a one 
shot action at load time?




Very often you have buffers in your programs, which you want to have 
statically allocated in ram (so they have a fixed address, perhaps 
specially aligned, and so you have a full overview of your memory usage 
in your map files), but you don't care about the contents at startup. 
Clearing these to 0 is just a waste of processor time.



7. Convenient support for sections (or variables) placed at specific 
addresses, in a standardised way.


Hmm... Frankly, I'm quite comfortable with current features of 

Re: wishlist: support for shorter pointers

2023-07-05 Thread David Brown via Gcc

On 05/07/2023 15:29, Rafał Pietrak wrote:

Hi,


W dniu 5.07.2023 o 14:57, David Brown pisze:
[]


My objection to named address spaces stem from two points:

1. They are compiler implementations, not user code (or library code), 
which means development is inevitably much slower and less flexible.


2. They mix two concepts that are actually quite separate - how 
objects are allocated, and how they are accessed.


OK. I don't see a problem here, but I admit that mixing semantics often 
lead to problems.




I think it also allows better generalisation and flexibility if they are 
separate.  You might want careful control over where something is 
allocated, but the access would be using normal instructions. 
Conversely, you might not be bothered about where the data is allocated, 
but want control of access (maybe you want interrupts disabled around 
accesses to make it atomic).


Access to different types of object in different sorts of memory can 
be done today.  In C, you can use inline functions or macros.  For 
target-specific stuff you can use inline assembly, and GCC might have 
builtins for some target-specific features.  In C++, you can also wrap 
things in classes if that makes more sense.


Personally, I'd avoid inline assembly whenever possible. It does a very 
good job of obfuscating programmers' intentions. From my experience, I'd 
rather put the entire functions into assembler if compiler makes obstacles.




I'd rather keep the assembly to a minimum, and let the compiler do what 
it is good at - such as register allocation.  That means extended syntax 
inline assembly (but typically wrapped inside a small inline function).



But that's not an issue here.


Agreed.



Allocation is currently controlled by "section" attributes.  This is 
where we I believe GCC could do better, and give the user more 
control. (It may be possible to develop a compiler-independent syntax 
here that could become part of future C and C++ standards, but I think 
it will unavoidably be heavily implementation dependent.)


I agree.



All we really need is a way to combine these with types to improve 
user convenience and reduce the risk of mistakes.  And I believe that 
allowing allocation control attributes to be attached to types would 
give us that in GCC.  Then it would all be user code - typedefs, 
macros, functions, classes, whatever suits.


OK. Sounds good.

Naturally I have my "wishlist": the "small pointers" segment/attribute :)

But how (and to what extend) would you do that? I mean, the convenient 
syntax is desirable, but IMHO at this point there is also a question of 
semantics: what exactly compiler is supposed to tell linker? I think it 
would be good to list here the use scenarios that we now of. Scenarios 
that would benefit from compiler communicating to linker more then 
names@sections. (even if such list wouldn't evolve into any 
implementation effort at this point I think that would nicely conclude 
this thread.)




Let me try to list some things I think might be useful (there may be 
some overlap).  I am not giving any particular order here.


1. Adding a prefix to section names rather than replacing them.

2. Adding a suffix to section names.

3. Constructing section names at compile time, rather that just using a 
string literal.  (String literals can be constructed using the 
pre-processor, but that has its limitations.)


4. Pragmas to apply section names (or prefixes or suffixes) to a block 
of definitions, changing the defaults.


5. Control of section flags (such as read-only, executable, etc.).  At 
the moment, flags are added automatically depending on what you put into 
the section (code, data, read-only data).  So if you want to override 
these, such as to make a data section in ram that is executable (for 
your JIT compiler :-) ), you need something like :


__attribute__((section("jit_buffer,\"ax\"\n@")))

to add the flags manually, then a newline, then a line comment character 
(@ for ARM, but this varies according to target.)


6. Convenient support for non-initialised non-zeroed data sections in a 
standardised way, without having to specify sections manually in the 
source and linker setup.


7. Convenient support for sections (or variables) placed at specific 
addresses, in a standardised way.


8. Convenient support for sections that are not allocated space by the 
linker in the target memory, but where the contents are still included 
in the elf file and map files, where they can be read by other tools. 
(This could be used for external analysis tools.)


9. Support for getting data from the linker to the code, such as section 
sizes and start addresses, without having to manually add the symbols to 
the linker file and declare extern symbols in the C or C++ code.


10. Support for structs (or C++ classes) where different parts of the 
struct are in different sections.  This would mean the struct could only 
be statically allocated (no stack 

Re: wishlist: support for shorter pointers

2023-07-05 Thread David Brown via Gcc




On 05/07/2023 14:25, Rafał Pietrak wrote:

Hi,

W dniu 5.07.2023 o 13:55, David Brown pisze:

On 05/07/2023 11:42, Rafał Pietrak via Gcc wrote:

[--]

So your current objections to named spaces ... are in fact in favor 
of them. Isn't it so?




Not really, no - I would rather see better ways to handle allocation 
and section control than more named address spaces.


Doesn't it call for "something" that a c-source (through the compiler) 
can express to the linker programmers' intention?




Yes, I think that is fair to say.  And that "something" should be more 
advanced and flexible than the limited "section" attribute we have 
today.  But I don't think it should be "named address spaces".


My objection to named address spaces stem from two points:

1. They are compiler implementations, not user code (or library code), 
which means development is inevitably much slower and less flexible.


2. They mix two concepts that are actually quite separate - how objects 
are allocated, and how they are accessed.


Access to different types of object in different sorts of memory can be 
done today.  In C, you can use inline functions or macros.  For 
target-specific stuff you can use inline assembly, and GCC might have 
builtins for some target-specific features.  In C++, you can also wrap 
things in classes if that makes more sense.


Allocation is currently controlled by "section" attributes.  This is 
where we I believe GCC could do better, and give the user more control. 
(It may be possible to develop a compiler-independent syntax here that 
could become part of future C and C++ standards, but I think it will 
unavoidably be heavily implementation dependent.)


All we really need is a way to combine these with types to improve user 
convenience and reduce the risk of mistakes.  And I believe that 
allowing allocation control attributes to be attached to types would 
give us that in GCC.  Then it would all be user code - typedefs, macros, 
functions, classes, whatever suits.


David



Re: wishlist: support for shorter pointers

2023-07-05 Thread David Brown via Gcc

On 05/07/2023 11:42, Rafał Pietrak via Gcc wrote:

Hi,

W dniu 5.07.2023 o 11:11, David Brown pisze:

On 05/07/2023 10:05, Rafał Pietrak via Gcc wrote:

[---]


I am not sure if you are clear about this, but the address space 
definition macros here are for use in the source code for the 
compiler, not in user code.  There is (AFAIK) no way for user code to 
create address spaces - you need to check out the source code for GCC, 
modify it to support your new address space, and build your own 
compiler.  This is perfectly possible (it's all free and open source, 
after all), but it is not a minor undertaking - especially if you 
don't like C++ !


Hmmm.

Wouldn't it be easier and more natural to make the "named spaces" a 
synonym to specific linker sections (like section names, or section name 
prefix when instead of ".data.array.*" one gets ".mynamespace.array.*")?


You can, of course, write :

#define __smalldata __attribute__((section(".smalldata)))

I'd rather see the "section" attribute extended to allow it to specify a 
prefix or suffix (to make subsections) than more named address spaces.


I'm a big fan of only putting things in the compiler if they have to be 
there - if a feature can be expressed in code (whether it be C, C++, or 
preprocessor macros), then I see that as the best choice.




[--]
I realise that learning at least some C++ is a significant step beyond 
learning C - but /using/ C++ classes or templates is no harder than C 
coding.  And it is far easier, faster and less disruptive to make a 
C++ header library implementing such features than adding new named 
address spaces into the compiler itself.


The one key feature that is missing is that named address spaces can 
affect the allocation details of data, which cannot be done with C++ 
classes.  You could make a "small_data" class template, but variables 
would still need to be marked __attribute__((section(".smalldata"))) 
when used.  I think this could be handled very neatly with one single 
additional feature in GCC - allow arbitrary GCC variable attributes to 
be specified for types, which would then be applied to any variables 
declared for that type.


OK. I see your point.

But let's have look at it. You say, that "names spaces affect allocation 
details, which cannot be done with C++". Pls consider:
1. for small embedded devices C++ is not a particularly "seller". We 
even turn to assembler occasionally.


I have been writing code for small embedded systems for about 30 years. 
I used to write a lot in assembly, but it is very rare now.  Almost all 
of the assembly I write these days is inline assembly in gcc format - 
and a lot of that actually contains no assembly at all, but is for 
careful control of dependencies or code re-arrangements.  The smallest 
device I have ever used was an AVR Tiny with no ram at all - just 2K 
flash, a 3-level return stack and its 32 8-bit registers.  I programmed 
that in C (with gcc).


C++ /is/ a big "seller" in this market.  It is definitely growing, just 
as the market for commercial toolchains with non-portable extensions is 
dropping and 8-bit CISC devices are being replaced by Cortex-M0 cores. 
There is certainly plenty of C-only coding going on, but C++ is growing.


2. affecting allocation details is usually the hole point of engineering 
skills when dealing with small embedded devices - the hole point is to 
have tools to do that.




When you are dealing with 8-bit CISC devices like the 8051 or the COP8, 
then allocation strategies are critical, and good tools are essential.


But for current microcontrollers, they are not nearly as important 
because you have a single flat address space - pointers to read-only 
data in flash and pointers to data in ram are fully compatible.  You do 
sometimes need to place particular bits of data in particular places, 
but that is usually for individual large data blocks such as putting 
certain buffers in non-cached memory, or a large array in external 
memory.  Section attributes suffice for that.


Allocation control is certainly important at times, but it's far from 
being as commonly needed as you suggest.


(Dynamic allocation is a different matter, but I don't believe we are 
talking about that here.)


So your current objections to named spaces ... are in fact in favor of 
them. Isn't it so?




Not really, no - I would rather see better ways to handle allocation and 
section control than more named address spaces.


David




Re: wishlist: support for shorter pointers

2023-07-05 Thread David Brown via Gcc




On 05/07/2023 11:25, Martin Uecker wrote:

Am Mittwoch, dem 05.07.2023 um 11:11 +0200 schrieb David Brown:

On 05/07/2023 10:05, Rafał Pietrak via Gcc wrote:


...


In my personal opinion (which you are all free to disregard), named
address spaces were an interesting idea that failed.  I was
enthusiastic
about a number of the extensions in TR 18307 "C Extensions to support
embedded processors" when the paper was first published.  As I
learned
more, however, I saw it was a dead-end.  The features are too
under-specified to be useful or portable, gave very little of use to
embedded programmers, and fit badly with C.  It was an attempt to
standardise and generalise some of the mess of different extensions
that
proprietary toolchain developers had for a variety of 8-bit CISC
microcontrollers that could not use standard C very effectively.  But
it
was all too little, too late - and AFAIK none of these proprietary
toolchains support it.  GCC supports some of the features to some
extent
- a few named address spaces on a few devices, for "gnuc" only (not
standard C, and not C++), and has some fixed point support for some
targets (with inefficient generated code - it appears to be little
more
than an initial "proof of concept" implementation).

I do not think named address spaces have a future - in GCC or
anywhere
else.  The only real use of them at the moment is for the AVR for
accessing data in flash, and even then it is of limited success since
it
does not work in C++.


Can you explain a little bit why you think it is a dead-end?  It
seems an elegant solution to a range of problems to me.


Named address spaces are not standardised in C, and I do not expect they 
ever will be.  The TR18307 document is not anywhere close to being of a 
quality that could be integrated with the C standards, even as optional 
features, and much of it makes no sense in practice (I have never heard 
of the IO stuff being implemented or used).


The few compilers that implement any of it do so in different ways - the 
"__flash" address space in AVR GCC is slightly different from the same 
extension in IAR's AVR compiler.  For existing compilers, there is a 
strong inconsistency as to whether such things are "named address 
spaces", "extension keywords", "type qualifiers", "attributes", or other 
terms, all with subtly (or not so subtly) different effects on how they 
are used, what restrictions exist, conversions between types, and how 
errors can be diagnosed.  Sometimes these features are considered part 
of the data type, sometimes of pointer types, sometimes they are just 
about data placement.


Since every compiler targeting these small awkward microcontrollers has 
a different idea of what something like "const __flash int x = 123;" 
means, and has been implementing their own ideas for a decade or two 
before TR18307 ever proposed "named address spaces", the TR hasn't a 
hope of being a real standard.


Named address spaces are not implemented at all, anywhere (AFAIK), for 
C++.  (Some embedded toolchains have limited support for C++ on such 
microcontrollers, but these are again not really named address spaces.) 
Since C++ usage is heavily increasing in the small embedded system 
world, this is important.  (GCC has much of the honour for that - as ARM 
took a bigger share of the market and GCC for ARM improved, the 
toolchain market was no longer at the mercy of big commercial vendors 
who charged absurd amounts for their C++ toolchains.)  A feature which 
is only for C, and not supported by C++, is almost guaranteed to be 
dead-end.


And of course the type of processor for which named address spaces or 
other related extensions are essential, are a dying breed.  The AVR is 
probably the only one with a significant future.  Part of the appeal of 
ARM in the embedded world is it frees you from the pains of 
target-specific coding with some of your data in "near" memory, some in 
"extended" memory, some in "flash" address spaces or "IO" address 
spaces.  It all works with standard C or C++.  The same applies to 
challengers like RISC-V, MIPS, PPC, and any other core - you have a 
single flat address space for normal data.




I have no idea how much the GCC features are actually used,
but other compilers for  embedded systems such as SDCC also
support named address spaces.



And the targets supported by SDCC are also dead-end devices - there is 
not a single one of them that I would consider for a new project.  These 
microcontrollers are now used almost exclusively for legacy projects - 
updates to existing hardware or software, and rely on compatibility with 
existing C extensions (whether they are called "named address spaces", 
"extension keywords", or anything else).



Now, there are things that I would like to be able to write in my code 
that could appear to be candidates for some kind of named address space. 
 For example, I might want data that is placed in an external eeprom - 
it could be nice to be able to 

Re: wishlist: support for shorter pointers

2023-07-05 Thread David Brown via Gcc

On 05/07/2023 10:05, Rafał Pietrak via Gcc wrote:

Hi,

W dniu 5.07.2023 o 09:29, Martin Uecker pisze:

Am Mittwoch, dem 05.07.2023 um 07:26 +0200 schrieb Rafał Pietrak:

[---]

And if it's so ... there is no mention of how does it show up for
"simple user" of the GCC (instead of the use of that "machinery" by
creators of particular GCC port). In other words: how the sources should
look like for the compiler to do "the thing"?



Not sure I understand the question.  You would add a name space
to an object as a qualifier and then the object would be allocated
in a special (small) region of memory.  Pointers known to point
into that special region of memory (which is encoded into the
type) would then be smaller.  At least, this is my understanding
of how it could work.


Note that this only applies to pointers declared to be of the address 
space specific type.  If you have "__smalldata int x;" using a 
hypothetical new address space, then "" is of type "__smalldata int *" 
and you need to specify the address space specific pointer type to get 
the size advantages.  (Since the __smalldata address space is a subset 
of the generic space, conversions between pointer types are required to 
work correctly.)




Apparently you do understand my question.

Then again ... apparently you are guessing the answer. Incidentally, 
that would be my guess, too. And while such "syntax" is not really 
desirable (since such attribution at every declaration of every "short 
pointer" variable would significantly obfuscate the sources and a thing 
like "#pragma" at the top of a file would do a better job), better 
something then nothing. Then again, should you happen to fall onto an 
actual documentation of syntax to use this feature with, I'd appreciate 
you sharing it :)




I am not sure if you are clear about this, but the address space 
definition macros here are for use in the source code for the compiler, 
not in user code.  There is (AFAIK) no way for user code to create 
address spaces - you need to check out the source code for GCC, modify 
it to support your new address space, and build your own compiler.  This 
is perfectly possible (it's all free and open source, after all), but it 
is not a minor undertaking - especially if you don't like C++ !


In my personal opinion (which you are all free to disregard), named 
address spaces were an interesting idea that failed.  I was enthusiastic 
about a number of the extensions in TR 18307 "C Extensions to support 
embedded processors" when the paper was first published.  As I learned 
more, however, I saw it was a dead-end.  The features are too 
under-specified to be useful or portable, gave very little of use to 
embedded programmers, and fit badly with C.  It was an attempt to 
standardise and generalise some of the mess of different extensions that 
proprietary toolchain developers had for a variety of 8-bit CISC 
microcontrollers that could not use standard C very effectively.  But it 
was all too little, too late - and AFAIK none of these proprietary 
toolchains support it.  GCC supports some of the features to some extent 
- a few named address spaces on a few devices, for "gnuc" only (not 
standard C, and not C++), and has some fixed point support for some 
targets (with inefficient generated code - it appears to be little more 
than an initial "proof of concept" implementation).


I do not think named address spaces have a future - in GCC or anywhere 
else.  The only real use of them at the moment is for the AVR for 
accessing data in flash, and even then it is of limited success since it 
does not work in C++.



I realise that learning at least some C++ is a significant step beyond 
learning C - but /using/ C++ classes or templates is no harder than C 
coding.  And it is far easier, faster and less disruptive to make a C++ 
header library implementing such features than adding new named address 
spaces into the compiler itself.


The one key feature that is missing is that named address spaces can 
affect the allocation details of data, which cannot be done with C++ 
classes.  You could make a "small_data" class template, but variables 
would still need to be marked __attribute__((section(".smalldata"))) 
when used.  I think this could be handled very neatly with one single 
additional feature in GCC - allow arbitrary GCC variable attributes to 
be specified for types, which would then be applied to any variables 
declared for that type.


David





Re: wishlist: support for shorter pointers

2023-07-04 Thread David Brown via Gcc

On 04/07/2023 16:46, Rafał Pietrak wrote:

Hi,

W dniu 4.07.2023 o 14:38, David Brown pisze:
[-]
A key difference is that using 32-bit pointers on an x86 is enough 
address space for a large majority of use-cases, while even on the 
smallest small ARM microcontroller, 16-bit is not enough.  (It's not 
even enough to access all memory on larger AVR microcontrollers - the 
only 8-bit device supported by mainline gcc.)  So while 16 bits would 
cover the address space of the RAM on a small ARM microcontroller, it 
would not cover access to code/flash space (including read-only data), 
IO registers, or other areas of memory-mapped memory and peripherals. 
Generic low-level pointers really have to be able to access everything.


Naturaly 16-bit is "most of the time" not enough to cover the entire 
workspace on even the smallest MCU (AVR being the only close to an 
exception here), but in my little experience, that is not really 
necessary.


(Most MSP430 devices, also supported by GCC, are also covered by a 
16-bit address space.)


Meaning "generic low-level pointers really have to...", I 
don't think so. I really don't. Programs often manipulate quite 
"localized" data, and compiler is capable enough to distinguish and keep 
separate pointers of different "domains". What makes it currently 
impossible is tools (semantic constructs like pragma or named sections) 
that would let it happen.




No, generic low-level pointers /do/ have to work with all reasonable 
address spaces on the device.  A generic pointer has to support pointing 
to modifiable ram, to constant data (flash on small microcontrollers), 
to IO registers, etc.  If you want something that can access a specific, 
restricted area, then it is a specialised pointer - not a generic one. 
C has no support for making your own pointer types, but C++ does.




So an equivalent of x32 mode would not work at all.  Really, what you 
want is a 16-bit "small pointer" that is added to 0x2000 (the base 
address for RAM in small ARM devices, in case anyone following this 
thread is unfamiliar with the details) to get a real data pointer.  
And you'd like these small pointers to have convenient syntax and 
efficient use.


more or less yes. But "with a twist". A "compiler construct" that would 
be (say) sufficient to get the RAM-savings/optimization I'm aiming at 
could be "reduced" to the ability to create "medium-size" array of "some 
objects" and have them reference each other all WITHIN that "array". 
That array was in my earlier emails referred to as segment or section. 
So whenever a programmer writes a construct like:


struct test_s attribute((small-and-funny)) {
 struct test_s attribute((small-and-funny)) *next, *prev, *head;
 struct test_s attribute((small-and-funny)) *user, *group;
} repository[1000];
struct test_s attribute((small-and-funny)) *master, *trash;

compiler puts that data into that small array (dedicated section), so no 
"generic low-level pointers" referring that data would need to exist 
within the program. And if it happens, error is thrown (or 
autoconversion happen).




GCC attributes for sections already exist.

And again - indices will give you what you need here more efficiently 
than pointers.  All of your pointers can be converted to "repository[i]" 
format.  (And if your repository has no more than 256 entries, 8-bit 
indices will be sufficient.)  It can be efficient to store pointers to 
the entries in local variables if you are using them a lot, though GCC 
will do a fair amount of that automatically.




I think a C++ class (or rather, class template) with inline functions 
is the way to go here.  gcc's optimiser will give good code, and the 
C++ class will let you get nice syntax to hide the messy details.


OK. Thenx for the advice, but going into c++ is a major thing for me and 
(at least for  the time being) I'll stay with ordinary "big" pointers in 
plain C instead.


There is no good way to do this in C.  Named address spaces would be a 
possibility, but require quite a bit of effort and change to the 
compiler to implement, and they don't give you anything that you would 
not get from a C++ class.


Yes. named address spaces would be great. And for code, too.



It is good to have a wishlist (and you can file a wishlist "bug" in the 
gcc bugzilla, so that it won't be forgotten).  But it is also good to be 
realistic.  Indices will give you what you need in terms of space 
efficiency, but will be messier in the syntax.  A small pointer class 
will give you efficient code and neat syntax, but require C++.  These 
two solutions will, however, work today.  (And they are both target 
independent.)


David


(That's not quite true - named address spaces can, I believe, also 
influence the section name used for allocation of data defined in 
these spaces, which cannot be done by a C++ class.)


OK.

-R




Re: wishlist: support for shorter pointers

2023-07-04 Thread David Brown via Gcc

On 04/07/2023 16:20, Rafał Pietrak wrote:



W dniu 3.07.2023 o 18:29, Rafał Pietrak pisze:

Hi David,


[--]
4. It is worth taking a step back, and thinking about how you would 
like to use these pointers.  It is likely that you would be better 
thinking in terms of an array, rather than pointers - after all, you 
don't want to be using dynamically allocated memory here if you can 
avoid it, and certainly not generic malloc().  If you can use an 
array, then your index type can be as small as you like - maybe 
uint8_t is enough.


I did that trip ... some time ago. May be I discarded the idea 
prematurely, but I dropped it because I was afraid of cost of 


I remember now what was my main problem with indexes implementation: 
inability to express/write chain "references" with them. Table/index 
semantic of:

 t[a][b][c][d].
is a "multidimentional table" which is completely different from 
"pointer semantic" of:

 *t->a->b->c->d

It is quite legit to do a full circle around a circular list this way, 
while table semantics doesn't allow that.


Indexes are off the table.

-R


If you have a circular buffer, it is vastly more efficient to have an 
array with no pointers or indices, and use head and tail indices to 
track the current position.  But I'm not sure if that is what you are 
looking for.  And you can use indices in fields for chaining, but the 
syntax will be different.  (For some microcontrollers, the 
multiplications involved in array index calculations can be an issue, 
but not for ARM devices.)





Re: wishlist: support for shorter pointers

2023-07-04 Thread David Brown via Gcc

On 03/07/2023 18:42, Rafał Pietrak via Gcc wrote:

Hi Ian,

W dniu 3.07.2023 o 17:07, Ian Lance Taylor pisze:
On Wed, Jun 28, 2023 at 11:21 PM Rafał Pietrak via Gcc 
 wrote:

[]

I was thinking about that, and it doesn't look as requiring that deep
rewrites. ABI spec, that  could accomodate the functionality could be as
little as one additional attribute to linker segments.


If I understand correctly, you are looking for something like the x32
mode that was available for a while on x86_64 processors:
https://en.wikipedia.org/wiki/X32_ABI .  That was a substantial amount
of work including changes to the compiler, assembler, linker, standard
library, and kernel.  And at least to me it's never seemed
particularly popular.


Yes.

And WiKi reporting up to 40% performance improvements in some corner 
cases is impressive and encouraging. I believe, that the reported 
average of 5-8% improvement would be significantly better within MCU 
tiny resources environment. In MCU world, such improvement could mean 
fit-nofit of a project into a particular device.


-R



A key difference is that using 32-bit pointers on an x86 is enough 
address space for a large majority of use-cases, while even on the 
smallest small ARM microcontroller, 16-bit is not enough.  (It's not 
even enough to access all memory on larger AVR microcontrollers - the 
only 8-bit device supported by mainline gcc.)  So while 16 bits would 
cover the address space of the RAM on a small ARM microcontroller, it 
would not cover access to code/flash space (including read-only data), 
IO registers, or other areas of memory-mapped memory and peripherals. 
Generic low-level pointers really have to be able to access everything.


So an equivalent of x32 mode would not work at all.  Really, what you 
want is a 16-bit "small pointer" that is added to 0x2000 (the base 
address for RAM in small ARM devices, in case anyone following this 
thread is unfamiliar with the details) to get a real data pointer.  And 
you'd like these small pointers to have convenient syntax and efficient use.


I think a C++ class (or rather, class template) with inline functions is 
the way to go here.  gcc's optimiser will give good code, and the C++ 
class will let you get nice syntax to hide the messy details.


There is no good way to do this in C.  Named address spaces would be a 
possibility, but require quite a bit of effort and change to the 
compiler to implement, and they don't give you anything that you would 
not get from a C++ class.


(That's not quite true - named address spaces can, I believe, also 
influence the section name used for allocation of data defined in these 
spaces, which cannot be done by a C++ class.)


David



Re: wishlist: support for shorter pointers

2023-07-03 Thread David Brown via Gcc

On 28/06/2023 10:35, Rafał Pietrak via Gcc wrote:

Hi Jonathan,

W dniu 28.06.2023 o 09:31, Jonathan Wakely pisze:




If you use a C++ library type for your pointers the syntax above 
doesn't need to change, and the fancy pointer type can be implemented 
portable, with customisation for targets where you could use 16 bits 
for the pointers.


As you can expect from the problem I've stated - I don't know C++, so 
I'll need some more advice there.


But, before I dive into learning C++ (forgive the naive question) 
isn't it so, that C++ comes with a heavy runtime? One that will bloat my 
tiny project? Or the bloat comes only when one uses particular 
elaborated class/inheritance scenarios, and this particular case ( for 
(...; ...; x = x->next) {} ) will not draw any of that into this project?





Let me make a few points (in no particular order) :

1. For some RISC targets, such as PowerPC, it is common to have a 
section of memory called the "small data section".  One of the registers 
is dedicated as an anchor to this section, and data within it is 
addressed as Rx + 16-bit offset.  But this is primarily for data at 
fixed (statically allocated) addresses, since reads and writes using 
this address mode are smaller and faster than full 32-bit addresses. 
Normal pointers are still 32-bit.  It also requires a dedicated register 
- not a big cost when you have 31 GPRs, but much more costly when you 
have only 13.


2. C++ is only costly if you use costly features.  On small embedded 
systems, you want "-fno-exceptions -fno-rtti", and you will get as good 
(or bad!) results for C++ as for C.  Many standard library features 
will, however, result in a great deal of code - it is usually fairly 
obvious which classes and functions are appropriate.


3. In C, you could make a type such as :

typedef struct {
uint16_t p;
} small_pointer_t;

and conversion functions :

static const uintptr_t ram_base = 0x2000;

static inline void * sp_to_voidp(small_pointer_t sp) {
return (void *)(ram_base + sp);
}

static inline small_pointer_t voidp_to_sp(void * p) {
small_pointer_t sp;
sp.p = (uintptr_t) p - ram_base;
return sp;
}

Then you would use these access functions to turn your "small pointers" 
into normal pointers.  The source code would become significantly harder 
to read and write, and less type-safe, but could be quite efficient.


In C++, you'd use the same kinds of functions.  But they would now be 
methods in a class template, and tied to overloaded operators and/or 
conversion functions.  The result would be type-safe and let you 
continue to use a normal pointer-like syntax, and with equally efficient 
generated code.  You could also equally conveniently have small pointers 
to ram and to peripheral groups.  This mailing list is not really the 
place to work through an implementation of such class templates - but it 
certainly could be done.



4. It is worth taking a step back, and thinking about how you would like 
to use these pointers.  It is likely that you would be better thinking 
in terms of an array, rather than pointers - after all, you don't want 
to be using dynamically allocated memory here if you can avoid it, and 
certainly not generic malloc().  If you can use an array, then your 
index type can be as small as you like - maybe uint8_t is enough.



David





Re: Will GCC eventually learn to use BSR or even TZCNT on AMD/Intel processors?

2023-06-06 Thread David Brown via Gcc

On 06/06/2023 14:53, Paul Smith wrote:

On Tue, 2023-06-06 at 16:36 +0800, Julian Waters via Gcc wrote:

Sorry for my outburst, to the rest of this list. I can no longer stay
silent and watch these little shits bully people who are too kind to
fire back with the same kind of venom in their words.


Many of us have had Dave in our killfiles for a long time already.  I
recommend you (and everyone else) do the same.  You won't miss out on
any information of any use to anyone: he apparently just enjoys making
other people angry.

I'm quite serious: it's so not worth the mental energy to even read his
messages, much less to reply to him.  Arguing with "people who are
wrong on the internet" can be cathartic but this is not arguing, it's
just stabbing yourself in the eye with a pencil.  Don't play.



If a poster is causing enough aggravation that a large number of people 
have killfiled him, is there a process for banning him from the list? 
That is surely a better solution than having many people individually 
killfiling him?  I would assume those with the power to blacklist 
addresses from the mailing list do not do so lightly, and that there is 
a procedure for it.


David




Re: Will GCC eventually learn to use BSR or even TZCNT on AMD/Intel processors?

2023-06-06 Thread David Brown via Gcc

On 06/06/2023 02:09, Dave Blanchard wrote:


If this guy's threads are such a terrible waste of your time, how
about employing your email client's filters to ignore his posts (and
mine too) and fuck off?



You apparently appreciate Stefan's posts, but burst a blood vessel when 
reading anyone else's.  And Stefan has shown a total disregard for what 
anyone else writes.


Rather than everyone else having to killfile the pair of you, why don't 
you do everyone a favour and have your little rants with each other 
directly, and not on this list?


If either of you are remotely interested in improving gcc's 
optimisation, there are two things you must do:


1. Stop wasting the developers' time and driving them up the wall, so 
that they have more time to work on improving the tools.


2. Make the suggestions and requests for improvements through the proper 
channels - polite, factual and detailed bug reports.


This is not rocket science - it's basic human decency, and should not be 
difficult to understand.


David



Re: Will GCC eventually support SSE2 or SSE4.1?

2023-05-26 Thread David Brown via Gcc

On 26/05/2023 17:49, Stefan Kanthak wrote:


I don't like to argue with idiots: they beat me with experience!

Stefan



Stefan, you are clearly not happy about the /free/ compiler you are 
using, and its /free/ documentation (which, despite its flaws, is better 
than I have seen for most other compilers).


Instead of filing a bug report, as you have been asked to do, or reading 
the documentation, or thinking, or posting to an appropriate mailing 
list, you have chosen to rant, yell, shout at and insult the very people 
who could make the changes and improvements you want.


So who, exactly, do you think is acting like an idiot?  I'd say it is 
the rude and arrogant fool that is sawing off the branch he is sitting on.


Remember, these are people with /no/ obligation to help you.  Some do 
gcc development as voluntary contributions, others are paid to work on 
it - but they are not paid by /you/.  And none are paid to sit and 
listen to your tantrums.



So if you want to shout and rant and blow off steam, go make a tweet or 
something.  If you actually hope to see gcc change its optimisation, 
flag details or documentation to your liking, then your current 
behaviour is the worst possible tactic.  So let your final post to this 
thread be an apology, then register bug reports with what you see as 
bugs or scope for improvement in the project.  Please - for the sanaity 
of the gcc developers and for the benefit of gcc users everywhere - stop 
your aggravating posts here, so that Jonathan and the others can back to 
what they do best - improving gcc for everyone.


David




Re: [wish] Flexible array members in unions

2023-05-12 Thread David Brown via Gcc

On 12/05/2023 08:16, Richard Biener via Gcc wrote:

On Thu, May 11, 2023 at 11:14 PM Kees Cook via Gcc  wrote:


On Thu, May 11, 2023 at 08:53:52PM +, Joseph Myers wrote:

On Thu, 11 May 2023, Kees Cook via Gcc wrote:


On Thu, May 11, 2023 at 06:29:10PM +0200, Alejandro Colomar wrote:

On 5/11/23 18:07, Alejandro Colomar wrote:
[...]

Would you allow flexible array members in unions?  Is there any
strong reason to disallow them?


Yes please!! And alone in a struct, too.

AFAICT, there is no mechanical/architectural reason to disallow them
(especially since they _can_ be constructed with some fancy tricks,
and they behave as expected.) My understanding is that it's disallowed
due to an overly strict reading of the very terse language that created
flexible arrays in C99.


Standard C has no such thing as a zero-size object or type, which would
lead to problems with a struct or union that only contains a flexible
array member there.


Ah-ha, okay. That root cause makes sense now.


Hmm. but then the workaround

struct X {
   int n;
   union u {
   char at_least_size_one;
   int iarr[];
   short sarr[];
   };
};

doesn't work either.  We could make that a GNU extension without
adverse effects?

Richard.



I would like and use an extension like that (for C and C++) - the 
flexible arrays would act as though they were the same size as the 
size-specific part of the union, rounding up in this case to make the 
alignments correct.


I regularly want something like :

union ProtocolBuffer {
struct {
header ...
data fields ...
}
uint8_t raw8[];
uint32_t raw32[];
}

The "raw" arrays would be used to move data around, or access it from 
communication drivers.  As C (and C++) is defined, I have to split this 
up so that the "raw" arrays can use "sizeof(ProtocolTelegram) / 4" or 
similar expressions for their size.  If flexible arrays in unions were 
allowed here, it could make my code a little neater and use more 
anonymous unions and structs to reduce unhelpful verbosity.






Why are zero-sized objects missing in Standard C? Or, perhaps, the better
question is: what's needed to support the idea of a zero-sized object?

--
Kees Cook







Re: More C type errors by default for GCC 14

2023-05-12 Thread David Brown via Gcc

On 12/05/2023 04:08, Po Lu via Gcc wrote:

Eli Schwartz  writes:





Because that's exactly what is going on here. Features that were valid
C89 code are being used in a GNU99 or GNU11 code file, despite that
***not*** being valid GNU99 or GNU11 code.


How GCC currently behaves defines what is valid GNU C.



What GCC /documents/ defines what is valid GNU C.  (Much of that is, of 
course, imported by reference from the ISO C standards, along with 
target-specific details such as ABI's.)


Anything you write that relies on undocumented behaviour may work by 
luck, not design, and you have no basis for expecting future versions of 
gcc, or any other compiler, to give the same lucky results.


Each version of a compiler is, in fact, a different compiler - that is 
how you should be viewing your tools.  The move between different 
versions of the same compiler is usually much smaller than moving 
between different compiler vendors, but you still look at the release 
notes, change notices, porting information, etc., before changing.  You 
still make considered decisions, and appropriate testing.  You still 
check your build systems and modify flags if needed.  And you do that 
even if you are confident that your code is solid with fully defined 
behaviour and conforming to modern C standards - you might have a bug 
somewhere, and the new compiler version might have a bug.




I am not dictating anything to you or anyone else in this paragraph,
though? All I said was that if one writes a c89 program and tells the
compiler that, then they will not even notice this entire discussion to
begin with.

What, precisely, have I dictated?


That people who are writing GNU C code should be forced to rewrite their
code in ANSI C, in order to make use of GNU C extensions to the 1999
Standard.



You are joking, right?  Surely no one can /still/ be under the 
misapprehension that anyone is proposing GCC stop accepting the old 
code?  All that is changing is the default behaviour, which will mean 
some people might have to use an extra flag or two in their build setup.





However, it does appear that we are still stuck in confusion here,
because you think that GCC is no longer able to compile such code, when
in fact it is able to.


It won't, not by default.



That's pretty much irrelevant.  People don't use gcc without flags.  The 
only thing that will change is which flags you need to use.


If you are not in a position to change the source code, and not in a 
position to change the build flags, then you are not in a position to 
change the compiler version.  (That's fine, of course - in my line of 
work, I almost never change compiler version for existing projects.  I 
have old code where the makefile specifies gcc 2.95.)


David




Re: More C type errors by default for GCC 14

2023-05-11 Thread David Brown via Gcc

On 11/05/2023 04:09, Po Lu via Gcc wrote:

jwakely@gmail.com (Jonathan Wakely) writes:


So let's do it. Let's write a statement saying that the GCC developers
consider software security to be of increasing importance, and that we
consider it irresponsible to default to accepting invalid constructs in the
name of backwards compatibility. State that we will make some changes which
were a break from GCC's traditional stance, for the good of the ecosystem.


I'm sorry you think that way.


Given recent pushes to discourage or outright ban the use of memory-safe
languages in some domains, I think it would be good to make a strong
statement about taking the topic seriously. And not just make a statement,
but take action too.

If we don't do this, I believe it will harm GCC in the long run. The vocal
minority who want to preserve the C they're used to, like some kind of
historical reenactment society, would get their wish: it would become a
historical dead end and go nowhere.


Vocal minority? Do you have any evidence to back this claim?

What I see is that some reasonable organizations have already chosen
other C compilers which are capable of supporting their existing large
bodies of C code that have seen significant investment over many years,
while others have chosen to revise their C code with each major change
to the language.

The organizations which did not wish to change their code did not
vocally demand changes to GCC after GCC became unsuitable, but quietly
arranged to license other compilers.

Those that continue write traditional C code know what they are doing,
and the limitations of traditional C do not affect the quality of their
code.  For example, on the Unix systems at my organization, the SGS is
modified so that it will not link functions called through a declaration
with no parameter specification with a different set of parameters than
it was defined with.

Naturally, the modified linker is not used to run configure scripts.



Let's be absolutely clear here - gcc has been, and will continue to be, 
able to compile code according to old and new standards.  It can handle 
K C, right through to the cutting edge of newest C and C++ standards. 
It can handle semantic requirements such as two's complement wrapping 
and "anything goes" pointer type conversions - features that a lot of 
old code relies on but which are not documented or guaranteed behaviour 
for the vast majority of other compilers.  It can handle all these 
things - /if/ you pick the correct flags.


With the proposed changes, you can still compile old K code with gcc - 
if you give it the right flags.  No features are being removed - only 
the default flags are being changed.  If anyone is changing from gcc to 
other compilers because they think newer gcc does not support older 
code, then they are perhaps doing so from ignorance.


If some users are willing to change to different compilers, but 
unwilling to learn or use new flags in order to continue using their 
existing compiler after it changes its defaults, then perhaps gcc could 
pick different defaults depending on the name used for the executable? 
If it is invoked with the name "gcc-kr", then it could accept K code 
by default and have "-std=gnu90" (I believe that's the oldest standard 
option).  If it is invoked as "gcc", then it would reject missing 
function declarations, implicit int, etc., as hard errors.


Then these users could continue to use gcc, and their "new" compiler to 
handle their old code would be nothing more than a symbolic link.


David







Re: More C type errors by default for GCC 14

2023-05-10 Thread David Brown via Gcc

On 10/05/2023 16:39, Eli Zaretskii via Gcc wrote:

Date: Wed, 10 May 2023 15:30:02 +0200
From: David Brown via Gcc 


If some developers want to ignore warnings, it is not the business of
GCC to improve them, even if you are right in assuming that they will
not work around errors like they work around warnings (and I'm not at
all sure you are right in that assumption).  But by _forcing_ these
errors on _everyone_, GCC will in effect punish those developers who
have good reasons for not changing the code.


What would those "good reasons" be, in your opinion?


For example, something that adversely affects GCC itself and its
ability to compile valid programs.


If gcc itself contains code that relies on outdated features, these 
should be fixed in the gcc source code.  It is one thing to suggest that 
a project that has been "maintenance only" for several decades cannot 
reasonably be updated, but that does not apply to current programs like gcc.





On the other hand, continuing to accept old, outdated code by lax
defaults is punishing /current/ developers and users.  Why should 99.99%
of current developers have to enable extra errors to catch mistakes (and
we all make occasional mistakes in our coding - so they /should/ be
enabling these error flags)?


Adding a flag to a Makefile is infinitely easier than fixing old
sources in a way that they produce the same machine code.



The suggestion has been - always - that support for old syntaxes be 
retained.  But that flag should be added to the makefiles of the 0.01% 
of projects that need it because they have old code - not the 99.99% of 
projects that are written (or updated) this century.



I do agree that backwards compatibility breaks should only be done for
good reasons.  But I think the reasons are good.


Not good enough, not for such a radical shift in the balance between
the two groups.



Do you have any reason to believe that the old code group is of relevant 
size?  I think it is quite obvious that I have been pulling percentages 
out of thin air, but can you justify claiming anything different?


I mean, if gcc simply added a default "-Werror=implicit" flag in the 
release candidate for gcc-14, how many people do you think would 
actually complain?  I'd guess that there would be far fewer complaints 
than there are posts in this thread discussing whether or not it's a 
good idea.




And no,
educating/forcing GCC users to use more modern dialect of C is not a
good reason.



Yes, it /is/ a good reason.


Not for a compiler.  A compiler is a tool, it is none of its business
to teach me what is and what isn't a good dialect in each particular
case.  Hinting on that, via warnings, is sufficient and perfectly
okay, but _forcing_ me is not.


Again - did you miss the point about people who really want to work with 
old code can do so, by picking the right flag(s) ?





Consider why Rust has become the modern fad in programming.  People
claim it is because it is inherently safer than C and C++.  It is not.
There are really two reasons for it appearing to be safer.  One is that
the /defaults/ for the tools, and the language idioms, are safer than
the /defaults/ for C and C++ tools.  That makes it harder to make
mistakes.  The other is that it has no legacy of decades of old code and
old habits, and no newbie programmers copying those old styles.


Exactly.  We cannot reasonably expect that a compiler which needs to
support 50 years of legacy code to be as safe as a compiler for a
language invented yesterday afternoon.  People who want a safe
programming environment should not choose C as their first choice.



We cannot expect a /language/ with a 50 year history to be as safe as a 
modern one.  But we can expect a /compiler/ released /today/ to be as 
safe as it can be made /today/.


I agree that C is not the best choice of language for many people. 
Actually, I'd say that most people who program in C would be better off 
programming in something else.  And most programs that are written in C 
could be better in a different language.  But when C /is/ the right 
choice - or even when it is the choice made despite being the wrong 
choice, I want it to be /good/ C, and I want tools to help out there as 
best they possibly can.  That includes good default flags, because not 
all gcc users are experts on gcc flags.


My ideal, actually, would be that gcc has "-Wall -Wextra" by default, 
trying to help developers from the get-go.  It should also have an flag 
"-sep" that disables all warnings and uses lax modes, for people using 
it to build software provided by others and they want nothing to do with 
the source code.  But of course that is not the ideal situation for 
everyone else!


(See <https://en.wikipedia.org/wiki/Somebody_else%27s_problem> for an 
explanation behind the "-sep" flag.)




So yes, anything that pushes C programmers into being better C
programmers is worth consi

Re: More C type errors by default for GCC 14

2023-05-10 Thread David Brown via Gcc

On 10/05/2023 16:14, Eli Zaretskii via Gcc wrote:

Date: Wed, 10 May 2023 14:41:27 +0200
Cc: jwakely@gmail.com, fwei...@redhat.com, gcc@gcc.gnu.org,
  ar...@aarsen.me
From: Gabriel Ravier 


Because GCC is capable of compiling it.

That is not a good argument.  GCC is capable of compiling any code in all
the reported accepts-invalid bugs on which it doesn't ICE.  That doesn't
mean those bugs shouldn't be fixed.

Fixing those bugs, if they are bugs, is not the job of the compiler.
It's the job of the programmer, who is the one that knows what the
code was supposed to do.  If there's a significant risk that the code
is a mistake or might behave in problematic ways, a warning to that
effect is more than enough.


Are you seriously saying that no accepts-invalid bug should ever be
fixed under any circumstances on the basis that some programmers might
rely on code exploiting that bug ??


Sorry, I'm afraid I don't understand the question.  What are
"accepts-invalid bugs"?



They are cases where the C standards (plus documented gcc extensions) 
have syntax or constraint requirements, but code which breaks these 
requirements is accepted by the compiler.  For example, if the compiler 
accepted "long long long int" as a type, that would be an 
"accepts-invalid" bug.  They are important for two reasons.  One is that 
they mean the compiler fails to help the developer catch the mistake in 
their code.  The other is that the code might have an inconsistent 
interpretation, and that might change in the future.  In the 
hypothetical example of a three-long int, a current compiler might treat 
it as a "long long", while a future standard might add support for it as 
a new type with minimum 128-bit size.



In any case, I was not not talking about bug-compatibility, I was
talking about being able to compile code which GCC was able to compile
in past versions.  Being able to compile that code is not a bug, it's
a feature.



No, being able to compile /incorrect/ code by default is a bug.  It is 
not helpful.


(The compiler cannot, of course, spot /all/ mistakes - the gcc 
developers are a smart group, but I think asking them to solve the 
Halting Problem is a bit much!)


I've seen this kind of argument many times - "The compiler used to 
accept my code and give the results I wanted, and now newer compiler 
versions make a mess of it".  The cause is almost invariably undefined 
behaviour, but it can occasionally be through changes to the standards 
such as removal of old behaviour or other differences in the 
interpretation of code (there were a number of incompatibilities between 
K and C90, and between C90 and C99).


The compiler is under /no/ obligation to compile undefined behaviour in 
the same way as it might have done for a particular piece of code.  It 
is under /no/ obligation to continue to accept incorrect or invalid 
code, just because it used to accept it.  It /is/ - IMHO - under an 
obligation to do what it can to help spot problems in code and help 
developers get good quality correct code in the end.  If it fails to do 
that, people will, and should, move to using different tools.


New compiler versions are not required to do two's complement wrapping 
of signed integer overflow, even though old broken code might have been 
written under the assumption that it did and even though older, less 
powerful versions of the compiler might have compiled that code into 
something the developer wanted.  In the same way, new compiler versions 
are not required to support syntax that has been dead for decades - at 
least not by default.  (Unlike most other compilers, gcc developers go 
far out of their way to support such outdated and incorrect code - all 
they ask is that people use non-default flags to get such non-standard 
syntax and semantics.)


If the gcc developers really were required to continue to compile /all/ 
programs that compiled before, with the same results, then the whole gcc 
project can be stopped.  The only way to ensure perfect backwards 
compatibility would be to stop development, and no longer release any 
new versions of the compiler.  That is the logical consequence of "it 
used to compile (with defaults or a given set of flags), so it should 
continue to compile (with these same flags)" - assuming "compile" here 
means "giving the same resulting behaviour in the executable" rather 
than just "giving an executable that may or may not work".


Clearly, you don't mean gcc development should stop.  That means a line 
must be drawn, and some code that compiled with older gcc will not 
compile with newer gcc.  The only question is where the line should be.







Re: More C type errors by default for GCC 14

2023-05-10 Thread David Brown via Gcc

On 10/05/2023 15:10, Basile Starynkevitch wrote:

Hello all,

After a suggestion by Eric Gallager

Idea for a compromise: What if, instead of flipping the switch on all
3 of these at once, we staggered them so that each one becomes a
default in a separate release? i.e., something like:

- GCC 14: -Werror=implicit-function-declaration gets added to the 
defaults

- GCC 15: -Werror=implicit-int gets added to the defaults
- GCC 16: -Werror=int-conversion gets added to the defaults

That would give people more time to catch up on a particular warning,
rather than overwhelming them with a whole bunch all at once. Just an
idea.


Eli Zaretskii  wrote on 10 may 2023, at 14:00


And that is just one example of perfectly valid reasons for not
wanting or not being able to make changes to pacify GCC.

Once again, my bother is not about "villains" who don't want to get
their act together, my bother is about cases such as the one above,
where the developers simply have no practical choice.

And please don't tell me they should use an older GCC, because as
systems go forward and are upgraded, older GCC will not work anymore.



My experience is that for safety critical software (per DOI 178C, 
embedded in aircrafts, or for the French covid breathing machine on 
https://github.com/Recovid/Controller ) the regulations, funders, and 
authorities requires a very specific version of GCC with very specific 
compilation flags.



Changing either the compiler (even from gcc-12.1 to gcc-12.2) or the 
compilation flags (even changing -O1 by -O2) requires written (on paper) 
approval by a large number of human persons, and formal certifications 
(eg ISO9001, ISO27001 procedures) and lots of checks and headaches.



I do know several persons making their living of these constraints.

I do know several corporations making a living from them (and keeping 
decade older GCC compiler binaries on many disks).


So I really think that for safety critical software (whose failure may 
impact lives) people are using an older (and well specified) GCC.



Of course, to compile an ordinary business web service (e-shop for 
clothes) with e.g. libonion (from https://github.com/davidmoreno/onion 
...) or to compile a zsh.org from source code (for or on a developer's 
laptop) the constraints are a lot lighter.


Regards!



In my line of work (small-systems embedded programming), the source for 
a program does not just include the C source code.  It includes the 
build system, compiler version, the flags used, and the library used - 
everything that can affect the resulting binary.  I realise I am far 
more paranoid about that kind of thing than the majority of developers, 
but it is also noteworthy that there is a trend towards reproducible 
builds in more mainstream development.


The oldest gcc I have on my machine is 2.95.3 for the 68k, from 1998.  I 
have some older compilers, but they are not gcc.


I wouldn't say I made a living out of this, but I have had a customer 
who was very happy that I could make a fix in a program I wrote 20 years 
previously, and could compile it with exactly the same tools as I used then.


One of the reasons I use gcc (in a world where companies are willing to 
pay $5000 for tools from the likes of Green Hills) is that I can keep 
the old versions around, and copy and use them at will.


And for those that are more demanding than me, they can of course 
archive the sources for gcc (and other parts of the toolchain).


Those that /really/ need old versions of the toolchain, can use old 
versions of the toolchain.  And if gcc 14 changes in such a way that 
distro maintainers can't use it to build ancient packages, then they 
should make gcc-13 a part of their base packages as well as current gcc, 
and ship gcc version 13 for as long as they ship "ed", "rn" and other 
software from the middle ages.






Re: More C type errors by default for GCC 14

2023-05-10 Thread David Brown via Gcc

On 10/05/2023 14:22, Eli Zaretskii via Gcc wrote:

From: Jonathan Wakely 
Date: Wed, 10 May 2023 12:49:52 +0100
Cc: David Brown , gcc@gcc.gnu.org


If some developers want to ignore warnings, it is not the business of
GCC to improve them, even if you are right in assuming that they will
not work around errors like they work around warnings (and I'm not at
all sure you are right in that assumption).  But by _forcing_ these
errors on _everyone_, GCC will in effect punish those developers who
have good reasons for not changing the code.


What would those "good reasons" be, in your opinion?  (I realise I am 
asking you to be speculative and generalise.  This discussion is an 
exchange of opinions, thoughts, experiences and impressions.)


Frankly, the most common "good" reason for a developer not changing 
their code from pre-C99 is that they retired long ago.  And people 
should definitely question whether the code should be kept.


As I noted in another post, it is entirely reasonable to suspect that 
such old code has errors - unwarranted assumptions that were considered 
appropriate back in the days when such code techniques were considered 
appropriate.  It has always been the unfortunate case with C programming 
that getting optimal results for some compilers has sometimes involved 
"cheating" a bit, such as assuming wrapping signed arithmetic or 
converting pointer types and breaking the "strict aliasing" rules.


Changing the gcc defaults and requiring old code to use flags that allow 
old constructs but limiting optimisations is not /punishing/ the old 
code or its developers or maintainers.  It is /supporting/ it - allowing 
it to be used more safely with modern tools.



On the other hand, continuing to accept old, outdated code by lax 
defaults is punishing /current/ developers and users.  Why should 99.99% 
of current developers have to enable extra errors to catch mistakes (and 
we all make occasional mistakes in our coding - so they /should/ be 
enabling these error flags)?  Why should they have to deal with other 
people's code that was badly written 30 years ago?  Is it really worth 
it, just so that a half-dozen maintainers at Linux distributions can 
recompile the 40-year old source for "ed" without adding a flag to the 
makefile?



Ultimately, /someone/ is going to suffer - a compiler can't have good 
defaults for current developers and simultaneously good defaults for 
ancient relics.  The question to consider is not whether we "punish" 
someone, but /whom/ do we punish, and what is the best balance overall 
going forward.





There will be options you can use to continue compiling the code
without changing it. You haven't given a good reason why it's OK for
one group of developers to have to use options to get their desired
behaviour from GCC, but completely unacceptable for a different group
to have to use options to get their desired behaviour.

This is just a change in defaults.


A change in defaults that is not backward-compatible should only be
done for very good reasons, because it breaks something that was
working for years.  No such good reasons were provided.  


I'm sorry, but I believe I /did/ provide good reasons.  Granted, they 
were in more than one post.  And many others here have also given many 
good reasons.  At the very least, making a safer and more useful 
compiler that helps developers make better code is a good reason, as is 
making a C compiler that is closer to standards compatibility by default.


I do agree that backwards compatibility breaks should only be done for 
good reasons.  But I think the reasons are good.




And no,
educating/forcing GCC users to use more modern dialect of C is not a
good reason.



Yes, it /is/ a good reason.  But I suppose that one is a matter of opinion.

I encourage you to look at CERT/CC, or other lists of code errors 
leading to security issues or functional failures.  When someone writes 
poor code, lots of people suffer.  Any initiative that reduces the 
likelihood of such errors getting into the wild is not just good for gcc 
and its users, it's good for the whole society.


Consider why Rust has become the modern fad in programming.  People 
claim it is because it is inherently safer than C and C++.  It is not. 
There are really two reasons for it appearing to be safer.  One is that 
the /defaults/ for the tools, and the language idioms, are safer than 
the /defaults/ for C and C++ tools.  That makes it harder to make 
mistakes.  The other is that it has no legacy of decades of old code and 
old habits, and no newbie programmers copying those old styles.  Rust 
code is written in modern development styles, with a care for 
correctness rather than getting maximum efficiency from limited 
old-fashioned tools or macho programming.  The only reason there is any 
sense in re-writing old programs in Rust is because re-writing them in 
good, clear, modern C (or C++) is never going to happen - even though 
the results would be 

Re: More C type errors by default for GCC 14

2023-05-10 Thread David Brown via Gcc

On 09/05/2023 22:13, David Edelsohn via Gcc wrote:

On Tue, May 9, 2023 at 3:22 PM Eli Zaretskii via Gcc 
wrote:


Date: Tue, 9 May 2023 21:07:07 +0200
From: Jakub Jelinek 
Cc: Jonathan Wakely , ar...@aarsen.me,

gcc@gcc.gnu.org


On Tue, May 09, 2023 at 10:04:06PM +0300, Eli Zaretskii via Gcc wrote:

From: Jonathan Wakely 
Date: Tue, 9 May 2023 18:15:59 +0100
Cc: Arsen Arsenović , gcc@gcc.gnu.org

On Tue, 9 May 2023 at 17:56, Eli Zaretskii wrote:


No one has yet explained why a warning about this is not enough,

and

why it must be made an error.  Florian's initial post doesn't

explain

that, and none of the followups did, although questions about

whether

a warning is not already sufficient were asked.

That's a simple question, and unless answered with valid arguments,
the proposal cannot make sense to me, at least.


People ignore warnings. That's why the problems have gone unfixed for
so many years, and will continue to go unfixed if invalid code keeps
compiling.


People who ignore warnings will use options that disable these new
errors, exactly as they disable warnings.  So we will end up not


Some subset of them will surely do that.  But I think most people will

just

fix the code when they see hard errors, rather than trying to work around
them.


The same logic should work for warnings.  That's why we have warnings,
no?



This seems to be the core tension.  If developers cared about these issues,
they would enable appropriate warnings and -Werror.



-Werror is a /big/ stick.  An unused parameter message might just be an 
indication that the programmer isn't finished with that bit of code, and 
a warning is fine.  An implicit function declaration message shows a 
clear problem in the code - a typo in the function call, a missing 
#include, or a major flaw in the design and organisation of the code.


The C language takes backwards compatibility more seriously than any 
other programming language.  When the C standards mark previously 
acceptable features as deprecated, obsolescent, or constrain errors, it 
is done for very good reasons.  People should not be writing code with 
implicit int, or non-prototype function declarations.  Such mis-features 
of the language were outdated 30 years ago.



The code using these idioms is not safe and does create security
vulnerabilities.  And software security is increasingly important.

The concern is using the good will of the GNU Toolchain brand as the tip of
the spear or battering ram to motivate software packages to fix their
problems. It's using GCC as leverage in a manner that is difficult for
package maintainers to avoid.  Maybe that's a necessary approach, but we
should be clear about the reasoning.  Again, I'm not objecting, but let's
clarify why we are choosing this approach.



There are two problems I see with the current state of affairs, where 
deeply flawed code can be accepted (possibly with warnings) by gcc by 
default.


1. Modern developers who are not particularly well versed in the 
language write code with these same risky features.  It is depressing 
how many people think "The C Programming Language" (often a battered 
first edition) is all you need for learning C programming.  Turning more 
outdated syntax and more obvious mistakes into hard errors will help 
such developers - and help everyone who has to use the code they make.



2. Old code gets compiled with with modern tools that do not fulfil the 
assumptions made by the developer decades ago.  Compiling such code with 
modern gcc risks all sorts of problems due to the simpler compilation 
models of older tools.  For example, the code might assume two's 
complement wrapping arithmetic, or that function calls always act as a 
memory barrier.



My suggestion would be to have a flag "-fold-code" that would do the 
following (at a minimum) :


* Disallow higher optimisation flags.
* Force -fwrapv, -fno-strict-aliasing, -fno-inline.
* Require an explicit "-std=" selection.
* Allow old-style syntax, such as implicit int, with just a warning

If the "-fold-code" is /not/ included, then old, deprecated or 
obsolescent syntax would be a hard error that cannot be turned off or 
downgraded to a warning by flags.  A substantial subset of -Wall 
warnings would be enabled automatically.  (I think the "unused" warnings 
should not be included, for example.)



Distributions and upstream code maintainers should be pushed towards 
either fixing and updating their code, or marking it as "-fold-code" if 
it is too outdated to modernise without a major re-write.  This might be 
painful during the transition, but waiting longer just makes the 
situation work.


(I'm a long-term gcc user, but not a gcc developer.  I'm fully aware 
that I am asking others to do a lot of work here, but I think something 
of this sort is important going forward.)



David







Re: More C type errors by default for GCC 14

2023-05-10 Thread David Brown via Gcc

On 09/05/2023 21:04, Eli Zaretskii via Gcc wrote:

From: Jonathan Wakely 
Date: Tue, 9 May 2023 18:15:59 +0100
Cc: Arsen Arsenović , gcc@gcc.gnu.org

On Tue, 9 May 2023 at 17:56, Eli Zaretskii wrote:


No one has yet explained why a warning about this is not enough, and
why it must be made an error.  Florian's initial post doesn't explain
that, and none of the followups did, although questions about whether
a warning is not already sufficient were asked.

That's a simple question, and unless answered with valid arguments,
the proposal cannot make sense to me, at least.


People ignore warnings. That's why the problems have gone unfixed for
so many years, and will continue to go unfixed if invalid code keeps
compiling.


People who ignore warnings will use options that disable these new
errors, exactly as they disable warnings.  So we will end up not
reaching the goal, but instead harming those who are well aware of the
warnings.



My experience is that many of the people who ignore warnings are not 
particularly good developers, and not particularly good at 
self-improvement.  They know how to ignore warnings - the attitude is 
"if it really was a problem, the compiler would have given an error 
message, not a mere warning".  They don't know how to disable error 
messages, and won't bother to find out.  So they will, in fact, be a lot 
more likely to fix their code.




IOW, if we are targeting people for whom warnings are not enough, then
we have already lost the battle.  Discipline cannot be forced by
technological means, because people will always work around.



Agreed.  But if we can make it harder for them to release bad code, 
that's good overall.


Ideally, I'd like the compiler to email such people's managers with a 
request that they be sent on programming courses!






Re: [BUG] -Wuninitialized: initialize variable with itself

2022-11-14 Thread David Brown via Gcc




On 14/11/2022 16:10, NightStrike wrote:



On Mon, Nov 14, 2022, 04:42 David Brown via Gcc 



Warnings are not perfect - there is always the risk of false positives
and false negatives.  And different people will have different ideas
about what code is perfectly reasonable, and what code is risky and
should trigger a warning.  Thus gcc has warning flag groups (-Wall,
-Wextra) that try to match common consensus, and individual flags for
personal fine-tuning.

Sometimes it is useful to have a simple way to override a warning in
code, without going through "#pragma GCC diagnostic" lines (which are
powerful, but not pretty).

So if you have :

         int i;
         if (a == 1) i = 1;
         if (b == 1) i = 2;
         if (c == 1) i = 3;
         return i;

the compiler will warn that "i" may not be initialised.  But if you
/know/ that one of the three conditions will match (or you don't care
what "i" is if it does not match), then you know your code is fine and
don't want the warning.  Writing "int i = i;" is a way of telling the
compiler "I know what I am doing, even though this code looks dodgy,
because I know more than you do".

It's just like writing "while ((*p++ = *q++));", or using a cast to
void
to turn off an "unused parameter" warning.


Wouldn't it be easier, faster, and more obvious to the reader to just 
use "int i = 0"? I'm curious what a real world use case is where you 
can't do the more common thing if =0.




You can write "int i = 0;" if you prefer.  I would not, because IMHO 
doing so would be wrong, unclear to the reader, less efficient, and 
harder to debug.


In the code above, the value returned should never be 0.  So why should 
"i" be set to 0 at any point?  That's just an extra instruction the 
compiler must generate (in my line of work, my code often needs to be 
efficient).  More importantly, perhaps, it means that if you use 
diagnostic tools such as sanitizers you are hiding bugs from them 
instead of catching them - a sanitizer could catch the case of "return 
i;" when "i" is not set.


(I don't know if current sanitizers will do that or not, and haven't 
tested it, but they /could/.)


But I'm quite happy with :

int i = i;  // Self-initialise to silence warning

I don't think there is a "perfect" solution to cases like this, and 
opinions will always differ, but self-initialisation seems a good choice 
to me.  Regardless of the pros and cons in this particular example, the 
handling of self-initialisation warnings in gcc is, AFAIUI, to allow 
such code for those that want to use it.





Re: [BUG] -Wuninitialized: initialize variable with itself

2022-11-14 Thread David Brown via Gcc

On 13/11/2022 19:43, Alejandro Colomar via Gcc wrote:

Hi Andrew!

On 11/13/22 19:41, Andrew Pinski wrote:

On Sun, Nov 13, 2022 at 10:40 AM Andrew Pinski  wrote:


On Sun, Nov 13, 2022 at 10:36 AM Alejandro Colomar via Gcc
 wrote:


Hi,

While discussing some idea for a new feature, I tested the following 
example

program:


  int main(void)
  {
  int i = i;
  return i;
  }


This is NOT a bug but a documented way of having the warning not 
being there.
See 
https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Warning-Options.html#index-Winit-self 

https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Warning-Options.html#index-Wuninitialized 


"If you want to warn about code that uses the uninitialized value of
the variable in its own initializer, use the -Winit-self option."


I should note the main reason why I Know about this is because I fixed
this feature years ago (at least for C front-end)
and added the option to disable the feature.


I'm curious: what are the reasons why one would want to disable such a 
warning?

Why is it not in -Wall or -Wextra?

Thanks,

Alex



Warnings are not perfect - there is always the risk of false positives 
and false negatives.  And different people will have different ideas 
about what code is perfectly reasonable, and what code is risky and 
should trigger a warning.  Thus gcc has warning flag groups (-Wall, 
-Wextra) that try to match common consensus, and individual flags for 
personal fine-tuning.


Sometimes it is useful to have a simple way to override a warning in 
code, without going through "#pragma GCC diagnostic" lines (which are 
powerful, but not pretty).


So if you have :

int i;
if (a == 1) i = 1;
if (b == 1) i = 2;
if (c == 1) i = 3;
return i;

the compiler will warn that "i" may not be initialised.  But if you 
/know/ that one of the three conditions will match (or you don't care 
what "i" is if it does not match), then you know your code is fine and 
don't want the warning.  Writing "int i = i;" is a way of telling the 
compiler "I know what I am doing, even though this code looks dodgy, 
because I know more than you do".


It's just like writing "while ((*p++ = *q++));", or using a cast to void 
to turn off an "unused parameter" warning.


Re: -Wint-conversion, -Wincompatible-pointer-types, -Wpointer-sign: Are they hiding constraint C violations?

2022-11-11 Thread David Brown via Gcc

On 10/11/2022 20:16, Florian Weimer via Gcc wrote:

* Marek Polacek:


On Thu, Nov 10, 2022 at 07:25:21PM +0100, Florian Weimer via Gcc wrote:

GCC accepts various conversions between pointers and ints and different
types of pointers by default, issuing a warning.

I've been reading the (hopefully) relevant partso f the C99 standard,
and it seems to me that C implementations are actually required to
diagnose errors in these cases because they are constraint violations:
the types are not compatible.


It doesn't need to be a hard error, a warning is a diagnostic message,
which is enough to diagnose a violation of any syntax rule or
constraint.

IIRC, the only case where the compiler _must_ emit a hard error is for
#error.


Hmm, you could be right.

The standard says that constraint violations are not undefiend behavior,
but of course it does not define what happens in the presence of a
constraint violation.  So the behavior is undefined by omission.  This
seems to be a contradiction.



Section 5.1.1.3p1 of the C standard covers diagnostics.  (I'm looking at 
the C11 version at the moment, but numbering is mostly consistent 
between C standards.)  If there is at least one constraint violation or 
syntax error in the translation unit, then the compiler must emit at 
least one diagnostic message.  That is all that is required.


The C standard does not (as far as I know) distinguish between "error 
messages" and "warnings", or require that diagnostics stop compilation 
or the production of output files.


So that means a conforming compiler can sum up all warnings and errors 
with a single "You did something wrong" message - and it can still 
produce an object file.  It is even allowed to generate the same message 
when /nothing/ is wrong.  The minimum behaviour to be conforming here is 
not particularly helpful!


Also note that gcc, with default flags, is not a conforming compiler - 
it does not conform to any language standards.  You need at least 
"-std=c99" (or whatever) and "-Wpedantic".  Even then, I think gcc falls 
foul of the rule in 5.1.1.3p1 that says at least one diagnostic must be 
issued for a syntax or constraint violation "even if the behaviour is 
explicitly specified as undefined or implementation-defined".  I am not 
entirely sure, but I think some of the extensions that are enabled even 
in non-gnu standards modes could contradict that.


I personally think the key question for warnings on things like pointer 
compatibility depends on whether the compiler will do what the 
programmer expects.  If you have a target where "int" and "long" are the 
same size, a programmer might use "pointer-to-int" to access a "long", 
and vice-versa.  (This can easily be done accidentally on something like 
32-bit ARM, where "int32_t" is "long" rather than "int".)  If the 
compiler may use this incompatibility for type-based alias analysis and 
optimise on the assumption that the "pointer-to-int" never affects a 
"long", then such mixups should by default be at least a warning, if not 
a hard error.  The primary goal for warnings and error messages must be 
to stop the programmer writing code that is wrong and does not do what 
they expect (as best the compiler can guess what the programmer expects).


The secondary goal is to help the programmer write good quality code, 
and avoid potentially risky constructs - things that might work now, but 
could fail with other compiler versions, flags, targets, etc.  It is not 
unreasonable to have warnings in this category need "-Wall" or explicit 
flags.  (I'd like to see more warnings in gcc by default, and more of 
them as errors, but compatibility with existing build scripts is important.)




I assumed that there was a rule similar to the the rule for #error for
any kind of diagnostic, which would mean that GCC errors are diagnostic
messages in the sense of the standard, but GCC warnings are not.


I believe that both "error" and "warning" messages are "diagnostics" in 
the terms of the standard.


As I said above, the minimum requirements of the standard provide a very 
low bar here.  A useful compiler must do far better (and gcc /does/ do 
far better).




I wonder how C++ handles this.

Thanks,
Florian






Re: Local type inference with auto is in C2X

2022-11-04 Thread David Brown via Gcc

On 03/11/2022 16:19, Michael Matz via Gcc wrote:

Hello,

On Thu, 3 Nov 2022, Florian Weimer via Gcc wrote:


will not have propagated widely once GCC 13 releases, so rejecting
implicit ints in GCC 13 might be too early.  GCC 14 might want to switch
to C23/C24 mode by default, activating auto support, if the standard
comes out in 2023 (which apparently is the plan).

Then we would go from
warning to changed semantics in a single release.

Comments?


I would argue that changing the default C mode to c23 in the year that
comes out (or even a year later) is too aggressive and early.  Existing
sources are often compiled with defaults, and hence would change
semantics, which seems unattractive.  New code can instead easily use
-std=c23 for a time.

E.g. c99/gnu99 (a largish deviation from gnu90) was never default and
gnu11 was made default only in 2014.



That's true - and the software world still has not recovered from the 
endless mass of drivel that gcc (and other compilers) accepted in lieu 
of decent C as a result of not changing to C99 as the standard.


Good C programmers put the standards flag explicitly in their makefile 
(or other build system).  Bad ones use whatever the compiler gives them 
by default and believe "the compiler accepted it, it must be good code".


My vote would be to make "-std=c17 -Wall -Wextra -Wpedantic -Werror -O2" 
the default flags.  Force those who don't really know what they are 
doing, to learn - it's not /that/ hard, and the effort pays off quickly. 
 (Or they can give up and move to Python.)  Those who understand how to 
use their tools can happily change the standards and warnings to suit 
their needs.


And the person who first decided "implicit declaration of function" 
should merely be a /warning/ should be sentenced to 10 years Cobol 
programming.


It's probably a good thing that it is not I who decides the default 
flags for gcc !


Re: Gcc Digest, Vol 29, Issue 7

2022-07-06 Thread David Brown via Gcc

On 05/07/2022 09:19, Yair Lenga via Gcc wrote:

Hi,

Wanted to get some feedback on an idea that I have - trying to address the
age long issue with type check on VA list function - like 'scanf' and
friends. In my specific case, I'm trying to build code that will parse a
list of values from SELECT statement into list of C variables. The type of
the values is known (by inspecting the result set meta-data). My ideal
solution will be to implement something like:

int result_set_read(struct result_set *p_result_set, ...);

Which can be called with

int int_var ; float float_var ; char c[20] ;
result_set_read(rs1, _var, _var, c) ;

The tricky part is to verify argument type - make sure . One possible path
I thought was - why not leverage the ability to describe scanf like
functions (
result_set_read(rs1, const char *format, ...) __attribute((format (scanf,
2, 3)) ;

And then the above call will be
result_set-read(rs1, "%d %f %s", _var, _var, c) ;

With the added benefit that GCC will flag as error, if there is mismatch
between the variable and the type. My function parses the scanf format to
decide on conversions (just the basic formatting '%f', '%d', '%*s', ...).
So far big improvement, and the only missing item is the ability to enforce
check on string sizes - to support better checks against buffer overflow
(side note: wish there was ability to force inclusion of the max string
size, similar to the sscanf_s).

My question: does anyone know how much effort it will be to add a new GCC
built-in (or extension), that will automatically generate a descriptive
format string, consistent with scanf formatting, avoiding the need to
manually enter the formatting string. This can be thought of as "poor man
introspection". Simple macro can then be used to generate it

#define RESULT_SET_READ(rs, ...) result_set_read(rs,
__builtin_format(__VA_ARGS__),  __VA_ARGS__)

Practically, making the function "safe" (with respect to buffer overflow,
type conversions) for most use cases.

Any feedback, pointers, ... to how to implement will be appreciated

Yair



I haven't worked through all the details, but I wonder if this could be 
turned around a bit.  Rather than your function taking a variable number 
of arguments of different types, which as you know can be a risky 
business, have it take an array of (type, void*) pairs (where "type" is 
an enumeration).  Use some variadic macro magic to turn the 
"RESULT_SET_READ" into the creation of a local array that is then passed 
on to the function.