Re: Disassemble the .lib file compiled with gcc-arm-8.3-2019.03-x86_64-arm-eabi compilation tool chain, and found that malloc is optimized to calloc.

2020-11-04 Thread Richard Earnshaw via Gcc
On 30/10/2020 08:53, YaRu Wei(魏亚茹) wrote:
> Dear gcc:
> I find that disassemble the .lib file compiled with 
> gcc-arm-8.3-2019.03-x86_64-arm-eabi compilation tool chain, and found that 
> malloc is optimized to calloc. I want to know under what circumstances malloc 
> will be optimized to calloc?
> Thanks
> 

I would presume this would happen if the call to malloc were immediately
followed by a call to memset to clear the memory (or an equivalent
software coded implementation of memset were detected).  Generally,
calloc has a better idea of the memory being allocated and can optimize
the clearing step if it knows the memory is already zeroed.

R.


Re: [RFC] Increase libstdc++ line length to 100(?) columns

2020-12-03 Thread Richard Earnshaw via Gcc
On 29/11/2020 17:38, Florian Weimer wrote:
> * Allan Sandfeld Jensen:
> 
>> If you _do_ change it. I would suggest changing it to 120, which is next 
>> common step for a lot of C++ projects.
> 
> 120 can be problematic for a full HD screen in portrait mode.  Nine
> pixels per character is not a lot (it's what VGA used), and you can't
> have any window decoration.  With a good font and screen, it's doable.
> But if the screen isn't quite sharp, then I think you wouldn't be able
> to use portrait mode anymore.
> 

Please remember that not everyone has 20:20 vision.  Requiring a
terminal width that's so large that the text is wrapped (or, worse, you
get horizontal scroll bars) is not acceptable, IMO.

R.


Re: why aarch64 doesn't support V4QI.

2020-12-18 Thread Richard Earnshaw via Gcc
On 15/12/2020 14:26, 172060...@hdu.edu.cn wrote:
> Hi,
> 
> I have some problems in gcc development about aarch64. I saw it doesn't 
> support 
> V4QI machine mode in aarch64-modes.def, but it has V4QI in arm-modes.def.
> 
> I want to know why it doesn't?
> 
> I am looking forward your replies. Thanks for your help.
> 
> Best regards,
> yancheng
> 

v4qi was used on arm for supporting the Intel XScale wireless-mmx
extensions.  But that doesn't exist in AArch64.

R.


Re: "musttail" statement attribute for GCC?

2021-04-26 Thread Richard Earnshaw via Gcc




On 26/04/2021 14:49, Iain Sandoe via Gcc wrote:

Alexander Monakov  wrote:


On Fri, 23 Apr 2021, Josh Haberman via Gcc wrote:

On Fri, Apr 23, 2021 at 1:10 PM Iain Sandoe  
wrote:
I did try to use it this ^ for GCC coroutines (where such a 
guarantee is

pretty important)

However, the issue there is that not all targets support indirect 
tailcalls.


I'm not familiar with this limitation. What targets do not support
indirect tail calls?


On Power64 tailcalls to anything that is not a static function are 
tricky, and
current Clang ICEs for musttail indirect calls (or direct calls to a 
function
which is not static) on that target. GCC also doesn't do indirect 
tailcalls on

Power64.

Also, NVPTX (being an intermediate instruction set) does not define a 
"tailcall
instruction", so Clang also ICEs there. Just mentioning it for 
completeness.


Also aarch64-linux if I remember correctly.


Tailcalling is possible on aarch64 using the normal branch instructions 
(B & BR).


There however be reasons why those can't be used in a specific 
compilation context.




One could use a trampoline to deal with the loading of additional state, 
but  I do
not want to go down that route for coroutines, if possible - since we 
already have
a state frame, it seems better to me to extend the ABI for that to 
include space

for target-specific data at a known offset from the frame pointer.

Iain




What about tailcalls between DSOs?


What issues are there around DSOs? Clang/LLVM don't treat these
specially AFAIK, and it seems that tail calls through a PLT should
work okay?


No, some targets have additional requirements for calls that 
potentially go via

a (PIC) PLT. The most well-known example is probably 32-bit x86, which
requries the %ebx register to hold the address of GOT when entering a PLT
trampoline. Since %ebx is callee-saved, this makes tailcalling 
impossible.


LLVM solves this by transforming such calls to "no-plt" GOT-indirect 
calls under

'musttail', so evidently it does treat them specially.

Another example is mips64, where even non-PIC PLT is special (but 
looks like

LLVM does not do any tailcalls on mips64 at all).

Alexander





Re: Build failure in fixincludes on x86_64

2021-05-26 Thread Richard Earnshaw via Gcc




On 26/05/2021 13:22, Uros Bizjak via Gcc wrote:

The build currently fails to build for me on x86_64 in fixincludes:

/home/uros/gcc-build/./gcc/xgcc -B/home/uros/gcc-build/./gcc/
-B/usr/local/x86_64-pc-linux-gnu/bin/
-B/usr/local/x86_64-pc-linux-gnu/lib/ -isystem
/usr/local/x86_64-pc-linux-gnu/include -isystem
/usr/local/x86_64-pc-linux-gnu/sys-include-c -g -O2 -W -Wall
-Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes
-Wold-style-definition -Wmissing-format-attribute
-Wno-overlength-strings -pedantic -Wno-long-long   -DHAVE_CONFIG_H -I.
-I../../git/gcc/fixincludes -I../include
-I../../git/gcc/fixincludes/../include
../../git/gcc/fixincludes/fixtests.c
during GIMPLE pass: evrp
../../git/gcc/fixincludes/fixtests.c: In function ‘run_test’:
../../git/gcc/fixincludes/fixtests.c:155:1: internal compiler error:
in operator[], at vec.h:890
  155 | }
  | ^


Same failure on arm.

R.


0x7e4ed0 vec::operator[](unsigned int)
../../git/gcc/gcc/vec.h:890
0x7e509e vec::operator[](unsigned int)
../../git/gcc/gcc/tree.h:3366
0x7e509e vec::operator[](unsigned int)
../../git/gcc/gcc/vec.h:1461
0x7e509e range_def_chain::register_dependency(tree_node*, tree_node*,
basic_block_def*)
../../git/gcc/gcc/gimple-range-gori.cc:179
0x1825ffc fold_using_range::range_of_range_op(irange&, gimple*, fur_source&)
../../git/gcc/gcc/gimple-range.cc:439
0x18292c5 fold_using_range::fold_stmt(irange&, gimple*, fur_source&, tree_node*)
../../git/gcc/gcc/gimple-range.cc:376
0x18295e2 gimple_ranger::fold_range_internal(irange&, gimple*, tree_node*)
../../git/gcc/gcc/gimple-range.cc:1067
0x18295e2 gimple_ranger::range_of_stmt(irange&, gimple*, tree_node*)
../../git/gcc/gcc/gimple-range.cc:1097
0x18256ca gimple_ranger::range_of_expr(irange&, tree_node*, gimple*)
../../git/gcc/gcc/gimple-range.cc:980
0x1825e07 fold_using_range::range_of_range_op(irange&, gimple*, fur_source&)
../../git/gcc/gcc/gimple-range.cc:431
0x18292c5 fold_using_range::fold_stmt(irange&, gimple*, fur_source&, tree_node*)
../../git/gcc/gcc/gimple-range.cc:376
0x18295e2 gimple_ranger::fold_range_internal(irange&, gimple*, tree_node*)
../../git/gcc/gcc/gimple-range.cc:1067
0x18295e2 gimple_ranger::range_of_stmt(irange&, gimple*, tree_node*)
../../git/gcc/gcc/gimple-range.cc:1097
0x18256ca gimple_ranger::range_of_expr(irange&, tree_node*, gimple*)
../../git/gcc/gcc/gimple-range.cc:980
0x110a121 range_query::value_of_expr(tree_node*, gimple*)
../../git/gcc/gcc/value-query.cc:86
0x1834491 hybrid_folder::value_of_expr(tree_node*, gimple*)
../../git/gcc/gcc/gimple-ssa-evrp.c:235
0xfb9d33 substitute_and_fold_engine::replace_uses_in(gimple*)
../../git/gcc/gcc/tree-ssa-propagate.c:575
0xfba04c substitute_and_fold_dom_walker::before_dom_children(basic_block_def*)
../../git/gcc/gcc/tree-ssa-propagate.c:845
0x17fb85f dom_walker::walk(basic_block_def*)
../../git/gcc/gcc/domwalk.c:309
0xfb94d5 substitute_and_fold_engine::substitute_and_fold(basic_block_def*)
../../git/gcc/gcc/tree-ssa-propagate.c:987
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
make: *** [Makefile:76: fixtests.o] Error 1

Is this known failure?

Uros.



Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog

2021-06-17 Thread Richard Earnshaw via Gcc




On 17/06/2021 01:40, Jason Merrill via Gcc wrote:

On 6/16/21 8:17 PM, Martin Sebor wrote:

On 6/16/21 5:45 PM, Jason Merrill wrote:
On Wed, Jun 16, 2021 at 5:46 PM Martin Sebor > wrote:


    On 6/16/21 2:49 PM, Jason Merrill wrote:
 > On 6/15/21 11:42 PM, Jason Merrill wrote:
 >> On Tue, Jun 15, 2021 at 10:04 PM Martin Sebor via Gcc
    mailto:gcc@gcc.gnu.org>
 >> >> wrote:
 >>
 >>     On 6/15/21 6:56 PM, Hans-Peter Nilsson wrote:
 >>  > On Fri, 11 Jun 2021, Martin Sebor via Gcc wrote:
 >>  >
 >>  >> On 6/11/21 11:32 AM, Jonathan Wakely wrote:
 >>  >>> On Fri, 11 Jun 2021 at 18:02, Martin Sebor wrote:
 >>   My objection is to making our policies and tools more
 >> restrictive
 >>   than they need to be.  We shouldn't expect everyone to
    study
 >> whole
 >>   manuals just to figure out how to successfully 
commit a

 >> change (or
 >>   learn how to format it just the right way).  It should
    be easy.
 >>  >>>
 >>  >>> I agree, to some extent. But consistency is also 
good. The

 >>     conventions
 >>  >>> for GNU ChangeLog formatting exist for a reason, and so
    do the
 >>  >>> conventions for good Git commit messages.
 >>  >>>
 >>   Setting this discussion aside for a moment and using a
 >> different
 >>   example, the commit hook rejects commit messages that
    don't
 >> start
 >>   ChangeLog entries with tabs.  It also rejects commit
 >> messages that
 >>   don't list all the same test files as those changed by
    the
 >> commit
 >>   (and probably some others as well).  That's in my view
 >> unnecessary
 >>   when the hook could just replace the leading spaces 
with

 >> tabs and
 >>   automatically mention all the tests.
 >>  
 >>   I see this proposal as heading in the same direction.
 >> Rather than
 >>   making the script fix things up if we get them wrong
    it would
 >>     reject
 >>   the commit, requiring the user to massage the 
ChangeLog by

 >>     hand into
 >>   an unnecessarily rigid format.
 >>  >>>
 >>  >>> You cannot "fix things up" in a server-side receive 
hook,

 >> because
 >>  >>> changing the commit message would alter the commit
    hash, which
 >>     would
 >>  >>> require the committer to do a rebase to proceed. That
    breaks the
 >>  >>> expected behaviour and workflow of a git repo.
 >>  >>>
 >>  >>> You can use the scripts on the client side to verify
    your commit
 >>  >>> message before pushing, so you don't have to be 
surprised

 >> when the
 >>  >>> server rejects it.
 >>  >>
 >>  >> That sounds like a killer argument.  Do we have shared
 >> client-side
 >>  >> scripts that could fix things up for us, or are we each
    on our
 >> own
 >>  >> to write them?
 >>  >
 >>  > I hope I got your view wrong.  If not: the "scripts 
fixing

 >>  > things up for us" direction is flawed (compared to the
    "scripts
 >>  > rejecting bad formats"), unless offered as a non-default
    option;
 >>  > please don't proceed.
 >>  >
 >>  > Why?  For one, there'll always be bugs in the scripting.
 >>  > Mitigate those situations: while wrongly rejecting a
    commit is
 >>  > bad, wrongly "fixing things up" is worse, as a general 
rule.

 >>  > Better avoid that.  (There's probably a popular "pattern
    name"
 >>  > for what I try to describe.)
 >>
 >>     The word that comes to mind is Technophobia.  Is it wise to
    trust
 >>     compilers to transform programs from their source form into
 >>     executables?  What if there are bugs in either?  What about
    the OS?
 >>     The whole computer, or the Internet?  Our cars? 
Fortunately, there's

 >>     more to gain than to lose by trusting automation.  If there
    weren't
 >>     human progress would be stuck sometime in the 1700's.
 >>
 >>     But we're not talking about anything anywhere that 
sophisticated

 >>     here: a sed script to copy and paste a piece of text in
 >>     the description of a change from one place to another.  It's
    been
 >>     done a few times before with more important data than
    ChangeLogs.
 >>
 >>
 >> git gcc-commit-mklog already automates most of the process.  It
    could
 >> also automate adding [PRx] to the first line.  Is that what
    you're
 >> asking for?
 >
 > Like, say:

    I don't think this solves the problem Xionghu Luo was asking about:
    https://gcc.gnu.org/pipermail/gc

Re: git gcc-commit-mklog doesn't extract PR number to ChangeLog

2021-06-18 Thread Richard Earnshaw via Gcc
On 17/06/2021 18:21, Jakub Jelinek wrote:
> On Thu, Jun 17, 2021 at 05:12:52PM +, Joseph Myers wrote:
>> On Thu, 17 Jun 2021, Richard Earnshaw via Gcc wrote:
>>
>>> It seems a bit dangerous to me to rely on just extracting PR numbers from
>>> tests.  What if the patch is just adjusting a test to make it compatible 
>>> with
>>> the remainder of the change?
>>
>> Also, that a test is added for a PR, or a commit is relevant to a PR, is a 
>> weaker property than the commit *resolving* the PR.  The fact that a 
>> commit *resolves* a PR (allows it to be marked as resolved, or the 
>> regression markers to be updated if it's resolved in master but the fix 
>> still needs to be backported) needs to be explicitly affirmed by the 
>> committer (possibly based on a question asked by a script) rather than 
>> assumed by default based on the PR being mentioned somewhere.
> 
> mklog as is doesn't fill in the details (descriptions of the changes
> to each function etc.), nor is realiable in many cases, and with Jason's
> recent change just fills in the first and last part of the first line
> but not the important middle part.
> So, the developer has to hand edit it anyway and that I'd consider also
> be the right time when the verification whether the PR being mentioned
> is the right one etc.  So no need to add a question asked by the script
> at another point.
> 
>   Jakub
> 


That misses my point.  If we use a tool to help doing this we can make
the tool also scrape the entry out of bugzilla and print the summary
line as a stronger visual check that the number has been typed
correctly.  We get many bug attributions wrong simply because two digits
have been transposed: visually checking the summary line is a far
stronger check that the correct number has been entered.

R.


Re: [PATCH] Port GCC documentation to Sphinx

2021-06-29 Thread Richard Earnshaw via Gcc




On 29/06/2021 11:09, Martin Liška wrote:

On 6/28/21 5:33 PM, Joseph Myers wrote:

Are formatted manuals (HTML, PDF, man, info) corresponding to this patch
version also available for review?


I've just uploaded them here:
https://splichal.eu/gccsphinx-final/

Martin



In the HTML version of the gcc manual the sidebar has an "Option index" 
link but no link to the general index.  When you follow that link the 
page contents is just a link to the "index" where everything is all 
lumped together.


If we can't have separate indexes for options and general entries, I 
think it would make more sense for the Option index link to be removed 
entirely.


R.


Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Richard Earnshaw via Gcc




On 30/06/2021 05:47, Martin Liška wrote:

On 6/29/21 12:50 PM, Richard Earnshaw wrote:



On 29/06/2021 11:09, Martin Liška wrote:

On 6/28/21 5:33 PM, Joseph Myers wrote:
Are formatted manuals (HTML, PDF, man, info) corresponding to this 
patch

version also available for review?


I've just uploaded them here:
https://splichal.eu/gccsphinx-final/

Martin



In the HTML version of the gcc manual the sidebar has an "Option 
index" link but no link to the general index.  When you follow that 
link the page contents is just a link to the "index" where everything 
is all lumped together.


If we can't have separate indexes for options and general entries, I 
think it would make more sense for the Option index link to be removed 
entirely.


Fully agree with you. Thanks for the feedback and I've changed that to 
the standard Sphinx section,
see e.g. 
https://splichal.eu/gccsphinx-final/html/gcc/indices-and-tables.html


Martin



R.





Thanks.  Given that the manual is nominally in American English, it 
might be better to use the term "indexes" rather than "indices".


https://grammarist.com/usage/indexes-indices/

R.


Re: [RFC] Adding a new attribute to function param to mark it as constant

2021-08-04 Thread Richard Earnshaw via Gcc




On 03/08/2021 18:44, Martin Sebor wrote:

On 8/3/21 4:11 AM, Prathamesh Kulkarni via Gcc wrote:
On Tue, 27 Jul 2021 at 13:49, Richard Biener 
 wrote:


On Mon, Jul 26, 2021 at 11:06 AM Prathamesh Kulkarni via Gcc
 wrote:


On Fri, 23 Jul 2021 at 23:29, Andrew Pinski  wrote:


On Fri, Jul 23, 2021 at 3:55 AM Prathamesh Kulkarni via Gcc
 wrote:


Hi,
Continuing from this thread,
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575920.html
The proposal is to provide a mechanism to mark a parameter in a
function as a literal constant.

Motivation:
Consider the following intrinsic vshl_n_s32 from arrm/arm_neon.h:

__extension__ extern __inline int32x2_t
__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
vshl_n_s32 (int32x2_t __a, const int __b)
{
   return (int32x2_t)__builtin_neon_vshl_nv2si (__a, __b);
}

and it's caller:

int32x2_t f (int32x2_t x)
{
    return vshl_n_s32 (x, 1);
}


Can't you do similar to what is done already in the aarch64 back-end:
#define __AARCH64_NUM_LANES(__v) (sizeof (__v) / sizeof (__v[0]))
#define __AARCH64_LANE_CHECK(__vec, __idx)  \
 __builtin_aarch64_im_lane_boundsi (sizeof(__vec),
sizeof(__vec[0]), __idx)

?
Yes this is about lanes but you could even add one for min/max which
is generic and such; add an argument to say the intrinsics name even.
You could do this as a non-target builtin if you want and reuse it
also for the aarch64 backend.

Hi Andrew,
Thanks for the suggestions. IIUC, we could use this approach to check
if the argument
falls within a certain range (min / max), but I am not sure how it
will help to determine
if the arg is a constant immediate ? AFAIK, vshl_n intrinsics require
that the 2nd arg is immediate ?

Even the current RTL builtin checking is not consistent across
optimization levels:
For eg:
int32x2_t f(int32_t *restrict a)
{
   int32x2_t v = vld1_s32 (a);
   int b = 2;
   return vshl_n_s32 (v, b);
}

With pristine trunk, compiling with -O2 results in no errors because
constant propagation replaces 'b' with 2, and during expansion,
expand_builtin_args is happy. But at -O0, it results in the error -
"argument 2 must be a constant immediate".

So I guess we need some mechanism to mark a parameter as a constant ?


I guess you want to mark it in a way that the frontend should force
constant evaluation and error if that's not possible?   C++ doesn't
allow to declare a parameter as 'constexpr' but something like

void foo (consteval int i);

since I guess you do want to allow passing constexpr arguments
in C++ or in C extended forms of constants like

static const int a[4];

foo (a[1]);

?  But yes, this looks useful to me.

Hi Richard,
Thanks for the suggestions and sorry for late response.
I have attached a prototype patch that implements consteval attribute.
As implemented, the attribute takes at least one argument(s), which
refer to parameter position,
and the corresponding parameter must be const qualified, failing
which, the attribute is ignored.


I'm curious why the argument must be const-qualified.  If it's
to keep it from being changed in ways that would prevent it from
being evaluated at compile-time in the body of the function then
to be effective, the enforcement of the constraint should be on
the definition of the function.  Otherwise, the const qualifier
could be used in a declaration of a function but left out from
a subsequent definition of it, letting it modify it, like so:

   __attribute__ ((consteval (1))) void f (const int);

   inline __attribute__ ((always_inline)) void f (int i) { ++i; }


In this particular case it's because the inline function is implementing 
an intrinsic operation in the architecture and the instruction only 
supports a literal constant value.  At present we catch this while 
trying to expand the intrinsic, but that can lead to poor diagnostics 
because we really want to report against the line of code calling the 
intrinsic.


R.


That said, if compile-time function evaluation is the goal then
a fully general solution is an attribute that applies to the whole
function, not just a subset of its arguments.  That way arguments
can also be assigned to local variables within the function that
can then be modified while still evaluated at compile time and
used where constant expressions are expected.  I.e., the design
goal is [a subset of] C++ constexpr.  (Obviously a much bigger
project.)

A few notes on the prototype patch: conventionally GCC warnings
about attributes do not mention when an attribute is ignored.
It may be a nice touch to add to all of them but I'd recommend
against doing that in individual handlers.  Since the attribute
allows pointer constants the warning issued when an argument is
not one should be generalized (i.e., not refer to just integer
constants).

(Other than that, C/C++ warnings should start in lowercase and
not end in a period).

Martin



The patch does type-checking for arguments in
check_function_consteval_attr, which
simply does a linear search to

Re: [RFC] Adding a new attribute to function param to mark it as constant

2021-08-04 Thread Richard Earnshaw via Gcc
On 04/08/2021 13:46, Segher Boessenkool wrote:
> On Wed, Aug 04, 2021 at 05:20:58PM +0530, Prathamesh Kulkarni wrote:
>> On Wed, 4 Aug 2021 at 15:49, Segher Boessenkool
>>  wrote:
>>> Both __builtin_constant_p and __is_constexpr will not work in your use
>>> case (since a function argument is not a constant, let alone an ICE).
>>> It only becomes a constant value later on.  The manual (for the former)
>>> says:
>>>   You may use this built-in function in either a macro or an inline
>>>   function. However, if you use it in an inlined function and pass an
>>>   argument of the function as the argument to the built-in, GCC never
>>>   returns 1 when you call the inline function with a string constant or
>>>   compound literal (see Compound Literals) and does not return 1 when you
>>>   pass a constant numeric value to the inline function unless you specify
>>>   the -O option.
>> Indeed, that's why I was thinking if we should use an attribute to mark 
>> param as
>> a constant, so during type-checking the function call, the compiler
>> can emit a diagnostic if the passed arg
>> is not a constant.
> 
> That will depend on the vagaries of what optimisations the compiler
> managed to do :-(
> 
>> Alternatively -- as you suggest, we could define a new builtin, say
>> __builtin_ice(x) that returns true if 'x' is an ICE.
> 
> (That is a terrible name, it's not clear at all to the reader, just
> write it out?  It is fun if you know what it means, but infuriating
> otherwise.)
> 
>> And wrap the intrinsic inside a macro that would check if the arg is an ICE ?
> 
> That will work yeah.  Maybe not as elegant as you'd like, but not all
> that bad, and it *works*.  Well, hopefully it does :-)
> 
>> For eg:
>>
>> __extension__ extern __inline int32x2_t
>> __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
>> vshl_n_s32_1 (int32x2_t __a, const int __b)
>> {
>>   return __builtin_neon_vshl_nv2si (__a, __b);
>> }
>>
>> #define vshl_n_s32(__a, __b) \
>> ({ typeof (__a) a = (__a); \
>>_Static_assert (__builtin_constant_p ((__b)), #__b " is not an
>> integer constant"); \
>>vshl_n_s32_1 (a, (__b)); })
>>
>> void f(int32x2_t x, const int y)
>> {
>>   vshl_n_s32 (x, 2);
>>   vshl_n_s32 (x, y);
>>
>>   int z = 1;
>>   vshl_n_s32 (x, z);
>> }
>>
>> With this, the compiler rejects vshl_n_s32 (x, y) and vshl_n_s32 (x,
>> z) at all optimization levels since neither 'y' nor 'z' is an ICE.
> 
> You used __builtin_constant_p though, which works differently, so the
> test is not conclusive, might not show what you want to show.
> 
>> Instead of __builtin_constant_p, we could use __builtin_ice.
>> Would that be a reasonable approach ?
> 
> I think it will work, yes.
> 
>> But this changes the semantics of intrinsic from being an inline
>> function to a macro, and I am not sure if that's a good idea.
> 
> Well, what happens if you call the actual builtin directly, with some
> non-constant parameter?  That just fails with a more cryptic error,
> right?  So you can view this as some syntactic sugar to make these
> intrinsics easier to use.
> 
> Hrm I now remember a place I could have used this:
> 
> #define mtspr(n, x) do { asm("mtspr %1,%0" : : "r"(x), "n"(n)); } while (0)
> #define mfspr(n) ({ \
>   u32 x; asm volatile("mfspr %0,%1" : "=r"(x) : "n"(n)); x; \
> })
> 
> It is quite similar to your builtin code really, and I did resort to
> macros there, for similar reasons :-)
> 
> 
> Segher
> 

We don't want to have to resort to macros.  Not least because at some
point we want to replace the content of arm_neon.h with a single #pragma
directive to remove all the parsing of the header that's needed.  What's
more, if we had a suitable pragma we'd stand a fighting chance of being
able to extend support to other languages as well that don't use the
pre-processor, such as Fortran or Ada (not that that is on the cards
right now).

R.


Re: [RFC] Adding a new attribute to function param to mark it as constant

2021-08-04 Thread Richard Earnshaw via Gcc




On 04/08/2021 14:40, Segher Boessenkool wrote:

On Wed, Aug 04, 2021 at 02:00:42PM +0100, Richard Earnshaw wrote:

We don't want to have to resort to macros.  Not least because at some
point we want to replace the content of arm_neon.h with a single #pragma
directive to remove all the parsing of the header that's needed.  What's
more, if we had a suitable pragma we'd stand a fighting chance of being
able to extend support to other languages as well that don't use the
pre-processor, such as Fortran or Ada (not that that is on the cards
right now).


So how do you want to handle constants-that-are-not-yet-constant, say
before inlining?  And how do you want to deal with those possibly not
ever becoming constant, perhaps because you used a too low "n" in -On
(but there are very many random other causes)?  And, what *is* a
constant, anyway?  This is even more fuzzy if you consider those
other languages as well.

(Does skipping parsing of some trivial header save so much time?  Huh!)



Trivial?  arm_neon.h is currently 20k lines of source.  What's more, it 
has to support inline functions that might not be available when the 
header is parsed, but might become available if the user subsequently 
compiles a function with different attributes enabled.  It is very 
definitely *NOT* trivial.


R.



Segher



Re: [RFC] Adding a new attribute to function param to mark it as constant

2021-08-05 Thread Richard Earnshaw via Gcc
On 04/08/2021 18:59, Segher Boessenkool wrote:
> On Wed, Aug 04, 2021 at 07:08:08PM +0200, Florian Weimer wrote:
>> * Segher Boessenkool:
>>
>>> On Wed, Aug 04, 2021 at 03:27:00PM +0100, Richard Earnshaw wrote:
 On 04/08/2021 14:40, Segher Boessenkool wrote:
> On Wed, Aug 04, 2021 at 02:00:42PM +0100, Richard Earnshaw wrote:
>> We don't want to have to resort to macros.  Not least because at some
>> point we want to replace the content of arm_neon.h with a single #pragma
>> directive to remove all the parsing of the header that's needed.  What's
>> more, if we had a suitable pragma we'd stand a fighting chance of being
>> able to extend support to other languages as well that don't use the
>> pre-processor, such as Fortran or Ada (not that that is on the cards
>> right now).
>
> So how do you want to handle constants-that-are-not-yet-constant, say
> before inlining?  And how do you want to deal with those possibly not
> ever becoming constant, perhaps because you used a too low "n" in -On
> (but there are very many random other causes)?  And, what *is* a
> constant, anyway?  This is even more fuzzy if you consider those
> other languages as well.
>
> (Does skipping parsing of some trivial header save so much time?  Huh!)

 Trivial?  arm_neon.h is currently 20k lines of source.  What's more, it 
 has to support inline functions that might not be available when the 
 header is parsed, but might become available if the user subsequently 
 compiles a function with different attributes enabled.  It is very 
 definitely *NOT* trivial.
>>>
>>> Ha yes :-)  I just assumed without looking that it would be like other
>>> architectures' intrinsics headers.  Whoops.
>>
>> But isn't it?
>>
>> $ echo '#include ' | gcc -E - | wc -l
>> 41045
> 
> $ echo '#include ' | gcc -E - -maltivec | wc -l
> 9
> 
> Most of this file (774 lines) is #define's, which take essentially no
> time at all.  And none of the other archs I have looked at have big
> headers either!
> 
> 
> Segher
> 

arm_sve.h isn't large either, but that's because all it contains (other
than a couple of typedefs is

#pragma GCC aarch64 "arm_sve.h"

:)

R.


Re: [RFC] Adding a new attribute to function param to mark it as constant

2021-08-06 Thread Richard Earnshaw via Gcc




On 06/08/2021 01:06, Martin Sebor via Gcc wrote:

On 8/4/21 3:46 AM, Richard Earnshaw wrote:



On 03/08/2021 18:44, Martin Sebor wrote:

On 8/3/21 4:11 AM, Prathamesh Kulkarni via Gcc wrote:
On Tue, 27 Jul 2021 at 13:49, Richard Biener 
 wrote:


On Mon, Jul 26, 2021 at 11:06 AM Prathamesh Kulkarni via Gcc
 wrote:


On Fri, 23 Jul 2021 at 23:29, Andrew Pinski  
wrote:


On Fri, Jul 23, 2021 at 3:55 AM Prathamesh Kulkarni via Gcc
 wrote:


Hi,
Continuing from this thread,
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575920.html
The proposal is to provide a mechanism to mark a parameter in a
function as a literal constant.

Motivation:
Consider the following intrinsic vshl_n_s32 from arrm/arm_neon.h:

__extension__ extern __inline int32x2_t
__attribute__  ((__always_inline__, __gnu_inline__, 
__artificial__))

vshl_n_s32 (int32x2_t __a, const int __b)
{
   return (int32x2_t)__builtin_neon_vshl_nv2si (__a, __b);
}

and it's caller:

int32x2_t f (int32x2_t x)
{
    return vshl_n_s32 (x, 1);
}


Can't you do similar to what is done already in the aarch64 
back-end:

#define __AARCH64_NUM_LANES(__v) (sizeof (__v) / sizeof (__v[0]))
#define __AARCH64_LANE_CHECK(__vec, __idx)  \
 __builtin_aarch64_im_lane_boundsi (sizeof(__vec),
sizeof(__vec[0]), __idx)

?
Yes this is about lanes but you could even add one for min/max which
is generic and such; add an argument to say the intrinsics name 
even.

You could do this as a non-target builtin if you want and reuse it
also for the aarch64 backend.

Hi Andrew,
Thanks for the suggestions. IIUC, we could use this approach to check
if the argument
falls within a certain range (min / max), but I am not sure how it
will help to determine
if the arg is a constant immediate ? AFAIK, vshl_n intrinsics require
that the 2nd arg is immediate ?

Even the current RTL builtin checking is not consistent across
optimization levels:
For eg:
int32x2_t f(int32_t *restrict a)
{
   int32x2_t v = vld1_s32 (a);
   int b = 2;
   return vshl_n_s32 (v, b);
}

With pristine trunk, compiling with -O2 results in no errors because
constant propagation replaces 'b' with 2, and during expansion,
expand_builtin_args is happy. But at -O0, it results in the error -
"argument 2 must be a constant immediate".

So I guess we need some mechanism to mark a parameter as a constant ?


I guess you want to mark it in a way that the frontend should force
constant evaluation and error if that's not possible?   C++ doesn't
allow to declare a parameter as 'constexpr' but something like

void foo (consteval int i);

since I guess you do want to allow passing constexpr arguments
in C++ or in C extended forms of constants like

static const int a[4];

foo (a[1]);

?  But yes, this looks useful to me.

Hi Richard,
Thanks for the suggestions and sorry for late response.
I have attached a prototype patch that implements consteval attribute.
As implemented, the attribute takes at least one argument(s), which
refer to parameter position,
and the corresponding parameter must be const qualified, failing
which, the attribute is ignored.


I'm curious why the argument must be const-qualified.  If it's
to keep it from being changed in ways that would prevent it from
being evaluated at compile-time in the body of the function then
to be effective, the enforcement of the constraint should be on
the definition of the function.  Otherwise, the const qualifier
could be used in a declaration of a function but left out from
a subsequent definition of it, letting it modify it, like so:

   __attribute__ ((consteval (1))) void f (const int);

   inline __attribute__ ((always_inline)) void f (int i) { ++i; }


In this particular case it's because the inline function is 
implementing an intrinsic operation in the architecture and the 
instruction only supports a literal constant value.  At present we 
catch this while trying to expand the intrinsic, but that can lead to 
poor diagnostics because we really want to report against the line of 
code calling the intrinsic.


Presumably the intrinsics can accept (or can be made to accept) any
constant integer expressions, not just literals.  E.g., the aarch64
builtin below accepts them.  For example, this is accepted in C++:

   __Int64x2_t void f (__Int32x2_t a)
   {
     constexpr int n = 2;
     return __builtin_aarch64_vshll_nv2si (a, n + 1);
   }

Making the intrinscis accept constant arguments in constexpr-like
functions and introducing a constexpr-lite attribute (for C code)
was what I was suggesting bythe constexpr comment below.  I'd find
that a much more general and more powerful design.



Yes, it would be unfortunate if the rule made it impossible to avoid 
idiomatic const-exprs like those you would write in C++, or even those 
you'd write naturally in C:


#define foo (1u << 5)



But my comment above was to highlight that if requiring the function
argument referenced by the proposed consteval attribute to be const
is necessary to prevent it from being modified 

Re: ARM32 configury changes, with no FPU as a default

2021-09-17 Thread Richard Earnshaw via Gcc




On 17/09/2021 11:23, Florian Weimer via Gcc wrote:

* Matthias Klose:


Starting with GCC 8, the configury allows to encode extra features into the
architecture string. Debian and Ubuntu's armhf (hard float) architecture is
configured with

   --with-arch=armv7-a --with-fpu=vfpv3-d16

and now should be configured with

   --with-arch=armv7-a+fp

The --with-fpu configure option is deprecated.  The problem with this approach
is that there is no default for the fpu setting, while old compilers silently
pick up the -mfpu from the configured compiler.


FWIW, Fedora uses:

 --with-tune=generic-armv7-a --with-arch=armv7-a \
--with-float=hard --with-fpu=vfpv3-d16 --with-abi=aapcs-linux \

Not sure how it is impacted by this change.


This breaks software which explicitly configures things like
-march=armv7-a, or where the architecture string is embedded in the
source as an attribute.  So going from one place in the compiler about
configuring the ABI for a distro arch, this config now moves to some
dozen places in different packages.  Not the thing I would expect.


I don't know if we have seen such problems in Fedora.  I don't remember
any reports.



That's still using the now-deprecated --with-fpu option.  I want to 
remove that from GCC eventually in favour of the new way of adding the 
FP configuration as part of the architecture.  Recent versions of the 
Arm architecture do not document separate FPU versions, but just add 
features to the main architecture, so we need to move away from the old 
approach.


R.


Thanks,
Florian



Re: [FYI] bugzilla cleanup

2021-09-17 Thread Richard Earnshaw via Gcc




On 16/09/2021 16:44, Martin Sebor via Gcc wrote:

On 9/14/21 2:10 AM, Andrew Pinski via Gcc wrote:

Hi all,
   I am doing some bugzilla cleanup.  This includes disabling some
components and some versions for new bugs.
So far I have disabled versions before GCC 4 because we have not had a
report from someone for those versions in over 7 years.  I disabled
some versions which are about developmental branches which are
inactive too.
I also disabled the java, libgcj, fastjar, libmudflap, treelang and
libf2c components.

I am in the process of moving away from having an inline-asm component
to an inline-asm keyword instead; this was suggested on IRC and I
agree.  After the current open bugs have moved away from the
inline-asm component, I will disable it also.

If anyone else has any other suggestions that should be done, please
let me know and I will look into doing it.


Re: Keywords: I find it useful to differentiate between two kinds of
diagnostic bugs: false positives and false negatives (the latter for
existing warnings that don't trigger when intended, as opposed to
requests to enhance existing warnings or add new ones). I've been
using Personal Tags for this but it might be useful to others as
well.  If you agree and add the corresponding new keywords
(false-positive and false-negative) I'll set them based on my Tags.

One other suggestion: every once in a while someone asks if
ice-on-invalid-code bugs apply to syntactically well-formed code that
has undefined behavior (I don't believe it does).  It would help to
clarify the Description for this Keyword (and, correspondingly, for
ice-on-valid).  E.g., something like

ice-on-invalid-code: ICE on code that is not syntactically valid.
ice-on-valid-code: ICE on code that is syntactically valid.



What about syntactically valid but semantically invalid code?  I'd call 
that ICE-on-invalid as well.


R.


Martin


Re: Can gcc.dg/torture/pr67828.c be an infinite loop?

2021-09-24 Thread Richard Earnshaw via Gcc




On 24/09/2021 10:29, Andrew Pinski via Gcc wrote:

On Fri, Sep 24, 2021 at 1:05 AM Aldy Hernandez via Gcc  wrote:


Hi folks.

My upcoming threading improvements turn the test below into an infinite
runtime loop:

int a, b;
short c;

int
main ()
{
int j, d = 1;
for (; c >= 0; c++)
  {
BODY:
a = d;
d = 0;
if (b)
 {
   xprintf (0);
   if (j)
 xprintf (0);
 }
  }
xprintf (d);
exit (0);
}

On the false edge out of if(b) we thread directly to BODY, eliding the
loop conditional, because we know that c>=0 because it could never overflow.


Huh about c>=0 being always true? the expression, "c++" is really c=
(short)(((int)c)+1).  So it will definitely wrap over when c is
SHRT_MAX.


Except when sizeof(short) == sizeof (int), at which point the 'int' 
expression will overflow and we get UB again.


R.



Thanks,
Andrew Pinski



Since B is globally initialized to 0, this has the effect of turning the
test into an infinite loop.

Is this correct, or did I miss something?
Aldy



Re: How to describe ‘earlyclobber’ explicitly for specific source operand ?

2021-11-23 Thread Richard Earnshaw via Gcc




On 22/11/2021 06:40, Jojo R via Gcc wrote:


— Jojo
在 2021年11月20日 +0800 AM6:11,Peter Bergner ,写道:

On 11/19/21 1:28 AM, Jojo R via Gcc wrote:

We know gcc supply earlyclobber function to avoid register overlap,

but it can not describe explicitly for specific source operand, is it right ?


You add the early clobber to the OUTPUT operand(s) that can clobber any of the
input source operands. You don't mark the source operands that could be 
clobbered.

Yes, so we need to enhance the early clobber to cover this scene ?


Peter


You can write alternatives that explicitly tie a source to the 
destination, provided that the source and destination are the same size. 
 See the Arm backend for examples.


R.


Re: Labelling of regressions in Bugzilla

2021-12-15 Thread Richard Earnshaw via Gcc




On 15/12/2021 11:39, Jonathan Wakely via Gcc wrote:

On IRC we've been discussing some changes to Bugzilla that would give
a bit more structure to how we label and process regressions.

Currently we add something like "[9/10/11/12 Regression]" to the start
of the summary, and then edit that when it's fixed on a branch, when
forking a new release branch from trunk, and when the oldest branch is
closed and becomes unmaintained. Finding active regressions means a
free-text search in the summary field.

On IRC we discussed adding a new custom field to record which branches
a regression affects. This could be like the
known-towork/known-to-fail fields, so only allow predefined values,
and allow searching by "all of" and by "any of". The possible values
for the field would only be the major releases, not all the minor
releases and snapshots and vendor branches that are in the
known-to-work field. So just 4.9, 5, 6, 7 etc. not 4.9.4, 5.0, 5.1.0,
5.1.1 etc.

When a new branch is forked from trunk before a release all bugs that
have "trunk" in the regression field would automatically get "12"
added (or if we already used "12" instead of "trunk" they'd get "13"
added, either way would work).

Unlike the current system, we wouldn't need to remove closed branches
from the regressions field. We do that today so the Summary field
doesn't get cluttered with old branch info, but if it's in a separate
field keeping the old data present is valuable. We would only remove a
branch from that field when the regression is fixed on the branch. We
would still be able to search for regressions in active branches
easily, but it would also be possible to see at a glance that a given
regression was present on the gcc-8 branch and never fixed there. This
would also help vendors who maintain older branches, as the
information that a regression affected an old branch would not be
wiped out of the summary when the branch reaches EOL.

Jakub also suggested it would be nice to be able to enter a revision
into a "regressed at" field and have it auto-populate the regressions
list with all branches that the commit is present in. (Ideally any of
SVN r numbers, or git revisions, or gcc-descr rNN-NNN strings
could be used there). That would be useful when we bisect the
regression and find where it started.

Iain pointed out a drawback of not having the regression info in the
Summary. Currently it does draw your attention when looking at the
results of a bugzilla search. Andrew noted that bug aliases are
automatically added to the summary, e.g. https://gcc.gnu.org/PR94404
shows its alias "(c++core-issues)". Maybe we could do that for
regressions (for the active branches only, so the result would be
similar to what we have today).

Thoughts? Objections? Better ideas?



My immediate thought (since I tend to dislike deleting history) is why 
not have two fields?  One listing all the release branches where this 
has occurred and another for where it has now been fixed.  That way you 
can see quickly whether the regression has ever affected some versions 
of a release.  Something we lack today with the single fixed in field is 
the ability to track exactly which dot releases of each branch contained 
the fix for a regression.


Other than that, I have no other concerns at the moment.


Re: Help with an ABI peculiarity

2022-01-11 Thread Richard Earnshaw via Gcc




On 10/01/2022 08:38, Florian Weimer via Gcc wrote:

* Jeff Law via Gcc:


Most targets these days use registers for parameter passing and
obviously we can run out of registers on all of them.  The key
property is the size/alignment of the argument differs depending on if
it's pass in a register (get promoted) or passed in memory (not
promoted).  I'm not immediately aware of another ABI with that
feature.  Though I haven't really gone looking.


I think what AArch64 Darwin does is not compatible with a GCC extension
that allows calling functions defined with a prototype without it (for
pre-ISO-C compatibility).  Given that, anyone defining an ABI in
parallel with a GCC implementation probably has paused, reconsidered
what they were doing, and adjusted the ABI for K&R compatibility.

Thanks,
Florian



Not having a prototype was deprecated in C89.  One would hope that after 
33 years we could move on from that.


R.


Re: [ANNOUNCEMENT] Mass rename of C++ .c files to .cc suffix is going to happen on Jan 17 evening UTC TZ

2022-01-18 Thread Richard Earnshaw via Gcc




On 17/01/2022 21:41, Martin Liška wrote:

On 1/13/22 12:01, Martin Liška wrote:

Hello.

Based on the discussion with release managers, the change is going to 
happen

after stage4 begins.

Martin


Hi.

The renaming patches have been just installed and I've built a few 
target compilers so far.
I'll be online in ~10 hours from now so I can address potential issues 
caused by the patch.


One note: I would recommend using:
git config log.follow true



Is that worth adding to contrib/gcc-git-customization.sh ?

That causes git log following changes of a file before it was renamed so 
that

one can get a complete history.

Cheers,
Martin


R.


Re: Benchmark recommendations needed

2022-02-22 Thread Richard Earnshaw via Gcc
Dhrystone is (and probably always was) a bogus benchmark.  It's a 
well-known truism that MIPS stands for Meaningless Indication of 
Processor Speed, and dhrystone scores are equally meaningless. 
Dhrystone fell out of common usage over 20 years ago.


It's not GCC that is being peculiar, it's just Dhrystone is pointless.

R.

On 22/02/2022 05:22, Andras Tantos wrote:

That's true, I did notice GCC being rather ... peculiar about
drhystone. Is there a way to make it less clever about the benchmark?

Or is there some alteration to the benchmark I can make to not trigger
the special behavior in GCC?

Andras

On Mon, 2022-02-21 at 03:19 +, Gary Oblock via Gcc wrote:

Trying to use the dhrystone isn't going to be very useful. It has
many downsides not the least is that gcc's optimizer can run rings
about it.

Gary


From: Gcc  on
behalf of gcc-requ...@gcc.gnu.org 
Sent: Tuesday, February 15, 2022 6:25 AM
To: gcc@gcc.gnu.org 
Subject: Re:

[EXTERNAL EMAIL NOTICE: This email originated from an external
sender. Please be mindful of safe email handling and proprietary
information protection practices.]


Send Gcc mailing list submissions to
 gcc@gcc.gnu.org

To subscribe or unsubscribe via the World Wide Web, visit
 https://gcc.gnu.org/mailman/listinfo/gcc
or, via email, send a message with subject or body 'help' to
 gcc-requ...@gcc.gnu.org

You can reach the person managing the list at
 gcc-ow...@gcc.gnu.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Gcc digest..."




Re: ARM Cortex-R5F Support

2022-03-02 Thread Richard Earnshaw via Gcc




On 01/03/2022 16:23, Kinsey Moore wrote:

Hi,

I'm looking at working on Cortex-R5F support for RTEMS, but it seems as 
if latest GCC supports the Cortex-R5. This R5 has implicit FPU support 
which would make it really R5F. The ARM reference page on this core 
(https://developer.arm.com/Processors/Cortex-R5) specifies that the FPU 
is optional. I see that the FPU support can probably be disabled using 
the nofp option to achieve Cortex-R5 support, but I was wondering why 
this is handled differently from the Cortex-R4[F] support since that is 
broken out into two different CPU entries in gcc/config/arm/arm-cpus.in. 
It appears that R7 and R8 are handled the same way as R5.


Is the R4/R4F just the legacy way of handling this and R5/7/8 are the 
new way?




Arm no-longer gives distinct product names for products that come in 
multiple guises.  Another example of this is that many armv8-a products 
have an optional crypto unit but have the same product name.


So to answer your question more directly, the -mcpu=cortex-r5 will by 
default be considered to have an FPU, provided that the compiler was 
built with --with-fpu=auto (the default).  If you specify 
--with-float-abi=soft, then even if the product has an FPU, or for some 
cases a SIMD unit, then these will never be used.  So I'd recommend:


For FP support: -mcpu=cortex-r5 -mfloat-abi=hard
For no FP support: -mcpu=cortex-r5 -mfloat-abi=soft

There's also a mid-way variant of -mcpu=cortex-r5 -mfloat-abi=softfp, 
which would use the FP hardware but use the soft-float calling 
conventions; this code is abi-compatible with the no-fp variant above.


HTH,
R.



Thanks,

Kinsey



Re: what is the difference with and without crc extension support

2022-03-04 Thread Richard Earnshaw via Gcc

On 03/03/2022 13:41, Dongjiu Geng via Gcc wrote:

Hi,
  My program does not use CRC instructions,but I find the compiled
binary has much difference between using "-march=armv8-a+crc" and
using "-march=armv8-a". Even stranger, when I use
"-march=armv8-a+crc", I find my compiled binary can not run. but when
I change -O2 to -O0 based on "-march=armv8-a+crc",  it can run. I  do
not  know what is the reason.. can you answer it? Thanks.


The most common reason for a program failing to run when the optimizer 
is turned on is that it contains undefined behaviour.  Make sure that 
you turn on GCC's warning options and fix anything that these report.


It's very unlikely that the change of -march=armv8-a to 
-march=armv8-a+crc is the reason for the differences in compiled code 
that you are seeing.  Something else to try here is to identify an 
object file that is different between the two options and then rebuild 
it both ways, but add "-g".  You can then use


  readelf -debug-dump=str 

to print out the real options used by the compiler (there may be some 
additional strings reported in the dump, but the options are the bit 
that is interesting).  The only difference should be the -march option 
you mentioned above.


R.


Re: Urgent GCC ABI backend maintainer ping re zero width bitfield passing (PR102024)

2022-03-22 Thread Richard Earnshaw via Gcc




On 21/03/2022 16:28, Jakub Jelinek via Gcc wrote:

Hi!

I'd like to ping port maintainers about
https://gcc.gnu.org/PR102024

As I wrote, the int : 0 bitfields are present early in the TYPE_FIELDS
during structure layout and intentionally affect the layout.
We had some code to remove those from TYPE_FIELDS chains in the C and C++
FEs, but for C that removal never worked correctly (never removed any)
and the non-working removal was eventually removed.  For C++ it
didn't initially work either, but for GCC 4.5 that was fixed in PR42217,
so on various backends where TYPE_FIELDS are analyzed for how to pass or
return certain aggregates starting with GCC 4.5 the C++ and C ABI diverged.
In August, I have removed that zero width bitfield removal from C++ FE
as the FE needs to take those bitfields into account later on as well.

The x86_64 backend was changed in r12-6418-g3159da6c46 to match recently
approved clarification of the x86-64 psABI and the zero width bitfields
are now ignored for both C and C++ (so an ABI change for C from 11.x and
earlier to 12.x and for C++ from GCC 4.4 and earlier to 4.5 and later)
with a -Wpsabi diagnostics about it.

The rs6000 backend was changed in r12-3843-g16e3d6b8b2 to never ignore
those bitfields (so no ABI change for C, for C++ ...-4.4 and 12+ are
ABI incompatible with 4.5 through 11.x; note, it affects I think just
ppc64le ABI, which didn't really exist before 4.8 I think) and diagnostics
has been added about the ABI change.

As I wrote in the PR, I believe most of the GCC backends are unaffected,
x86_64 and rs6000 are handled, riscv was changed already in GCC 10 to
ignore those bitfields and emit a -Wpsabi diagnostics.

I can see code-generation differences certainly on armv7hl and aarch64.
ia64, iq2000, mips, s390 and sparc are maybe affected, haven't checked.

Simple testcase could be e.g.:
struct S { float a; int : 0; float b; };

__attribute__((noipa)) struct S
foo (struct S x)
{
   return x;
}

void
bar (void)
{
   struct S s = { 0.0f, 0.0f };
   foo (s);
}
where one should look at the argument and return value passing
in GCC 11 C, GCC 11 C++, GCC trunk C, GCC trunk C++.

The FE now sets bits on the bitfields that make it possible to
differentiate between the different cases, so each port may decide to do
one of the 3 things:
1) keep ABI exactly compatible between GCC 11 and 12, which means
C and C++ will continue to be incompatible
2) keep the G++ 4.5 through 11 ABI of ignoring zero width bitfields and
change C ABI
3) keep the GCC < 11 C ABI of not ignoring zero width bitfields and
change the C++ ABI (which means restoring ABI compatibility in
this regard between G++ 4.4 and earlier with G++ 12 and later)
Furthermore, it would be very nice to emit -Wpsabi diagnostics for the
changed ABI unless 1) is decided.
One should take into account psABI as well as what other compilers do.

The current state of GCC trunk is that 3) is done except that x86_64
did 2) and riscv did 2 already for GCC 10 and all of x86_64, riscv and
rs6000 emit -Wpsabi diagnostics (though I think rs6000 doesn't guard
it with -Wpsabi).

I can help with the backend implementations if needed, but I can't
decide which possibility you want to choose for each backend.
It would be really nice to decide about this soon, because changing
the ABI in GCC 12 only to change it again in GCC 13 doesn't look much
desirable and even if 3) is the choice, it is really nice to have
some diagnostics about ABI changes.

Thanks

Jakub



Unless I've missed something subtle here, the layout of

  struct S { float a; int : 0; float b;};

is going to identical to

  struct T { float a; float b;};

on pretty much every architecture I can think of, so this is purely 
about parameter passing rules for the former and whether the two cases 
above should behave the same way.


The AAPCS and AAPCS64 both contain the same statement as part of the 
definition of an HFA:


| The test for homogeneity is applied after data layout is
| completed and without regard to access control or other source
| language restrictions.

The access control and source language restrictions was intended to 
cover c++-style features such as public/private, so aren't really 
relevant to this discussion (though you might plausibly read 'source 
language restriction' to cover this).  However, the fact that the test 
is applied after layout has been done and because a zero-sized bit-field 
neither

- adds an accessible member
- changes the layout in any case I can think of that would potentially 
be an HFA.
my preliminary conclusion is that for Arm and AArch64 we still have a 
duck here (if it walks like one and quacks like one...).


I'm still awaiting final confirmation of this from our internal ABI 
group, but I'm pretty confident that this will be our final position.


R.

PS.  It looks like llvm and llvm++ are inconsistent on this one as well.


Re: Urgent GCC ABI backend maintainer ping re zero width bitfield passing (PR102024)

2022-03-22 Thread Richard Earnshaw via Gcc




On 22/03/2022 16:51, Jakub Jelinek via Gcc wrote:

On Tue, Mar 22, 2022 at 04:28:08PM +, Richard Earnshaw wrote:

Unless I've missed something subtle here, the layout of

   struct S { float a; int : 0; float b;};

is going to identical to

   struct T { float a; float b;};

on pretty much every architecture I can think of, so this is purely about
parameter passing rules for the former and whether the two cases above
should behave the same way.


Layout is always done with the int : 0; bitfields in TYPE_FIELDS and
only after that is done C++ FE used to remove them.
So yes, it only can affect the passing of parameters and return values
in registers (or partially in registers, partially in memory).


The AAPCS and AAPCS64 both contain the same statement as part of the
definition of an HFA:

| The test for homogeneity is applied after data layout is
| completed and without regard to access control or other source
| language restrictions.

The access control and source language restrictions was intended to cover
c++-style features such as public/private, so aren't really relevant to this
discussion (though you might plausibly read 'source language restriction' to
cover this).  However, the fact that the test is applied after layout has
been done and because a zero-sized bit-field neither
- adds an accessible member
- changes the layout in any case I can think of that would potentially be an
HFA.
my preliminary conclusion is that for Arm and AArch64 we still have a duck
here (if it walks like one and quacks like one...).

I'm still awaiting final confirmation of this from our internal ABI group,
but I'm pretty confident that this will be our final position.

PS.  It looks like llvm and llvm++ are inconsistent on this one as well.


At least on x86_64 clang and clang++ consistently honored the zero width
bitfields during structure layout and ignored them during parameter passing
decisions (i.e. what the x86_64 psABI chose to clarify).


I was looking at aarch32 (arm).
Compiling

struct S { float a; int : 0; float b; };

struct S
foo (struct S x)
{
  x.b += 1.0f;
  return x;
}

with clang-10 I get

foo:
.fnstart
vmov.f32s0, #1.00e+00
str r1, [r0]
vmovs2, r2
vadd.f32s0, s2, s0
vstrs0, [r0, #4]
bx  lr

while clang++10 gives

_Z3foo1S:
.fnstart
vmov.f32s2, #1.00e+00
vadd.f32s1, s1, s2
bx  lr

Both with the options
-S -O2 abi-bf.c -o - --target=arm-none-eabi -march=armv8-a -mfpu=neon 
-mfloat-abi=hard


So for C it has passed the object in r1/r2 and returned it in memory, 
while for C++ it has passed it as an HFA.


But for AArch64 it doesn't look right either:

clang-10

foo:// @foo
// %bb.0:
lsr x8, x0, #32
fmovs0, #1.
fmovs1, w8
fadds0, s1, s0
fmovw8, s0
bfi x0, x8, #32, #32
ret

clang++-10:


_Z3foo1S:   // @_Z3foo1S
// %bb.0:
fmovs2, #1.
fadds1, s1, s2
ret


I guess it would be nice to include the testcases we are talking about,
like { float x; int : 0; float y; } and { float x; int : 0; } and
{ int : 0; float x; } into compat.exp testsuite so that we see ABI
differences in compat testing.

Jakub



Yes, we might also add some specific Arm ABI tests as well.

R.


Re: Urgent GCC ABI backend maintainer ping re zero width bitfield passing (PR102024)

2022-03-25 Thread Richard Earnshaw via Gcc




On 22/03/2022 16:28, Richard Earnshaw via Gcc wrote:



On 21/03/2022 16:28, Jakub Jelinek via Gcc wrote:

Hi!

I'd like to ping port maintainers about
https://gcc.gnu.org/PR102024

As I wrote, the int : 0 bitfields are present early in the TYPE_FIELDS
during structure layout and intentionally affect the layout.
We had some code to remove those from TYPE_FIELDS chains in the C and C++
FEs, but for C that removal never worked correctly (never removed any)
and the non-working removal was eventually removed.  For C++ it
didn't initially work either, but for GCC 4.5 that was fixed in PR42217,
so on various backends where TYPE_FIELDS are analyzed for how to pass or
return certain aggregates starting with GCC 4.5 the C++ and C ABI 
diverged.

In August, I have removed that zero width bitfield removal from C++ FE
as the FE needs to take those bitfields into account later on as well.

The x86_64 backend was changed in r12-6418-g3159da6c46 to match recently
approved clarification of the x86-64 psABI and the zero width bitfields
are now ignored for both C and C++ (so an ABI change for C from 11.x and
earlier to 12.x and for C++ from GCC 4.4 and earlier to 4.5 and later)
with a -Wpsabi diagnostics about it.

The rs6000 backend was changed in r12-3843-g16e3d6b8b2 to never ignore
those bitfields (so no ABI change for C, for C++ ...-4.4 and 12+ are
ABI incompatible with 4.5 through 11.x; note, it affects I think just
ppc64le ABI, which didn't really exist before 4.8 I think) and 
diagnostics

has been added about the ABI change.

As I wrote in the PR, I believe most of the GCC backends are unaffected,
x86_64 and rs6000 are handled, riscv was changed already in GCC 10 to
ignore those bitfields and emit a -Wpsabi diagnostics.

I can see code-generation differences certainly on armv7hl and aarch64.
ia64, iq2000, mips, s390 and sparc are maybe affected, haven't checked.

Simple testcase could be e.g.:
struct S { float a; int : 0; float b; };

__attribute__((noipa)) struct S
foo (struct S x)
{
   return x;
}

void
bar (void)
{
   struct S s = { 0.0f, 0.0f };
   foo (s);
}
where one should look at the argument and return value passing
in GCC 11 C, GCC 11 C++, GCC trunk C, GCC trunk C++.

The FE now sets bits on the bitfields that make it possible to
differentiate between the different cases, so each port may decide to do
one of the 3 things:
1) keep ABI exactly compatible between GCC 11 and 12, which means
    C and C++ will continue to be incompatible
2) keep the G++ 4.5 through 11 ABI of ignoring zero width bitfields and
    change C ABI
3) keep the GCC < 11 C ABI of not ignoring zero width bitfields and
    change the C++ ABI (which means restoring ABI compatibility in
    this regard between G++ 4.4 and earlier with G++ 12 and later)
Furthermore, it would be very nice to emit -Wpsabi diagnostics for the
changed ABI unless 1) is decided.
One should take into account psABI as well as what other compilers do.

The current state of GCC trunk is that 3) is done except that x86_64
did 2) and riscv did 2 already for GCC 10 and all of x86_64, riscv and
rs6000 emit -Wpsabi diagnostics (though I think rs6000 doesn't guard
it with -Wpsabi).

I can help with the backend implementations if needed, but I can't
decide which possibility you want to choose for each backend.
It would be really nice to decide about this soon, because changing
the ABI in GCC 12 only to change it again in GCC 13 doesn't look much
desirable and even if 3) is the choice, it is really nice to have
some diagnostics about ABI changes.

Thanks

Jakub



Unless I've missed something subtle here, the layout of

   struct S { float a; int : 0; float b;};

is going to identical to

   struct T { float a; float b;};

on pretty much every architecture I can think of, so this is purely 
about parameter passing rules for the former and whether the two cases 
above should behave the same way.


The AAPCS and AAPCS64 both contain the same statement as part of the 
definition of an HFA:


| The test for homogeneity is applied after data layout is
| completed and without regard to access control or other source
| language restrictions.

The access control and source language restrictions was intended to 
cover c++-style features such as public/private, so aren't really 
relevant to this discussion (though you might plausibly read 'source 
language restriction' to cover this).  However, the fact that the test 
is applied after layout has been done and because a zero-sized bit-field 
neither

- adds an accessible member
- changes the layout in any case I can think of that would potentially 
be an HFA.
my preliminary conclusion is that for Arm and AArch64 we still have a 
duck here (if it walks like one and quacks like one...).


I'm still awaiting final confirmation of this from our internal ABI 
group, but I'm pretty confident that this will be our final position.


Just to confirm that this is our f

Re: Urgent GCC ABI backend maintainer ping re zero width bitfield passing (PR102024)

2022-03-25 Thread Richard Earnshaw via Gcc




On 25/03/2022 14:47, Jakub Jelinek via Gcc wrote:

On Fri, Mar 25, 2022 at 02:26:56PM +, Richard Earnshaw wrote:

Just to confirm that this is our final position.  The 'int:0 field should be
ignored for the purposes of determining the parameter passing as it has no
effect on the layout of the type.

We do not feel that an update to the AAPCS or AAPCS64 is needed as the
wording already covers this.


Ok.  So on the GCC side you need for both arm and aarch64 something similar
to the r12-6418-g3159da6c46568a7c change (of course on the ARM/AArch64 side
it will be in different spots etc.).
But generally, if you see during TYPE_FIELDS walk for argument/return value
passing decisions (both test whether something could be passed in registers
or say alignment decisions for those) rather than layout
   DECL_BIT_FIELD (field) && integer_zerop (DECL_SIZE (field))
ignore it - if DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD (field) then always,
otherwise arrange for 2 invocations in which one ignores them and one
doesn't and warns if the overall decisions change.

Jakub



Do we really need two passes?  Surely, if we find a zero-width bitfield 
in the type we just set a marker to note that it was found.  If, at the 
end of walking the type, it's still a candidate for passing in FP regs, 
then we've identified an ABI change because previously we would not have 
behaved this way.


R.


Re: gnatlink vs. -mthumb -march=armv7-a+simd -mfloat-abi=hard

2022-04-28 Thread Richard Earnshaw via Gcc




On 28/04/2022 09:16, Sebastian Huber wrote:
/opt/rtems/7/lib/gcc/arm-rtems7/12.0.1/thumb/armv7-a+simd/hard/adainclude/s-secsta.ads:288:9: 
sorry, unimplemented: Thumb-1 'hard-float' VFP ABI


Does that source file somehow attempt to change the architecture on that 
line?  This looks like something equivalent to a pragma changing things.


R.


Re: .eh_frame augmentation character for MTE stack tagging

2022-06-06 Thread Richard Earnshaw via Gcc




On 04/06/2022 00:52, Florian Mayer via Gcc wrote:

Hey!

We are in the process of implementing MTE (Memory Tagging Extension)
stack tagging in LLVM. To support stack tagging in combination with
exceptions, we need to make sure that the unwinder will untag stack
frames, to avoid leaving behind stale tags. As such, we need some way
to communicate to the unwinder to do that.

In a discussion on llvm-dev [1], it was decided the best way to go
forward with this would be to add a new character ('G' for taG, as the
MTE instructions stg etc.) to the eh_frame augmentation string, and
then handle that in libunwind by clearing the tags of the respective
frame.

How does that sound? Would that be a good course of action for GCC as well?

Thanks,
Florian

[1]: https://lists.llvm.org/pipermail/llvm-dev/2020-May/141345.html


Hi Florian,

This is something that needs to be specified in the ABI, not just agreed 
between a couple of compilers. So while the community input is helpful, 
it isn't enough.


The correct place to do this is in the ABI project here: 
https://github.com/ARM-software/abi-aa


R.


Re: Setting up editors for the GNU/GCC coding style?

2022-07-29 Thread Richard Earnshaw via Gcc




On 28/07/2022 22:43, Iannetta Paul wrote:

About configuring recent editors to follow the GNU coding style, I don't 
really know but it should always be possible to register a hook that 
will run `indent` when the file is saved.


I don't think that's a good idea.  It will result in quite a lot of 
minor changes that will just make diffs confused.  We do have a style, 
but not everything is exactly as GNU indent would lay it out (plus GNU 
indent says quite clearly that it can't handle C++).


R.


Re: Wanted: original ConceptGCC downloads / branch, concepts-lite branch

2022-08-17 Thread Richard Earnshaw via Gcc




On 17/08/2022 12:42, Aaron Gray via Gcc wrote:

Hi,

I am looking for the original ConceptGCC source code, the
https://www.generic-programming.org/software/ConceptGCC/download.html has
all broken links and the SVN is gone.

Is this available on GCC git or SVN ?

Also I am wondering if the original concepts-lite code is available too
anywhere please ?

Also any pointers to the documentation for the current implementation ?

Regards,

Aaron


Not withstanding what others have already said, the various concepts 
branches are still in the git repository, but aren't in the standard 
pull set.  You can use git fetch to explicitly pull them:


d743a72b52bcfaa1effd7fabe542c05a30609614refs/dead/heads/c++-concepts
780065c813a72664bd46a354a2d26087464c74fc 
refs/dead/heads/conceptgcc-branch
ce85971fd96e12d0d6675ecbc46c9a1884df766c 
refs/dead/heads/cxx0x-concepts-branch
14d4dad929a01ff7179350f0251af752c3125d74 
refs/deleted/r131428/heads/cxx0x-concepts-branch


I haven't looked as to which is most likely to be relevant.

R.

PS, note that these branches may not appear in some mirrors if they only 
mirror the default refs/heads set.


Re: Forward GCC '-v' command-line option to binutils assembler, linker (was: [PING] nvptx: forward '-v' command-line option to assembler, linker)

2022-09-22 Thread Richard Earnshaw via Gcc




On 22/09/2022 12:32, Nick Clifton via Gcc wrote:

Hi Thomas,



+/* Linker supports '-v' option.  */
+#define LINK_SPEC "%{v}"


..., Tom rightfully asked:


[...] I wonder, normally we don't pass -v to ld, and need -Wl,-v for
that.


So, on my quest for making things uniform/simple, I now wonder: should we
also forward GCC '-v' to binutils linker, or is there a reason to not do
that?


Not really no.  Historically of course this has not been done, so changing
it now might surprise a few users.  But it should not be that big of an
issue.



So, any particular reason why we would do things differently for
nvptx?


Nope, none at all.

Harmonizing the effect of the -v option sounds like a good idea to me.

Cheers
   Nick



What's wrong with users passing -Wa,-v or -Wl,-v to pass the option 
through to the assembler and linker respectively?  The more flags like 
this we force pass to the additional tools the more likely we are to 
have problems when that tool is not from the GNU toolchain.


R.



GNU Tools Cauldron 2023

2023-06-05 Thread Richard Earnshaw via Gcc

We are pleased to invite you all to the next GNU Tools Cauldron,
taking place in Cambridge, UK, on September 22-24, 2023.

As for the previous instances, we have setup a wiki page for
details:

  https://gcc.gnu.org/wiki/cauldron2023


Like last year, we are having to charge for attendance.  We are still
working out what we will need to charge, but it will be no more than £250.

Attendance will remain free for community volunteers and others who do
not have a commercial backer and we will be providing a small number of
travel bursaries for students to attend.

For all details of how to register, and how to submit a proposal for a 
track session, please see the wiki page.


The Cauldron is organized by a group of volunteers. We are keen to add
some more people so others can stand down. If you'd like to be part of
that organizing committee, please email the same address.

This announcement is being sent to the main mailing list of the
following groups: GCC, GDB, binutils, CGEN, DejaGnu, newlib and glibc.

Please feel free to share with other groups as appropriate.

Richard (on behalf of the GNU Tools Cauldron organizing committee).


Re: GNU Tools Cauldron 2023

2023-07-25 Thread Richard Earnshaw via Gcc
It is now just under 2 months until the GNU Tools Cauldron. 
Registration is still open, but we would really appreciate it if you 
could register as soon as possible so that we have a clear idea of the 
numbers.


Richard.

On 05/06/2023 14:59, Richard Earnshaw wrote:

We are pleased to invite you all to the next GNU Tools Cauldron,
taking place in Cambridge, UK, on September 22-24, 2023.

As for the previous instances, we have setup a wiki page for
details:

https://gcc.gnu.org/wiki/cauldron2023 




Like last year, we are having to charge for attendance.  We are still
working out what we will need to charge, but it will be no more than £250.

Attendance will remain free for community volunteers and others who do
not have a commercial backer and we will be providing a small number of
travel bursaries for students to attend.

For all details of how to register, and how to submit a proposal for a
track session, please see the wiki page.

The Cauldron is organized by a group of volunteers. We are keen to add
some more people so others can stand down. If you'd like to be part of
that organizing committee, please email the same address.

This announcement is being sent to the main mailing list of the
following groups: GCC, GDB, binutils, CGEN, DejaGnu, newlib and glibc.

Please feel free to share with other groups as appropriate.

Richard (on behalf of the GNU Tools Cauldron organizing committee).


Re: gcc 13.2 is missing warnings?

2023-10-19 Thread Richard Earnshaw via Gcc




On 19/10/2023 12:39, Eric Sokolowsky via Gcc wrote:

I am using gcc 13.2 on Fedora 38. Consider the following program.

#include 
int main(int argc, char **argv)
{
 printf("Enter a number: ");
 int num = 0;
 scanf("%d", &num);

 switch (num)
 {
 case 1:
 int a = num + 3;
 printf("The new number is %d.\n", a);
 break;
 case 2:
 int b = num - 4;
 printf("The new number is %d.\n", b);
 break;
 default:
 int c = num * 3;
 printf("The new number is %d.\n", c);
 break;
 }
}

I would expect that gcc would complain about the declaration of
variables (a, b, and c) within the case statements. When I run "gcc
-Wall t.c" I get no warnings. When I run "g++ -Wall t.c" I get
warnings and errors as expected. I do get warnings when using MinGW on
Windows (gcc version 6.3 specifically). Did something change in 13.2?

Eric


The analysis needed to generate useful warnings is often not run unless 
the optimizers are enabled.  Try adding -O, or even higher.  -O0 is 
generally only recommended for syntax checking.


R.


Re: Discussion about arm testcase failures seen with patch for PR111673

2023-11-24 Thread Richard Earnshaw via Gcc




On 24/11/2023 08:09, Surya Kumari Jangala via Gcc wrote:

Hi Richard,
Ping. Please let me know if the test failure that I mentioned in the mail below 
can be handled by changing the expected generated code. I am not conversant 
with arm, and hence would appreciate your help.

Regards,
Surya

On 03/11/23 4:58 pm, Surya Kumari Jangala wrote:

Hi Richard,
I had submitted a patch for review 
(https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631849.html)
regarding scaling save/restore costs of callee save registers with block
frequency in the IRA pass (PR111673).

This patch has been approved by VMakarov
(https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632089.html).

With this patch, we are seeing performance improvements with spec on x86
(exchange: 5%, xalancbmk: 2.5%) and on Power (perlbench: 5.57%).

I received a mail from Linaro about some failures seen in the CI pipeline with
this patch. I have analyzed the failures and I wish to discuss the analysis 
with you.

One failure reported by the Linaro CI is:

FAIL: gcc.target/arm/pr111235.c scan-assembler-times ldrexd\tr[0-9]+, r[0-9]+, 
\\[r[0-9]+\\] 2

The diff in the assembly between trunk and patch is:

93c93
<   push{r4, r5}
---

   push{fp}

95c95
<   ldrexd  r4, r5, [r0]
---

   ldrexd  fp, ip, [r0]

99c99
<   pop {r4, r5}
---

   ldr fp, [sp], #4



The test fails with patch because the ldrexd insn uses fp & ip registers instead
of r[0-9]+

But the code produced by patch is better because it is pushing and restoring 
only
one register (fp) instead of two registers (r4, r5). Hence, this test can be
modified to allow it to pass on arm. Please let me know what you think.

If you need more information, please let me know. I will be sending separate 
mails
for the other test failures.



Thanks for looking at this.


The key part of this test is that the compiler generates LDREXD.  The 
registers used for that are pretty much irrelevant as we don't match 
them to any other operations within the test.  So I'd recommend just 
testing for the mnemonic and not for any of the operands (ie just match 
"ldrexd\t").


R.


Regards,
Surya





Re: Discussion about arm/aarch64 testcase failures seen with patch for PR111673

2023-11-28 Thread Richard Earnshaw via Gcc




On 28/11/2023 12:52, Surya Kumari Jangala wrote:

Hi Richard,
Thanks a lot for your response!

Another failure reported by the Linaro CI is as follows :
(Note: I am planning to send a separate mail for each failure, as this will make
the discussion easy to track)

FAIL: gcc.target/aarch64/sve/acle/general/cpy_1.c -march=armv8.2-a+sve 
-moverride=tune=none  check-function-bodies dup_x0_m

Expected code:

   ...
   add (x[0-9]+), x0, #?1
   mov (p[0-7])\.b, p15\.b
   mov z0\.d, \2/m, \1
   ...
   ret


Code obtained w/o patch:
 addvl   sp, sp, #-1
 str p15, [sp]
 add x0, x0, 1
 mov p3.b, p15.b
 mov z0.d, p3/m, x0
 ldr p15, [sp]
 addvl   sp, sp, #1
 ret

Code obtained w/ patch:
addvl   sp, sp, #-1
 str p15, [sp]
 mov p3.b, p15.b
 add x0, x0, 1
 mov z0.d, p3/m, x0
 ldr p15, [sp]
 addvl   sp, sp, #1
 ret

As we can see, with the patch, the following two instructions are interchanged:
 add x0, x0, 1
 mov p3.b, p15.b


Indeed, both look acceptable results to me, especially given that we 
don't schedule results at -O1.


There's two ways of fixing this:
1) Simply swap the order to what the compiler currently generates (which 
is a little fragile, since it might flip back someday).

2) Write the test as


** (
**   add (x[0-9]+), x0, #?1
**   mov (p[0-7])\.b, p15\.b
**   mov z0\.d, \2/m, \1
** |
**   mov (p[0-7])\.b, p15\.b
**   add (x[0-9]+), x0, #?1
**   mov z0\.d, \1/m, \2
** )

Note, we need to swap the match names in the third insn to account for 
the different order of the earlier instructions.


Neither is ideal, but the second is perhaps a little more bomb proof.

I don't really have a strong feeling either way, but perhaps the second 
is slightly preferable.


Richard S: thoughts?

R.


I believe that this is fine and the test can be modified to allow it to pass on
aarch64. Please let me know what you think.

Regards,
Surya


On 24/11/23 4:18 pm, Richard Earnshaw wrote:



On 24/11/2023 08:09, Surya Kumari Jangala via Gcc wrote:

Hi Richard,
Ping. Please let me know if the test failure that I mentioned in the mail below 
can be handled by changing the expected generated code. I am not conversant 
with arm, and hence would appreciate your help.

Regards,
Surya

On 03/11/23 4:58 pm, Surya Kumari Jangala wrote:

Hi Richard,
I had submitted a patch for review 
(https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631849.html)
regarding scaling save/restore costs of callee save registers with block
frequency in the IRA pass (PR111673).

This patch has been approved by VMakarov
(https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632089.html).

With this patch, we are seeing performance improvements with spec on x86
(exchange: 5%, xalancbmk: 2.5%) and on Power (perlbench: 5.57%).

I received a mail from Linaro about some failures seen in the CI pipeline with
this patch. I have analyzed the failures and I wish to discuss the analysis 
with you.

One failure reported by the Linaro CI is:

FAIL: gcc.target/arm/pr111235.c scan-assembler-times ldrexd\tr[0-9]+, r[0-9]+, 
\\[r[0-9]+\\] 2

The diff in the assembly between trunk and patch is:

93c93
<   push    {r4, r5}
---

    push    {fp}

95c95
<   ldrexd  r4, r5, [r0]
---

    ldrexd  fp, ip, [r0]

99c99
<   pop {r4, r5}
---

    ldr fp, [sp], #4



The test fails with patch because the ldrexd insn uses fp & ip registers instead
of r[0-9]+

But the code produced by patch is better because it is pushing and restoring 
only
one register (fp) instead of two registers (r4, r5). Hence, this test can be
modified to allow it to pass on arm. Please let me know what you think.

If you need more information, please let me know. I will be sending separate 
mails
for the other test failures.



Thanks for looking at this.


The key part of this test is that the compiler generates LDREXD.  The registers used for 
that are pretty much irrelevant as we don't match them to any other operations within the 
test.  So I'd recommend just testing for the mnemonic and not for any of the operands (ie 
just match "ldrexd\t").

R.


Regards,
Surya





Re: Help needed with maintainer-mode

2024-03-04 Thread Richard Earnshaw via Gcc



On 04/03/2024 15:36, Richard Earnshaw (lists) wrote:
> On 04/03/2024 14:46, Christophe Lyon via Gcc wrote:
>> On Mon, 4 Mar 2024 at 12:25, Jonathan Wakely  wrote:
>>>
>>> On Mon, 4 Mar 2024 at 10:44, Christophe Lyon via Gcc  
>>> wrote:

 Hi!

 On Mon, 4 Mar 2024 at 10:36, Thomas Schwinge  
 wrote:
>
> Hi!
>
> On 2024-03-04T00:30:05+, Sam James  wrote:
>> Mark Wielaard  writes:
>>> On Fri, Mar 01, 2024 at 05:32:15PM +0100, Christophe Lyon wrote:
 [...], I read
 https://gcc.gnu.org/wiki/Regenerating_GCC_Configuration 
 
 which basically says "run autoreconf in every dir where there is a
 configure script"
 but this is not exactly what autoregen.py is doing. IIRC it is based
 on a script from Martin Liska, do you know/remember why it follows a
 different process?
>>>
>>> CCing Sam and Arsen who helped refine the autoregen.py script, who
>>> might remember more details. We wanted a script that worked for both
>>> gcc and binutils-gdb. And as far as I know autoreconf simply didn't
>>> work in all directories. We also needed to skip some directories that
>>> did contain a configure script, but that were imported (gotools,
>>> readline, minizip).
>>
>> What we really need to do is, for a start, land tschwinge/aoliva's 
>> patches [0]
>> for AC_CONFIG_SUBDIRS.
>
> Let me allocate some time this week to get that one completed.
>
>> Right now, the current situation is janky and it's nowhere near
>> idiomatic autotools usage. It is not a comfortable experience
>> interacting with it even as someone who is familiar and happy with using
>> autotools otherwise.
>>
>> I didn't yet play with maintainer-mode myself but I also didn't see much
>> point given I knew of more fundamental problems like this.
>>
>> [0] 
>> https://inbox.sourceware.org/gcc-patches/oril72c4yh@lxoliva.fsfla.org/
>>  
>> 
>

 Thanks for the background. I didn't follow that discussion at that time :-)

 So... I was confused because I noticed many warnings when doing a simple
 find . -name configure |while read f; do echo $f;d=$(dirname $f) &&
 autoreconf -f $d && echo $d; done
 as suggested by https://gcc.gnu.org/wiki/Regenerating_GCC_Configuration 
 

 Then I tried with autoregen.py, and saw the same and now just
 checked Sourceware's bot logs and saw the same numerous warnings at
 least in GCC (didn't check binutils yet). Looks like this is
 "expected" 

 I started looking at auto-regenerating these files in our CI a couple
 of weeks ago, after we received several "complaints" from contributors
 saying that our precommit CI was useless / bothering since it didn't
 regenerate files, leading to false alarms.
 But now I'm wondering how such contributors regenerate the files
 impacted by their patches before committing, they probably just
 regenerate things in their subdir of interest, not noticing the whole
 picture :-(

 As a first step, we can probably use autoregen.py too, and declare
 maintainer-mode broken. However, I do notice that besides the rules
 about regenerating configure/Makefile.in/..., maintainer-mode is also
 used to update some files.
 In gcc:
 fixincludes: fixincl.x
 libffi: doc/version.texi
 libgfortran: some stuff :-)
 libiberty: functions.texi
>>>
>>> My recently proposed patch adds the first of those to gcc_update, the
>>> other should be done too.
>>> https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647027.html 
>>> 
>>>
>> 
>> This script touches files such that they appear more recent than their
>> dependencies,
>> so IIUC even if one uses --enable-maintainer-mode, it will have no effect.
>> For auto* files, this is "fine" as we can run autoreconf or
>> autoregen.py before starting configure+build, but what about other
>> files?
>> For instance, if we have to test a patch which implies changes to
>> fixincludes/fixincl.x, how should we proceed?
>> 1- git checkout (with possibly "wrong" timestamps)
>> 2- apply patch-to-test
>> 3- contrib/gcc_update -t
>> 4- configure --enable-maintainer-mode
> 
> If you ran
> 
> git reset --hard master // restore state to 'master'
> contrib/gcc_update // pull latest code
> 
> then anything coming from upstream will be touched automatically.  You really 
> don't want to re-touch the files after patching unless you're sure they've 
> all been patched correctly, it will break if there's anything regenerated 
> that's missing.
> 
> R.

Alternatively, if you did 

git reset --hard master

Re: Help needed with maintainer-mode

2024-03-05 Thread Richard Earnshaw via Gcc



On 05/03/2024 14:26, Richard Earnshaw (lists) wrote:
> On 04/03/2024 20:04, Jonathan Wakely wrote:
>> On Mon, 4 Mar 2024 at 19:27, Vladimir Mezentsev
>>  wrote:
>>>
>>>
>>>
>>> On 3/4/24 09:38, Richard Earnshaw (lists) wrote:
 Tools like git (and svn before it) don't try to maintain time-stamps on 
 patches, the tool just modifies the file and the timestamp comes from the 
 time of the modification.  That's fine if there is nothing regenerated 
 within the repository (it's pure original source), but will cause problems 
 if there are generated files as their time stamps aren't necessarily 
 correct.  `gcc_update --touch` addresses that by ensuring all the 
 generated files are retouched when needed.
>>>
>>> Why do we save generated files in the source tree?
>>> What will be the problem if we remove Makefile.in and configure from
>>> source tree and will run `autoreconf -i -f` before building ?
>> 
>> Having the exact correct versions of autoconf and automake increases
>> the barrier for new contributors to start work. And to regenerate
>> everything, they also need autogen, mkinfo, etc.
> 
> It's worse than that.  They might need multiple versions of those tools 
> because different subtrees are built with different, subtly incompatible, 
> versions of those tools.
> 
> R.
> 

And I've just remembered another reason as well, which is that some people want 
to store their sources in a read-only environment; having the tools write to 
the source area during a build can cause problems (eg if building multiple 
configurations of the compiler in parallel).

R.