[PATCH (pushed)] analyzer: fix Clang warnings

2022-11-23 Thread Martin Liška
Fixes the following warnings:
gcc/analyzer/varargs.cc:655:8: warning: 'matches_call_types_p' overrides a 
member function but is not marked 'override' [-Winconsistent-missing-override]
gcc/analyzer/varargs.cc:707:50: warning: unused parameter 'cd' 
[-Wunused-parameter]
gcc/analyzer/varargs.cc:707:8: warning: 'matches_call_types_p' overrides a 
member function but is not marked 'override' [-Winconsistent-missing-override]

gcc/analyzer/ChangeLog:

* varargs.cc: Fix Clang warnings.
---
 gcc/analyzer/varargs.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/analyzer/varargs.cc b/gcc/analyzer/varargs.cc
index 1da5a46f677..daa937d9c65 100644
--- a/gcc/analyzer/varargs.cc
+++ b/gcc/analyzer/varargs.cc
@@ -652,7 +652,7 @@ make_va_list_state_machine (logger *logger)
 class kf_va_start : public known_function
 {
 public:
-  bool matches_call_types_p (const call_details &) const
+  bool matches_call_types_p (const call_details &) const final override
   {
 return true;
   }
@@ -704,7 +704,7 @@ kf_va_start::impl_call_pre (const call_details ) const
 class kf_va_copy : public known_function
 {
 public:
-  bool matches_call_types_p (const call_details ) const
+  bool matches_call_types_p (const call_details &) const final override
   {
 return true;
   }
-- 
2.38.1



Re: [PATCH V2] Use subscalar mode to move struct block for parameter

2022-11-23 Thread Richard Biener via Gcc-patches
On Wed, 23 Nov 2022, Jiufu Guo wrote:

> Hi Jeff,
> 
> Thanks a lot for your comments!

Sorry for the late response ...

> Jeff Law  writes:
> 
> > On 11/20/22 20:07, Jiufu Guo wrote:
> >> Jiufu Guo  writes:
> >>
> >>> Hi,
> >>>
> >>> As mentioned in the previous version patch:
> >>> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604646.html
> >>> The suboptimal code is generated for "assigning from parameter" or
> >>> "assigning to return value".
> >>> This patch enhances the assignment from parameters like the below
> >>> cases:
> >>> /case1.c
> >>> typedef struct SA {double a[3];long l; } A;
> >>> A ret_arg (A a) {return a;}
> >>> void st_arg (A a, A *p) {*p = a;}
> >>>
> >>> case2.c
> >>> typedef struct SA {double a[3];} A;
> >>> A ret_arg (A a) {return a;}
> >>> void st_arg (A a, A *p) {*p = a;}
> >>>
> >>> For this patch, bootstrap and regtest pass on ppc64{,le}
> >>> and x86_64.
> >>> * Besides asking for help reviewing this patch, I would like to
> >>> consult comments about enhancing for "assigning to returns".
> >> I updated the patch to fix the issue for returns.  This patch
> >> adds a flag DECL_USEDBY_RETURN_P to indicate if a var is used
> >> by a return stmt.  This patch fix the issue in expand pass only,
> >> so, we would try to update the patch to avoid this flag.
> >>
> >> diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
> >> index dd29c03..09b8ec64cea 100644
> >> --- a/gcc/cfgexpand.cc
> >> +++ b/gcc/cfgexpand.cc
> >> @@ -2158,6 +2158,20 @@ expand_used_vars (bitmap forced_stack_vars)
> >>   frame_phase = off ? align - off : 0;
> >> }
> >>   +  /* Collect VARs on returns.  */
> >> +  if (DECL_RESULT (current_function_decl))
> >> +{
> >> +  edge_iterator ei;
> >> +  edge e;
> >> +  FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
> >> +  if (greturn *ret = safe_dyn_cast (last_stmt (e->src)))
> >> +{
> >> +  tree val = gimple_return_retval (ret);
> >> +  if (val && VAR_P (val))
> >> +DECL_USEDBY_RETURN_P (val) = 1;

you probably want to check && auto_var_in_fn (val, ...) since val
might be global?

> >> +}
> >> +}
> >> +
> >> /* Set TREE_USED on all variables in the local_decls.  */
> >> FOR_EACH_LOCAL_DECL (cfun, i, var)
> >>   TREE_USED (var) = 1;
> >> diff --git a/gcc/expr.cc b/gcc/expr.cc
> >> index d9407432ea5..20973649963 100644
> >> --- a/gcc/expr.cc
> >> +++ b/gcc/expr.cc
> >> @@ -6045,6 +6045,52 @@ expand_assignment (tree to, tree from, bool 
> >> nontemporal)
> >> return;
> >>   }

I miss an explanatory comment here on that the following is heuristics
and its reasoning.

> >>   +  if ((TREE_CODE (from) == PARM_DECL && DECL_INCOMING_RTL (from)
> >> +   && TYPE_MODE (TREE_TYPE (from)) == BLKmode

Why check TYPE_MODE here?  Do you want AGGREGATE_TYPE_P on the type
instead?

> >> +   && (GET_CODE (DECL_INCOMING_RTL (from)) == PARALLEL
> >> + || REG_P (DECL_INCOMING_RTL (from
> >> +  || (VAR_P (to) && DECL_USEDBY_RETURN_P (to)
> >> +&& TYPE_MODE (TREE_TYPE (to)) == BLKmode

Likewise.

> >> +&& GET_CODE (DECL_RTL (DECL_RESULT (current_function_decl)))
> >> + == PARALLEL))

Not REG_P here?

> >> +{
> >> +  push_temp_slots ();
> >> +  rtx par_ret;
> >> +  machine_mode mode;
> >> +  par_ret = TREE_CODE (from) == PARM_DECL
> >> +? DECL_INCOMING_RTL (from)
> >> +: DECL_RTL (DECL_RESULT (current_function_decl));
> >> +  mode = GET_CODE (par_ret) == PARALLEL
> >> + ? GET_MODE (XEXP (XVECEXP (par_ret, 0, 0), 0))
> >> + : word_mode;
> >> +  int mode_size = GET_MODE_SIZE (mode).to_constant ();
> >> +  int size = INTVAL (expr_size (from));
> >> +
> >> +  /* If/How the parameter using submode, it dependes on the size and
> >> +   position of the parameter.  Here using heurisitic number.  */
> >> +  int hurstc_num = 8;
> >
> > Where did this come from and what does it mean?
> Sorry for does not make this clear. We know that an aggregate arg may be
> on registers partially or totally, as assign_parm_adjust_entry_rtl.
> For an example, if a parameter with 12 words and the target/ABI only
> allow 8 gprs for arguments, then the parameter could use 8 regs at most
> and left part in stack.

I also wonder about the exact semantics of the parallels we get here.

+  int size = INTVAL (expr_size (from));

esp. when you use sth as simple as this.  Shouldn't you instead look
at to_rtx since that's already expanded?  For returns that should
be the desired layout to match 'from' to, no?  Maybe it's better
to not try sharing the code for both incoming and return copies
for clarity?

Also, what happens if there's a copy from a PARM_DECL to a
DECL_USEDBY_RETURN_P decl?  Which heuristic takes precedent?

I think that at least the place of the copy improvement and the
way you compute DECL_USEDBY_RETURN_P is reasonable.

Thanks,
Richard.

> >
> >
> > Note that BLKmode subword values passed in 

Re: [PATCH] Add a new conversion for conditional ternary set into ifcvt [PR106536]

2022-11-23 Thread HAO CHEN GUI via Gcc-patches
Hi Richard,


在 2022/11/24 4:06, Richard Biener 写道:
> Wouldn't we usually either add an optab or try to recog a canonical
> RTL form instead of adding a new target hook for things like this?

Thanks so much for your comments. Please let me make it clear.

Do you mean we should create an optab for "setb" pattern (the nested
if-then-else insn) and detect candidate insns in ifcvt pass? Then
generate the insn with the new optab?

My concern is that some candidate insns are target specific. For
example, different modes cause additional zero_extend or subreg insns
generated on different targets. So I put the detection process into a
target hook.

Looking forward to your advice.

Thanks again
Gui Haochen


Re: [PATCH] AArch64: Add fma_reassoc_width [PR107413]

2022-11-23 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra  writes:
> Hi Richard,
>
>>> A smart reassociation pass could form more FMAs while also increasing
>>> parallelism, but the way it currently works always results in fewer FMAs.
>>
>> Yeah, as Richard said, that seems the right long-term fix.
>> It would also avoid the hack of treating PLUS_EXPR as a signal
>> of an FMA, which has the drawback of assuming (for 2-FMA cores)
>> that plain addition never benefits from reassociation in its own right.
>
> True but it's hard to separate them. You will have a mix of FADD and FMAs
> to reassociate (since FMA still counts as an add), and the ratio between
> them as well as the number of operations may affect the best reassociation
> width.
>
>> Still, I guess the hackiness is pre-existing and the patch is removing
>> the hackiness for some cores, so from that point of view it's a strict
>> improvement over the status quo.  And it's too late in the GCC 13
>> cycle to do FMA reassociation properly.  So I'm OK with the patch
>> in principle, but could you post an update with more commentary?
>
> Sure, here is an update with longer comment in aarch64_reassociation_width:
>
>
> Add a reassocation width for FMAs in per-CPU tuning structures. Keep the
> existing setting for cores with 2 FMA pipes, and use 4 for cores with 4
> FMA pipes.  This improves SPECFP2017 on Neoverse V1 by ~1.5%.
>
> Passes regress/bootstrap, OK for commit?
>
> gcc/ChangeLog/
> PR 107413
> * config/aarch64/aarch64.cc (struct tune_params): Add
> fma_reassoc_width to all CPU tuning structures.
> (aarch64_reassociation_width): Use fma_reassoc_width.
> * config/aarch64/aarch64-protos.h (struct tune_params): Add
> fma_reassoc_width.

OK, thanks.

Richard

> ---
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 
> 238820581c5ee7617f8eed1df2cf5418b1127e19..4be93c93c26e091f878bc8e4cf06e90888405fb2
>  100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -540,6 +540,7 @@ struct tune_params
>const char *loop_align;
>int int_reassoc_width;
>int fp_reassoc_width;
> +  int fma_reassoc_width;
>int vec_reassoc_width;
>int min_div_recip_mul_sf;
>int min_div_recip_mul_df;
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> c91df6f5006c257690aafb75398933d628a970e1..15d478c77ceb2d6c52a70b6ffd8fdadcfa8deba0
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -1346,6 +1346,7 @@ static const struct tune_params generic_tunings =
>"8", /* loop_align.  */
>2,   /* int_reassoc_width.  */
>4,   /* fp_reassoc_width.  */
> +  1,   /* fma_reassoc_width.  */
>1,   /* vec_reassoc_width.  */
>2,   /* min_div_recip_mul_sf.  */
>2,   /* min_div_recip_mul_df.  */
> @@ -1382,6 +1383,7 @@ static const struct tune_params cortexa35_tunings =
>"8", /* loop_align.  */
>2,   /* int_reassoc_width.  */
>4,   /* fp_reassoc_width.  */
> +  1,   /* fma_reassoc_width.  */
>1,   /* vec_reassoc_width.  */
>2,   /* min_div_recip_mul_sf.  */
>2,   /* min_div_recip_mul_df.  */
> @@ -1415,6 +1417,7 @@ static const struct tune_params cortexa53_tunings =
>"8", /* loop_align.  */
>2,   /* int_reassoc_width.  */
>4,   /* fp_reassoc_width.  */
> +  1,   /* fma_reassoc_width.  */
>1,   /* vec_reassoc_width.  */
>2,   /* min_div_recip_mul_sf.  */
>2,   /* min_div_recip_mul_df.  */
> @@ -1448,6 +1451,7 @@ static const struct tune_params cortexa57_tunings =
>"8", /* loop_align.  */
>2,   /* int_reassoc_width.  */
>4,   /* fp_reassoc_width.  */
> +  1,   /* fma_reassoc_width.  */
>1,   /* vec_reassoc_width.  */
>2,   /* min_div_recip_mul_sf.  */
>2,   /* min_div_recip_mul_df.  */
> @@ -1481,6 +1485,7 @@ static const struct tune_params cortexa72_tunings =
>"8", /* loop_align.  */
>2,   /* int_reassoc_width.  */
>4,   /* fp_reassoc_width.  */
> +  1,   /* fma_reassoc_width.  */
>1,   /* vec_reassoc_width.  */
>2,   /* min_div_recip_mul_sf.  */
>2,   /* min_div_recip_mul_df.  */
> @@ -1514,6 +1519,7 @@ static const struct tune_params cortexa73_tunings =
>"8", /* loop_align.  */
>2,   /* int_reassoc_width.  */
>4,   /* fp_reassoc_width.  */
> +  1,   /* fma_reassoc_width.  */
>1,   /* vec_reassoc_width.  */
>2,   /* min_div_recip_mul_sf.  */
>2,   /* min_div_recip_mul_df.  */
> @@ -1548,6 +1554,7 @@ static const struct tune_params exynosm1_tunings =
>"4", /* loop_align.  */
>2,   /* int_reassoc_width.  */
>4,   /* fp_reassoc_width.  */
> +  1,   /* fma_reassoc_width.  */
>1,   /* vec_reassoc_width.  */
>2,   /* min_div_recip_mul_sf.  */
>2,   /* min_div_recip_mul_df.  */
> @@ -1580,6 +1587,7 @@ static const struct tune_params thunderxt88_tunings =
>"8", /* loop_align.  */
>2,   /* int_reassoc_width.  */
>4,   /* fp_reassoc_width.  */
> 

Re: [PATCH 2/2] Add a new warning option -Wstrict-flex-arrays.

2022-11-23 Thread Richard Biener via Gcc-patches
On Tue, 22 Nov 2022, Kees Cook wrote:

> On Tue, Nov 22, 2022 at 03:02:04PM +, Qing Zhao wrote:
> > 
> > 
> > > On Nov 22, 2022, at 9:10 AM, Qing Zhao via Gcc-patches 
> > >  wrote:
> > > 
> > > 
> > > 
> > >> On Nov 22, 2022, at 3:16 AM, Richard Biener  wrote:
> > >> 
> > >> On Mon, 21 Nov 2022, Qing Zhao wrote:
> > >> 
> > >>> 
> > >>> 
> >  On Nov 18, 2022, at 11:31 AM, Kees Cook  wrote:
> >  
> >  On Fri, Nov 18, 2022 at 03:19:07PM +, Qing Zhao wrote:
> > > Hi, Richard,
> > > 
> > > Honestly, it?s very hard for me to decide what?s the best way to 
> > > handle the interaction 
> > > between -fstrict-flex-array=M and -Warray-bounds=N. 
> > > 
> > > Ideally,  -fstrict-flex-array=M should completely control the 
> > > behavior of -Warray-bounds.
> > > If possible, I prefer this solution.
> > > 
> > > However, -Warray-bounds is included in -Wall, and has been used 
> > > extensively for a long time.
> > > It?s not safe to change its default behavior. 
> >  
> >  I prefer that -fstrict-flex-arrays controls -Warray-bounds. That
> >  it is in -Wall is _good_ for this reason. :) No one is going to add
> >  -fstrict-flex-arrays (at any level) without understanding what it does
> >  and wanting those effects on -Warray-bounds.
> > >>> 
> > >>> 
> > >>> The major difficulties to let -fstrict-flex-arrays controlling 
> > >>> -Warray-bounds was discussed in the following threads:
> > >>> 
> > >>> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604133.html
> > >>> 
> > >>> Please take a look at the discussion and let me know your opinion.
> > >> 
> > >> My opinion is now, after re-considering and with seeing your new 
> > >> patch, that -Warray-bounds=2 should be changed to only add
> > >> "the intermediate results of pointer arithmetic that may yield out of 
> > >> bounds values" and that what it considers a flex array should now
> > >> be controlled by -fstrict-flex-arrays only.
> > >> 
> > >> That is, I think, the only thing that's not confusing to users even
> > >> if that implies a change from previous behavior that we should
> > >> document by clarifying the -Warray-bounds documentation as well as
> > >> by adding an entry to the Caveats section of gcc-13/changes.html
> > >> 
> > >> That also means that =2 will get _less_ warnings with GCC 13 when
> > >> the user doesn't use -fstrict-flex-arrays as well.
> > > 
> > > Okay.  So, this is for -Warray-bounds=2.
> > > 
> > > For -Warray-bounds=1 -fstrict-flex-array=N, if N > 1, should 
> > > -fstrict-flex-array=N control -Warray-bounds=1?
> > 
> > More thinking on this. (I might misunderstand a little bit in the previous 
> > email)
> > 
> > If I understand correctly now, what you proposed was:
> > 
> > 1. The level of -Warray-bounds will NOT control how a trailing array is 
> > considered as a flex array member anymore. 
> > 2. Only the level of -fstrict-flex-arrays will control this;
> > 3. Keep the current default  behavior of -Warray-bounds on treating 
> > trailing arrays as flex array member (treating all [0],[1], and [] as 
> > flexible array members). 
> > 4. Updating the documentation for -Warray-bounds by clarifying this change, 
> > and also as an entry to the Caveats section on such change on 
> > -Warray-bounds.
> > 
> > If the above is correct, Yes, I like this change. Both the user interface 
> > and the internal implementation will be simplified and cleaner. 
> > 
> > Let me know if you see any issue with my above understanding.
> > 
> > Thanks a lot.
> 
> FWIW, this matches what I think makes the most sense too.

Yes, I think that makes most sense.  As said for -Warray-bounds=2 this
will change behavior but since that's not the default that should be
fine if documented.

Thanks,
Richard.


[committed] analyzer: fix nondeterminism in logs

2022-11-23 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4275-ge0f18b87bfaf0b.

gcc/analyzer/ChangeLog:
* checker-path.cc (checker_path::inject_any_inlined_call_events):
Don't dump the address of the block when -fdump-noaddr.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/checker-path.cc | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/analyzer/checker-path.cc b/gcc/analyzer/checker-path.cc
index cbe24a2058a..221042ee010 100644
--- a/gcc/analyzer/checker-path.cc
+++ b/gcc/analyzer/checker-path.cc
@@ -273,8 +273,10 @@ checker_path::inject_any_inlined_call_events (logger 
*logger)
   !iter.done_p (); iter.next ())
{
  logger->start_log_line ();
- logger->log_partial ("  %qE (%p), fndecl: %qE, callsite: 0x%x",
-  iter.get_block (), iter.get_block (),
+ logger->log_partial ("  %qE", iter.get_block ());
+ if (!flag_dump_noaddr)
+   logger->log_partial (" (%p)", iter.get_block ());
+ logger->log_partial (", fndecl: %qE, callsite: 0x%x",
   iter.get_fndecl (), iter.get_callsite ());
  if (iter.get_callsite ())
dump_location (logger->get_printer (), iter.get_callsite ());
-- 
2.26.3



[committed] analyzer: revamp of heap-allocated regions [PR106473]

2022-11-23 Thread David Malcolm via Gcc-patches
PR analyzer/106473 reports a false positive from -Wanalyzer-malloc-leak
on:

  void foo(char **args[], int *argc) {
  *argc = 1;
  (*args)[0] = __builtin_malloc(42);
  }

The issue is that at the write to *argc we don't know if argc could
point within *args, and so we conservatiely set *args to be unknown.
At the write "(*args)[0] = __builtin_malloc(42)" we have the result of
the allocation written through an unknown pointer, so we mark the
heap_allocated_region as having escaped.
Unfortunately, within store::canonicalize we overzealously purge the
heap allocated region, losing the information that it has escaped, and
thus errnoeously report a leak.

The first part of the fix is to update store::canonicalize so that it
doesn't purge heap_allocated_regions that are marked as escaping.

Doing so fixes the leak false positive, but leads to various state
explosions relating to anywhere we have a malloc/free pair in a loop,
where the analysis of the iteration appears to only have been reaching
a fixed state due to a bug in the state merger code that was erroneously
merging state about the region allocated in one iteration with that
of another.  On touching that, the analyzer fails to reach a fixed state
on any loops containing a malloc/free pair, since each analysis of a
malloc was creating a new heap_allocated_region instance.

Hence the second part of the fix is to revamp how heap_allocated_regions
are managed within the analyzer.  Rather than create a new one at each
analysis of a malloc call, instead we reuse them across the analysis,
only creating a new one if the current path's state is referencing all
of the existing ones.  Hence the heap_allocated_region instances get
used in a fixed order along every analysis path, so e.g. at:

  if (flag)
p = malloc (4096);
  else
p = malloc (1024);

both paths now use the same heap_allocated_region for their malloc
calls - but we still end up with two enodes after the CFG merger, by
rejecting merger of states with non-equal dynamic extents.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4276-gce917b0422c145.

gcc/analyzer/ChangeLog:
PR analyzer/106473
* call-summary.cc
(call_summary_replay::convert_region_from_summary_1): Update for
change to creation of heap-allocated regions.
* program-state.cc (test_program_state_1): Likewise.
(test_program_state_merging): Likewise.
* region-model-impl-calls.cc (kf_calloc::impl_call_pre): Likewise.
(kf_malloc::impl_call_pre): Likewise.
(kf_operator_new::impl_call_pre): Likewise.
(kf_realloc::impl_call_postsuccess_with_move::update_model): Likewise.
* region-model-manager.cc
(region_model_manager::create_region_for_heap_alloc): Convert
to...
(region_model_manager::get_or_create_region_for_heap_alloc):
...this, reusing an existing region if it's unreferenced in the
client state.
* region-model-manager.h (region_model_manager::get_num_regions): New.
 (region_model_manager::create_region_for_heap_alloc): Convert to...
 (region_model_manager::get_or_create_region_for_heap_alloc): ...this.
* region-model.cc (region_to_value_map::can_merge_with_p): Reject
merger when the values are different.
(region_model::create_region_for_heap_alloc): Convert to...
(region_model::get_or_create_region_for_heap_alloc): ...this.
(region_model::get_referenced_base_regions): New.
(selftest::test_state_merging):  Update for change to creation of
heap-allocated regions.
(selftest::test_malloc_constraints): Likewise.
(selftest::test_malloc): Likewise.
* region-model.h: Include "sbitmap.h".
(region_model::create_region_for_heap_alloc): Convert to...
(region_model::get_or_create_region_for_heap_alloc): ...this.
(region_model::get_referenced_base_regions): New decl.
* store.cc (store::canonicalize): Don't purge a heap-allocated region
that's been marked as escaping.

gcc/testsuite/ChangeLog:
PR analyzer/106473
* gcc.dg/analyzer/aliasing-pr106473.c: New test.
* gcc.dg/analyzer/allocation-size-2.c: Add
-fanalyzer-fine-grained".
* gcc.dg/analyzer/allocation-size-3.c: Likewise.
* gcc.dg/analyzer/explode-1.c: Mark leak with XFAIL.
* gcc.dg/analyzer/explode-3.c: New test.
* gcc.dg/analyzer/malloc-reuse.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/call-summary.cc  |  4 +-
 gcc/analyzer/program-state.cc |  4 +-
 gcc/analyzer/region-model-impl-calls.cc   |  8 +--
 gcc/analyzer/region-model-manager.cc  | 15 -
 gcc/analyzer/region-model-manager.h   |  4 +-
 gcc/analyzer/region-model.cc  | 66 +++
 gcc/analyzer/region-model.h   |  7 +-
 gcc/analyzer/store.cc  

[committed 1/2] analyzer: move known funs for fds to sm-fd.cc

2022-11-23 Thread David Malcolm via Gcc-patches
This mostly mechanical change enables a simplification in the
followup patch.  No functional change intended.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4273-g50d5b240424d2b.

gcc/analyzer/ChangeLog:
* analyzer.h (register_known_fd_functions): New decl.
* region-model-impl-calls.cc (class kf_accept): Move to sm-fd.cc.
(class kf_bind): Likewise.
(class kf_connect): Likewise.
(class kf_listen): Likewise.
(class kf_pipe): Likewise.
(class kf_socket): Likewise.
(register_known_functions): Remove registration of the above
functions, instead calling register_known_fd_functions.
* sm-fd.cc: Include "analyzer/call-info.h".
(class kf_socket): Move here from region-model-impl-calls.cc.
(class kf_bind): Likewise.
(class kf_listen): Likewise.
(class kf_accept): Likewise.
(class kf_connect): Likewise.
(class kf_pipe): Likewise.
(register_known_fd_functions): New.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/analyzer.h |   1 +
 gcc/analyzer/region-model-impl-calls.cc | 286 +--
 gcc/analyzer/sm-fd.cc   | 293 
 3 files changed, 296 insertions(+), 284 deletions(-)

diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
index d424b43f2de..4fbe092199f 100644
--- a/gcc/analyzer/analyzer.h
+++ b/gcc/analyzer/analyzer.h
@@ -258,6 +258,7 @@ public:
 };
 
 extern void register_known_functions (known_function_manager );
+extern void register_known_fd_functions (known_function_manager );
 extern void register_varargs_builtins (known_function_manager );
 
 /* Passed by pointer to PLUGIN_ANALYZER_INIT callbacks.  */
diff --git a/gcc/analyzer/region-model-impl-calls.cc 
b/gcc/analyzer/region-model-impl-calls.cc
index 23a21d752cf..d3f2bf8240b 100644
--- a/gcc/analyzer/region-model-impl-calls.cc
+++ b/gcc/analyzer/region-model-impl-calls.cc
@@ -595,83 +595,6 @@ public:
   }
 };
 
-/* Handle calls to "accept".
-   See e.g. https://man7.org/linux/man-pages/man3/accept.3p.html  */
-
-class kf_accept : public known_function
-{
-  class outcome_of_accept : public succeed_or_fail_call_info
-  {
-  public:
-outcome_of_accept (const call_details , bool success)
-: succeed_or_fail_call_info (cd, success)
-{}
-
-bool update_model (region_model *model,
-  const exploded_edge *,
-  region_model_context *ctxt) const final override
-{
-  const call_details cd (get_call_details (model, ctxt));
-  return cd.get_model ()->on_accept (cd, m_success);
-}
-  };
-
-  bool matches_call_types_p (const call_details ) const final override
-  {
-return (cd.num_args () == 3
-   && cd.arg_is_pointer_p (1)
-   && cd.arg_is_pointer_p (2));
-  }
-
-  void impl_call_post (const call_details ) const final override
-  {
-if (cd.get_ctxt ())
-  {
-   cd.get_ctxt ()->bifurcate (make_unique (cd, false));
-   cd.get_ctxt ()->bifurcate (make_unique (cd, true));
-   cd.get_ctxt ()->terminate_path ();
-  }
-  }
-};
-
-/* Handle calls to "bind".
-   See e.g. https://man7.org/linux/man-pages/man3/bind.3p.html  */
-
-class kf_bind : public known_function
-{
-public:
-  class outcome_of_bind : public succeed_or_fail_call_info
-  {
-  public:
-outcome_of_bind (const call_details , bool success)
-: succeed_or_fail_call_info (cd, success)
-{}
-
-bool update_model (region_model *model,
-  const exploded_edge *,
-  region_model_context *ctxt) const final override
-{
-  const call_details cd (get_call_details (model, ctxt));
-  return cd.get_model ()->on_bind (cd, m_success);
-}
-  };
-
-  bool matches_call_types_p (const call_details ) const final override
-  {
-return (cd.num_args () == 3 && cd.arg_is_pointer_p (1));
-  }
-
-  void impl_call_post (const call_details ) const final override
-  {
-if (cd.get_ctxt ())
-  {
-   cd.get_ctxt ()->bifurcate (make_unique (cd, false));
-   cd.get_ctxt ()->bifurcate (make_unique (cd, true));
-   cd.get_ctxt ()->terminate_path ();
-  }
-  }
-};
-
 /* Handler for "__builtin_expect" etc.  */
 
 class kf_expect : public internal_known_function
@@ -723,45 +646,6 @@ kf_calloc::impl_call_pre (const call_details ) const
 }
 }
 
-/* Handle calls to "connect".
-   See e.g. https://man7.org/linux/man-pages/man3/connect.3p.html  */
-
-class kf_connect : public known_function
-{
-public:
-  class outcome_of_connect : public succeed_or_fail_call_info
-  {
-  public:
-outcome_of_connect (const call_details , bool success)
-: succeed_or_fail_call_info (cd, success)
-{}
-
-bool update_model (region_model *model,
-  const exploded_edge *,
-  region_model_context *ctxt) const final override
-{
-  const call_details cd 

[committed 2/2] analyzer: eliminate region_model::on_ fns for sockets

2022-11-23 Thread David Malcolm via Gcc-patches
This mostly mechanical patch eliminates a confusing extra layer of
redundant calls in the handling of socket-related functions.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4274-g5d2908b7bf9305.

gcc/analyzer/ChangeLog:
* region-model.h (region_model::on_socket): Delete decl.
(region_model::on_bind): Likewise.
(region_model::on_listen): Likewise.
(region_model::on_accept): Likewise.
(region_model::on_connect): Likewise.
* sm-fd.cc (kf_socket::outcome_of_socket::update_model): Move body
of region_model::on_socket into here, ...
(region_model::on_socket): ...eliminating this function.
(kf_bind::outcome_of_bind::update_model): Likewise for on_bind...
(region_model::on_bind): ...eliminating this function.
(kf_listen::outcome_of_listen::update_model): Likewise fo
on_listen...
(region_model::on_listen): ...eliminating this function.
(kf_accept::outcome_of_accept::update_model): Likewise fo
on_accept...
(region_model::on_accept): ...eliminating this function.
(kf_connect::outcome_of_connect::update_model): Likewise fo
on_connect...
(region_model::on_connect): ...eliminating this function.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model.h |   5 --
 gcc/analyzer/sm-fd.cc   | 144 
 2 files changed, 49 insertions(+), 100 deletions(-)

diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 8e4616c28de..4413f5542d9 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -515,11 +515,6 @@ class region_model
 
   /* Implemented in sm-fd.cc  */
   void mark_as_valid_fd (const svalue *sval, region_model_context *ctxt);
-  bool on_socket (const call_details , bool successful);
-  bool on_bind (const call_details , bool successful);
-  bool on_listen (const call_details , bool successful);
-  bool on_accept (const call_details , bool successful);
-  bool on_connect (const call_details , bool successful);
 
   /* Implemented in sm-malloc.cc  */
   void on_realloc_with_move (const call_details ,
diff --git a/gcc/analyzer/sm-fd.cc b/gcc/analyzer/sm-fd.cc
index af59aef401d..8f8ec851bab 100644
--- a/gcc/analyzer/sm-fd.cc
+++ b/gcc/analyzer/sm-fd.cc
@@ -2270,7 +2270,16 @@ public:
   region_model_context *ctxt) const final override
 {
   const call_details cd (get_call_details (model, ctxt));
-  return cd.get_model ()->on_socket (cd, m_success);
+  sm_state_map *smap;
+  const fd_state_machine *fd_sm;
+  std::unique_ptr sm_ctxt;
+  if (!get_fd_state (ctxt, , _sm, NULL, _ctxt))
+   return true;
+  const extrinsic_state *ext_state = ctxt->get_ext_state ();
+  if (!ext_state)
+   return true;
+
+  return fd_sm->on_socket (cd, m_success, sm_ctxt.get (), *ext_state);
 }
   };
 
@@ -2290,24 +2299,6 @@ public:
   }
 };
 
-/* Specialcase hook for handling "socket", for use by
-   kf_socket::outcome_of_socket::update_model.  */
-
-bool
-region_model::on_socket (const call_details , bool successful)
-{
-  sm_state_map *smap;
-  const fd_state_machine *fd_sm;
-  std::unique_ptr sm_ctxt;
-  if (!get_fd_state (cd.get_ctxt (), , _sm, NULL, _ctxt))
-return true;
-  const extrinsic_state *ext_state = cd.get_ctxt ()->get_ext_state ();
-  if (!ext_state)
-return true;
-
-  return fd_sm->on_socket (cd, successful, sm_ctxt.get (), *ext_state);
-}
-
 /* Handle calls to "bind".
See e.g. https://man7.org/linux/man-pages/man3/bind.3p.html  */
 
@@ -2326,7 +2317,15 @@ public:
   region_model_context *ctxt) const final override
 {
   const call_details cd (get_call_details (model, ctxt));
-  return cd.get_model ()->on_bind (cd, m_success);
+  sm_state_map *smap;
+  const fd_state_machine *fd_sm;
+  std::unique_ptr sm_ctxt;
+  if (!get_fd_state (ctxt, , _sm, NULL, _ctxt))
+   return true;
+  const extrinsic_state *ext_state = ctxt->get_ext_state ();
+  if (!ext_state)
+   return true;
+  return fd_sm->on_bind (cd, m_success, sm_ctxt.get (), *ext_state);
 }
   };
 
@@ -2346,24 +2345,6 @@ public:
   }
 };
 
-/* Specialcase hook for handling "bind", for use by
-   kf_bind::outcome_of_bind::update_model.  */
-
-bool
-region_model::on_bind (const call_details , bool successful)
-{
-  sm_state_map *smap;
-  const fd_state_machine *fd_sm;
-  std::unique_ptr sm_ctxt;
-  if (!get_fd_state (cd.get_ctxt (), , _sm, NULL, _ctxt))
-return true;
-  const extrinsic_state *ext_state = cd.get_ctxt ()->get_ext_state ();
-  if (!ext_state)
-return true;
-
-  return fd_sm->on_bind (cd, successful, sm_ctxt.get (), *ext_state);
-}
-
 /* Handle calls to "listen".
See e.g. https://man7.org/linux/man-pages/man3/listen.3p.html  */
 
@@ -2381,7 +2362,16 @@ class kf_listen : public known_function
   

[r13-4268 Regression] FAIL: gcc.dg/pr107127.c (test for excess errors) on Linux/x86_64

2022-11-23 Thread haochen.jiang via Gcc-patches
On Linux/x86_64,

8a0fce6a51915c29584427fd376b40073c328090 is the first bad commit
commit 8a0fce6a51915c29584427fd376b40073c328090
Author: Jakub Jelinek 
Date:   Wed Nov 23 19:09:31 2022 +0100

c: Fix compile time hog in c_genericize [PR107127]

caused

FAIL: gcc.dg/pr107127.c (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-4268/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr107127.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr107127.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr107127.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr107127.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com)


[PATCH v2] [x86] Fix incorrect _mm_cvtsbh_ss.

2022-11-23 Thread liuhongt via Gcc-patches
After supporting real __bf16, the implementation of _mm_cvtsbh_ss went
wrong.

The patch add a builtin to generate pslld for the intrinsic, also
extendbfsf2 is supported with pslld when !flag_signaling_nans &&
!HONOR_NANS (BFmode).

truncsfbf2 is supported with vcvtneps2bf16 when !flag_signaling_nans &&
!HONOR_NANS (BFmode) && flag_unsafe_math_optimizations.

Here's updated patch.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
Ok for trunk?

gcc/ChangeLog:

PR target/107748
* config/i386/avx512bf16intrin.h (_mm_cvtsbh_ss): Refined.
* config/i386/i386-builtin-types.def (FLOAT_FTYPE_BFLOAT16):
New function type.
* config/i386/i386-builtin.def (BDESC): New builtin.
* config/i386/i386-expand.cc (ix86_expand_args_builtin):
Handle the builtin.
* config/i386/i386.md (extendbfsf2): New expander.
(extendbfsf2_1): New define_insn.
(truncsfbf2): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512bf16-cvtsbh2ss-1.c: Scan pslld.
* gcc.target/i386/extendbfsf.c: New test.
---
 gcc/config/i386/avx512bf16intrin.h|  4 +-
 gcc/config/i386/i386-builtin-types.def|  1 +
 gcc/config/i386/i386-builtin.def  |  2 +
 gcc/config/i386/i386-expand.cc|  1 +
 gcc/config/i386/i386.md   | 41 ++-
 .../gcc.target/i386/avx512bf16-cvtsbh2ss-1.c  |  3 +-
 gcc/testsuite/gcc.target/i386/extendbfsf.c| 16 
 7 files changed, 62 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/extendbfsf.c

diff --git a/gcc/config/i386/avx512bf16intrin.h 
b/gcc/config/i386/avx512bf16intrin.h
index ea1d0125b3f..75378af5584 100644
--- a/gcc/config/i386/avx512bf16intrin.h
+++ b/gcc/config/i386/avx512bf16intrin.h
@@ -46,9 +46,7 @@ extern __inline float
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_cvtsbh_ss (__bf16 __A)
 {
-  union{ float a; unsigned int b;} __tmp;
-  __tmp.b = ((unsigned int)(__A)) << 16;
-  return __tmp.a;
+  return __builtin_ia32_cvtbf2sf (__A);
 }
 
 /* vcvtne2ps2bf16 */
diff --git a/gcc/config/i386/i386-builtin-types.def 
b/gcc/config/i386/i386-builtin-types.def
index d10de32643f..65fe070e37f 100644
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -1281,6 +1281,7 @@ DEF_FUNCTION_TYPE (V4SI, V4SI, V4SI, UHI)
 DEF_FUNCTION_TYPE (V8SI, V8SI, V8SI, UHI)
 
 # BF16 builtins
+DEF_FUNCTION_TYPE (FLOAT, BFLOAT16)
 DEF_FUNCTION_TYPE (V32BF, V16SF, V16SF)
 DEF_FUNCTION_TYPE (V32BF, V16SF, V16SF, V32BF, USI)
 DEF_FUNCTION_TYPE (V32BF, V16SF, V16SF, USI)
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index 5e0461acc00..d85b1753039 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -2838,6 +2838,8 @@ BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_dpbf16ps_v8sf_maskz, "__
 BDESC (0, OPTION_MASK_ISA2_AVX512BF16, CODE_FOR_avx512f_dpbf16ps_v4sf, 
"__builtin_ia32_dpbf16ps_v4sf", IX86_BUILTIN_DPBF16PS_V4SF, UNKNOWN, (int) 
V4SF_FTYPE_V4SF_V8BF_V8BF)
 BDESC (0, OPTION_MASK_ISA2_AVX512BF16, CODE_FOR_avx512f_dpbf16ps_v4sf_mask, 
"__builtin_ia32_dpbf16ps_v4sf_mask", IX86_BUILTIN_DPBF16PS_V4SF_MASK, UNKNOWN, 
(int) V4SF_FTYPE_V4SF_V8BF_V8BF_UQI)
 BDESC (0, OPTION_MASK_ISA2_AVX512BF16, CODE_FOR_avx512f_dpbf16ps_v4sf_maskz, 
"__builtin_ia32_dpbf16ps_v4sf_maskz", IX86_BUILTIN_DPBF16PS_V4SF_MASKZ, 
UNKNOWN, (int) V4SF_FTYPE_V4SF_V8BF_V8BF_UQI)
+BDESC (OPTION_MASK_ISA_SSE2, 0, CODE_FOR_extendbfsf2_1, 
"__builtin_ia32_cvtbf2sf", IX86_BUILTIN_CVTBF2SF, UNKNOWN, (int) 
FLOAT_FTYPE_BFLOAT16)
+
 
 /* AVX512FP16.  */
 BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, 
CODE_FOR_addv8hf3_mask, "__builtin_ia32_addph128_mask", 
IX86_BUILTIN_ADDPH128_MASK, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI)
diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index 0373c3614a4..d26e7e41445 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -10423,6 +10423,7 @@ ix86_expand_args_builtin (const struct 
builtin_description *d,
   return ix86_expand_sse_ptest (d, exp, target);
 case FLOAT128_FTYPE_FLOAT128:
 case FLOAT_FTYPE_FLOAT:
+case FLOAT_FTYPE_BFLOAT16:
 case INT_FTYPE_INT:
 case UINT_FTYPE_UINT:
 case UINT16_FTYPE_UINT16:
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 01faa911b77..62d70330c5c 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -130,6 +130,7 @@ (define_c_enum "unspec" [
   ;; For AVX/AVX512F support
   UNSPEC_SCALEF
   UNSPEC_PCMP
+  UNSPEC_CVTBFSF
 
   ;; Generic math support
   UNSPEC_IEEE_MIN  ; not commutative
@@ -4961,6 +4962,31 @@ (define_insn "*extendhf2"
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
+(define_expand "extendbfsf2"
+  [(set (match_operand:SF 0 "register_operand")
+   (unspec:SF
+ [(match_operand:BF 1 

Re: [PATCH] Remove use_equiv_p in vr-values.cc

2022-11-23 Thread Richard Biener via Gcc-patches
On Tue, Nov 22, 2022 at 2:58 PM Aldy Hernandez  wrote:
>
> With no equivalences, the use_equiv_p argument in various methods in
> simplify_using_ranges is always false.  This means we can remove all
> calls to compare_names, along with the function.
>
> OK pending tests?

OK

> gcc/ChangeLog:
>
> * vr-values.cc (simplify_using_ranges::compare_names): Remove.
> (vrp_evaluate_conditional_warnv_with_ops): Remove call to
> compare_names.
> (simplify_using_ranges::vrp_visit_cond_stmt): Remove use_equiv_p
> argument to vrp_evaluate_conditional_warnv_with_ops.
> * vr-values.h (class simplify_using_ranges): Remove
> compare_names.
> Remove use_equiv_p to vrp_evaluate_conditional_warnv_with_ops.
> ---
>  gcc/vr-values.cc | 127 +--
>  gcc/vr-values.h  |   4 +-
>  2 files changed, 3 insertions(+), 128 deletions(-)
>
> diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
> index b0dd30260ae..1dbd9e47085 100644
> --- a/gcc/vr-values.cc
> +++ b/gcc/vr-values.cc
> @@ -667,124 +667,6 @@ simplify_using_ranges::compare_name_with_value
>return retval;
>  }
>
> -/* Given a comparison code COMP and names N1 and N2, compare all the
> -   ranges equivalent to N1 against all the ranges equivalent to N2
> -   to determine the value of N1 COMP N2.  Return the same value
> -   returned by compare_ranges.  Set *STRICT_OVERFLOW_P to indicate
> -   whether we relied on undefined signed overflow in the comparison.  */
> -
> -
> -tree
> -simplify_using_ranges::compare_names (enum tree_code comp, tree n1, tree n2,
> - bool *strict_overflow_p, gimple *s)
> -{
> -  /* ?? These bitmaps are NULL as there are no longer any equivalences
> - available in the value_range*.  */
> -  bitmap e1 = NULL;
> -  bitmap e2 = NULL;
> -
> -  /* Use the fake bitmaps if e1 or e2 are not available.  */
> -  static bitmap s_e1 = NULL, s_e2 = NULL;
> -  static bitmap_obstack *s_obstack = NULL;
> -  if (s_obstack == NULL)
> -{
> -  s_obstack = XNEW (bitmap_obstack);
> -  bitmap_obstack_initialize (s_obstack);
> -  s_e1 = BITMAP_ALLOC (s_obstack);
> -  s_e2 = BITMAP_ALLOC (s_obstack);
> -}
> -  if (e1 == NULL)
> -e1 = s_e1;
> -  if (e2 == NULL)
> -e2 = s_e2;
> -
> -  /* Add N1 and N2 to their own set of equivalences to avoid
> - duplicating the body of the loop just to check N1 and N2
> - ranges.  */
> -  bitmap_set_bit (e1, SSA_NAME_VERSION (n1));
> -  bitmap_set_bit (e2, SSA_NAME_VERSION (n2));
> -
> -  /* If the equivalence sets have a common intersection, then the two
> - names can be compared without checking their ranges.  */
> -  if (bitmap_intersect_p (e1, e2))
> -{
> -  bitmap_clear_bit (e1, SSA_NAME_VERSION (n1));
> -  bitmap_clear_bit (e2, SSA_NAME_VERSION (n2));
> -
> -  return (comp == EQ_EXPR || comp == GE_EXPR || comp == LE_EXPR)
> -? boolean_true_node
> -: boolean_false_node;
> -}
> -
> -  /* Start at -1.  Set it to 0 if we do a comparison without relying
> - on overflow, or 1 if all comparisons rely on overflow.  */
> -  int used_strict_overflow = -1;
> -
> -  /* Otherwise, compare all the equivalent ranges.  First, add N1 and
> - N2 to their own set of equivalences to avoid duplicating the body
> - of the loop just to check N1 and N2 ranges.  */
> -  bitmap_iterator bi1;
> -  unsigned i1;
> -  EXECUTE_IF_SET_IN_BITMAP (e1, 0, i1, bi1)
> -{
> -  if (!ssa_name (i1))
> -   continue;
> -
> -  value_range tem_vr1;
> -  const value_range *vr1 = get_vr_for_comparison (i1, _vr1, s);
> -
> -  tree t = NULL_TREE, retval = NULL_TREE;
> -  bitmap_iterator bi2;
> -  unsigned i2;
> -  EXECUTE_IF_SET_IN_BITMAP (e2, 0, i2, bi2)
> -   {
> - if (!ssa_name (i2))
> -   continue;
> -
> - bool sop = false;
> -
> - value_range tem_vr2;
> - const value_range *vr2 = get_vr_for_comparison (i2, _vr2, s);
> -
> - t = compare_ranges (comp, vr1, vr2, );
> - if (t)
> -   {
> - /* If we get different answers from different members
> -of the equivalence set this check must be in a dead
> -code region.  Folding it to a trap representation
> -would be correct here.  For now just return don't-know.  */
> - if (retval != NULL && t != retval)
> -   {
> - bitmap_clear_bit (e1, SSA_NAME_VERSION (n1));
> - bitmap_clear_bit (e2, SSA_NAME_VERSION (n2));
> - return NULL_TREE;
> -   }
> - retval = t;
> -
> - if (!sop)
> -   used_strict_overflow = 0;
> - else if (used_strict_overflow < 0)
> -   used_strict_overflow = 1;
> -   }
> -   }
> -
> -  if (retval)
> -   {
> - bitmap_clear_bit (e1, SSA_NAME_VERSION (n1));
> -  

Re: [PATCH] Remove follow_assert_exprs from overflow_comparison.

2022-11-23 Thread Richard Biener via Gcc-patches
On Tue, Nov 22, 2022 at 2:58 PM Aldy Hernandez  wrote:
>
> OK pending tests?

OK

> gcc/ChangeLog:
>
> * tree-vrp.cc (overflow_comparison_p_1): Remove follow_assert_exprs.
> (overflow_comparison_p): Remove use_equiv_p.
> * tree-vrp.h (overflow_comparison_p): Same.
> * vr-values.cc (vrp_evaluate_conditional_warnv_with_ops): Remove
> use_equiv_p argument to overflow_comparison_p.
> ---
>  gcc/tree-vrp.cc  | 40 
>  gcc/tree-vrp.h   |  2 +-
>  gcc/vr-values.cc |  2 +-
>  3 files changed, 6 insertions(+), 38 deletions(-)
>
> diff --git a/gcc/tree-vrp.cc b/gcc/tree-vrp.cc
> index d29941d0f2d..3846dc1d849 100644
> --- a/gcc/tree-vrp.cc
> +++ b/gcc/tree-vrp.cc
> @@ -679,7 +679,7 @@ range_fold_unary_expr (value_range *vr,
>
>  static bool
>  overflow_comparison_p_1 (enum tree_code code, tree op0, tree op1,
> -bool follow_assert_exprs, bool reversed, tree 
> *new_cst)
> +bool reversed, tree *new_cst)
>  {
>/* See if this is a relational operation between two SSA_NAMES with
>   unsigned, overflow wrapping values.  If so, check it more deeply.  */
> @@ -693,19 +693,6 @@ overflow_comparison_p_1 (enum tree_code code, tree op0, 
> tree op1,
>  {
>gimple *op1_def = SSA_NAME_DEF_STMT (op1);
>
> -  /* If requested, follow any ASSERT_EXPRs backwards for OP1.  */
> -  if (follow_assert_exprs)
> -   {
> - while (gimple_assign_single_p (op1_def)
> -&& TREE_CODE (gimple_assign_rhs1 (op1_def)) == ASSERT_EXPR)
> -   {
> - op1 = TREE_OPERAND (gimple_assign_rhs1 (op1_def), 0);
> - if (TREE_CODE (op1) != SSA_NAME)
> -   break;
> - op1_def = SSA_NAME_DEF_STMT (op1);
> -   }
> -   }
> -
>/* Now look at the defining statement of OP1 to see if it adds
>  or subtracts a nonzero constant from another operand.  */
>if (op1_def
> @@ -716,24 +703,6 @@ overflow_comparison_p_1 (enum tree_code code, tree op0, 
> tree op1,
> {
>   tree target = gimple_assign_rhs1 (op1_def);
>
> - /* If requested, follow ASSERT_EXPRs backwards for op0 looking
> -for one where TARGET appears on the RHS.  */
> - if (follow_assert_exprs)
> -   {
> - /* Now see if that "other operand" is op0, following the chain
> -of ASSERT_EXPRs if necessary.  */
> - gimple *op0_def = SSA_NAME_DEF_STMT (op0);
> - while (op0 != target
> -&& gimple_assign_single_p (op0_def)
> -&& TREE_CODE (gimple_assign_rhs1 (op0_def)) == 
> ASSERT_EXPR)
> -   {
> - op0 = TREE_OPERAND (gimple_assign_rhs1 (op0_def), 0);
> - if (TREE_CODE (op0) != SSA_NAME)
> -   break;
> - op0_def = SSA_NAME_DEF_STMT (op0);
> -   }
> -   }
> -
>   /* If we did not find our target SSA_NAME, then this is not
>  an overflow test.  */
>   if (op0 != target)
> @@ -764,13 +733,12 @@ overflow_comparison_p_1 (enum tree_code code, tree op0, 
> tree op1,
> the alternate range representation is often useful within VRP.  */
>
>  bool
> -overflow_comparison_p (tree_code code, tree name, tree val,
> -  bool use_equiv_p, tree *new_cst)
> +overflow_comparison_p (tree_code code, tree name, tree val, tree *new_cst)
>  {
> -  if (overflow_comparison_p_1 (code, name, val, use_equiv_p, false, new_cst))
> +  if (overflow_comparison_p_1 (code, name, val, false, new_cst))
>  return true;
>return overflow_comparison_p_1 (swap_tree_comparison (code), val, name,
> - use_equiv_p, true, new_cst);
> + true, new_cst);
>  }
>
>  /* Handle
> diff --git a/gcc/tree-vrp.h b/gcc/tree-vrp.h
> index 07630b5b1ca..127909604f0 100644
> --- a/gcc/tree-vrp.h
> +++ b/gcc/tree-vrp.h
> @@ -39,7 +39,7 @@ extern enum value_range_kind 
> intersect_range_with_nonzero_bits
>  extern bool find_case_label_range (gswitch *, tree, tree, size_t *, size_t 
> *);
>  extern tree find_case_label_range (gswitch *, const irange *vr);
>  extern bool find_case_label_index (gswitch *, size_t, tree, size_t *);
> -extern bool overflow_comparison_p (tree_code, tree, tree, bool, tree *);
> +extern bool overflow_comparison_p (tree_code, tree, tree, tree *);
>  extern void maybe_set_nonzero_bits (edge, tree);
>
>  #endif /* GCC_TREE_VRP_H */
> diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
> index 0347c29b216..b0dd30260ae 100644
> --- a/gcc/vr-values.cc
> +++ b/gcc/vr-values.cc
> @@ -837,7 +837,7 @@ 
> simplify_using_ranges::vrp_evaluate_conditional_warnv_with_ops
>   occurs when the chosen argument is zero and does not occur if the
>   chosen argument is not zero.  */
>tree x;
> -  if (overflow_comparison_p (code, op0, op1, use_equiv_p, ))
> +  

Re: [PATCH] Remove ASSERT_EXPR.

2022-11-23 Thread Richard Biener via Gcc-patches
On Tue, Nov 22, 2022 at 2:58 PM Aldy Hernandez  wrote:
>
> This removes all uses of ASSERT_EXPR except the internal one in ipa-*.
>
> OK pending tests?

OK.

> gcc/ChangeLog:
>
> * doc/gimple.texi: Remove ASSERT_EXPR references.
> * fold-const.cc (tree_expr_nonzero_warnv_p): Same.
> (fold_binary_loc): Same.
> (tree_expr_nonnegative_warnv_p): Same.
> * gimple-array-bounds.cc (get_base_decl): Same.
> * gimple-pretty-print.cc (dump_unary_rhs): Same.
> * gimple.cc (get_gimple_rhs_num_ops): Same.
> * pointer-query.cc (handle_ssa_name): Same.
> * tree-cfg.cc (verify_gimple_assign_single): Same.
> * tree-pretty-print.cc (dump_generic_node): Same.
> * tree-scalar-evolution.cc (scev_dfs::follow_ssa_edge_expr):Same.
> (interpret_rhs_expr): Same.
> * tree-ssa-operands.cc (operands_scanner::get_expr_operands): Same.
> * tree-ssa-propagate.cc
> (substitute_and_fold_dom_walker::before_dom_children): Same.
> * tree-ssa-threadedge.cc: Same.
> * tree-vrp.cc (overflow_comparison_p): Same.
> * tree.def (ASSERT_EXPR): Add note.
> * tree.h (ASSERT_EXPR_VAR): Remove.
> (ASSERT_EXPR_COND): Remove.
> * vr-values.cc (simplify_using_ranges::vrp_visit_cond_stmt):
> Remove comment.
> ---
>  gcc/doc/gimple.texi  |  3 +--
>  gcc/fold-const.cc|  6 -
>  gcc/gimple-array-bounds.cc   |  9 +---
>  gcc/gimple-pretty-print.cc   |  1 -
>  gcc/gimple.cc|  1 -
>  gcc/pointer-query.cc |  6 -
>  gcc/tree-cfg.cc  | 11 -
>  gcc/tree-pretty-print.cc |  8 ---
>  gcc/tree-scalar-evolution.cc | 15 -
>  gcc/tree-ssa-operands.cc |  1 -
>  gcc/tree-ssa-propagate.cc|  5 +
>  gcc/tree-ssa-threadedge.cc   |  6 ++---
>  gcc/tree-vrp.cc  |  7 +++---
>  gcc/tree.def |  5 -
>  gcc/tree.h   |  4 
>  gcc/vr-values.cc | 43 
>  16 files changed, 13 insertions(+), 118 deletions(-)
>
> diff --git a/gcc/doc/gimple.texi b/gcc/doc/gimple.texi
> index 7832fa6ff90..a4263922887 100644
> --- a/gcc/doc/gimple.texi
> +++ b/gcc/doc/gimple.texi
> @@ -682,8 +682,7 @@ more than two slots on the RHS.  For instance, a 
> @code{COND_EXPR}
>  expression of the form @code{(a op b) ? x : y} could be flattened
>  out on the operand vector using 4 slots, but it would also
>  require additional processing to distinguish @code{c = a op b}
> -from @code{c = a op b ? x : y}.  Something similar occurs with
> -@code{ASSERT_EXPR}.   In time, these special case tree
> +from @code{c = a op b ? x : y}.  In time, these special case tree
>  expressions should be flattened into the operand vector.
>  @end itemize
>
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index b89cac91cae..114258fa182 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -10751,7 +10751,6 @@ tree_expr_nonzero_warnv_p (tree t, bool 
> *strict_overflow_p)
>  case COND_EXPR:
>  case CONSTRUCTOR:
>  case OBJ_TYPE_REF:
> -case ASSERT_EXPR:
>  case ADDR_EXPR:
>  case WITH_SIZE_EXPR:
>  case SSA_NAME:
> @@ -12618,10 +12617,6 @@ fold_binary_loc (location_t loc, enum tree_code 
> code, tree type,
>  : fold_convert_loc (loc, type, arg1);
>return tem;
>
> -case ASSERT_EXPR:
> -  /* An ASSERT_EXPR should never be passed to fold_binary.  */
> -  gcc_unreachable ();
> -
>  default:
>return NULL_TREE;
>  } /* switch (code) */
> @@ -15117,7 +15112,6 @@ tree_expr_nonnegative_warnv_p (tree t, bool 
> *strict_overflow_p, int depth)
>  case COND_EXPR:
>  case CONSTRUCTOR:
>  case OBJ_TYPE_REF:
> -case ASSERT_EXPR:
>  case ADDR_EXPR:
>  case WITH_SIZE_EXPR:
>  case SSA_NAME:
> diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
> index 1eafd3fd3e1..eae49ab3910 100644
> --- a/gcc/gimple-array-bounds.cc
> +++ b/gcc/gimple-array-bounds.cc
> @@ -75,14 +75,7 @@ get_base_decl (tree ref)
>if (gimple_assign_single_p (def))
> {
>   base = gimple_assign_rhs1 (def);
> - if (TREE_CODE (base) != ASSERT_EXPR)
> -   return base;
> -
> - base = TREE_OPERAND (base, 0);
> - if (TREE_CODE (base) != SSA_NAME)
> -   return base;
> -
> - continue;
> + return base;
> }
>
>if (!gimple_nop_p (def))
> diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc
> index 7ec079f15c6..af704257633 100644
> --- a/gcc/gimple-pretty-print.cc
> +++ b/gcc/gimple-pretty-print.cc
> @@ -339,7 +339,6 @@ dump_unary_rhs (pretty_printer *buffer, const gassign 
> *gs, int spc,
>switch (rhs_code)
>  {
>  case VIEW_CONVERT_EXPR:
> -case ASSERT_EXPR:
>dump_generic_node (buffer, rhs, spc, flags, false);
>break;
>
> diff --git 

Re: [PATCH] Make Warray-bounds alias to Warray-bounds= [PR107787]

2022-11-23 Thread Richard Biener via Gcc-patches
On Wed, Nov 23, 2022 at 3:08 PM Iskander Shakirzyanov via Gcc-patches
 wrote:
>
> Hi!
> Sorry for the initially missing description.
> The following patch changes the definition of -Warray-bounds to an Alias to 
> -Warray-bounds=1. This is necessary for the correct use of 
> -Werror=array-bounds=X, for more information see bug report 107787 
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107787)
> As I understand, this happens because -Warray-bounds and -Warray-bounds= are 
> declared as 2 different options in common.opt, so when 
> diagnostic_classify_diagnostic() (opts-common.cc:1880) is called, DK_ERROR is 
> set for the -Warray-bounds= option, but in diagnostic_report_diagnostic() 
> (diagnostic.cc:1446) through warning_at() passes opt_index of -Warray-bounds, 
> so information about DK_ERROR is lost.

How did you test the patch?  If you bootstrapped it and ran the
testsuite then it's OK.

Thanks,
Richard.

>
> From 51559e862d191a1f51cc9af11f0d9be5fbc0b43c Mon Sep 17 00:00:00 2001
> From: Iskander Shakirzyanov 
> Date: Wed, 23 Nov 2022 12:26:47 +
> Subject: [PATCH] Make Warray-bounds alias to Warray-bounds= [PR107787]
>
> PR driver/107787
>
> gcc/ChangeLog:
>
> * common.opt (Warray-bounds): Turn into alias to
> -Warray-bounds=1.
> * builtins.cc (warn_array_bounds): Use OPT_Warray_bounds_
> instead of OPT_Warray_bounds.
> * diagnostic-spec.cc: Likewise.
> * gimple-array-bounds.cc: Likewise.
> * gimple-ssa-warn-restrict.cc: Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * pr107787.c: New test.
>
> gcc/c-family/ChangeLog:
>
> * c-common.cc (warn_array_bounds): Use OPT_Warray_bounds_
> instead of OPT_Warray_bounds.
> ---
>  gcc/builtins.cc |  6 +++---
>  gcc/c-family/c-common.cc|  4 ++--
>  gcc/common.opt  |  2 +-
>  gcc/diagnostic-spec.cc  |  1 -
>  gcc/gimple-array-bounds.cc  | 38 -
>  gcc/gimple-ssa-warn-restrict.cc |  2 +-
>  gcc/testsuite/gcc.dg/pr107787.c | 13 +++
>  7 files changed, 39 insertions(+), 27 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr107787.c
>
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index 4dc1ca672b2..02c4fefa86f 100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -696,14 +696,14 @@ c_strlen (tree arg, int only_value, c_strlen_data 
> *data, unsigned eltsize)
>  {
>/* Suppress multiple warnings for propagated constant strings.  */
>if (only_value != 2
> - && !warning_suppressed_p (arg, OPT_Warray_bounds)
> - && warning_at (loc, OPT_Warray_bounds,
> + && !warning_suppressed_p (arg, OPT_Warray_bounds_)
> + && warning_at (loc, OPT_Warray_bounds_,
>  "offset %qwi outside bounds of constant string",
>  eltoff))
> {
>   if (decl)
> inform (DECL_SOURCE_LOCATION (decl), "%qE declared here", decl);
> - suppress_warning (arg, OPT_Warray_bounds);
> + suppress_warning (arg, OPT_Warray_bounds_);
> }
>return NULL_TREE;
>  }
> diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
> index 6f1f21bc4c1..b0da6886ccf 100644
> --- a/gcc/c-family/c-common.cc
> +++ b/gcc/c-family/c-common.cc
> @@ -6811,7 +6811,7 @@ fold_offsetof (tree expr, tree type, enum tree_code ctx)
>  definition thereof.  */
>   if (TREE_CODE (v) == ARRAY_REF
>   || TREE_CODE (v) == COMPONENT_REF)
> -   warning (OPT_Warray_bounds,
> +   warning (OPT_Warray_bounds_,
>  "index %E denotes an offset "
>  "greater than size of %qT",
>  t, TREE_TYPE (TREE_OPERAND (expr, 0)));
> @@ -8534,7 +8534,7 @@ convert_vector_to_array_for_subscript (location_t loc,
>if (TREE_CODE (index) == INTEGER_CST)
>  if (!tree_fits_uhwi_p (index)
> || maybe_ge (tree_to_uhwi (index), TYPE_VECTOR_SUBPARTS (type)))
> -  warning_at (loc, OPT_Warray_bounds, "index value is out of bound");
> +  warning_at (loc, OPT_Warray_bounds_, "index value is out of 
> bound");
>
>/* We are building an ARRAY_REF so mark the vector as addressable
>   to not run into the gimplifiers premature setting of 
> DECL_GIMPLE_REG_P
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 26e9d1cc4e7..e475d6e56eb 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -539,7 +539,7 @@ Common Var(warn_aggressive_loop_optimizations) Init(1) 
> Warning
>  Warn if a loop with constant number of iterations triggers undefined 
> behavior.
>
>  Warray-bounds
> -Common Var(warn_array_bounds) Warning
> +Common Alias(Warray-bounds=, 1, 0) Warning
>  Warn if an array is accessed out of bounds.
>
>  Warray-bounds=
> diff --git a/gcc/diagnostic-spec.cc b/gcc/diagnostic-spec.cc
> index 

Re: [PATCH] Remove legacy VRP (maybe?)

2022-11-23 Thread Richard Biener via Gcc-patches
On Tue, Nov 22, 2022 at 2:40 PM Aldy Hernandez  wrote:
>
>
>
> On 11/22/22 10:22, Richard Biener wrote:
> > On Tue, Nov 22, 2022 at 10:04 AM Aldy Hernandez  wrote:
> >>
> >>
> >>
> >> On 11/22/22 09:25, Richard Biener wrote:
> >>> On Tue, Nov 22, 2022 at 9:24 AM Richard Biener
> >>>  wrote:
> 
>  On Mon, Nov 21, 2022 at 5:49 PM Jeff Law  wrote:
> >
> >
> > On 11/21/22 09:35, Aldy Hernandez via Gcc-patches wrote:
> >> I've been playing around with removing the legacy VRP code for the
> >> next release.  It's a layered onion to get this right, but the first
> >> bit is pretty straightforward and may be useful for this release.
> >> Basically, it entails removing the old VRP pass itself, along with
> >> value_range_equiv which have no producers left.  The current users of
> >> value_range_equiv don't put anything in the equivalence bitmaps, so
> >> they're basically behaving like plain value_range.
> >>
> >> I removed as much as possible without having to change any behavior,
> >> and this is what I came up with.  Is this something that would be
> >> useful for this release?  Would it help release managers have less
> >> unused cruft in the tree?
> >>
> >> Neither Andrew nor I have any strong feelings here.  We don't foresee
> >> the legacy code changing at all in the offseason, so we can just
> >> accumulate these patches in local trees.
> >
> > I'd lean towards removal after gcc-13 releases.
> 
>  I think removing the ability to switch to the old implementation easens
>  maintainance so I'd prefer to have this before the gcc-13 release.
> 
>  So please go ahead.
> >>>
> >>> Btw, ASSERT_EXPR should also go away with this, no?
> >>
> >> Ah yes, for everything except ipa-*.* which uses it internally (and sets
> >> it in its internal structures):
> >>
> >>  - ASSERT_EXPR means that only the value in operand is allowed to
> >> pass
> >>through (without any change), for all other values the result is
> >>unknown.
> >
> > Ick.  But yeah, I can see how 'ASSERT_EXPR' looked nice to use here
> > (but it's only a distinct value, so TARGET_OPTION_NODE would have
> > worked here as well)
> >
> >> I can remove all other uses, including any externally visible 
> >> documentation.
> >
> > Works for me.
>
> Documented and added change log entries.  Retested on x86-64 Linux.
>
> There are three follow-up patches removing ASSERT_EXPR which I'll post
> shortly.
>
> OK for trunk?

OK.

Thanks,
Richard.

> Aldy


Re: [PATCH] Add a new conversion for conditional ternary set into ifcvt [PR106536]

2022-11-23 Thread Richard Biener via Gcc-patches
On Wed, Nov 23, 2022 at 8:09 AM HAO CHEN GUI via Gcc-patches
 wrote:
>
> Hi,
>   There is a new insn on my target, which has a nested if_then_else and
> set -1, 0 and 1 according to a comparison.
>
>[(set (match_operand:SI 0 "gpc_reg_operand" "=r")
>  (if_then_else:SI (lt (match_operand:CC 1 "cc_reg_operand" "y")
>   (const_int 0))
>   (const_int -1)
>   (if_then_else (gt (match_dup 1)
> (const_int 0))
> (const_int 1)
> (const_int 0]
>
>   In ifcvt pass, it probably contains a comparison, a branch, a setcc
> and a constant set.
>
> 8: r122:CC=cmp(r120:DI#0,r121:DI#0)
> 9: pc={(r122:CC<0)?L29:pc}
>
>14: r118:SI=r122:CC>0
>
>29: L29:
> 5: r118:SI=0x
>
>   This patch adds the new conversion into ifcvt and convert this kind of
> branch into a nested if-then-else insn if the target supports such
> pattern.
>
>   HAVE_ternary_conditional_set indicates if the target has such nested
> if-then-else insn. It's set in genconfig. noce_try_ternary_cset will be
> executed to detect suitable pattern and convert it to the nested
> if-then-else insn if HAVE_ternary_conditional_set is set. The hook
> TARGET_NOCE_TERNARY_CSET_P detects target specific pattern and output
> conditions and setting integers for the nested if-then-else.
>
>   Bootstrapped and tested on powerpc64-linux BE/LE and x86 with no
> regressions. Is this okay for trunk? Any recommendations? Thanks a lot.

Wouldn't we usually either add an optab or try to recog a canonical
RTL form instead of adding a new target hook for things like this?

> ChangeLog
> 2022-11-23  Haochen Gui 
>
> gcc/
> * doc/tm.texi: Regenerate.
> * doc/tm.texi.in (TARGET_NOCE_TERNARY_CSET_P): Document new hook.
> * genconfig.cc (have_ternary_cset_flag): New.
> (walk_insn_part): Detect nested if-then-else with const_int setting
> and set have_ternary_cset_flag.
> (HAVE_ternary_conditional_set): Define.
> * ifcvt.cc (noce_emit_ternary_cset): New function to emit nested
> if-then-else insns.
> (noce_try_ternary_cset): Detect ternary conditional set and emit the
> insn.
> (noce_process_if_block): Try to do ternary condition set convertion
> when a target supports ternary conditional set insn.
> * target.def (noce_ternary_cset_p): New hook.
> * targhooks.cc (default_noce_ternary_cset_p): New function.
> * targhooks.h (default_noce_ternary_cset_p): New declare.
>
>
> patch.diff
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 92bda1a7e14..9823eccbe68 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -7094,6 +7094,15 @@ the @code{POLY_VALUE_MIN}, @code{POLY_VALUE_MAX} and
>  implementation returns the lowest possible value of @var{val}.
>  @end deftypefn
>
> +@deftypefn {Target Hook} bool TARGET_NOCE_TERNARY_CSET_P (struct 
> noce_if_info *@var{if_info}, rtx *@var{outer_cond}, rtx *@var{inner_cond}, 
> int *@var{int1}, int *@var{int2}, int *@var{int3})
> +This hook returns true if the if-then-else-join blocks describled in
> +@code{if_info} can be converted to a ternary conditional set implemented by
> +a nested if-then-else insn.  The @code{int1}, @code{int2} and @code{int3}
> +are three possible results of the nested if-then-else insn.
> +@code{outer_cond} and @code{inner_cond} are the conditions for outer and
> +if-then-else.
> +@end deftypefn
> +
>  @node Scheduling
>  @section Adjusting the Instruction Scheduler
>
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index 112462310b1..1d6f28cc50a 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -4631,6 +4631,8 @@ Define this macro if a non-short-circuit operation 
> produced by
>
>  @hook TARGET_ESTIMATED_POLY_VALUE
>
> +@hook TARGET_NOCE_TERNARY_CSET_P
> +
>  @node Scheduling
>  @section Adjusting the Instruction Scheduler
>
> diff --git a/gcc/genconfig.cc b/gcc/genconfig.cc
> index b7c6b48eec6..902c832cf5a 100644
> --- a/gcc/genconfig.cc
> +++ b/gcc/genconfig.cc
> @@ -33,6 +33,7 @@ static int max_recog_operands;  /* Largest operand number 
> seen.  */
>  static int max_dup_operands;/* Largest number of match_dup in any insn.  
> */
>  static int max_clobbers_per_insn;
>  static int have_cmove_flag;
> +static int have_ternary_cset_flag;
>  static int have_cond_exec_flag;
>  static int have_lo_sum_flag;
>  static int have_rotate_flag;
> @@ -136,6 +137,12 @@ walk_insn_part (rtx part, int recog_p, int 
> non_pc_set_src)
>   && GET_CODE (XEXP (part, 1)) == MATCH_OPERAND
>   && GET_CODE (XEXP (part, 2)) == MATCH_OPERAND)
> have_cmove_flag = 1;
> +  else if (recog_p && non_pc_set_src
> +  && GET_CODE (XEXP (part, 1)) == CONST_INT
> +  && GET_CODE (XEXP (part, 2)) == IF_THEN_ELSE
> +  

Re: [PATCH] rs6000: Adjust loop_unroll_adjust to match middle-end change [PR 107692]

2022-11-23 Thread Richard Biener via Gcc-patches
On Wed, Nov 23, 2022 at 2:53 AM Hongyu Wang  wrote:
>
> Hi, Segher and Richard
>
> > > Something in your patch was wrong, please fix that (or revert the
> > > patch).  You should not have to touch config/rs6000/ at all.
> >
> > Sure something is wrong, but I think there's the opportunity to
> > simplify rs6000/ and s390x/, the only other two implementors of
> > the hook used.
>
> If I understand correctly, the wrong part is we should not break the
> logic of -funroll-loops and check OPTION_SET_P in
> targetm.loop_unroll_adjust to pretend the loop-unrolling is disabled
> with -fno-unroll-loops.
> I don't have a good idea to resolve this, perhaps add another hook and
> check OPTION_SET_P (flag_unroll_loops) && munroll_only_small_loops
> there and use that hook in rtl_loop_unroll::gate (), but still it
> doesn't work if we want to strictly follow the logic that
> -munroll-only-small-loops should not enable loop unrolling.
> IMHO the middle-end part with target hook looks quite tricky (and of
> course the OPTION_SET_P in the target hook). So Richard if you agree,
> I'd like to install the reversion patch posted in
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606774.html
> and move all them to the backend first.

Fine by me.

Richard.

> --
> Regards,
>
> Hongyu, Wang


Re: [PATCH] AArch64: Add fma_reassoc_width [PR107413]

2022-11-23 Thread Wilco Dijkstra via Gcc-patches
Hi Richard,

>> A smart reassociation pass could form more FMAs while also increasing
>> parallelism, but the way it currently works always results in fewer FMAs.
>
> Yeah, as Richard said, that seems the right long-term fix.
> It would also avoid the hack of treating PLUS_EXPR as a signal
> of an FMA, which has the drawback of assuming (for 2-FMA cores)
> that plain addition never benefits from reassociation in its own right.

True but it's hard to separate them. You will have a mix of FADD and FMAs
to reassociate (since FMA still counts as an add), and the ratio between
them as well as the number of operations may affect the best reassociation
width.

> Still, I guess the hackiness is pre-existing and the patch is removing
> the hackiness for some cores, so from that point of view it's a strict
> improvement over the status quo.  And it's too late in the GCC 13
> cycle to do FMA reassociation properly.  So I'm OK with the patch
> in principle, but could you post an update with more commentary?

Sure, here is an update with longer comment in aarch64_reassociation_width:


Add a reassocation width for FMAs in per-CPU tuning structures. Keep the
existing setting for cores with 2 FMA pipes, and use 4 for cores with 4
FMA pipes.  This improves SPECFP2017 on Neoverse V1 by ~1.5%.

Passes regress/bootstrap, OK for commit?

gcc/ChangeLog/
PR 107413
* config/aarch64/aarch64.cc (struct tune_params): Add
fma_reassoc_width to all CPU tuning structures.
(aarch64_reassociation_width): Use fma_reassoc_width.
* config/aarch64/aarch64-protos.h (struct tune_params): Add
fma_reassoc_width.

---
diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
238820581c5ee7617f8eed1df2cf5418b1127e19..4be93c93c26e091f878bc8e4cf06e90888405fb2
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -540,6 +540,7 @@ struct tune_params
   const char *loop_align;
   int int_reassoc_width;
   int fp_reassoc_width;
+  int fma_reassoc_width;
   int vec_reassoc_width;
   int min_div_recip_mul_sf;
   int min_div_recip_mul_df;
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
c91df6f5006c257690aafb75398933d628a970e1..15d478c77ceb2d6c52a70b6ffd8fdadcfa8deba0
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -1346,6 +1346,7 @@ static const struct tune_params generic_tunings =
   "8", /* loop_align.  */
   2,   /* int_reassoc_width.  */
   4,   /* fp_reassoc_width.  */
+  1,   /* fma_reassoc_width.  */
   1,   /* vec_reassoc_width.  */
   2,   /* min_div_recip_mul_sf.  */
   2,   /* min_div_recip_mul_df.  */
@@ -1382,6 +1383,7 @@ static const struct tune_params cortexa35_tunings =
   "8", /* loop_align.  */
   2,   /* int_reassoc_width.  */
   4,   /* fp_reassoc_width.  */
+  1,   /* fma_reassoc_width.  */
   1,   /* vec_reassoc_width.  */
   2,   /* min_div_recip_mul_sf.  */
   2,   /* min_div_recip_mul_df.  */
@@ -1415,6 +1417,7 @@ static const struct tune_params cortexa53_tunings =
   "8", /* loop_align.  */
   2,   /* int_reassoc_width.  */
   4,   /* fp_reassoc_width.  */
+  1,   /* fma_reassoc_width.  */
   1,   /* vec_reassoc_width.  */
   2,   /* min_div_recip_mul_sf.  */
   2,   /* min_div_recip_mul_df.  */
@@ -1448,6 +1451,7 @@ static const struct tune_params cortexa57_tunings =
   "8", /* loop_align.  */
   2,   /* int_reassoc_width.  */
   4,   /* fp_reassoc_width.  */
+  1,   /* fma_reassoc_width.  */
   1,   /* vec_reassoc_width.  */
   2,   /* min_div_recip_mul_sf.  */
   2,   /* min_div_recip_mul_df.  */
@@ -1481,6 +1485,7 @@ static const struct tune_params cortexa72_tunings =
   "8", /* loop_align.  */
   2,   /* int_reassoc_width.  */
   4,   /* fp_reassoc_width.  */
+  1,   /* fma_reassoc_width.  */
   1,   /* vec_reassoc_width.  */
   2,   /* min_div_recip_mul_sf.  */
   2,   /* min_div_recip_mul_df.  */
@@ -1514,6 +1519,7 @@ static const struct tune_params cortexa73_tunings =
   "8", /* loop_align.  */
   2,   /* int_reassoc_width.  */
   4,   /* fp_reassoc_width.  */
+  1,   /* fma_reassoc_width.  */
   1,   /* vec_reassoc_width.  */
   2,   /* min_div_recip_mul_sf.  */
   2,   /* min_div_recip_mul_df.  */
@@ -1548,6 +1554,7 @@ static const struct tune_params exynosm1_tunings =
   "4", /* loop_align.  */
   2,   /* int_reassoc_width.  */
   4,   /* fp_reassoc_width.  */
+  1,   /* fma_reassoc_width.  */
   1,   /* vec_reassoc_width.  */
   2,   /* min_div_recip_mul_sf.  */
   2,   /* min_div_recip_mul_df.  */
@@ -1580,6 +1587,7 @@ static const struct tune_params thunderxt88_tunings =
   "8", /* loop_align.  */
   2,   /* int_reassoc_width.  */
   4,   /* fp_reassoc_width.  */
+  1,   /* fma_reassoc_width.  */
   1,   /* vec_reassoc_width.  */
   2,   /* min_div_recip_mul_sf.  */
   2,   /* min_div_recip_mul_df.  */
@@ -1612,6 +1620,7 @@ static const struct tune_params thunderx_tunings =
   "8", /* loop_align.  */
   2,   /* 

Re: [PATCH] c: Fix compile time hog in c_genericize [PR107127]

2022-11-23 Thread Joseph Myers
On Wed, 23 Nov 2022, Jakub Jelinek via Gcc-patches wrote:

> Hi!
> 
> The complex multiplications result in deeply nested set of many SAVE_EXPRs,
> which takes even on fast machines over 5 minutes to walk.
> This patch fixes that by using walk_tree_without_duplicates where it is
> instant.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2022-11-23  Andrew Pinski  
>   Jakub Jelinek  
> 
>   PR c/107127
>   * c-gimplify.cc (c_genericize): Use walk_tree_without_duplicates
>   instead of walk_tree for c_genericize_control_r.
> 
>   * gcc.dg/pr107127.c: New test.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH]AArch64 sve2: Fix expansion of division [PR107830]

2022-11-23 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
> Hi All,
>
> SVE has an actual division optab, and when using -Os we don't
> optimize the division away.  This means that we need to distinguish
> between a div which we can optimize and one we cannot even during
> expansion.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   PR target/107830
>   * config/aarch64/aarch64.cc
>   (aarch64_vectorize_can_special_div_by_constant): Check validity during
>   codegen phase as well.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/107830
>   * gcc.target/aarch64/sve2/pr107830.c: New test.
>
> --- inline copy of patch -- 
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> 4176d7b046a126664360596b6db79a43e77ff76a..bee23625807af95d5ec15ad45702961b2d7ab55d
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -24322,12 +24322,15 @@ aarch64_vectorize_can_special_div_by_constant (enum 
> tree_code code,
>if ((flags & VEC_ANY_SVE) && !TARGET_SVE2)
>  return false;
>  
> +  wide_int val = wi::add (cst, 1);
> +  int pow = wi::exact_log2 (val);
> +  bool valid_p = pow == (int)(element_precision (vectype) / 2);
> +  /* SVE actually has a div operator, we we may have gotten here through
> + that route.  */
>if (in0 == NULL_RTX && in1 == NULL_RTX)
> -{
> -  wide_int val = wi::add (cst, 1);
> -  int pow = wi::exact_log2 (val);
> -  return pow == (int)(element_precision (vectype) / 2);
> -}
> +return valid_p;
> +  else if (!valid_p)
> +return false;

Is this equivalent to:

  int pow = wi::exact_log2 (cst + 1);
  if (pow != (int) (element_precision (vectype) / 2))
return false;

  /* We can use the optimized pattern.  */
  if (in0 == NULL_RTX && in1 == NULL_RTX)
return true;

?  If so, I'd find that slightly easier to follow, but I realise it's
personal taste.  OK with that change if it works and you agree.

While looking at this, I noticed that we ICE for:

  void f(unsigned short *restrict p1, unsigned int *restrict p2)
  {
for (int i = 0; i < 16; ++i)
  {
p1[i] /= 0xff;
p2[i] += 1;
  }
  }

for -march=armv8-a+sve2 -msve-vector-bits=512.  I guess we need to filter
out partial modes or (better) add support for them.  Adding support for
them probably requires changes to the underlying ADDHNB pattern.

Thanks,
Richard

>if (!VECTOR_TYPE_P (vectype))
> return false;
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr107830.c 
> b/gcc/testsuite/gcc.target/aarch64/sve2/pr107830.c
> new file mode 100644
> index 
> ..6d8ee3615fdb0083dbde1e45a2826fb681726139
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr107830.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target fopenmp } */
> +/* { dg-additional-options "-Os -fopenmp" } */
> +
> +void
> +f2 (int *a)
> +{
> +  unsigned int i;
> +
> +#pragma omp simd
> +  for (i = 0; i < 4; ++i)
> +a[i / 3] -= 4;
> +}


Re: [PATCH] doc: -Wdelete-non-virtual-dtor supersedes -Wnon-virtual-dtor

2022-11-23 Thread Jason Merrill via Gcc-patches

On 11/23/22 05:10, Jonathan Wakely wrote:

The existence of this option makes users think they need it (even though
it's in neither -Wall nor -Wextra). Document that there's a better
option (since 2011).

OK for trunk?


OK.


-- >8 --

The newer -Wdelete-non-virtual-dtor has no false positives and fewer
bugs. There is very little reason to use -Wnon-virtual-dtor instead.

gcc/ChangeLog:

* doc/invoke.texi (C++ Dialect Options): Recommend using
-Wdelete-non-virtual-dtor instead of -Wnon-virtual-dtor.
---
  gcc/doc/invoke.texi | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 330da6eb5d4..4899bd1ea4c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -3986,6 +3986,9 @@ destructor itself or in an accessible polymorphic base 
class, in which
  case it is possible but unsafe to delete an instance of a derived
  class through a pointer to the class itself or base class.  This
  warning is automatically enabled if @option{-Weffc++} is specified.
+The @option{-Wdelete-non-virtual-dtor} option (enabled by @option{-Wall})
+should be preferred because it warns about the unsafe cases without false
+positives.
  
  @item -Wregister @r{(C++ and Objective-C++ only)}

  @opindex Wregister




Re: [PATCH 2/5] c++: Set the locus of the function result decl

2022-11-23 Thread Jason Merrill via Gcc-patches

On 11/22/22 15:25, Jason Merrill wrote:

On 11/20/22 12:06, Bernhard Reutner-Fischer wrote:

Hi Jason!

The "meh" of result-decl-plugin-test-2.C should likely be omitted,
grokdeclarator would need some changes to add richloc hints and we 
would not

be able to make a reliable guess what to remove precisely.
C.f. /* Check all other uses of type modifiers.  */
Furthermore it is unrelated to DECL_RESULT so not of direct interest
here. The other tests in test-2.C, f() and huh() should work though.

I don't know if it's acceptable to change ipa-pure-const to make the
missing noreturn warning more precise and emit a fixit-hint. At least it
would be a real test for the DECL_RESULT and would spare us the plugin.


The main problem I see with that change is that the syntax of the fixit 
might be wrong for non-C-family front-ends.


Here's a version of the patch that fixes template/method handling, and 
adjusts -Waggregate-return as well:


Actually, that broke some of the spaceship tests, fixed by this version:
From 3c8106c95ec07d17ff5ade173126067c540ab7cc Mon Sep 17 00:00:00 2001
From: Bernhard Reutner-Fischer 
Date: Sun, 20 Nov 2022 18:06:04 +0100
Subject: [PATCH] c++: Set the locus of the function result decl
To: gcc-patches@gcc.gnu.org

gcc/cp/ChangeLog:

	* decl.cc (grokdeclarator): Build RESULT_DECL.
	(start_preparsed_function): Copy location from template.
	* semantics.cc (apply_deduced_return_type): Handle
	arg != current_function_decl.
	* method.cc (implicitly_declare_fn): Use it.

gcc/ChangeLog:

	* function.cc (init_function_start): Use DECL_RESULT location
	for -Waggregate-return warning.
	* ipa-pure-const.cc (suggest_attribute): Add fixit-hint for the
	noreturn attribute.

gcc/testsuite/ChangeLog:

	* c-c++-common/pr68833-1.c: Adjust noreturn warning line number.
	* gcc.dg/noreturn-1.c: Likewise.
	* g++.dg/diagnostic/return-type-loc1.C: New test.
	* g++.dg/other/resultdecl-1.C: New test.

Co-authored-by: Jason Merrill 
---
 gcc/cp/decl.cc| 26 +--
 gcc/cp/method.cc  |  2 +-
 gcc/cp/semantics.cc   | 15 -
 gcc/function.cc   |  3 +-
 gcc/ipa-pure-const.cc | 14 +++-
 gcc/testsuite/c-c++-common/pr68833-1.c|  2 +-
 .../g++.dg/diagnostic/return-type-loc1.C  | 20 
 gcc/testsuite/g++.dg/other/resultdecl-1.C | 32 +++
 gcc/testsuite/gcc.dg/noreturn-1.c |  2 +-
 9 files changed, 101 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/return-type-loc1.C
 create mode 100644 gcc/testsuite/g++.dg/other/resultdecl-1.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 544efdc9914..2c5cd930e0a 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -14774,6 +14774,18 @@ grokdeclarator (const cp_declarator *declarator,
 	else if (constinit_p)
 	  DECL_DECLARED_CONSTINIT_P (decl) = true;
   }
+else if (TREE_CODE (decl) == FUNCTION_DECL)
+  {
+	location_t loc = smallest_type_location (declspecs);
+	if (loc != UNKNOWN_LOCATION)
+	  {
+	tree restype = TREE_TYPE (TREE_TYPE (decl));
+	tree resdecl = build_decl (loc, RESULT_DECL, 0, restype);
+	DECL_ARTIFICIAL (resdecl) = 1;
+	DECL_IGNORED_P (resdecl) = 1;
+	DECL_RESULT (decl) = resdecl;
+	  }
+  }
 
 /* Record constancy and volatility on the DECL itself .  There's
no need to do this when processing a template; we'll do this
@@ -17328,9 +17340,19 @@ start_preparsed_function (tree decl1, tree attrs, int flags)
 
   if (DECL_RESULT (decl1) == NULL_TREE)
 {
-  tree resdecl;
+  /* In a template instantiation, copy the return type location.  When
+	 parsing, the location will be set in grokdeclarator.  */
+  location_t loc = input_location;
+  if (DECL_TEMPLATE_INSTANTIATION (decl1)
+	  && !DECL_CXX_CONSTRUCTOR_P (decl1)
+	  && !DECL_CXX_DESTRUCTOR_P (decl1))
+	{
+	  tree tmpl = template_for_substitution (decl1);
+	  tree res = DECL_RESULT (DECL_TEMPLATE_RESULT (tmpl));
+	  loc = DECL_SOURCE_LOCATION (res);
+	}
 
-  resdecl = build_decl (input_location, RESULT_DECL, 0, restype);
+  tree resdecl = build_decl (loc, RESULT_DECL, 0, restype);
   DECL_ARTIFICIAL (resdecl) = 1;
   DECL_IGNORED_P (resdecl) = 1;
   DECL_RESULT (decl1) = resdecl;
diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index 1e962b6e3b1..7b4d5a59823 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -3079,7 +3079,7 @@ implicitly_declare_fn (special_function_kind kind, tree type,
 {
   fn = copy_operator_fn (pattern_fn, EQ_EXPR);
   DECL_ARTIFICIAL (fn) = 1;
-  TREE_TYPE (fn) = change_return_type (boolean_type_node, TREE_TYPE (fn));
+  apply_deduced_return_type (fn, boolean_type_node);
   return fn;
 }
 
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 9401b35a789..ab52e56d6c1 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ 

Re: [PATCH] analyzer: Use __builtin_alloca in gcc.dg/analyzer/call-summaries-2.c

2022-11-23 Thread David Malcolm via Gcc-patches
On Wed, 2022-11-23 at 14:27 +0100, Rainer Orth wrote:
> gcc.dg/analyzer/call-summaries-2.c currently FAILs on Solaris:
> 
> FAIL: gcc.dg/analyzer/call-summaries-2.c (test for excess errors)
> 
> Excess errors:
> /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/analyzer/call-
> summaries-2.c:468:12: warning: implicit declaration of function
> 'alloca' [-Wimplicit-function-declaration]
> /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/analyzer/call-
> summaries-2.c:468:12: warning: incompatible implicit decl
> 
> alloca is only declared in , which isn't included
> indirectly
> anywhere.  To avoid this, I switched the test to use __builtin_alloca
> instead, following the vast majority of analyzer tests that use
> alloca.
> 
> Tested no i386-pc-solaris2.11, sparc-sun-solaris2.11, and
> x86_64-pc-linux-gnu.
> 
> Ok for trunk?

Yes

Thanks
Dave

> 
> There are a handful of tests that explicitly include 
> instead,
> which is of course an alternative if preferred.
> 
> Rainer
> 



[PATCH]AArch64 sve2: Fix expansion of division [PR107830]

2022-11-23 Thread Tamar Christina via Gcc-patches
Hi All,

SVE has an actual division optab, and when using -Os we don't
optimize the division away.  This means that we need to distinguish
between a div which we can optimize and one we cannot even during
expansion.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR target/107830
* config/aarch64/aarch64.cc
(aarch64_vectorize_can_special_div_by_constant): Check validity during
codegen phase as well.

gcc/testsuite/ChangeLog:

PR target/107830
* gcc.target/aarch64/sve2/pr107830.c: New test.

--- inline copy of patch -- 
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
4176d7b046a126664360596b6db79a43e77ff76a..bee23625807af95d5ec15ad45702961b2d7ab55d
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -24322,12 +24322,15 @@ aarch64_vectorize_can_special_div_by_constant (enum 
tree_code code,
   if ((flags & VEC_ANY_SVE) && !TARGET_SVE2)
 return false;
 
+  wide_int val = wi::add (cst, 1);
+  int pow = wi::exact_log2 (val);
+  bool valid_p = pow == (int)(element_precision (vectype) / 2);
+  /* SVE actually has a div operator, we we may have gotten here through
+ that route.  */
   if (in0 == NULL_RTX && in1 == NULL_RTX)
-{
-  wide_int val = wi::add (cst, 1);
-  int pow = wi::exact_log2 (val);
-  return pow == (int)(element_precision (vectype) / 2);
-}
+return valid_p;
+  else if (!valid_p)
+return false;
 
   if (!VECTOR_TYPE_P (vectype))
return false;
diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr107830.c 
b/gcc/testsuite/gcc.target/aarch64/sve2/pr107830.c
new file mode 100644
index 
..6d8ee3615fdb0083dbde1e45a2826fb681726139
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr107830.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target fopenmp } */
+/* { dg-additional-options "-Os -fopenmp" } */
+
+void
+f2 (int *a)
+{
+  unsigned int i;
+
+#pragma omp simd
+  for (i = 0; i < 4; ++i)
+a[i / 3] -= 4;
+}




-- 
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
4176d7b046a126664360596b6db79a43e77ff76a..bee23625807af95d5ec15ad45702961b2d7ab55d
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -24322,12 +24322,15 @@ aarch64_vectorize_can_special_div_by_constant (enum 
tree_code code,
   if ((flags & VEC_ANY_SVE) && !TARGET_SVE2)
 return false;
 
+  wide_int val = wi::add (cst, 1);
+  int pow = wi::exact_log2 (val);
+  bool valid_p = pow == (int)(element_precision (vectype) / 2);
+  /* SVE actually has a div operator, we we may have gotten here through
+ that route.  */
   if (in0 == NULL_RTX && in1 == NULL_RTX)
-{
-  wide_int val = wi::add (cst, 1);
-  int pow = wi::exact_log2 (val);
-  return pow == (int)(element_precision (vectype) / 2);
-}
+return valid_p;
+  else if (!valid_p)
+return false;
 
   if (!VECTOR_TYPE_P (vectype))
return false;
diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr107830.c 
b/gcc/testsuite/gcc.target/aarch64/sve2/pr107830.c
new file mode 100644
index 
..6d8ee3615fdb0083dbde1e45a2826fb681726139
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr107830.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target fopenmp } */
+/* { dg-additional-options "-Os -fopenmp" } */
+
+void
+f2 (int *a)
+{
+  unsigned int i;
+
+#pragma omp simd
+  for (i = 0; i < 4; ++i)
+a[i / 3] -= 4;
+}





Re: [PATCH] Make Warray-bounds alias to Warray-bounds= [PR107787]

2022-11-23 Thread Iskander Shakirzyanov via Gcc-patches
Hi!
Sorry for the initially missing description. 
The following patch changes the definition of -Warray-bounds to an Alias to 
-Warray-bounds=1. This is necessary for the correct use of 
-Werror=array-bounds=X, for more information see bug report 107787 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107787)
As I understand, this happens because -Warray-bounds and -Warray-bounds= are 
declared as 2 different options in common.opt, so when 
diagnostic_classify_diagnostic() (opts-common.cc:1880) is called, DK_ERROR is 
set for the -Warray-bounds= option, but in diagnostic_report_diagnostic() 
(diagnostic.cc:1446) through warning_at() passes opt_index of -Warray-bounds, 
so information about DK_ERROR is lost.


From 51559e862d191a1f51cc9af11f0d9be5fbc0b43c Mon Sep 17 00:00:00 2001
From: Iskander Shakirzyanov 
Date: Wed, 23 Nov 2022 12:26:47 +
Subject: [PATCH] Make Warray-bounds alias to Warray-bounds= [PR107787]

PR driver/107787

gcc/ChangeLog:

* common.opt (Warray-bounds): Turn into alias to
-Warray-bounds=1.
* builtins.cc (warn_array_bounds): Use OPT_Warray_bounds_
instead of OPT_Warray_bounds.
* diagnostic-spec.cc: Likewise.
* gimple-array-bounds.cc: Likewise.
* gimple-ssa-warn-restrict.cc: Likewise.

gcc/testsuite/ChangeLog:

* pr107787.c: New test.

gcc/c-family/ChangeLog:

* c-common.cc (warn_array_bounds): Use OPT_Warray_bounds_
instead of OPT_Warray_bounds.
---
 gcc/builtins.cc |  6 +++---
 gcc/c-family/c-common.cc|  4 ++--
 gcc/common.opt  |  2 +-
 gcc/diagnostic-spec.cc  |  1 -
 gcc/gimple-array-bounds.cc  | 38 -
 gcc/gimple-ssa-warn-restrict.cc |  2 +-
 gcc/testsuite/gcc.dg/pr107787.c | 13 +++
 7 files changed, 39 insertions(+), 27 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr107787.c

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 4dc1ca672b2..02c4fefa86f 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -696,14 +696,14 @@ c_strlen (tree arg, int only_value, c_strlen_data *data, 
unsigned eltsize)
 {
   /* Suppress multiple warnings for propagated constant strings.  */
   if (only_value != 2
- && !warning_suppressed_p (arg, OPT_Warray_bounds)
- && warning_at (loc, OPT_Warray_bounds,
+ && !warning_suppressed_p (arg, OPT_Warray_bounds_)
+ && warning_at (loc, OPT_Warray_bounds_,
 "offset %qwi outside bounds of constant string",
 eltoff))
{
  if (decl)
inform (DECL_SOURCE_LOCATION (decl), "%qE declared here", decl);
- suppress_warning (arg, OPT_Warray_bounds);
+ suppress_warning (arg, OPT_Warray_bounds_);
}
   return NULL_TREE;
 }
diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 6f1f21bc4c1..b0da6886ccf 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -6811,7 +6811,7 @@ fold_offsetof (tree expr, tree type, enum tree_code ctx)
 definition thereof.  */
  if (TREE_CODE (v) == ARRAY_REF
  || TREE_CODE (v) == COMPONENT_REF)
-   warning (OPT_Warray_bounds,
+   warning (OPT_Warray_bounds_,
 "index %E denotes an offset "
 "greater than size of %qT",
 t, TREE_TYPE (TREE_OPERAND (expr, 0)));
@@ -8534,7 +8534,7 @@ convert_vector_to_array_for_subscript (location_t loc,
   if (TREE_CODE (index) == INTEGER_CST)
 if (!tree_fits_uhwi_p (index)
|| maybe_ge (tree_to_uhwi (index), TYPE_VECTOR_SUBPARTS (type)))
-  warning_at (loc, OPT_Warray_bounds, "index value is out of bound");
+  warning_at (loc, OPT_Warray_bounds_, "index value is out of bound");
 
   /* We are building an ARRAY_REF so mark the vector as addressable
  to not run into the gimplifiers premature setting of DECL_GIMPLE_REG_P
diff --git a/gcc/common.opt b/gcc/common.opt
index 26e9d1cc4e7..e475d6e56eb 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -539,7 +539,7 @@ Common Var(warn_aggressive_loop_optimizations) Init(1) 
Warning
 Warn if a loop with constant number of iterations triggers undefined behavior.
 
 Warray-bounds
-Common Var(warn_array_bounds) Warning
+Common Alias(Warray-bounds=, 1, 0) Warning
 Warn if an array is accessed out of bounds.
 
 Warray-bounds=
diff --git a/gcc/diagnostic-spec.cc b/gcc/diagnostic-spec.cc
index aece89619e7..7a03fc493e6 100644
--- a/gcc/diagnostic-spec.cc
+++ b/gcc/diagnostic-spec.cc
@@ -79,7 +79,6 @@ nowarn_spec_t::nowarn_spec_t (opt_code opt)
   break;
 
   /* Access warning group.  */
-case OPT_Warray_bounds:
 case OPT_Warray_bounds_:
 case OPT_Wformat_overflow_:
 case OPT_Wformat_truncation_:
diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index 

[PATCH] Make Warray-bounds alias to Warray-bounds= [PR107787]

2022-11-23 Thread Искандер Шакирзянов via Gcc-patches
From 51559e862d191a1f51cc9af11f0d9be5fbc0b43c Mon Sep 17 00:00:00 2001
From: Iskander Shakirzyanov 
Date: Wed, 23 Nov 2022 12:26:47 +
Subject: [PATCH] Make Warray-bounds alias to Warray-bounds= [PR107787]

PR driver/107787

gcc/ChangeLog:

* common.opt (Warray-bounds): Turn into alias to
-Warray-bounds=1.
* builtins.cc (warn_array_bounds): Use OPT_Warray_bounds_
instead of OPT_Warray_bounds.
* diagnostic-spec.cc: Likewise.
* gimple-array-bounds.cc: Likewise.
* gimple-ssa-warn-restrict.cc: Likewise.

gcc/testsuite/ChangeLog:

* pr107787.c: New test.

gcc/c-family/ChangeLog:

* c-common.cc (warn_array_bounds): Use OPT_Warray_bounds_
instead of OPT_Warray_bounds.
---
 gcc/builtins.cc |  6 +++---
 gcc/c-family/c-common.cc|  4 ++--
 gcc/common.opt  |  2 +-
 gcc/diagnostic-spec.cc  |  1 -
 gcc/gimple-array-bounds.cc  | 38 -
 gcc/gimple-ssa-warn-restrict.cc |  2 +-
 gcc/testsuite/gcc.dg/pr107787.c | 13 +++
 7 files changed, 39 insertions(+), 27 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr107787.c

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 4dc1ca672b2..02c4fefa86f 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -696,14 +696,14 @@ c_strlen (tree arg, int only_value, c_strlen_data *data, 
unsigned eltsize)
 {
   /* Suppress multiple warnings for propagated constant strings.  */
   if (only_value != 2
- && !warning_suppressed_p (arg, OPT_Warray_bounds)
- && warning_at (loc, OPT_Warray_bounds,
+ && !warning_suppressed_p (arg, OPT_Warray_bounds_)
+ && warning_at (loc, OPT_Warray_bounds_,
 "offset %qwi outside bounds of constant string",
 eltoff))
{
  if (decl)
inform (DECL_SOURCE_LOCATION (decl), "%qE declared here", decl);
- suppress_warning (arg, OPT_Warray_bounds);
+ suppress_warning (arg, OPT_Warray_bounds_);
}
   return NULL_TREE;
 }
diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 6f1f21bc4c1..b0da6886ccf 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -6811,7 +6811,7 @@ fold_offsetof (tree expr, tree type, enum tree_code ctx)
 definition thereof.  */
  if (TREE_CODE (v) == ARRAY_REF
  || TREE_CODE (v) == COMPONENT_REF)
-   warning (OPT_Warray_bounds,
+   warning (OPT_Warray_bounds_,
 "index %E denotes an offset "
 "greater than size of %qT",
 t, TREE_TYPE (TREE_OPERAND (expr, 0)));
@@ -8534,7 +8534,7 @@ convert_vector_to_array_for_subscript (location_t loc,
   if (TREE_CODE (index) == INTEGER_CST)
 if (!tree_fits_uhwi_p (index)
|| maybe_ge (tree_to_uhwi (index), TYPE_VECTOR_SUBPARTS (type)))
-  warning_at (loc, OPT_Warray_bounds, "index value is out of bound");
+  warning_at (loc, OPT_Warray_bounds_, "index value is out of bound");
 
   /* We are building an ARRAY_REF so mark the vector as addressable
  to not run into the gimplifiers premature setting of DECL_GIMPLE_REG_P
diff --git a/gcc/common.opt b/gcc/common.opt
index 26e9d1cc4e7..e475d6e56eb 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -539,7 +539,7 @@ Common Var(warn_aggressive_loop_optimizations) Init(1) 
Warning
 Warn if a loop with constant number of iterations triggers undefined behavior.
 
 Warray-bounds
-Common Var(warn_array_bounds) Warning
+Common Alias(Warray-bounds=, 1, 0) Warning
 Warn if an array is accessed out of bounds.
 
 Warray-bounds=
diff --git a/gcc/diagnostic-spec.cc b/gcc/diagnostic-spec.cc
index aece89619e7..7a03fc493e6 100644
--- a/gcc/diagnostic-spec.cc
+++ b/gcc/diagnostic-spec.cc
@@ -79,7 +79,6 @@ nowarn_spec_t::nowarn_spec_t (opt_code opt)
   break;
 
   /* Access warning group.  */
-case OPT_Warray_bounds:
 case OPT_Warray_bounds_:
 case OPT_Wformat_overflow_:
 case OPT_Wformat_truncation_:
diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index fbf448e045d..7af85b86f75 100644
--- a/gcc/gimple-array-bounds.cc
+++ b/gcc/gimple-array-bounds.cc
@@ -182,7 +182,7 @@ bool
 array_bounds_checker::check_array_ref (location_t location, tree ref,
   gimple *stmt, bool ignore_off_by_one)
 {
-  if (warning_suppressed_p (ref, OPT_Warray_bounds))
+  if (warning_suppressed_p (ref, OPT_Warray_bounds_))
 /* Return true to have the caller prevent warnings for enclosing
refs.  */
 return true;
@@ -287,7 +287,7 @@ array_bounds_checker::check_array_ref (location_t location, 
tree ref,
 
   /* Empty array.  */
   if (up_bound && tree_int_cst_equal (low_bound, up_bound_p1))
-warned = warning_at (location, 

[PATCH] analyzer: Use __builtin_alloca in gcc.dg/analyzer/call-summaries-2.c

2022-11-23 Thread Rainer Orth
gcc.dg/analyzer/call-summaries-2.c currently FAILs on Solaris:

FAIL: gcc.dg/analyzer/call-summaries-2.c (test for excess errors)

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/analyzer/call-summaries-2.c:468:12:
 warning: implicit declaration of function 'alloca' 
[-Wimplicit-function-declaration]
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/analyzer/call-summaries-2.c:468:12:
 warning: incompatible implicit decl

alloca is only declared in , which isn't included indirectly
anywhere.  To avoid this, I switched the test to use __builtin_alloca
instead, following the vast majority of analyzer tests that use alloca.

Tested no i386-pc-solaris2.11, sparc-sun-solaris2.11, and
x86_64-pc-linux-gnu.

Ok for trunk?

There are a handful of tests that explicitly include  instead,
which is of course an alternative if preferred.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2022-11-23  Rainer Orth  

gcc/testsuite:
* gcc.dg/analyzer/call-summaries-2.c (uses_alloca): Use
__builtin_alloca instead of alloca.

diff --git a/gcc/testsuite/gcc.dg/analyzer/call-summaries-2.c b/gcc/testsuite/gcc.dg/analyzer/call-summaries-2.c
--- a/gcc/testsuite/gcc.dg/analyzer/call-summaries-2.c
+++ b/gcc/testsuite/gcc.dg/analyzer/call-summaries-2.c
@@ -465,7 +465,7 @@ int test_returns_external_result (void)
 
 int uses_alloca (int i)
 {
-  int *p = alloca (sizeof (int));
+  int *p = __builtin_alloca (sizeof (int));
   *p = i;
   return *p;
 }


Re: [PATCH] d: respect --enable-link-mutex configure option

2022-11-23 Thread Martin Liška
On 11/22/22 13:59, Iain Buclaw wrote:
> Excerpts from Martin Liška's message of November 22, 2022 10:41 am:
>> I noticed the option is ignored because @DO_LINK_MUTEX@
>> is not defined in d/Make-lang.in.
>>
>> Tested locally before and after the patch.
>>
>> Ready to be installed?
>> Thanks,
>> Martin
>>
> 
> Fine on my end.  Thanks!

Done, pushed as r13-4264-g52a0ef696e1d78.

Martin

> 
> Iain.



Re: [PATCH] diagnostics: Fix selftest ICE in certain locales [PR107722]

2022-11-23 Thread David Malcolm via Gcc-patches
On Wed, 2022-11-23 at 09:51 +0100, Jakub Jelinek wrote:
> Hi!
> 
> As reported in the PR, since special_fname_builtin () call has been
> introduced, the diagnostics code compares filename against _(" in>")
> rather than "", which means that if self tests are
> performed
> with the string being translated, one self-test fails.
> The following patch fixes that.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux (with normal C
> locale)
> and by the reporter in German, where it fixes the problem.  Ok for
> trunk?

OK

Thanks
Dave



Re: [PATCH] [x86] Fix incorrect implementation for mm_cvtsbh_ss.

2022-11-23 Thread Hongtao Liu via Gcc-patches
On Wed, Nov 23, 2022 at 8:40 PM Jakub Jelinek  wrote:
>
> On Wed, Nov 23, 2022 at 08:28:20PM +0800, liuhongt via Gcc-patches wrote:
> > After supporting real __bf16 type, implementation of mm_cvtsbh_ss went 
> > wrong.
> > The patch supports extendbfsf2/truncsfbf2 with pslld/psrld,
> > and then refined the intrinsic with implicit conversion.
>
> This is not correct.
> While using such code for _mm_cvtsbh_ss is fine if it is documented not to
> raise exceptions and turn a sNaN into a qNaN, it is not fine for HONOR_NANS
> (i.e. when -ffast-math is not on), because a __bf16 -> float conversion
> on sNaN should raise invalid exception and turn it into a qNaN.
> We could have extendbfsf2 expander that would FAIL; if HONOR_NANS and
> emit extendbfsf2_1 otherwise.
I see, i'll use target specific builtin and generate psrld just for
the intrinsic, and drop the expander part.
>
> And the truncsfbf2 case isn't correct IMHO even for -ffast-math.
> float -> __bf16 conversion should be properly rounding depending on the
> current rounding mode, while {,v}psrld will always round toward zero.
>
> Jakub
>


-- 
BR,
Hongtao


Re: [PATCH] [x86] Fix incorrect implementation for mm_cvtsbh_ss.

2022-11-23 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 23, 2022 at 08:28:20PM +0800, liuhongt via Gcc-patches wrote:
> After supporting real __bf16 type, implementation of mm_cvtsbh_ss went wrong.
> The patch supports extendbfsf2/truncsfbf2 with pslld/psrld,
> and then refined the intrinsic with implicit conversion.

This is not correct.
While using such code for _mm_cvtsbh_ss is fine if it is documented not to
raise exceptions and turn a sNaN into a qNaN, it is not fine for HONOR_NANS
(i.e. when -ffast-math is not on), because a __bf16 -> float conversion
on sNaN should raise invalid exception and turn it into a qNaN.
We could have extendbfsf2 expander that would FAIL; if HONOR_NANS and
emit extendbfsf2_1 otherwise.

And the truncsfbf2 case isn't correct IMHO even for -ffast-math.
float -> __bf16 conversion should be properly rounding depending on the
current rounding mode, while {,v}psrld will always round toward zero.

Jakub



[PATCH] [x86] Fix incorrect implementation for mm_cvtsbh_ss.

2022-11-23 Thread liuhongt via Gcc-patches
After supporting real __bf16 type, implementation of mm_cvtsbh_ss went wrong.
The patch supports extendbfsf2/truncsfbf2 with pslld/psrld,
and then refined the intrinsic with implicit conversion.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

PR target/107748
* config/i386/avx512bf16intrin.h (_mm_cvtsbh_ss): Refined.
* config/i386/i386.md (extendbfsf2): New define_insn.
(truncsfbf2): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/extendbfsf.c: New test.
* gcc.target/i386/avx512bf16-cvtsbh2ss-1.c: Adjust testcase.
---
 gcc/config/i386/avx512bf16intrin.h|  4 +--
 gcc/config/i386/i386.md   | 33 ++-
 .../gcc.target/i386/avx512bf16-cvtsbh2ss-1.c  |  3 +-
 gcc/testsuite/gcc.target/i386/extendbfsf.c| 16 +
 4 files changed, 50 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/extendbfsf.c

diff --git a/gcc/config/i386/avx512bf16intrin.h 
b/gcc/config/i386/avx512bf16intrin.h
index ea1d0125b3f..4a071bcd75a 100644
--- a/gcc/config/i386/avx512bf16intrin.h
+++ b/gcc/config/i386/avx512bf16intrin.h
@@ -46,9 +46,7 @@ extern __inline float
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_cvtsbh_ss (__bf16 __A)
 {
-  union{ float a; unsigned int b;} __tmp;
-  __tmp.b = ((unsigned int)(__A)) << 16;
-  return __tmp.a;
+  return __A;
 }
 
 /* vcvtne2ps2bf16 */
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 01faa911b77..f5215596d44 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4961,6 +4961,21 @@ (define_insn "*extendhf2"
(set_attr "prefix" "evex")
(set_attr "mode" "")])
 
+(define_insn "extendbfsf2"
+  [(set (match_operand:SF 0 "register_operand"   "=x,Yw")
+   (float_extend:SF
+ (match_operand:BF 1 "register_operand" " 0,Yw")))]
+ "TARGET_SSE2"
+ "@
+  pslld\t{$16, %0|%0, 16}
+  vpslld\t{$16, %1, %0|%0, %1, 16}"
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "sseishft")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix_data16" "1,*")
+   (set_attr "prefix" "orig,vex")
+   (set_attr "mode" "TI")
+   (set_attr "memory" "none")])
 
 (define_expand "extendxf2"
   [(set (match_operand:XF 0 "nonimmediate_operand")
@@ -5177,7 +5192,23 @@ (define_insn "*trunchf2"
   [(set_attr "type" "ssecvt")
(set_attr "prefix" "evex")
(set_attr "mode" "HF")])
-
+
+(define_insn "truncsfbf2"
+  [(set (match_operand:BF 0 "register_operand"   "=x,Yw")
+   (float_truncate:BF
+ (match_operand:SF 1 "register_operand" " 0,Yw")))]
+ "TARGET_SSE2"
+ "@
+  psrld\t{$16, %0|%0, 16}
+  vpsrld\t{$16, %1, %0|%0, %1, 16}"
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "sseishft")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix_data16" "1,*")
+   (set_attr "prefix" "orig,vex")
+   (set_attr "mode" "TI")
+   (set_attr "memory" "none")])
+
 ;; Signed conversion to DImode.
 
 (define_expand "fix_truncxfdi2"
diff --git a/gcc/testsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c 
b/gcc/testsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c
index 8e929e6f159..edf30b583b9 100644
--- a/gcc/testsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c
@@ -1,8 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512bf16 -O2" } */
 /* { dg-additional-options "-fno-PIE -mfpmath=sse" { target ia32 } } */
-/* { dg-final { scan-assembler-times "sall\[ \\t\]+\[^\{\n\]*16" 1 } } */
-/* { dg-final { scan-assembler-times "movl" 1 } } */
+/* { dg-final { scan-assembler-times "pslld" 1 } } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/i386/extendbfsf.c 
b/gcc/testsuite/gcc.target/i386/extendbfsf.c
new file mode 100644
index 000..f1b4c218742
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/extendbfsf.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-msse2 -O2" } */
+/* { dg-final { scan-assembler-times "pslld" 1 } } */
+/* { dg-final { scan-assembler-times "psrld" 1 } } */
+
+float
+extendsfbf (__bf16 a)
+{
+  return a;
+}
+
+__bf16
+truncsfbf (float a)
+{
+  return a;
+}
-- 
2.27.0



Re: *PING* - [wwwdocs] projects/gomp: TR11 + GCC13 update

2022-11-23 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 23, 2022 at 10:34:39AM +0100, Tobias Burnus wrote:
> On 11.11.22 16:13, Tobias Burnus wrote:
> > This patch adds TR11 to the history of OpenMP releases – and it does
> > an update of the implementation status.
> > 
> > OK?

LGTM, thanks.

Jakub



[PATCH] c-family: Incremental fix for -Wsign-compare BIT_NOT_EXPR handling [PR107465]

2022-11-23 Thread Jakub Jelinek via Gcc-patches
Hi!

There can be too many extensions and seems I didn't get everything right in
the previously posted patch.

The following incremental patch ought to fix that.
The code can deal with quite a few sign/zero extensions at various spots
and it is important to deal with all of them right.
On the argument that contains BIT_NOT_EXPR we have:
MSB bits#4 bits#3 BIT_NOT_EXPR bits#2 bits#1 LSB
where bits#1 is one or more bits (TYPE_PRECISION (TREE_TYPE (arg0))
at the end of the function) we don't know anything about, for the purposes
of this warning it is VARYING that is inverted with BIT_NOT_EXPR to some other
VARYING bits;
bits#2 is one or more bits (TYPE_PRECISION (TREE_TYPE (op0)) -
TYPE_PRECISION (TREE_TYPE (arg0)) at the end of the function)
which are known to be 0 before the BIT_NOT_EXPR and 1 after it.
bits#3 is zero or more bits from the TYPE_PRECISION (TREE_TYPE (op0))
at the end of function to the TYPE_PRECISION (TREE_TYPE (op0)) at the
end of the function to TYPE_PRECISION (TREE_TYPE (op0)) at the start
of the function, which are either zero extension or sign extension.
And bits#4 is zero or more bits from the TYPE_PRECISION (TREE_TYPE (op0))
at the start of the function to TYPE_PRECISION (result_type), which
again can be zero or sign extension.

Now, vanilla trunk as well as the previously posted patch mishandles the
case where bits#3 are sign extended (as bits#2 are known to be all set,
that means bits#3 are all set too) but bits#4 are zero extended and are
thus all 0.

The patch fixes it by tracking the lowest bit which is known to be clear
above the known to be set bits (if any, otherwise it is precision of
result_type).

Ok for trunk if it passes bootstrap/regtest?

2022-11-23  Jakub Jelinek  

PR c/107465
* c-warn.cc (warn_for_sign_compare): Don't warn for unset bits
above innermost zero extension of BIT_NOT_EXPR result.

* c-c++-common/Wsign-compare-2.c (f18): New test.

--- gcc/c-family/c-warn.cc.jj   2022-11-23 10:04:53.0 +0100
+++ gcc/c-family/c-warn.cc  2022-11-23 11:19:38.928113842 +0100
@@ -2344,13 +2344,33 @@ warn_for_sign_compare (location_t locati
  have all bits set that are set in the ~ operand when it is
  extended.  */
 
+  /* bits0 is the bit index of op0 extended to result_type, which will
+ be always 0 and so all bits above it.  If there is a BIT_NOT_EXPR
+ in that operand possibly sign or zero extended to op0 and then
+ possibly further sign or zero extended to result_type, bits0 will
+ be the precision of result type if all the extensions involved
+ if any are sign extensions, and will be the place of the innermost
+ zero extension otherwise.  We warn only if BIT_NOT_EXPR's operand is
+ zero extended from some even smaller precision, in that case after
+ BIT_NOT_EXPR some bits below bits0 will be guaranteed to be set.
+ Similarly for bits1.  */
+  int bits0 = TYPE_PRECISION (result_type);
+  if (TYPE_UNSIGNED (TREE_TYPE (op0)))
+bits0 = TYPE_PRECISION (TREE_TYPE (op0));
   tree arg0 = c_common_get_narrower (op0, );
   if (TYPE_PRECISION (TREE_TYPE (arg0)) == TYPE_PRECISION (TREE_TYPE (op0)))
 unsignedp0 = TYPE_UNSIGNED (TREE_TYPE (op0));
+  else if (unsignedp0)
+bits0 = TYPE_PRECISION (TREE_TYPE (arg0));
   op0 = arg0;
+  int bits1 = TYPE_PRECISION (result_type);
+  if (TYPE_UNSIGNED (TREE_TYPE (op1)))
+bits1 = TYPE_PRECISION (TREE_TYPE (op1));
   tree arg1 = c_common_get_narrower (op1, );
   if (TYPE_PRECISION (TREE_TYPE (arg1)) == TYPE_PRECISION (TREE_TYPE (op1)))
 unsignedp1 = TYPE_UNSIGNED (TREE_TYPE (op1));
+  else if (unsignedp1)
+bits1 = TYPE_PRECISION (TREE_TYPE (arg1));
   op1 = arg1;
 
   if ((TREE_CODE (op0) == BIT_NOT_EXPR)
@@ -2360,6 +2380,7 @@ warn_for_sign_compare (location_t locati
{
  std::swap (op0, op1);
  std::swap (unsignedp0, unsignedp1);
+ std::swap (bits0, bits1);
}
 
   int unsignedp;
@@ -2378,16 +2399,8 @@ warn_for_sign_compare (location_t locati
  && bits < HOST_BITS_PER_WIDE_INT)
{
  HOST_WIDE_INT mask = HOST_WIDE_INT_M1U << bits;
- if (unsignedp0)
-   {
- bits = TYPE_PRECISION (TREE_TYPE (op0));
- if (bits < TYPE_PRECISION (result_type)
- && bits < HOST_BITS_PER_WIDE_INT)
-   mask &= ~(HOST_WIDE_INT_M1U << bits);
-   }
- bits = TYPE_PRECISION (result_type);
- if (bits < HOST_BITS_PER_WIDE_INT)
-   mask &= ~(HOST_WIDE_INT_M1U << bits);
+ if (bits0 < HOST_BITS_PER_WIDE_INT)
+   mask &= ~(HOST_WIDE_INT_M1U << bits0);
  if ((mask & constant) != mask)
{
  if (constant == 0)
@@ -2405,24 +2418,7 @@ warn_for_sign_compare (location_t locati
< TYPE_PRECISION (TREE_TYPE (op0)))
   && unsignedp
   && unsignedp1
-  /* 

[committed] libstdc++: Fix unsafe use of dirent::d_name [PR107814]

2022-11-23 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux and sparc-sun-solaris2.11. Pushed to trunk.
I'll backport to the branches too.

-- >8 --

Copy the fix for PR 104731 to the equivalent experimental::filesystem
test.

libstdc++-v3/ChangeLog:

PR libstdc++/107814
* testsuite/experimental/filesystem/iterators/error_reporting.cc:
Use a static buffer with space after it.
---
 .../filesystem/iterators/error_reporting.cc   | 35 ---
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/iterators/error_reporting.cc 
b/libstdc++-v3/testsuite/experimental/filesystem/iterators/error_reporting.cc
index f005b7d5293..aabed14679c 100644
--- 
a/libstdc++-v3/testsuite/experimental/filesystem/iterators/error_reporting.cc
+++ 
b/libstdc++-v3/testsuite/experimental/filesystem/iterators/error_reporting.cc
@@ -29,35 +29,44 @@
 
 int choice;
 
-struct dirent global_dirent;
-
 extern "C" struct dirent* readdir(DIR*)
 {
+  // On some targets dirent::d_name is very small, but the OS allocates
+  // a trailing char array after the dirent struct. Emulate that here.
+  union State
+  {
+struct dirent d;
+char buf[sizeof(struct dirent) + 16] = {};
+  };
+
+  static State state;
+  char* d_name = state.buf + offsetof(struct dirent, d_name);
+
   switch (choice)
   {
   case 1:
-global_dirent.d_ino = 999;
+state.d.d_ino = 999;
 #if defined _GLIBCXX_HAVE_STRUCT_DIRENT_D_TYPE && defined DT_REG
-global_dirent.d_type = DT_REG;
+state.d.d_type = DT_REG;
 #endif
-global_dirent.d_reclen = 0;
-std::char_traits::copy(global_dirent.d_name, "file", 5);
+state.d.d_reclen = 0;
+std::char_traits::copy(d_name, "file", 5);
 choice = 0;
-return _dirent;
+return 
   case 2:
-global_dirent.d_ino = 111;
+state.d.d_ino = 111;
 #if defined _GLIBCXX_HAVE_STRUCT_DIRENT_D_TYPE && defined DT_DIR
-global_dirent.d_type = DT_DIR;
+state.d.d_type = DT_DIR;
 #endif
-global_dirent.d_reclen = 60;
-std::char_traits::copy(global_dirent.d_name, "subdir", 7);
+state.d.d_reclen = 60;
+std::char_traits::copy(d_name, "subdir", 7);
 choice = 1;
-return _dirent;
+return 
   default:
 errno = EIO;
 return nullptr;
   }
-  return _dirent;
+  return 
 }
 
 void
-- 
2.38.1



[PATCH] doc: -Wdelete-non-virtual-dtor supersedes -Wnon-virtual-dtor

2022-11-23 Thread Jonathan Wakely via Gcc-patches
The existence of this option makes users think they need it (even though
it's in neither -Wall nor -Wextra). Document that there's a better
option (since 2011).

OK for trunk?

-- >8 --

The newer -Wdelete-non-virtual-dtor has no false positives and fewer
bugs. There is very little reason to use -Wnon-virtual-dtor instead.

gcc/ChangeLog:

* doc/invoke.texi (C++ Dialect Options): Recommend using
-Wdelete-non-virtual-dtor instead of -Wnon-virtual-dtor.
---
 gcc/doc/invoke.texi | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 330da6eb5d4..4899bd1ea4c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -3986,6 +3986,9 @@ destructor itself or in an accessible polymorphic base 
class, in which
 case it is possible but unsafe to delete an instance of a derived
 class through a pointer to the class itself or base class.  This
 warning is automatically enabled if @option{-Weffc++} is specified.
+The @option{-Wdelete-non-virtual-dtor} option (enabled by @option{-Wall})
+should be preferred because it warns about the unsafe cases without false
+positives.
 
 @item -Wregister @r{(C++ and Objective-C++ only)}
 @opindex Wregister
-- 
2.38.1



[PATCH] lto: fix usage of timer in materialize_cgraph

2022-11-23 Thread Martin Liška
Pretty obvious change.

Ready to be installed?
Thanks,
Martin

PR lto/107829

gcc/lto/ChangeLog:

* lto.cc (materialize_cgraph): Call timevar_push before
materialization starts.
---
 gcc/lto/lto.cc | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/gcc/lto/lto.cc b/gcc/lto/lto.cc
index 3a9147b01b5..3265a1d07bc 100644
--- a/gcc/lto/lto.cc
+++ b/gcc/lto/lto.cc
@@ -137,6 +137,12 @@ materialize_cgraph (void)
 fprintf (stderr,
 flag_wpa ? "Materializing decls:" : "Reading function bodies:");
 
+  /* Start the appropriate timer depending on the mode that we are
+ operating in.  */
+  lto_timer = (flag_wpa) ? TV_WHOPR_WPA
+ : (flag_ltrans) ? TV_WHOPR_LTRANS
+ : TV_LTO;
+  timevar_push (lto_timer);
 
   FOR_EACH_FUNCTION (node)
 {
@@ -147,14 +153,6 @@ materialize_cgraph (void)
}
 }
 
-
-  /* Start the appropriate timer depending on the mode that we are
- operating in.  */
-  lto_timer = (flag_wpa) ? TV_WHOPR_WPA
- : (flag_ltrans) ? TV_WHOPR_LTRANS
- : TV_LTO;
-  timevar_push (lto_timer);
-
   current_function_decl = NULL;
   set_cfun (NULL);
 
-- 
2.38.1



*PING* - [wwwdocs] projects/gomp: TR11 + GCC13 update

2022-11-23 Thread Tobias Burnus

On 11.11.22 16:13, Tobias Burnus wrote:

This patch adds TR11 to the history of OpenMP releases – and it does
an update of the implementation status.

OK?

Tobias

PS: The implementation-status changes were lying around in that file
for a while. I think both the GCC 13 release notes and this file needs
some update for more recent changes. Nonetheless, while incomplete,
the changes themselves should be fine.

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH v1] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2022-11-23 Thread chenglulu



在 2022/11/23 17:25, Xi Ruoyao 写道:

On Wed, 2022-11-23 at 17:14 +0800, chenglulu wrote:

在 2022/11/23 16:59, Xi Ruoyao 写道:

On Wed, 2022-11-23 at 14:49 +0800, Lulu Cheng wrote:

  'A' Print a _DB suffix if the memory model requires a
release.
  'b' Print the address of a memory operand, without offset.
+   'c'  print an integer.

Nit:
    'c' Print an integer.

to match the format of other entries.


  'C' Print the integer branch condition for comparison OP.
  'd' Print CONST_INT OP in decimal.
  'F' Print the FPU branch condition for comparison OP.

And I'd consider this a new feature and delay it to GCC 14: we never
claimed we supported 'c' and it has not worked since the day one we
merged LoongArch port.  Is there any emergency reason to support 'c'
in
GCC 13?


I don't think this is a new feature.

There is a description of '%c' in section 17.5 of gccint.pdf, which I
understand is a public descriptor,

but right now loongarch doesn't support it.

I'm not sure if gccint is designed for normal users to read, but since
we lack a documentation about those descriptors in GCC user manual, I
guess many users will indeed use gccint as a reference ...

Ok to me.  But regarding the test case I suggest to keep those "%a"
tests there (so we won't inadvertently cause a regression in case some
user code already uses it for printing an integer).  Unless we
deliberately want to stop people from using "%a" for this purpose.


Ok thanks! I'll be making changes to the patch.

I think I'll have to go and supplement the documentation.



Re: [PATCH v1] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2022-11-23 Thread Xi Ruoyao via Gcc-patches
On Wed, 2022-11-23 at 17:14 +0800, chenglulu wrote:
> 
> 在 2022/11/23 16:59, Xi Ruoyao 写道:
> > On Wed, 2022-11-23 at 14:49 +0800, Lulu Cheng wrote:
> > >  'A' Print a _DB suffix if the memory model requires a
> > > release.
> > >  'b' Print the address of a memory operand, without offset.
> > > +   'c'  print an integer.
> > Nit:
> >    'c' Print an integer.
> > 
> > to match the format of other entries.
> > 
> > >  'C' Print the integer branch condition for comparison OP.
> > >  'd' Print CONST_INT OP in decimal.
> > >  'F' Print the FPU branch condition for comparison OP.
> > And I'd consider this a new feature and delay it to GCC 14: we never
> > claimed we supported 'c' and it has not worked since the day one we
> > merged LoongArch port.  Is there any emergency reason to support 'c'
> > in
> > GCC 13?
> > 
> I don't think this is a new feature.
> 
> There is a description of '%c' in section 17.5 of gccint.pdf, which I 
> understand is a public descriptor,
> 
> but right now loongarch doesn't support it.

I'm not sure if gccint is designed for normal users to read, but since
we lack a documentation about those descriptors in GCC user manual, I
guess many users will indeed use gccint as a reference ...

Ok to me.  But regarding the test case I suggest to keep those "%a"
tests there (so we won't inadvertently cause a regression in case some
user code already uses it for printing an integer).  Unless we
deliberately want to stop people from using "%a" for this purpose.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH] libstdc++: Fix libstdc++ build on some targets [PR107811]

2022-11-23 Thread Jonathan Wakely via Gcc-patches
On Wed, 23 Nov 2022 at 08:55, Jakub Jelinek wrote:
>
> Hi!
>
> fast_float library relies on size_t being 32-bit or larger and float/double
> being IEEE single/double.  Otherwise we only use strtod/strtof.
> In 3 spots I've used fast_float namespace stuff unconditionally in one
> function, which breaks the build if fast_float is disabled.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

OK, thanks.



Re: [PATCH v1] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2022-11-23 Thread chenglulu



在 2022/11/23 16:59, Xi Ruoyao 写道:

On Wed, 2022-11-23 at 14:49 +0800, Lulu Cheng wrote:

     'A' Print a _DB suffix if the memory model requires a release.
     'b' Print the address of a memory operand, without offset.
+   'c'  print an integer.

Nit:
   'c' Print an integer.

to match the format of other entries.


     'C' Print the integer branch condition for comparison OP.
     'd' Print CONST_INT OP in decimal.
     'F' Print the FPU branch condition for comparison OP.

And I'd consider this a new feature and delay it to GCC 14: we never
claimed we supported 'c' and it has not worked since the day one we
merged LoongArch port.  Is there any emergency reason to support 'c' in
GCC 13?


I don't think this is a new feature.

There is a description of '%c' in section 17.5 of gccint.pdf, which I 
understand is a public descriptor,


but right now loongarch doesn't support it.




[PATCH] c: Fix compile time hog in c_genericize [PR107127]

2022-11-23 Thread Jakub Jelinek via Gcc-patches
Hi!

The complex multiplications result in deeply nested set of many SAVE_EXPRs,
which takes even on fast machines over 5 minutes to walk.
This patch fixes that by using walk_tree_without_duplicates where it is
instant.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-11-23  Andrew Pinski  
Jakub Jelinek  

PR c/107127
* c-gimplify.cc (c_genericize): Use walk_tree_without_duplicates
instead of walk_tree for c_genericize_control_r.

* gcc.dg/pr107127.c: New test.

--- gcc/c-family/c-gimplify.cc.jj   2022-08-12 13:39:59.229218070 +0200
+++ gcc/c-family/c-gimplify.cc  2022-11-22 11:48:39.977263700 +0100
@@ -572,8 +572,8 @@ c_genericize (tree fndecl)
   bc_state_t save_state;
   push_cfun (DECL_STRUCT_FUNCTION (fndecl));
   save_bc_state (_state);
-  walk_tree (_SAVED_TREE (fndecl), c_genericize_control_r,
-NULL, NULL);
+  walk_tree_without_duplicates (_SAVED_TREE (fndecl),
+   c_genericize_control_r, NULL);
   restore_bc_state (_state);
   pop_cfun ();
 }
--- gcc/testsuite/gcc.dg/pr107127.c.jj  2022-11-22 11:56:20.798454526 +0100
+++ gcc/testsuite/gcc.dg/pr107127.c 2022-11-22 11:56:06.224669767 +0100
@@ -0,0 +1,12 @@
+/* PR c/107127 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+int *v;
+
+_Complex double
+foo (_Complex double a, double b, double c)
+{
+  return v[0] / c * (0 - 0 / a + 699.0 + 7.05 - 286.0 - +-4.65 + 1.57 + 0) 
* 0.1 - 3.28 + 4.22 + 0.1)) * b + 5.06)
+* 1.23 * 8.0 * 12.0 * 16.0 * 2.0 * 2.0 * 0.25 * 0.125 * 18.2 * 
-15.25 * 0.0001
+* 42.0 * 0.012 - 8.45 + 0 + 88.0 + 6.96 + 867.0 + 9.10 - 7.04 
* -1.0);
+}

Jakub



Re: [PATCH v1] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2022-11-23 Thread Xi Ruoyao via Gcc-patches
On Wed, 2022-11-23 at 14:49 +0800, Lulu Cheng wrote:
>     'A' Print a _DB suffix if the memory model requires a release.
>     'b' Print the address of a memory operand, without offset.
> +   'c'  print an integer.

Nit:
  'c' Print an integer.

to match the format of other entries.

>     'C' Print the integer branch condition for comparison OP.
>     'd' Print CONST_INT OP in decimal.
>     'F' Print the FPU branch condition for comparison OP.

And I'd consider this a new feature and delay it to GCC 14: we never
claimed we supported 'c' and it has not worked since the day one we
merged LoongArch port.  Is there any emergency reason to support 'c' in
GCC 13?

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH] libstdc++: Fix libstdc++ build on some targets [PR107811]

2022-11-23 Thread Jakub Jelinek via Gcc-patches
Hi!

fast_float library relies on size_t being 32-bit or larger and float/double
being IEEE single/double.  Otherwise we only use strtod/strtof.
In 3 spots I've used fast_float namespace stuff unconditionally in one
function, which breaks the build if fast_float is disabled.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2022-11-23  Jakub Jelinek  

PR libstdc++/107811
* src/c++17/floating_from_chars.cc (__floating_from_chars_hex): Guard
fast_float uses with #if USE_LIB_FAST_FLOAT and for mantissa_bits and
exponent_bits provide a fallback.

--- libstdc++-v3/src/c++17/floating_from_chars.cc.jj2022-11-08 
09:54:37.533397224 +0100
+++ libstdc++-v3/src/c++17/floating_from_chars.cc   2022-11-22 
14:03:10.365474110 +0100
@@ -783,11 +783,16 @@ namespace
 using uint_t = conditional_t, uint32_t,
 conditional_t, uint64_t,
   uint16_t>>;
+#if USE_LIB_FAST_FLOAT
 constexpr int mantissa_bits
   = fast_float::binary_format::mantissa_explicit_bits();
 constexpr int exponent_bits
   = is_same_v ? 11
: is_same_v ? 5 : 8;
+#else
+constexpr int mantissa_bits = is_same_v ? 23 : 52;
+constexpr int exponent_bits = is_same_v ? 8 : 11;
+#endif
 constexpr int exponent_bias = (1 << (exponent_bits - 1)) - 1;
 
 __glibcxx_requires_valid_range(first, last);
@@ -945,8 +950,11 @@ namespace
else if (mantissa_idx >= -4)
  {
if constexpr (is_same_v
+#if USE_LIB_FAST_FLOAT
  || is_same_v)
+  fast_float::floating_type_bfloat16_t>
+#endif
+)
  {
__glibcxx_assert(mantissa_idx == -1);
mantissa |= hexit >> 1;
@@ -1130,6 +1138,7 @@ namespace
   }
 if constexpr (is_same_v || is_same_v)
   memcpy(, , sizeof(result));
+#if USE_LIB_FAST_FLOAT
 else if constexpr (is_same_v)
   {
uint32_t res = uint32_t{result} << 16;
@@ -1156,6 +1165,7 @@ namespace
 | ((uint32_t{result} & 0x8000) << 16));
memcpy(value.x, , sizeof(res));
   }
+#endif
 
 return {first, errc{}};
   }

Jakub



[PATCH] diagnostics: Fix selftest ICE in certain locales [PR107722]

2022-11-23 Thread Jakub Jelinek via Gcc-patches
Hi!

As reported in the PR, since special_fname_builtin () call has been
introduced, the diagnostics code compares filename against _("")
rather than "", which means that if self tests are performed
with the string being translated, one self-test fails.
The following patch fixes that.

Bootstrapped/regtested on x86_64-linux and i686-linux (with normal C locale)
and by the reporter in German, where it fixes the problem.  Ok for trunk?

2022-11-22  Jakub Jelinek  

PR bootstrap/107722
* diagnostic.cc (test_diagnostic_get_location_text): Test
special_fname_builtin () rather than "" and expect
special_fname_builtin () concatenated with ":" for it.

--- gcc/diagnostic.cc.jj2022-11-15 22:57:18.215211107 +0100
+++ gcc/diagnostic.cc   2022-11-22 12:36:37.197764164 +0100
@@ -2593,7 +2593,10 @@ test_diagnostic_get_location_text ()
   const char *old_progname = progname;
   progname = "PROGNAME";
   assert_location_text ("PROGNAME:", NULL, 0, 0, true);
-  assert_location_text (":", "", 42, 10, true);
+  char *built_in_colon = concat (special_fname_builtin (), ":", (char *) 0);
+  assert_location_text (built_in_colon, special_fname_builtin (),
+   42, 10, true);
+  free (built_in_colon);
   assert_location_text ("foo.c:42:10:", "foo.c", 42, 10, true);
   assert_location_text ("foo.c:42:9:", "foo.c", 42, 10, true, 0);
   assert_location_text ("foo.c:42:1010:", "foo.c", 42, 10, true, 1001);

Jakub