I'm going to commit the attached two patches. Removed the redundant
changes in test cases and added constructor initialization of
fold_all_stmts.
Regards
Robin
--
gcc/ChangeLog:
2019-08-21 Robin Dapp
* gimple-loop-versioning.cc (loop_versioning::record_address_fragment
> So - which case is it? IIRC we want to handle small signed
> constants but the code can end up unsigned. For the
> above we could write (unsigned long)((int)a + 1 - 1) and thus
> sign-extend? Or even avoid this if we know the range.
> That is, it becomes the first case again (operation
> I'm still a bit worried about the overlap between the expanded
> noce_convert_multiple_sets and cond_move_process_if_block (5/9).
> It seems like we're making noce_convert_multiple_set handle most of
> the conditional move cases that cond_move_process_if_block can handle.
> But like you say,
> Looks like a nice optimisation, but could we just test whether the
> destination of a set isn't live on exit from the then block? I think
> we could do that on the fly during the main noce_convert_multiple_sets
> loop.
I included this locally along with the rest of the remarks. Any comments
on
> So - what are you really after? (sorry if I don't remeber, testcase(s)
> are missing
> from this patch)
>
> To me it seems that 1) loses information if A + CST was done in a signed type
> and we know that overflow doesn't happen because of that. For the reverse
> transformation we don't. Btw,
> +/* ((T)(A + CST1)) + CST2 -> (T)(A) + CST */
> Do you want to handle MINUS? What about POINTER_PLUS_EXPR?
When I last attempted this patch I had the MINUS still in it but got
confused easily by needing to think of too many cases at once leading to
lots of stupid mistakes. Hence, I left it
> I have become rather wary of INTEGRAL_TYPE_P recently because it
> includes enum types, which with -fstrict-enum can have a surprising
> behavior. If I have
> enum E { A, B, C };
> and e has type enum E, with -fstrict-enum, do your tests manage to
> prevent (long)e+1 from becoming (long)(e+1)
> May I suggest to add a parameter to the substitute-and-fold engine
> so we can do the folding on all stmts only when enabled and enable
> it just for VRP? That also avoids the testsuite noise.
Would something along these lines do?
diff --git a/gcc/tree-ssa-propagate.c
We would like to simplify code like
(larger_type)(var + const1) + const2
to
(larger_type)(var + combined_const1_const2)
when we know that no overflow happens.
---
gcc/match.pd | 101 +++
1 file changed, 101 insertions(+)
diff --git a/gcc/match.pd
---
.../gcc.dg/tree-ssa/copy-headers-5.c | 2 +-
.../gcc.dg/tree-ssa/copy-headers-7.c | 2 +-
.../gcc.dg/wrapped-binop-simplify-run.c | 52
.../gcc.dg/wrapped-binop-simplify-signed-1.c | 60 +++
.../wrapped-binop-simplify-unsigned-1.c
) and manifests similarly to
addr1,-1
extend r1,r1
addr1,1
where the adds could be avoided entirely.
This is the tree part of the fix, it will still be necessary to
correct rtl code generation in doloop later.
Bootstrapped and regtested on s390x, x86 running.
Regards
Robin
--
Robin Dapp (3
This patch performs more aggressive folding in order for the
match.pd changes to kick in later.
Some test cases rely on VRP doing something which now already
happens during CCP so adjust them accordingly.
Also, the loop versioning pass was missing one case when
deconstructing addresses that
> It seems like this is making noce_convert_multiple_sets overlap
> a lot with cond_move_process_if_block (although that uses CONSTANT_P
> instead of CONST_INT_P). How do they fit together after this patch,
> i.e. which cases is each one meant to handle that the other doesn't?
IMHO all of icvt
Hi Richard,
> Is the separate need_temps scan required for correctness? It looked
> like we could test:
>
> if (reg_overlap_mentioned_p (dest, cond))
> ...
>
> on-the-fly during the main noce_convert_multiple_sets loop.
right, I didn't re-check it but after changes during interal
When then and else are reversed, we would swap new_val and old_val.
The same has to be done for our new code paths.
Also, emit_conditional_move may perform swapping. In case we need to
swap, the cc comparison also needs to be swapped and for this we pass
the reversed cc comparison directly. An
This patch checks allows immediate then/else operands for cmovs.
We rely on,emit_conditional_move returning NULL if something unsupported
was generated.
Also, minor refactoring is performed.
--
gcc/ChangeLog:
2018-11-14 Robin Dapp
* ifcvt.c (have_const_cmov): New function
A swap-style idiom like
tmp = a
a = b
b = tmp
would be transformed like
tmp_tmp = cond ? a : tmp
tmp_a = cond ? b : a
tmp_b = cond ? tmp_tmp : b
[...]
including rewiring the first source operand to previous writes (e.g. tmp ->
tmp_tmp).
The code would recognize this, though, and
noce_convert_multiple_sets creates temporaries for the destination of
every emitted cmov and expects subsequent passes to get rid of them. This
does not happen every time and even if the temporaries are removed, code
generation can be affected adversely. In this patch, temporaries are
only
This patch extends bb_ok_for_noce_convert_multiple_sets by a temporary
cost estimation that can be used by noce_convert_multiple_sets.
---
gcc/ifcvt.c | 17 +++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 253b8a96c1a..55205cac153
This patch duplicates the previous noce_emit_cmove logic. First it
passes the canonical comparison emits the sequence and costs it.
Then, a second, separate sequence is created by passing the cc compare
we extracted before. The costs of both sequences are compared and the
cheaper one is emitted.
This patch extracts a cc comparison from the initial compare/jump
insn and allows it to be passed to noce_emit_cmove and
emit_conditional_move.
---
gcc/ifcvt.c | 68
gcc/optabs.c | 7 --
gcc/optabs.h | 2 +-
3 files changed, 69
This patch saves the number of created conditional moves by
noce_convert_multiple_sets in the IF_INFO struct. This may be used by
the backend to easier decide whether to accept a generated sequence or
not.
---
gcc/ifcvt.c | 10 --
gcc/ifcvt.h | 4
2 files changed, 12
Robin
Robin Dapp (9):
ifcvt: Store the number of created cmovs.
ifcvt: Use enum instead of transform_name string.
ifcvt: Only created temporaries as needed.
ifcvt: Estimate original costs before convert_multiple.
ifcvt: Allow constants operands in noce_convert_multiple_sets.
ifcvt
This patch introduces an enum for ifcvt's various noce transformations.
As the transformation might be queried by the backend, I find it nicer
to allow checking for a proper type instead of a string comparison.
---
gcc/ifcvt.c | 46 ++--
gcc/ifcvt.h | 67
Add s390_valid_shift_count to determine the validity of a
shift-count operand. This is used to replace increasingly
complex substitutions that should have allowed address-style
shift-count handling, an and mask as well as no-op subregs
on the operand.
--
gcc/ChangeLog:
2019-07-05 Robin Dapp
Define s390_shift_truncation_mask to allow the optabs optimization
sh = (64 - sh)
-> sh = -sh
for a rotation operation.
--
gcc/ChangeLog:
2019-07-05 Robin Dapp
* config/s390/s390.c (s390_shift_truncation_mask): Define.
(TARGET_SHIFT_TRUNCATION_MASK): Define.
Tests to check for the changed shift-count handling.
--
gcc/testsuite/ChangeLog:
2019-07-05 Robin Dapp
* gcc.target/s390/combine-rotate-modulo.c: New test.
* gcc.target/s390/combine-shift-rotate-add-mod.c: New test.
* gcc.target/s390/vector/combine-shift-vec.c: New
).
The second patch adds some tests.
The third patch defines the shift_truncation_mask and adds
a test for it.
Bootstrapped and regtested.
Regards
Robin
---
Robin Dapp (3):
S/390: Rework shift count handling.
S/390: Shift count tests.
S/390: Define shift_truncation_mask.
gcc/config/s390
Ping.
> gcc/testsuite/ChangeLog:
>
> 2019-05-15 Robin Dapp
>
> * gcc.dg/tree-ssa/gen-vect-26.c: Do not expect unaligned access
> vectorization on s390.
> * gcc.dg/tree-ssa/gen-vect-28.c: Likewise.
> * gcc.dg/tree-ssa/gen-vect-32.c: Likewise.
>
>> Now, in order to get rid of the subregs in the pattern combine creates,
>> I would need to be able to do something like
>>
>> (define_subst "subreg_subst"
>> [(set (match_operand:DI 0 "" "")
>> (shift:DI (match_operand:DI 1 "" "")
>>(subreg:SI (match_dup:DI 2)))]
>>
>>
Hi,
this patch changes three gen-vect testcases so they do not expect
vectorization of an unaligned access. Vectorization happens regardless,
we just ignore misalignment.
Regards
Robin
--
gcc/testsuite/ChangeLog:
2019-05-15 Robin Dapp
* gcc.dg/tree-ssa/gen-vect-26.c: Do
Hi,
this patch adds -march=z900 to a test case that expects larl for loading
a value via the GOT. On z10 and later, lgrl is used which is tested in
a new test case.
Regards
Robin
--
gcc/testsuite/ChangeLog:
2019-05-15 Robin Dapp
* gcc.target/s390/global-array-element-pic.c: Add
> It would really help if you could provide testcases which show the
> suboptimal code and any analysis you've done.
I tried introducing a define_subst pattern that substitutes something
one of two other subst patterns already changed.
The first subst pattern helps remove a superfluous and on
>> Bit tests on x86 also truncate [1], if the bit base operand specifies
>> a register, and we don't use BT with a memory location as a bit base.
>> I don't know what is referred with "(real or pretended) bit field
>> operations" in the documentation for SHIFT_COUNT_TRUNCATED:
>>
>> However,
Hi,
while trying to improve s390 code generation for rotate and shift I
noticed superfluous subregs for shift count operands. In our backend we
already have quite cumbersome patterns that would need to be duplicated
(or complicated further by more subst patterns) in order to get rid of
the
> Robin, have you been testing with --disable-multilib or something
> similar?
yes, I believe so... stupid mistake :(
Thanks for fixing it so quickly.
ll: all-am
+PWD_COMMAND = $${PWDCMD-pwd}
.SUFFIXES:
$(srcdir)/Makefile.in: @MAINTAINER_MODE_TRUE@ $(srcdir)/Makefile.am
$(am__configure_deps)
Regards
Robin
--
gcc/d/ChangeLog:
2019-04-24 Robin Dapp
* typeinfo.cc (create_typeinfo): Set fields with proper length.
gcc/testsuite/Change
Hi Rainer,
> I noticed you missed one piece of Iain's typeinfo.cc patch, btw.:
>
> diff --git a/gcc/d/typeinfo.cc b/gcc/d/typeinfo.cc
> --- a/gcc/d/typeinfo.cc
> +++ b/gcc/d/typeinfo.cc
> @@ -886,7 +886,7 @@ public:
> if (cd->isCOMinterface ())
> flags |= ClassFlags::isCOMclass;
>
Hi,
> + Establish an ANTI dependency between r11 and r15 restores from FPRs
> + to prevent the instructions scheduler from reordering them since
> + this would break CFI. No further handling in the sched_reorder
> + hook is required since the r11 and r15 restore will never appear in
> +
Hi Rainer,
> This will occur on any 32-bit target. The following patch (using
> ssize_t instead) allowed the code to compile:
thanks, included your fix and attempted a more generic version of the
186 test.
I also continued debugging some fails further:
- Most of the MurmurHash fails are
Hi,
this patch adds the pipeline description and the cpu model number for
arch13.
Bootstrapped and regtested on s390x.
Regards
Robin
--
gcc/ChangeLog:
2019-04-10 Robin Dapp
* config/s390/8561.md: New file.
* config/s390/driver-native.c (s390_host_detect_local_cpu): Add
Hi,
> Are the values inside the tables the problem? Or just some of the
> helper functions/templates that interact with them to generate the
> static data?
>
> If the latter, then a rebuild of the files may not be necessary.
I managed to get this to work without rebuilding the files. After
Hi,
the unicode tables in std.internal.unicode_tables are apparently auto
generated and loaded at (libphobos) compile time. They are also in
little endian format. Is the tool to generate them available somewhere?
I wanted to start converting them to little endian before loading but
this will
> This would mean that StructFlags and ClassFlags will also both have a
> wrong value as well.
Yes, can confirm that m_flags = 0 (instead of 1) for a struct containing
a pointer.
> If there's a compiler/library discrepancy, the compiler should be
> adjusted to write out the value at the correct
Hi,
> Alignment is written to TypeInfo, I don't think it should ever be
> zero. That would mean that it isn't being generated by the compiler,
> or read by the library correctly, so something else is amiss.
it took me a while to see that in libphobos/libdruntime/object.d
override @property
Hi,
during the last few days I tried to get D running on s390x (apparently
the first Big Endian platform to try it?). I did not yet go through the
code systematically and add a version(SystemZ) in every place where it
might be needed but rather tried to fix test failures as they arose.
After
Hi,
r269586 puts single quotes around option names. This patch fixes tests
that expect the old format.
Regards
Robin
---
gcc/testsuite/ChangeLog:
2019-03-15 Robin Dapp
* gcc.target/s390/target-attribute/tattr-1.c (htm0):
-mhtm -> '-mhtm'.
* gcc.target/s390/tar
Hi,
this patch sets the inlining parameters for z13 and later to rather
aggressive values in response to PR85103 that caused performance
regressions in SPEC2006's sjeng and gobmk benchmarks.
Regards
Robin
--
gcc/ChangeLog:
2019-03-12 Robin Dapp
* config/s390/s390.c
This fixes a newly introduced test failure.
---
2019-03-12 Robin Dapp
* gcc.target/s390/memset-1.c: Do not require stcy.
diff --git a/gcc/testsuite/gcc.target/s390/memset-1.c b/gcc/testsuite/gcc.target/s390/memset-1.c
index 3e201df1aed..9463a77208b 100644
--- a/gcc/testsuite
> Please adjust the year and the author in gcc/config/s390/3906.md. Ok with
> that change.
Changed that and also simplified the longrunning checks.
gcc/ChangeLog:
2019-03-12 Robin Dapp
* config/s390/s390.c (LONGRUNNING_THRESHOLD): Remove.
(s390_is_fpd
This patch adds handling of group-of-two instructions.
---
gcc/config/s390/s390.c | 36 +++-
1 file changed, 35 insertions(+), 1 deletion(-)
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 4dcf1be4445..78a707267e8 100644
---
This patch adapts some scheduling-related parameters.
---
gcc/config/s390/s390.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 78a707267e8..901807e7833 100644
--- a/gcc/config/s390/s390.c
+++
This patch makes the scheduling score execution-side aware.
---
gcc/config/s390/s390.c | 32 ++--
1 file changed, 18 insertions(+), 14 deletions(-)
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 249df00268a..4dcf1be4445 100644
---
This patch adds the z14 pipeline description.
---
gcc/config/s390/3906.md | 282
gcc/config/s390/s390.c | 23 +++-
gcc/config/s390/s390.h | 2 +-
gcc/config/s390/s390.md | 3 +
4 files changed, 307 insertions(+), 3 deletions(-)
create mode 100644
This patch adds a scheduling state struct and changes
the handling of end-group conditions.
---
gcc/config/s390/s390.c | 158 ++---
1 file changed, 68 insertions(+), 90 deletions(-)
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index
This patch adapts the z13 pipeline description.
---
gcc/config/s390/2964.md | 372 ++--
gcc/config/s390/s390.c | 39 ++---
2 files changed, 226 insertions(+), 185 deletions(-)
diff --git a/gcc/config/s390/2964.md b/gcc/config/s390/2964.md
index
This patch makes the detection of long-running instructions
independent of their latency and checks the execution unit
instead.
---
gcc/config/s390/s390.c | 73 +++---
1 file changed, 55 insertions(+), 18 deletions(-)
diff --git a/gcc/config/s390/s390.c
Hi,
this patch set adds new pipeline descriptions for z13 and z14. Based
on that, the scoring and some properties are handled differently in
the scheduler hooks.
Regards
Robin
Robin Dapp (7):
S/390: Change z13 pipeline description.
S/390: Add z14 pipeline description.
S/390: Change
Hi,
this patch implements vector copysign using vector select on S/390.
Regtested and bootstrapped on s390x.
Regards
Robin
--
gcc/ChangeLog:
2019-02-07 Robin Dapp
* config/s390/vector.md: Implement vector copysign.
gcc/testsuite/ChangeLog:
2019-02-07 Robin Dapp
> This looks pretty reasonable. ISTM it ought to be able to go forward if
> it's tested independently.
The test suite already passes, any other tests you have in mind? To be
honest I suppose noce_convert_multiple_sets will currently never
successfully return (due to the costing problems I
> This may ultimately be too simplistic. There are targets where some
> constants are OK, but others may not be. By checking the predicate
> like this I think you can cause over-aggressive if-conversion if the
> target allows a range of integers in the expander's operand predicate,
> but allows
This patch implements noce_conversion_profitable_p by checking for the
transformation ifcvt used and only return positively if
noce_convert_multiple_sets created less than MAX_IFCVT_INSNS insns.
--
gcc/ChangeLog:
2018-11-14 Robin Dapp
* config/s390/s390.c (MAX_IFCVT_INSNS): Define
This patch saves the number of created conditional moves by
noce_convert_multiple_sets in the IF_INFO struct. This may be used by
the backend to easier decide whether to accept a generated sequence or
not.
--
gcc/ChangeLog:
2018-11-14 Robin Dapp
* ifcvt.c
New test.
--
gcc/testsuite/ChangeLog:
2018-11-14 Robin Dapp
* gcc.target/s390/ifcvt-two-insns-int.c: New test.
---
.../gcc.target/s390/ifcvt-two-insns-int.c | 26 +++
1 file changed, 26 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two
created if the destination of a set is used in an emitted condition
check.
--
gcc/ChangeLog:
2018-11-14 Robin Dapp
* ifcvt.c (check_need_temps): New function.
(noce_convert_multiple_sets): Only created temporaries if needed.
---
gcc/ifcvt.c | 54
This patch introduces an enum for ifcvt's various noce transformations.
As the transformation might be queried by the backend, I find it nicer
to allow checking for a proper type instead of a string comparison.
--
gcc/ChangeLog:
2018-11-14 Robin Dapp
* ifcvt.c (noce_try_move): Use
This patch checks whether the current target supports conditional moves
with immediate then/else operands and allows noce_convert_multiple_sets
to deal with constants subsequently.
Also, minor refactoring is performed.
--
gcc/ChangeLog:
2018-11-14 Robin Dapp
* ifcvt.c
Hi,
the follow patch set was created in an attempt to allow multiple sets to be
if converted. I was not able to make it work out of the box since I found the
cost estimation for the newly created sequence to always be much higher than
the sequence before.
This is due to
Hi,
the attached patch increases the move costs for moves involving the CC
register. This saves us some instructions in SPEC CPU2006.
Regards
Robin
--
gcc/ChangeLog:
2018-11-05 Robin Dapp
* config/s390/s390.c (s390_register_move_cost): Increase costs
for moves involving
Hi,
this is v2 of the patch with less quirky pattern syntax and two tests.
Regards
Robin
--
gcc/ChangeLog:
2018-10-26 Robin Dapp
* config/s390/s390.md: QImode and HImode for load on condition.
gcc/testsuite/ChangeLog:
2018-10-26 Robin Dapp
* gcc.target/s390/ifcvt
/ChangeLog:
2018-10-26 Robin Dapp
* config/s390/predicates.md: Fix typo.
* config/s390/s390.md: Allow immediates for load on condition.
gcc/testsuite/ChangeLog:
2018-10-26 Robin Dapp
* gcc.dg/loop-8.c: On s390, always run the test with -march=zEC12.
diff --git a/gcc
> Still OK :-)
Committed as r265304.
Regards
Robin
Hi,
this enables QImode and HImode for load on condition. For SPEC2006 this
reduces code size overall, performance impact is negligible.
Regtested on s390x.
Regards
Robin
--
gcc/ChangeLog:
2018-10-18 Robin Dapp
* config/s390/s390.md: Add movcc for QImode and HImode.
diff --git
/ChangeLog:
2018-10-16 Robin Dapp
* haifa-sched.c (priority): Add force_recompute parameter.
(apply_replacement): Call priority () with force_recompute = true.
(restore_pattern): Likewise.
diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 1fdc9df9fb2..2c84ce38143 100644
Hi,
this allows immediates in the load-on-condition expander on z13 or later.
Regtested on z14.
Regards
Robin
--
gcc/ChangeLog:
2018-10-17 Robin Dapp
* config/s390/predicates.md:
Allow immediate operand in loc_operand for z13.
* config/s390/s390.md: Use
> A C++ style nit/question: instead of adding a new overload
>
> priority (rtx_insn *, bool)
>
> you can add a parameter with a default value in the existing
> static function
>
> priority (rtx_insn *insn, bool force_recompute = false)
Sometimes I'm still stuck in C land with GCC :),
ng.
Regards
Robin
gcc/ChangeLog:
2018-10-15 Robin Dapp
* haifa-sched.c (priority): Add force_recompute parameter.
(apply_replacement):
Call priority () with force_recompute = true.
(restore_pattern): Likewise.
diff --git a/gcc/haifa-sched.c b/gcc/haifa-s
didn't bootstrap for me on
x86). The actual code changes throughout SPEC2006 are minor and the
performance impact is negligible provided we do not hit a fixable bad
case as described in my last message.
Regards
Robin
--
gcc/ChangeLog:
2018-10-10 Robin Dapp
* haifa-sched.c
Hi,
committed only the zEC12 part for now. Performance behavior of z13 with
the patch is still unclear and will be tackled separately.
Regards
Robin
ping, any insight on this?
Regards
Robin
Hi,
I'm working on some insn latency changes in the s390 backend and noticed
a regression in the SPEC2006 bzip2 test case that was due to some insns
being scheduled differently.
The sequence in short form before my change is
;; | insn | prio |
;; | 823 |1 | %r1=%r1+0x1
Sorry, forgot the [S/390] tag in the subject.
Similar to zEC12, the change in latencies helps match the real machine's
behavior better.
--
gcc/ChangeLog:
2018-09-06 Robin Dapp
* config/s390/2964.md: Increase latencies for some FP instructions.
---
gcc/config/s390/2964.md | 80 ++---
1 file
Hi,
this patch increases the latency of some floating point instructions to better
match the real machine's behavior.
Regards
Robin
--
gcc/ChangeLog:
2018-09-06 Robin Dapp
* config/s390/2827.md: Increase latencies for some FP instructions.
---
gcc/config/s390/2827.md | 14
d the comment.
Regards
Robin
--
gcc/ChangeLog:
2018-07-16 Robin Dapp
* config/s390/s390.c (preferred_la_operand_p): Do not use
LA with index register on z196 or later.
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 23c3f3db621..d8b47c6fe67 100644
--- a/g
il
without the patch as we can get lucky with the alignment).
Regtested on s390x.
Regards
Robin
--
gcc/ChangeLog:
2018-07-12 Robin Dapp
* config/s390/s390.c (s390_default_align): Set default function
alignment to 16.
(s390_override_options_after_change
ot;Os")))
void bar () {};
I did not observe that the default alignment, once set, was reset anywhere.
Regards
Robin
--
gcc/ChangeLog:
2018-07-11 Robin Dapp
* config/s390/s390.c (s390_default_align): Set default
function alignment.
(s390_override_options_after_ch
Hi,
we recently hit a problem where fwprop would not propagate a memory
address into an insn because our backend (s390) tells it that the
address_cost ()s for an address with index are higher than for one
without. Subsequently, should_replace_address () returns false and no
propagation is
Hi,
this patch avoids emitting LA on z13 and later when the address has both
an index and a base since a regular add is faster in that case.
Regtested on s390x.
Regards
Robin
--
gcc/ChangeLog:
2018-07-05 Robin Dapp
* config/s390/s390.c (preferred_la_operand_p): Do not use
explicitly state -march=z13 -mtune=zEC12.
Regards
Robin
--
gcc/ChangeLog:
2018-06-04 Robin Dapp
* config/s390/s390.h (enum processor_flags): Do not use
default tune parameter when -march was specified.
diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index a372981ff3a
Hi,
when investigating a regression, I realized that we create a superfluous
load on S390. The snippet looks something like
LA %r10, 0(%r8,%r9)
LLH %r4, 0(%r10)
meaning the address in r10 is computed by an LA even though LLH supports
the addressing already. The same address is used multiple
> Preserving the sched state across basic blocks for your case works only if
> the BBs are traversed
> with the fall through edges coming first. Is that the case? We probably
> should have a description
> for s390_last_sched_state stating this.
Committed as attached with an additional comment
gcc/ChangeLog:
2017-10-17 Robin Dapp <rd...@linux.vnet.ibm.com>
* config/s390/s390.c (s390_bb_fallthru_entry_likely): New
function.
(s390_sched_init): Do not reset s390_sched_state if we entered
the current basic block via a fallthru edge and all othe
This patch fixes cases where we start a new group although the previous one has
not ended.
Regression tested on s390x.
gcc/ChangeLog:
2017-10-11 Robin Dapp <rd...@linux.vnet.ibm.com>
* config/s390/s390.c (s390_has_ok_fallthru): New function.
(s390_sched_score): Tempo
This patch introduces balancing of long-running instructions that may clog the
pipeline.
gcc/ChangeLog:
2017-10-11 Robin Dapp <rd...@linux.vnet.ibm.com>
* config/s390/s390.c (NUM_SIDES): New constant.
(LONGRUNNING_THRESHOLD): New constant.
(LATENCY_FACTOR
ChangeLog:
2017-07-31 Robin Dapp <rd...@linux.vnet.ibm.com>
* MAINTAINERS (write after approval): Add myself.
Index: MAINTAINERS
===
--- MAINTAINERS (revision 250740)
+++ MAINTAINERS (working copy)
@@ -356,6
> Do you have an example where wrong code is generated through the
> noce_convert_multiple_sets_p path (with or without bodged costs)?
>
> Both AArch64 and x86-64 reject your testcase along this codepath because
> of the constant set of 1. If we work around that by setting bla = n rather
> than
Hi,
recently I wondered why a snippet like the following is not being
if-converted at all on s390:
int foo (int *a, unsigned int n)
{
int min = 99;
int bla = 0;
for (int i = 0; i < n; i++)
{
if (a[i] < min)
{
min = a[i];
bla = 1;
}
}
the body_cost_vec parameter
which is not used elsewhere.
Regards
Robin
--
gcc/ChangeLog:
2017-07-12 Robin Dapp <rd...@linux.vnet.ibm.com>
* (vect_enhance_data_refs_alignment):
Remove body_cost_vec from _vect_peel_extended_info.
tree-vect-data-
[3/3] Tests
--
gcc/testsuite/ChangeLog:
2017-07-05 Robin Dapp <rd...@linux.vnet.ibm.com>
* gcc.dg/wrapped-binop-simplify-signed-1.c: New test.
* gcc.dg/wrapped-binop-simplify-signed-2.c: New test.
* gcc.dg/wrapped-binop-simplify-unsigned-1.c: Ne
801 - 900 of 973 matches
Mail list logo