not always be generated. This patch uses separate flags
for 2-, 4-, and 8-byte alignment to fix the problem.
Bootstrapped, no regressions on s390.
Regards
Robin
gcc/testsuite/ChangeLog:
2015-10-23 Robin Dapp <rd...@linux.vnet.ibm.com>
* gcc.target/s390/load-relative-check.c: Ne
Hi,
recently, I came across a problem that keeps a load instruction in a
loop although it is loop-invariant.
A simple example is:
#include
#define SZ 256
int a[SZ], b[SZ], c[SZ];
int main() {
int i;
for (i = 0; i < SZ; i++) {
a[i] = b[i] + c[i];
}
printf("%d\n", a[0]);
}
The
On 09/15/2015 05:25 PM, Jeff Law wrote:
> On 09/15/2015 06:11 AM, Robin Dapp wrote:
>> Hi,
>>
>> recently, I came across a problem that keeps a load instruction in a
>> loop although it is loop-invariant.
[..]
> You might want to check your costing model -- cprop
Hi,
the attached patch renames the constm1_operand predicate to
all_ones_operand and introduces a check for int mode.
It should be applied on top of the last patch ([Patch] S/390: Simplify
vector conditionals).
Regtested on s390.
Regards
Robin
gcc/ChangeLog:
2015-12-15 Robin Dapp <
ree-level.
Bootstrapped and regression-tested on s390.
Regards
Robin
gcc/ChangeLog:
2015-12-15 Robin Dapp <rd...@linux.vnet.ibm.com>
* config/s390/s390.c (s390_expand_vcond): Convert vector
conditional into shift.
* config/s390/vector.md: Change operand predicate.
?
No regressions on s390x and amd64.
Regards
Robin
--
gcc/ChangeLog:
2016-04-13 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vectorizer.h
(dr_misalignment): Introduce named DR_MISALIGNMENT constants.
(aligned_access_p): Use constants.
(known_alignment_for_ac
et perform bootstrapping
and more testing due to the premature nature of the patch.
Thanks
Robin
gcc/ChangeLog:
2016-03-17 Robin Dapp <rd...@linux.vnet.ibm.com>
* cfgloop.h (struct GTY): Add second number of iterations
* loop-doloop.c (doloop_condition_get)
As described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526, we
currently fail to simplify cases like
(unsigned long)(a - 1) + 1
to
(unsigned long)a
when VRP knows that (a - 1) does not overflow.
This patch introduces a match.pd pattern as well as a helper function
that checks for
I skimmed through the code to see where transformation like
(a - 1) -> (a + UINT_MAX) are performed. It seems there are only two
places, match.pd (/* A - B -> A + (-B) if B is easily negatable. */)
and fold-const.c.
In order to be able to reliably know whether to zero-extend or to
sign-extend
Ping.
To put it shortly, I'm not sure how to differentiate between:
example range of a: [3,3]
(ulong)(a + UINT_MAX) + 1 --> (ulong)(a) + (ulong)(-1 + 1), sign extend
example range of a: [0,0]
(ulong)(a + UINT_MAX) + 1 --> (ulong)(a) + (ulong)(UINT_MAX + 1), no
sign extend
In this case, there
for now because I find
extract_range_from_binary_expr_1 somewhat lengthy and hard to follow
already :) Wouldn't it be better to "separate concerns"/split it up in
the long run and merge the functionality needed here at some time?
Bootstrapped and reg-tested on s390x, bootstrap on x86 running.
Hi,
the following patch changes "nopr %r7" to "nopr %r0" which is
advantageous from a hardware perspective. It will only be emitted for
hotpatching and should not impact normal code.
Bootstrapped and regression tested on s390 and s390x.
Regards
Robin
gcc/ChangeLog:
20
Ping.
diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
index 2beadbc..d66fcb1 100644
--- a/gcc/gimple-match-head.c
+++ b/gcc/gimple-match-head.c
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see
#include "internal-fn.h"
#include "case-cfn-macros.h"
#include
This causes a performance regression in the xalancbmk SPECint2006
benchmark on s390x. At first sight, the produced asm output doesn't look
too different but I'll have a closer look. Is the fwprop order supposed
to have major performance implications?
Regards
Robin
> This changes it from PRE on
idn't manage to run it independently in this
directory via RUNTESTFLAGS=vect.exp=... or otherwise)
Bootstrapped on x86 and s390.
--
gcc/ChangeLog:
2016-09-26 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-loop-manip.c (create_intersect_range_checks_index):
Add tree
i_p().
ok to commit?
Regards
Robin
--
gcc/ChangeLog:
2016-09-26 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-loop-manip.c (create_intersect_range_checks_index):
Add tree_fits_uhwi_p check.
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
inde
Ping.
flow. Do you think it
should be handled differently?
Revised version attached.
Regards
Robin
--
gcc/ChangeLog:
2016-09-20 Robin Dapp <rd...@linux.vnet.ibm.com>
PR middle-end/69526
This enables combining of wrapped binary operations and fixes
the tree l
gah, this
+ return true;
+ if (TREE_CODE (t1) != SSA_NAME)
should of course be like this
+ if (TREE_CODE (t1) != SSA_NAME)
+ return true;
in the last patch.
> Also the '=' in the split line goes to the next line according to
> coding conventions.
fixed, I had only looked at an instance one function above which had it
wrong as well. Also changed comment grammar slightly.
Regards
Robin
--
gcc/ChangeLog:
2016-09-27 Robin Dap
This introduces an ICE ("bogus comparison result type") on s390 for the
following test case:
#include
void foo(int dim)
{
int ba, sign;
ba = abs (dim);
sign = dim / ba;
}
Doing
diff --git a/gcc/match.pd b/gcc/match.pd
index ba7e013..2455592 100644
--- a/gcc/match.pd
+++
Ping :)
Ping.
>> + /* Sign-extend @1 to TYPE. */
>> + w1 = w1.from (w1, TYPE_PRECISION (type), SIGNED);
>>
>> not sure why you do always sign-extend. If the inner op is unsigned
>> and we widen then that's certainly bogus considering your UINT_MAX
>> example above. Does
>>
>>
Found some time to look into this again.
> Index: tree-ssa-propagate.c
> ===
> --- tree-ssa-propagate.c(revision 240133)
> +++ tree-ssa-propagate.c(working copy)
> @@ -1105,10 +1105,10 @@
Ping. Any idea how to tackle this?
Perhaps I'm still missing how some cases are handled or not handled,
sorry for the noise.
> I'm not sure there is anything to "interpret" -- the operation is unsigned
> and overflow is when the operation may wrap around zero. There might
> be clever ways of re-writing the expression to
>
> So we have (uint64_t)(uint32 + -1U) + 1 and using TYPE_SIGN (inner_type)
> produces (uint64_t)uint32 + -1U + 1. This simply means that we cannot ignore
> overflow of the inner operation and for some reason your change
> to extract_range_from_binary_expr didn't catch this. That is _8 +
Hi,
this patch fixes the vcond shift testcase that failed since setting
PARAM_MIN_VECT_LOOP_BOUND in the s390 backend.
Regards
Robin
--
gcc/testsuite/ChangeLog:
2017-03-27 Robin Dapp <rd...@linux.vnet.ibm.com>
* gcc.target/s390/vector/vcond-shift.c (void foo): In
Hi,
when looking at various vectorization examples on s390x I noticed that
we still peel vf/2 iterations for alignment even though vectorization
costs of unaligned loads and stores are the same as normal loads/stores.
A simple example is
void foo(int *restrict a, int *restrict b, unsigned int
Hi Bin,
> Seems Richi added code like below comparing costs between aligned and
> unsigned loads, and only peeling if it's beneficial:
>
> /* In case there are only loads with different unknown misalignments,
> use
> peeling only if it may help to align other accesses in the loop
> Note I was very conservative here to allow store bandwidth starved
> CPUs to benefit from aligning a store.
>
> I think it would be reasonable to apply the same heuristic to the
> store case that we only peel for same cost if peeling would at least
> align two refs.
Do you mean checking if
gards
Robin
[1] https://gcc.gnu.org/ml/gcc/2017-01/msg00234.html
[2] https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01562.html
--
gcc/ChangeLog:
2017-03-02 Robin Dapp <rd...@linux.vnet.ibm.com>
* config/s390/s390.c (s390_option_override_internal): Set
PARAM_MIN_VECT_LOOP_
ChangeLog:
2017-07-31 Robin Dapp <rd...@linux.vnet.ibm.com>
* MAINTAINERS (write after approval): Add myself.
Index: MAINTAINERS
===
--- MAINTAINERS (revision 250740)
+++ MAINTAINERS (working copy)
@@ -356,6
the body_cost_vec parameter
which is not used elsewhere.
Regards
Robin
--
gcc/ChangeLog:
2017-07-12 Robin Dapp <rd...@linux.vnet.ibm.com>
* (vect_enhance_data_refs_alignment):
Remove body_cost_vec from _vect_peel_extended_info.
tree-vect-data-
Hi,
recently I wondered why a snippet like the following is not being
if-converted at all on s390:
int foo (int *a, unsigned int n)
{
int min = 99;
int bla = 0;
for (int i = 0; i < n; i++)
{
if (a[i] < min)
{
min = a[i];
bla = 1;
}
}
> Do you have an example where wrong code is generated through the
> noce_convert_multiple_sets_p path (with or without bodged costs)?
>
> Both AArch64 and x86-64 reject your testcase along this codepath because
> of the constant set of 1. If we work around that by setting bla = n rather
> than
[3/3] Tests
--
gcc/testsuite/ChangeLog:
2017-07-05 Robin Dapp <rd...@linux.vnet.ibm.com>
* gcc.dg/wrapped-binop-simplify-signed-1.c: New test.
* gcc.dg/wrapped-binop-simplify-signed-2.c: New test.
* gcc.dg/wrapped-binop-simplify-unsigned-1.c: Ne
> While the initialization value doesn't matter (wi::add will overwrite it)
> better initialize both to false ;) Ah, you mean because we want to
> transform only if get_range_info returned VR_RANGE. Indeed somewhat
> unintuitive (but still the best variant for now).
> so I'm still missing a
> ideally you'd use a wide-int here and defer the tree allocation to the result
Did that in the attached version.
> So I guess we never run into the outer_op == minus case as the above is
> clearly wrong for that?
Right, damn, not only was the treatment for this missing but it was
bogus in the
Included the workaround for SLP now. With it, testsuite is clean on x86
as well.
gcc/ChangeLog:
2017-05-11 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_get_data_access_cost):
Workaround for SLP handling.
(vect_enhance_data_refs_ali
gcc/ChangeLog:
2017-05-11 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_peeling_hash_choose_best_peeling):
Return peeling info and set costs to zero for unlimited cost
model.
(vect_enhance_data_refs_alignment): Also inspect all da
gcc/ChangeLog:
2017-05-11 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vectorizer.h (dr_misalignment): Introduce
DR_MISALIGNMENT_UNKNOWN.
* tree-vect-data-refs.c (vect_compute_data_ref_alignment): Refactoring.
(vect_update_misalignment_for_peel
gcc/ChangeLog:
2017-05-11 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_update_misalignment_for_peel): Change
comment and rename variable.
(vect_get_peeling_costs_all_drs): New function.
(vect_peeling_hash_get_lowest_cost
gcc/ChangeLog:
2017-05-11 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
Remove check for supportable_dr_alignment, compute costs for
doing no peeling at all, compare to the best peeling costs so
far
gcc/testsuite/ChangeLog:
2017-05-11 Robin Dapp <rd...@linux.vnet.ibm.com>
* gcc.target/s390/vector/vec-nopeel-2.c: New test.
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-2.c
new file mode 100644
index 0
Included the requested changes in the patches (to follow). I removed
the alignment count check now altogether.
> I'm not sure why you test for unlimited_cost_model here as I said
> elsewhere I'm not sure
> what not cost modeling means for static decisions. The purpose of
> unlimited_cost_model
ping.
> use INTEGRAL_TYPE_P.
Done.
> but you do not actually _use_ vr_outer. Do you think that if
> vr_outer is a VR_RANGE then the outer operation may not
> possibly have wrapped? That's a false conclusion.
These were remains of a previous version. vr_outer is indeed not needed
anymore; removed.
r max overflow, split/anti range). Test
suite on s390x has no regressions, bootstrap is ok, x86 running.
Regards
Robin
--
gcc/ChangeLog:
2017-06-19 Robin Dapp <rd...@linux.vnet.ibm.com>
* match.pd: Simplify wrapped binary operations.
diff --git a/gcc/match.pd b/gcc/match.pd
in
Ping.
> I can guess what is happening here. It's a 40 bits unsigned long long
> field, (s.b-8) will be like:
> _1 = s.b
> _2 = _1 + 0xf8
> Also get_range_info returns value range [0, 0xFF] for _1.
> You'd need to check if _1(with range [0, 0xFF]) + 0xf8
> overflows
> Hmm, won't (uint32_t + uint32_t-CST) doesn't overflow be sufficient
> condition for such transformation?
Yes, in principle this should suffice. What we're actually looking for
is something like a "proper" (or no) overflow, i.e. an overflow in both
min and max of the value range. In
(a +
This tries to fold unconditionally and fixes some test cases.
gcc/ChangeLog:
2017-05-18 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-ssa-propagate.c
(substitute_and_fold_dom_walker::before_dom_children):
Always try to fold.
gcc/testsuite/ChangeLog:
2017-05-18 Robi
> Any reason to expose tree-vrp.c internal interface here? The function
> looks quite expensive. Overflow check can be done by get_range_info
> and simple wi::cmp calls. Existing code like in
> tree-ssa-loop-niters.c already does that. Also could you avoid using
> comma expressions in
match.pd part of the patch.
gcc/ChangeLog:
2017-05-18 Robin Dapp <rd...@linux.vnet.ibm.com>
* match.pd: Simplify wrapped binary operations.
* tree-vrp.c (extract_range_from_binary_expr_1): Add overflow
parameter.
(extract_range_from_binary_expr): Li
New testcases.
gcc/testsuite/ChangeLog:
2017-05-18 Robin Dapp <rd...@linux.vnet.ibm.com>
* gcc.dg/wrapped-binop-simplify-signed-1.c: New test.
* gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test.
* gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test.
diff
The last version of the patch series caused some regressions for ppc64.
This was largely due to incorrect handling of unsupportable alignment
and should be fixed with the new version.
p2 and p5 have not changed but I'm posting the whole series again for
reference. p1 only changed comment
gcc/testsuite/ChangeLog:
2017-05-23 Robin Dapp <rd...@linux.vnet.ibm.com>
* gcc.target/s390/vector/vec-nopeel-2.c: New test.
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-2.c
new file mode 100644
index 0
gcc/ChangeLog:
2017-05-23 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_compute_data_ref_alignment):
Create DR_HAS_NEGATIVE_STEP.
(vect_update_misalignment_for_peel): Define DR_MISALIGNMENT.
(vect_enhance_data_refs_alignment
gcc/ChangeLog:
2017-05-23 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_update_misalignment_for_peel):
Rename.
(vect_get_peeling_costs_all_drs): Create function.
(vect_peeling_hash_get_lowest_cost):
gcc/ChangeLog:
2017-05-23 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_peeling_hash_choose_best_peeling):
Return peeling info and set costs to zero for unlimited cost
model.
(vect_enhance_data_refs_alignment): Also inspect all da
gcc/ChangeLog:
2017-05-23 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_get_data_access_cost):
Workaround for SLP handling.
(vect_enhance_data_refs_alignment):
Compute costs for doing no peeling at all, compare to the best
p
ld series itself (-p3)
doesn't apply to trunk anymore (because of the change in
vect_enhance_data_refs_alignment).
Regards
Robin
--
gcc/ChangeLog:
2017-05-24 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_get_peeling_costs_all_drs):
Introduce unkno
> Not sure I've understood the series TBH, but is the npeel == vf / 2
> there specifically for the "unknown number of peels" case? How do
> we distinguish that from the case in which the number of peels is
> known to be vf / 2 at compile time? Or have I missed the point
> completely? (probably
> http://gcc.gnu.org/ml/gcc-testresults/2017-06/msg00297.html
What machine is this running on? power4 BE? The tests are compiled with
--with-cpu-64=power4 apparently. I cannot reproduce this on power7
-m32. Is it possible to get more detailed logs or machine access to
reproduce?
Regards
Robin
> Patch 6 breaks no-vfa-vect-57.c on powerpc.
Which CPU model (power6/7/8?) and which compile options (-maltivec/
-mpower8-vector?) have been used for running and compiling the test? As
discussed in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80925
this has an influence on the cost function and
gcc/ChangeLog:
2017-04-26 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost):
Change cost model.
(vect_peeling_hash_choose_best_peeling): Return extended peel info.
(vect_peeling_supportable): Return peeling
Wrap some frequently used snippets in separate functions.
gcc/ChangeLog:
2017-04-26 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_update_misalignment_for_peel): Rename.
(vect_get_peeling_costs_all_drs): Create fu
Some refactoring and definitions to use for (unknown) DR_MISALIGNMENT,
gcc/ChangeLog:
2017-04-26 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-data-ref.h (struct data_reference): Create DR_HAS_NEGATIVE_STEP.
* tree-vectorizer.h (dr_misalignment): Define DR_MISALI
Hi,
> This one only works for known misalignment, otherwise it's overkill.
>
> OTOH if with some refactoring we can end up using a single cost model
> that would be great. That is for the SAME_ALIGN_REFS we want to
> choose the unknown misalignment with the maximum number of
> SAME_ALIGN_REFS.
gcc/ChangeLog:
2017-05-08 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost):
Remove unused variable.
(vect_enhance_data_refs_alignment):
Compare best peelings costs to doing no peeling and choose no
p
> So the new part is the last point? There's a lot of refactoring in
3/3 that
> makes it hard to see what is actually changed ... you need to resist
> in doing this, it makes review very hard.
The new part is actually spread across the three last "-"s. Attached is
a new version of [3/3] split
gcc/ChangeLog:
2017-05-08 Robin Dapp <rd...@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_peeling_hash_choose_best_peeling):
Return peel info.
(vect_enhance_data_refs_alignment):
Compute full costs when peeling for unknown alignment, compare
to
> Since this commit (r248678), I've noticed regressions on some arm targets.
> Executed from: gcc.dg/tree-ssa/tree-ssa.exp
> gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect "Alignment
> of access forced using peeling" 1
> gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect
>
gcc/ChangeLog:
2017-10-17 Robin Dapp <rd...@linux.vnet.ibm.com>
* config/s390/s390.c (s390_bb_fallthru_entry_likely): New
function.
(s390_sched_init): Do not reset s390_sched_state if we entered
the current basic block via a fallthru edge and all othe
This patch introduces balancing of long-running instructions that may clog the
pipeline.
gcc/ChangeLog:
2017-10-11 Robin Dapp <rd...@linux.vnet.ibm.com>
* config/s390/s390.c (NUM_SIDES): New constant.
(LONGRUNNING_THRESHOLD): New constant.
(LATENCY_FACTOR
This patch fixes cases where we start a new group although the previous one has
not ended.
Regression tested on s390x.
gcc/ChangeLog:
2017-10-11 Robin Dapp <rd...@linux.vnet.ibm.com>
* config/s390/s390.c (s390_has_ok_fallthru): New function.
(s390_sched_score): Tempo
> Preserving the sched state across basic blocks for your case works only if
> the BBs are traversed
> with the fall through edges coming first. Is that the case? We probably
> should have a description
> for s390_last_sched_state stating this.
Committed as attached with an additional comment
explicitly state -march=z13 -mtune=zEC12.
Regards
Robin
--
gcc/ChangeLog:
2018-06-04 Robin Dapp
* config/s390/s390.h (enum processor_flags): Do not use
default tune parameter when -march was specified.
diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index a372981ff3a
Hi,
when investigating a regression, I realized that we create a superfluous
load on S390. The snippet looks something like
LA %r10, 0(%r8,%r9)
LLH %r4, 0(%r10)
meaning the address in r10 is computed by an LA even though LLH supports
the addressing already. The same address is used multiple
d the comment.
Regards
Robin
--
gcc/ChangeLog:
2018-07-16 Robin Dapp
* config/s390/s390.c (preferred_la_operand_p): Do not use
LA with index register on z196 or later.
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 23c3f3db621..d8b47c6fe67 100644
--- a/g
Hi,
this patch increases the latency of some floating point instructions to better
match the real machine's behavior.
Regards
Robin
--
gcc/ChangeLog:
2018-09-06 Robin Dapp
* config/s390/2827.md: Increase latencies for some FP instructions.
---
gcc/config/s390/2827.md | 14
Sorry, forgot the [S/390] tag in the subject.
Similar to zEC12, the change in latencies helps match the real machine's
behavior better.
--
gcc/ChangeLog:
2018-09-06 Robin Dapp
* config/s390/2964.md: Increase latencies for some FP instructions.
---
gcc/config/s390/2964.md | 80 ++---
1 file
Hi,
this patch avoids emitting LA on z13 and later when the address has both
an index and a base since a regular add is faster in that case.
Regtested on s390x.
Regards
Robin
--
gcc/ChangeLog:
2018-07-05 Robin Dapp
* config/s390/s390.c (preferred_la_operand_p): Do not use
Hi,
we recently hit a problem where fwprop would not propagate a memory
address into an insn because our backend (s390) tells it that the
address_cost ()s for an address with index are higher than for one
without. Subsequently, should_replace_address () returns false and no
propagation is
ot;Os")))
void bar () {};
I did not observe that the default alignment, once set, was reset anywhere.
Regards
Robin
--
gcc/ChangeLog:
2018-07-11 Robin Dapp
* config/s390/s390.c (s390_default_align): Set default
function alignment.
(s390_override_options_after_ch
il
without the patch as we can get lucky with the alignment).
Regtested on s390x.
Regards
Robin
--
gcc/ChangeLog:
2018-07-12 Robin Dapp
* config/s390/s390.c (s390_default_align): Set default function
alignment to 16.
(s390_override_options_after_change
ping, any insight on this?
Regards
Robin
Hi,
committed only the zEC12 part for now. Performance behavior of z13 with
the patch is still unclear and will be tackled separately.
Regards
Robin
ng.
Regards
Robin
gcc/ChangeLog:
2018-10-15 Robin Dapp
* haifa-sched.c (priority): Add force_recompute parameter.
(apply_replacement):
Call priority () with force_recompute = true.
(restore_pattern): Likewise.
diff --git a/gcc/haifa-sched.c b/gcc/haifa-s
> A C++ style nit/question: instead of adding a new overload
>
> priority (rtx_insn *, bool)
>
> you can add a parameter with a default value in the existing
> static function
>
> priority (rtx_insn *insn, bool force_recompute = false)
Sometimes I'm still stuck in C land with GCC :),
> Still OK :-)
Committed as r265304.
Regards
Robin
/ChangeLog:
2018-10-16 Robin Dapp
* haifa-sched.c (priority): Add force_recompute parameter.
(apply_replacement): Call priority () with force_recompute = true.
(restore_pattern): Likewise.
diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 1fdc9df9fb2..2c84ce38143 100644
Hi,
this enables QImode and HImode for load on condition. For SPEC2006 this
reduces code size overall, performance impact is negligible.
Regtested on s390x.
Regards
Robin
--
gcc/ChangeLog:
2018-10-18 Robin Dapp
* config/s390/s390.md: Add movcc for QImode and HImode.
diff --git
Hi,
this allows immediates in the load-on-condition expander on z13 or later.
Regtested on z14.
Regards
Robin
--
gcc/ChangeLog:
2018-10-17 Robin Dapp
* config/s390/predicates.md:
Allow immediate operand in loc_operand for z13.
* config/s390/s390.md: Use
/ChangeLog:
2018-10-26 Robin Dapp
* config/s390/predicates.md: Fix typo.
* config/s390/s390.md: Allow immediates for load on condition.
gcc/testsuite/ChangeLog:
2018-10-26 Robin Dapp
* gcc.dg/loop-8.c: On s390, always run the test with -march=zEC12.
diff --git a/gcc
Hi,
this is v2 of the patch with less quirky pattern syntax and two tests.
Regards
Robin
--
gcc/ChangeLog:
2018-10-26 Robin Dapp
* config/s390/s390.md: QImode and HImode for load on condition.
gcc/testsuite/ChangeLog:
2018-10-26 Robin Dapp
* gcc.target/s390/ifcvt
Hi,
the attached patch increases the move costs for moves involving the CC
register. This saves us some instructions in SPEC CPU2006.
Regards
Robin
--
gcc/ChangeLog:
2018-11-05 Robin Dapp
* config/s390/s390.c (s390_register_move_cost): Increase costs
for moves involving
1 - 100 of 960 matches
Mail list logo