On 09/30/2014 02:52 AM, Jakub Jelinek wrote:
On Tue, Sep 30, 2014 at 11:03:47AM +0400, Varvara Rainchik wrote:
Corrected patch: call pthread_setspecific (gomp_tls_key, NULL) in
gomp_thread_start if HAVE_TLS is not defined.
2014-09-19 Varvara Rainchik varvara.rainc...@intel.com
*
On 09/29/2014 11:12 AM, Jiong Wang wrote:
+inline rtx single_set_no_clobber_use (const rtx_insn *insn)
+{
+ if (!INSN_P (insn))
+return NULL_RTX;
+
+ if (GET_CODE (PATTERN (insn)) == SET)
+return PATTERN (insn);
+
+ /* Defer to the more expensive case, and return NULL_RTX if
On 09/25/2014 08:05 AM, James Greenhalgh wrote:
On Fri, Sep 19, 2014 at 05:57:06PM +0100, Richard Henderson wrote:
On 09/11/2014 01:29 AM, James Greenhalgh wrote:
+;; Predicates used by the various SIMD shift operations. These
+;; fall in to 3 categories.
+;; Shifts with a range 0
On 09/20/2014 11:23 AM, Segher Boessenkool wrote:
+(define_code_attr iorxor [(ior ior) (xor xor)])
+(define_code_attr IORXOR [(ior IOR) (xor XOR)])
You don't need these. They are code and CODE respectively.
r~
On 09/11/2014 01:29 AM, James Greenhalgh wrote:
+;; Predicates used by the various SIMD shift operations. These
+;; fall in to 3 categories.
+;; Shifts with a range 0-(bit_size - 1) (aarch64_simd_shift_imm)
+;; Shifts with a range 1-bit_size (aarch64_simd_shift_imm_offset)
+;; Shifts
On 09/04/2014 07:04 AM, Ramana Radhakrishnan wrote:
gcc/Changelog
2014-09-04 Marcus Shawcroft marcus.shawcr...@arm.com
Ramana Radhakrishnan ramana.radhakrish...@arm.com
* config/aarch64/aarch64-elf-raw.h (ENDFILE_SPEC): Add
crtfastmath.o.
*
On 09/03/2014 04:06 AM, Marcus Shawcroft wrote:
On 22 August 2014 23:05, Richard Henderson r...@redhat.com wrote:
Generic code already handles calls_alloca for determining
the need for a frame pointer.
* config/aarch64/aarch64.c (aarch64_frame_pointer_required): Don't
check
Is it intentional or not that AArch64 does not define __ARM_NEON__?
Otherwise, here's a better way to fold the test bits. AArch64 of
course does not have dN+1 overlap the high part of the qM register,
like AArch32, so the current
l = vpadd_u8 (vget_low_u8 (t), vget_high_u8 (t));
On 09/02/2014 08:34 AM, Kyrill Tkachov wrote:
2014-09-02 Kyrylo Tkachov kyrylo.tkac...@arm.com
* config/aarch64/predicates.md (aarch64_comparison_operation):
New special predicate.
* config/aarch64/aarch64.md (*csinc2mode_insn): Use
aarch64_comparison_operation instead of
On 09/02/2014 08:51 AM, Ramana Radhakrishnan wrote:
The ADDV instruction isn't available on the AArch32 side IIRC. Given that
situation there is no intrinsic for ADDV on the AArch32 side which is why this
doesn't exist in the AArch32 version of arm_neon.h :(
Whoops, yes indeed. I clearly
On 06/20/2014 05:17 PM, Sriraman Tallam wrote:
Index: config/i386/i386.c
===
--- config/i386/i386.c(revision 211826)
+++ config/i386/i386.c(working copy)
@@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx
On 09/02/2014 01:59 AM, Matthew Fortune wrote:
gcc/
* target.def (TARGET_DWARF_FRAME_REG_MODE): New target hook.
* targhooks.c (default_dwarf_frame_reg_mode): New function.
* targhooks.h (default_dwarf_frame_reg_mode): New prototype.
* doc/tm.texi.in
On 08/06/2014 03:05 AM, Varvara Rainchik wrote:
* libgomp.h (gomp_thread): For non TLS case create thread data.
* team.c (create_non_tls_thread_data): New function.
---
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index a1482cc..cf3ec8f 100644
---
On 08/26/2014 10:15 AM, David Malcolm wrote:
Attached is a revised version of #225, with the following changes:
* fix for the above: avoid introducing a new shadow name note within
force_nonfallthru_and_redirect by introducing a new local rtx_insn *
new_head and renaming note to it in the
On 08/26/2014 05:58 AM, Jiong Wang wrote:
On 22/08/14 23:05, Richard Henderson wrote:
Don't continually re-read data from cfun-machine.
* config/aarch64/aarch64.c (aarch64_expand_prologue): Load
cfun-machine-frame.hard_fp_offset into a local variable.
---
gcc/config/aarch64
On 08/28/2014 05:47 PM, David Malcolm wrote:
- insn = as_a rtx_insn * (
-gen_extend_insn (op0, t, promoted_nominal_mode,
- data-passed_mode, unsignedp));
- emit_insn (insn);
+ rtx pat = gen_extend_insn (op0, t,
On 08/27/2014 08:48 AM, David Malcolm wrote:
Alternatively, should this simply use single_set?
Yes.
(though I think that's a more invasive change, especially since some of
the logic is for non-SETs).
I don't think that's the case. Take the tests in order:
if (mn10300_tune_cpu ==
On 08/27/2014 09:32 AM, David Malcolm wrote:
* gcc/config/mn10300/mn10300.c (is_load_insn): Rename to...
(set_is_load_p): ...this, updating to work on a SET pattern rather
than an insn.
(is_store_insn): Rename to...
(set_is_store_p): ...this, updating to work on
On 08/26/2014 05:59 AM, Evgeny Stupachenko wrote:
+(define_insn_and_split avx2_rotatemode_perm
+ [(set (match_operand:V_256 0 register_operand =x)
+ (vec_select:V_256
+ (match_operand:V_256 1 register_operand x)
+ (match_parallel 2 palignr_operand
+ [(match_operand
On 08/26/2014 09:00 AM, David Malcolm wrote:
OK for trunk?
David Malcolm (3):
Convert nonlocal_goto_handler_labels from an EXPR_LIST to an INSN_LIST
Convert forced_labels from an EXPR_LIST to an INSN_LIST
Use rtx_insn in more places in dwarf2cfi.c
Ok to all. Thanks.
r~
On 08/22/2014 02:14 PM, Joseph S. Myers wrote:
Tested with no regressions for cross to arm-none-eabi (it also fixes
failures of gcc.dg/noncompile/920507-1.c, which is PR 61330). OK to
commit?
2014-08-22 Joseph Myers jos...@codesourcery.com
PR target/60606
PR target/61330
Delay cfi restore opcodes until the stack frame is deallocated.
This reduces the number of cfi advance opcodes required.
We perform a similar optimization in the x86_64 epilogue.
* config/aarch64/aarch64.c (aarch64_popwb_single_reg): Remove.
(aarch64_popwb_pair_reg): Remove.
Generic code already handles calls_alloca for determining
the need for a frame pointer.
* config/aarch64/aarch64.c (aarch64_frame_pointer_required): Don't
check calls_alloca.
---
gcc/config/aarch64/aarch64.c | 5 -
1 file changed, 5 deletions(-)
diff --git
We were marking more than necessary in aarch64_set_frame_expr.
Fold the reduced function into aarch64_expand_prologue as necessary.
* config/aarch64/aarch64.c (aarch64_set_frame_expr): Remove.
(aarch64_expand_prologue): Use REG_CFA_ADJUST_CFA directly,
or no special markup
disabling the frame pointer. But fwiw,
it was 220k from 1.2M in cc1plus, or just shy of 20%.
Ok?
r~
Richard Henderson (4):
aarch64: Improve epilogue unwind info
aarch64: Tidy prologue unwind notes
aarch64: Tidy prologue local variables
aarch64: Don't duplicate calls_alloca check
gcc
Don't continually re-read data from cfun-machine.
* config/aarch64/aarch64.c (aarch64_expand_prologue): Load
cfun-machine-frame.hard_fp_offset into a local variable.
---
gcc/config/aarch64/aarch64.c | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git
On 08/19/2014 07:12 AM, Marek Polacek wrote:
On some archs, C[TL]Z_DEFINED_VALUE_AT_ZERO macros return only
true/false, so -Wbool-compare would warn.
Then we should fix them to return 0/1 instead.
r~
On 08/19/2014 06:29 AM, Kyrill Tkachov wrote:
+(define_special_predicate cc_register_zero
+ (and (match_code reg)
+ (and (match_test REGNO (op) == CC_REGNUM)
+ (ior (match_test mode == GET_MODE (op))
+ (ior (match_test mode == VOIDmode
+
On 08/19/2014 08:54 AM, Marek Polacek wrote:
Works as well. So is the following ok once the regtest finishes?
Bootstrapped on x86_64-linux.
2014-08-19 Marek Polacek pola...@redhat.com
* config/alpha/alpha.h (CLZ_DEFINED_VALUE_AT_ZERO,
CTZ_DEFINED_VALUE_AT_ZERO): Return
(define_special_predicate cc_register_zero
(match_code reg)
{
return (REGNO (op) == CC_REGNUM
(GET_MODE (op) == CCmode
|| GET_MODE (op) == CC_Zmode
|| GET_MODE (op) == CC_NZmode));
})
... and now that I read the backend more closely, I see _zero
On 08/06/2014 10:19 AM, David Malcolm wrote:
@@ -2772,11 +2772,11 @@ mn10300_adjust_sched_cost (rtx insn, rtx link, rtx
dep, int cost)
if (!TARGET_AM33)
return 1;
- if (GET_CODE (insn) == PARALLEL)
-insn = XVECEXP (insn, 0, 0);
+ if (GET_CODE (PATTERN (insn)) == PARALLEL)
On 08/06/2014 10:23 AM, David Malcolm wrote:
gcc/
* rtl.h (rtx_expr_list::insn): New method.
---
gcc/rtl.h | 9 +
1 file changed, 9 insertions(+)
diff --git a/gcc/rtl.h b/gcc/rtl.h
index d028be1..d5811c2 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -414,6 +414,10 @@ public:
On 08/06/2014 10:23 AM, David Malcolm wrote:
gcc/
* function.h (struct rtl_data): Strengthen fields x_return_label
and x_naked_return_label from rtx to rtx_code_label *.
---
gcc/function.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
#207 - #220 are all OK.
r~
On 08/06/2014 10:23 AM, David Malcolm wrote:
else if (computed_jump_p (insn))
{
for (rtx_expr_list *lab = forced_labels; lab; lab = lab-next ())
- maybe_record_trace_start (lab-element (), insn);
+ maybe_record_trace_start (lab-insn (), insn);
}
I
On 08/06/2014 10:23 AM, David Malcolm wrote:
diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index 59d633d..5e42a97 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -1604,6 +1604,7 @@ force_nonfallthru_and_redirect (edge e, basic_block
target, rtx jump_label)
if (EDGE_COUNT (e-src-succs) = 2
On 08/06/2014 10:23 AM, David Malcolm wrote:
gcc/
* output.h (insn_current_reference_address): Strengthen param
from rtx to rtx_insn *.
* final.c (insn_current_reference_address): Likewise.
#223 and #224 are ok.
r~
On 08/06/2014 10:23 AM, David Malcolm wrote:
gcc/
* rtl.h (tablejump_p): Strengthen first param from const_rtx to
const rtx_insn *.
(label_is_jump_target_p): Likewise for second param.
* rtlanal.c (tablejump_p): Likewise for param insn.
On 08/06/2014 10:23 AM, David Malcolm wrote:
/
rtx-classes-status.txt: Delete
---
rtx-classes-status.txt | 9 -
1 file changed, 9 deletions(-)
delete mode 100644 rtx-classes-status.txt
#230 - #236 are ok.
r~
On 08/06/2014 10:23 AM, David Malcolm wrote:
This patch updates NEXT_INSN and PREV_INSN to work on rtx_insn *, rather
than plain rtx - plus miscellaneous fixes needed to get there.
Ug. Bigger than I'd have liked, but still ok.
r~
On 08/19/2014 02:35 PM, David Malcolm wrote:
This one is quite ugly: the pre-existing code has two locals named
note, both of type rtx, with one shadowing the other. This patch
introduces a third, within the scope where the name note is used for
insns. In the other scopes the two other note
On 08/19/2014 10:17 AM, Marek Polacek wrote:
My recent patch broke bootstrap on ppc64, because, by default,
char on ppc defaults to be an unsigned char. But the code relied
on char being signed by default.
Furthermore, the compat warning about // comments shouldn't be issued
in C++ mode at
On 08/13/2014 05:29 AM, Kyrill Tkachov wrote:
Is the attached patch ok? It just moves the section as you suggested. I did a
build of the Linux kernel with and without this patch to make sure no code-gen
was accidentally affected.
Looks good.
We'd need to store a mapping from constant to
On 08/03/2014 03:39 AM, Richard Sandiford wrote:
+struct rtx_subrtx_bound_info {
+ unsigned char start;
+ unsigned char count;
+};
Given this structure is only two bytes...
+ /* The bounds to use for iterating over subrtxes. */
+ const rtx_subrtx_bound_info *m_bounds;
... wouldn't it
On 07/29/2014 06:11 AM, Uros Bizjak wrote:
Perhaps even better solution for mainline would be to detect a recent
enough linker and skip the workaround in that case? I guess that 2.25
will have this issue fixed?
Certainly 2.25 will have this fixed. If you want to add a check for binutils
On 07/26/2014 05:35 AM, Uros Bizjak wrote:
On Mon, May 2, 2011 at 9:21 AM, Uros Bizjak ubiz...@gmail.com wrote:
It looks that GP relative relocations do not fit anymore into GPREL16
reloc, so bootstrap on alpha hosts fail in stage2 with relocation
truncated to fit: GPREL16 against I
I noticed this while backporting support to the 4.9 branch.
I'm not sure what I was thinking when I wrote this originally;
probably too much cut-and-paste from another implementation.
Anyway, sanity tested and committed.
r~
* config/aarch64/sjlj.S (_ITM_beginTransaction): Use post-inc
On 07/07/2014 02:10 AM, Richard Biener wrote:
On Mon, Jun 30, 2014 at 5:54 PM, Richard Henderson r...@redhat.com wrote:
On 06/29/2014 11:14 AM, Uros Bizjak wrote:
if (MEM_READONLY_P (x))
+if (GET_CODE (mem_addr) == AND)
+ return 1;
return 0;
Certainly missing braces here
On 07/03/2014 02:53 AM, Evgeny Stupachenko wrote:
-expand_vec_perm_palignr (struct expand_vec_perm_d *d)
+expand_vec_perm_palignr (struct expand_vec_perm_d *d, int insn_num)
insn_num might as well be bool avx2, since it's only ever set to two values.
- /* Even with AVX, palignr only operates
On 07/07/2014 07:34 AM, Richard Biener wrote:
Ugh. I wasn't aware of that - is this documented anywhere? What
exactly does such address conflict with? Does it inhibit type-based analysis?
Dunno if it's documented anywhere. Such addresses conflict with anything,
unless it can be proven not
On 07/07/2014 09:35 AM, Uros Bizjak wrote:
On Mon, Jul 7, 2014 at 5:01 PM, Richard Henderson r...@redhat.com wrote:
Early alpha can't store sub-4-byte quantities. Altivec can't store anything
but 16 byte quantities. In order to perform smaller stores, we have to do a
read-modify-write
On 06/29/2014 12:51 PM, Uros Bizjak wrote:
I believe that attached v2 patch addresses all your review comments.
Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
Looks good, thanks.
r~
On 06/29/2014 11:14 AM, Uros Bizjak wrote:
if (MEM_READONLY_P (x))
+if (GET_CODE (mem_addr) == AND)
+ return 1;
return 0;
Certainly missing braces here. But with that fixed the patch looks plausible.
I'll look at it closer later today.
r~
On 06/26/2014 02:15 PM, Uros Bizjak wrote:
* except.c (emit_note_eh_region_end): New helper function.
(convert_to_eh_region_ranges): Use emit_note_eh_region_end to
emit EH_REGION_END note.
This bit looks good.
rtx insn, next, prev;
- for (insn = get_insns (); insn; insn =
On 06/27/2014 10:04 AM, Uros Bizjak wrote:
This happened due to the way stores to QImode and HImode locations are
implemented on non-BWX targets. The sequence reads full word, does its
magic to the part and stores the full word with changed part back to
the memory. However - the scheduler
On 06/26/2014 02:43 AM, Uros Bizjak wrote:
2014-06-26 Uros Bizjak ubiz...@gmail.com
PR target/61586
* config/alpha/alpha.c (alpha_handle_trap_shadows): Handle BARRIER RTX.
testsuite/ChangeLog:
2014-06-26 Uros Bizjak ubiz...@gmail.com
PR target/61586
*
On 06/25/2014 08:28 AM, Jeff Law wrote:
Ask an ARM maintainer if the new code is actually better than the old code.
It isn't.
It appears that with the peep2 pass moved that we actually if-convert the
fall-thru path of the conditional and eliminate the conditional. Which, on the
surface seems
On 06/25/2014 06:35 AM, Kai Tietz wrote:
Hello,
so there seems to be a fallout caused by moving peephole2 pass. See PR/61608.
So we need indeed 2 peephole2 passes.
We don't need a second peephole pass. Please try this.
I think there's room for cleanup here, depending on when we leave
On 06/23/2014 02:29 AM, Ramana Radhakrishnan wrote:
On 20/06/14 21:28, Richard Henderson wrote:
There aren't too many users of the cmpelim pass, and previously they were all
small embedded targets without an FPU.
I'm a bit surprised that Ramana decided to enable this pass for aarch64
On 06/20/2014 02:59 PM, Kai Tietz wrote:
So I suggest following change of passes.def:
Index: passes.def
===
--- passes.def (Revision 211850)
+++ passes.def (Arbeitskopie)
@@ -384,7 +384,6 @@ along with GCC; see the file
On 06/20/2014 01:42 PM, Marek Polacek wrote:
2014-06-20 Marek Polacek pola...@redhat.com
* genpreds.c (verify_rtx_codes): New function.
(main): Call it.
* rtl.h (RTX_FLD_WIDTH, RTX_HWINT_WIDTH): Define.
(struct rtx_def): Use them.
Looks pretty good. Just a few
On 06/23/2014 09:22 AM, Jeff Law wrote:
On 06/23/14 08:32, Richard Biener wrote:
Btw, there is now no DCE after peephole2? Is peephole2 expected to
cleanup after itself?
There were cases where we wanted to change the insns we would output to fit
into the 4:1:1 issue model of the PPro, but to
On 06/23/2014 08:55 AM, Ramana Radhakrishnan wrote:
Agreed, this is why cmpelim looks interesting for Thumb1. (We may need another
hook or something to disable it in configurations we don't need it in, but you
know ... )
Yeah. Feel free to change targetm.flags_regnum from a variable to a
On 06/20/2014 08:56 AM, Kai Tietz wrote:
+(define_split
+ [(set (match_operand:W 0 register_operand)
+(match_operand:W 1 memory_operand))
+ (set (pc) (match_dup 0))]
+ !TARGET_X32 peep2_reg_dead_p (2, operands[0])
+ [(set (pc) (match_dup 1))])
+
Huh? You can't use peep2 data
On 06/20/2014 10:52 AM, Kai Tietz wrote:
2014-06-20 Kai Tietz kti...@redhat.com
PR target/39284
* passes.def (peephole2): Add second peephole2 pass before
split before sched2 pass.
* config/i386/i386.md (peehole2): To combine
indirect jump with memory.
(split2):
There aren't too many users of the cmpelim pass, and previously they were all
small embedded targets without an FPU.
I'm a bit surprised that Ramana decided to enable this pass for aarch64, as
that target is not so limited as the block comment for the pass describes.
Honestly, whatever is being
On 06/19/2014 05:39 AM, Tom de Vries wrote:
2014-06-19 Tom de Vries t...@codesourcery.com
* final.c (collect_fn_hard_reg_usage): Add and use variable
function_used_regs.
Looks good, thanks.
r~
On 06/19/2014 01:39 AM, Tom de Vries wrote:
On 19-06-14 05:53, Richard Henderson wrote:
Do we in fact make sure this isn't an ifunc resolver? I don't immediately
see
how those get wired up in the cgraph...
Richard,
using the patch below I changed the
gcc/testsuite/gcc.target/i386/fuse
On 06/19/2014 09:06 AM, Tom de Vries wrote:
2014-06-19 Tom de Vries t...@codesourcery.com
* final.c (collect_fn_hard_reg_usage): Don't save function_used_regs if
it contains all call_used_regs.
Ok.
r~
On 06/19/2014 09:07 AM, Tom de Vries wrote:
2014-06-19 Tom de Vries t...@codesourcery.com
* final.c (collect_fn_hard_reg_usage): Add separate IOR_HARD_REG_SET for
get_call_reg_set_usage.
Ok, as far as it goes, but...
It seems like there should be quite a bit of overlap with
On 06/19/2014 09:37 AM, Tom de Vries wrote:
On 19-06-14 05:59, Richard Henderson wrote:
On 06/01/2014 04:27 AM, Tom de Vries wrote:
+ if (TARGET_AAPCS_BASED)
+{
+ /* For AAPCS, IP and CC can be clobbered by veneers inserted by the
+ linker. We need to add these to allow
On 06/19/2014 09:40 AM, Richard Henderson wrote:
It appears that regs_ever_live includes any register mentioned explicitly, and
thus the only registers it doesn't contain are those killed by the callees.
That should be an easier scan than the rtl, since we have those already
collected
On 06/19/2014 11:25 AM, Tom de Vries wrote:
On 19-06-14 05:53, Richard Henderson wrote:
On 06/01/2014 03:00 AM, Tom de Vries wrote:
+aarch64_emit_call_insn (rtx pat)
+{
+ rtx insn = emit_call_insn (pat);
+
+ rtx *fusage = CALL_INSN_FUNCTION_USAGE (insn);
+ clobber_reg (fusage
On 06/19/2014 12:36 PM, Jan Hubicka wrote:
On 06/19/2014 09:06 AM, Tom de Vries wrote:
2014-06-19 Tom de Vries t...@codesourcery.com
* final.c (collect_fn_hard_reg_usage): Don't save function_used_regs if
it contains all call_used_regs.
Ok.
When we now have way to represent
On 02/28/2014 01:32 AM, Ramana Radhakrishnan wrote:
Hi,
This defines TARGET_FLAGS_REGNUM for AArch64 to be CC_REGNUM. Noticed this
turns on the cmpelim pass after reload and in a few examples and a couple of
benchmarks I noticed a number of comparisons getting deleted. A similar patch
On 06/18/2014 03:57 PM, Kyle McMartin wrote:
pretty sure we need a similar fix for tlsgd_small, since __tls_get_addr
could clobber CC as well.
As I replied in IRC, no, because tlsgd_small is modeled with an actual
CALL_INSN, and thus call-clobbered registers work as normal.
r~
On 06/01/2014 03:00 AM, Tom de Vries wrote:
+/* Emit call insn with PAT and do aarch64-specific handling. */
+
+bool
+aarch64_emit_call_insn (rtx pat)
+{
+ rtx insn = emit_call_insn (pat);
+
+ rtx *fusage = CALL_INSN_FUNCTION_USAGE (insn);
+ clobber_reg (fusage, gen_rtx_REG
On 06/01/2014 03:00 AM, Tom de Vries wrote:
+aarch64_emit_call_insn (rtx pat)
+{
+ rtx insn = emit_call_insn (pat);
+
+ rtx *fusage = CALL_INSN_FUNCTION_USAGE (insn);
+ clobber_reg (fusage, gen_rtx_REG (word_mode, IP0_REGNUM));
+ clobber_reg (fusage, gen_rtx_REG (word_mode,
On 06/01/2014 04:27 AM, Tom de Vries wrote:
+ if (TARGET_AAPCS_BASED)
+{
+ /* For AAPCS, IP and CC can be clobbered by veneers inserted by the
+ linker. We need to add these to allow
+ arm_call_fusage_contains_non_callee_clobbers to return true. */
+ rtx *fusage =
On 05/19/2014 07:30 AM, Tom de Vries wrote:
+ for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
+{
+ HARD_REG_SET insn_used_regs;
+
+ if (!NONDEBUG_INSN_P (insn))
+ continue;
+
+ find_all_hard_reg_sets (insn, insn_used_regs, false);
+
+ if
On 06/17/2014 05:33 AM, Evgeny Stupachenko wrote:
+ 1st vec: 0 1 2 3 4 5 6 7
+ 2nd vec: 8 9 10 11 12 13 14 15
+ 3rd vec: 16 17 18 19 20 21 22 23
+
+ The output sequence should be:
+
+ 1st vec: 0 3 6 9 12 15 18 21
+ 2nd vec: 1 4 7 10 13 16 19 22
+ 3rd vec: 2
Trivial fix for missing clobber of the flags over the tlsdesc call.
Ok for all branches?
r~
* config/aarch64/aarch64.md (tlsdesc_small_PTR): Clobber CC_REGNUM.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index a4d8887..1ee2cae 100644
---
On 06/13/2014 08:36 AM, Jeff Law wrote:
So you may have answered this already, but why can't this be a combiner
pattern?
Until pass_duplicate_computed_gotos, we (intentionally) have a single indirect
branch in the entire function. This vastly reduces the size of the CFG.
Peep2 is currently
On 06/09/2014 03:13 AM, Evgeny Stupachenko wrote:
+ /* First we apply one operand permutation to the part where
+ elements stay not in their respective lanes. */
+ dcopy = *d;
+ if (which == 2)
+dcopy.op0 = dcopy.op1 = d-op1;
+ else
+dcopy.op0 = dcopy.op1 = d-op0;
+
On 06/09/2014 12:10 PM, Evgeny Stupachenko wrote:
Nice catch.
Patch with corresponding changes:
Looks ok with an appropriate changelog.
r~
On 06/02/2014 02:32 PM, Uros Bizjak wrote:
+ * config/i386/i386.c (decide_alg): Correctly handle maximum size of
+ stringop algorithm.
Looks good.
r~
On 06/05/2014 08:29 AM, Evgeny Stupachenko wrote:
+ /* Figure out where permutation elements stay not in their
+ respective lanes. */
+ for (i = 0, which = 0; i nelt; ++i)
+{
+ unsigned e = d-perm[i];
+ if (e != i)
+ which |= (e nelt ? 1 : 2);
+}
+ /* We
On 06/05/2014 09:47 AM, Kai Tietz wrote:
+(define_insn *sibcall_intern
+ [(call (unspec [(mem:QI (match_operand:W 0 memory_operand))]
Probably best to use memory_nox32_operand here (and the other define_insn
patterns) too.
Otherwise ok.
r~
On 06/04/2014 10:06 AM, Evgeny Stupachenko wrote:
Is it ok to use the following pattern?
patch passed bootstrap and make check, but one test failed:
gcc/testsuite/gcc.target/i386/vect-rebuild.c
It failed on /* { dg-final { scan-assembler-times \tv?permilpd\[ \t\] 1 } }
*/
which is now
On 06/04/2014 05:37 AM, Kai Tietz wrote:
+(define_peephole2
+ [(set (match_operand:DI 0 register_operand)
+(match_operand:DI 1 memory_operand))
+ (call (mem:QI (match_operand:DI 2 register_operand))
+ (match_operand 3))]
+ TARGET_64BIT REG_P (operands[0])
+ REG_P
On 06/04/2014 02:23 PM, Evgeny Stupachenko wrote:
Thanks. Moving pattern down helps. Now make check for the following
patch passed:
Excellent. This version looks good.
r~
On 06/03/2014 11:21 AM, Kai Tietz wrote:
case AX_REG:
case DX_REG:
- return true;
+ return (regno != DX_REG || !TARGET_64BIT || ix86_abi != MS_ABI);
You might as well eliminate the first test, and split the case entries:
case AX_REG:
return true;
case DX_REG:
On 06/03/2014 12:56 PM, Kai Tietz wrote:
+(define_insn *sibcall_intern
+ [(call (unspec [(mem:QI (match_operand:W 0 memory_operand))]
UNSPEC_PEEPSIB)
+ (match_operand 1))]
+
+ * SIBLING_CALL_P (insn) = 1; return ix86_output_call_insn (insn,
operands[0]);
+ [(set_attr type
On 06/03/2014 01:15 PM, Kai Tietz wrote:
- Original Message -
On 06/03/2014 12:56 PM, Kai Tietz wrote:
+(define_insn *sibcall_intern
+ [(call (unspec [(mem:QI (match_operand:W 0 memory_operand))]
UNSPEC_PEEPSIB)
+(match_operand 1))]
+
+ * SIBLING_CALL_P (insn) = 1; return
The scheme that alpha uses to split symbolic references into trackable pairs of
relocations doesn't really handle asms. Of course, normal asms don't have the
problem seen here because they're interested in producing instructions, whereas
this case is system tap creating some annotations.
The
On 05/30/2014 01:08 AM, Kai Tietz wrote:
(define_predicate sibcall_memory_operand
(match_operand 0 memory_operand)
{
return CONSTANT_P (op);
})
Surely that always returns false? Surely XEXP (op, 0) so that you look at the
address, not the memory.
r~
On 05/28/2014 01:43 AM, Kai Tietz wrote:
+ if (GET_CODE (op) == CONST)
+op = XEXP (op, 0);
+ return (GET_CODE (op) == SYMBOL_REF || CONSTANT_P (op));
Surely all this boils down to just CONSTANT_P (op),
without having to look through the CONST at all.
Otherwise this looks ok.
r~
On 05/28/2014 02:54 PM, Jeff Law wrote:
On 05/28/14 15:52, Jakub Jelinek wrote:
On Wed, May 28, 2014 at 05:28:31PM -0400, Kai Tietz wrote:
Yes, I missed the plus-part.
I am just running bootstrap with regression testing for altering predicate
to:
(define_predicate sibcall_memory_operand
On 05/22/2014 02:33 PM, Kai Tietz wrote:
* config/i386/i386.c (ix86_expand_call): Enforce for sibcalls
on memory the use of accumulator-register.
I don't like this at all.
I'm fine with allowing memories that are fully symbolic, e.g.
extern void (*foo)(void);
void f(void) { foo();
On 05/27/2014 08:39 AM, Jeff Law wrote:
But the value we want may be sitting around in a convenient register (such as
r11). So if we force the sibcall to use %rax, then we have to generate a
copy. Yet if we have a constraint for the set of registers allowed here, then
we give the register
On 05/27/2014 09:48 AM, Jeff Law wrote:
leaofs(base, index, scale), %eax
...
call*0(%eax)
we might as well include the memory load
movofs(base, index, scale), %eax
...
call*%eax
Ok. My misunderstanding.
Granted, this probably doesn't happen
601 - 700 of 2511 matches
Mail list logo