Re: [PATCH, testsuite]: Do not run simulate-thread on alpha*-*-linux*

2011-11-16 Thread Uros Bizjak
On Wed, Nov 16, 2011 at 10:46 PM, Richard Henderson  wrote:
>> +/* Checks for an atomic sequence of instructions beginning with a 
>> LWARX/LDARX
>> +   instruction and ending with a STWCX/STDCX instruction.  If such a 
>> sequence
>> +   is found, attempt to step through it.  A breakpoint is placed at the end 
>> of
>> +   the sequence.  */
>> +
>
> s/LWARX/LDL_L/g
> s/LDARX/LDQ_L/g
> s/STWCX/STL_C/g
> s/STDCX/STQ_C/g
>
>> +          int immediate = ((insn & 0x001f) << 2);
>> +
>> +          if (bc_insn_count >= 1)
>> +            return 0; /* More than one conditional branch found, fallback
>> +                         to the standard single-step code.  */
>> +
>> +       breaks[1] = loc + ALPHA_INSN_SIZE + immediate;
>
> The immediate needs to be sign-extended.  Add:
>
>  immedaite = (immediate ^ 0x40) - 0x40;
>
> Otherwise your computation of the address is correct.

Thanks, corrected gdb patch is at [1].

[1] http://sourceware.org/ml/gdb-patches/2011-11/msg00457.html

Uros.


Re: a question about IVOPTS: find_interesting_uses_address

2011-11-16 Thread Richard Guenther
On Wed, Nov 16, 2011 at 12:28 PM, Eric Botcazou  wrote:
>> Huh, IVOPTs should never cause a different size memory read.  I wonder
>> if the original issue would still reproduce with the fix reverted.
>
> The original issue was unaligned arrays in packed structures.  I don't see 
> what
> could have changed since then.

Ah, so it was expand being confused about the constructed access (not seeing
a component-ref), thus the usual old issue we have there.  There's indeed no
trivial fix for this but to create the [TARGET_]MEM_REF with an explicitely
unaligned type and hope that the target implements the movmisalign optab
(the only case this explicit alignment is honored).  And double-check the
TARGET_MEM_REF expansion code properly duplicates the movmisalign
path from the MEM_REF expander.

I'd wish somebody fixed this path for non-movmisalign strict-align targets.

Richard.

> --
> Eric Botcazou
>


Re: When is the hardware related register is allocated?

2011-11-16 Thread Joern Rennecke

Quoting Feng LI :


So I suppose with (clobber (reg:CC FLAGS_REG)), the compiler
will be capable to know this register will be modified in this   
instruction and

apply proper behavior when necessary?


Depends.  It will stop the compiler moving a FLAGS_REG setter and/or user
such that FLAGS_REG becomes live across your insn pattern.
But if you emit this pattern in a place where FLAGS_REG is already live
(e.g. when you generate it from some combiner pattern, or create it
via a split / peephole2 without FLAGS_REG clobber), there is a problem.


Re: [PATCH, testsuite]: Do not run simulate-thread on alpha*-*-linux*

2011-11-16 Thread Richard Henderson
> +/* Checks for an atomic sequence of instructions beginning with a LWARX/LDARX
> +   instruction and ending with a STWCX/STDCX instruction.  If such a sequence
> +   is found, attempt to step through it.  A breakpoint is placed at the end 
> of 
> +   the sequence.  */
> +

s/LWARX/LDL_L/g
s/LDARX/LDQ_L/g
s/STWCX/STL_C/g
s/STDCX/STQ_C/g

> +  int immediate = ((insn & 0x001f) << 2);
> +
> +  if (bc_insn_count >= 1)
> +return 0; /* More than one conditional branch found, fallback 
> + to the standard single-step code.  */
> + 
> +   breaks[1] = loc + ALPHA_INSN_SIZE + immediate;

The immediate needs to be sign-extended.  Add:

  immedaite = (immediate ^ 0x40) - 0x40;

Otherwise your computation of the address is correct.


r~


Re: [PATCH, testsuite]: Do not run simulate-thread on alpha*-*-linux*

2011-11-16 Thread Uros Bizjak
On Sun, Nov 13, 2011 at 10:41 PM, Richard Henderson  wrote:
> On 11/12/2011 07:38 AM, Uros Bizjak wrote:
>> On Fri, Nov 11, 2011 at 8:58 PM, Dominique Dhumieres  
>> wrote:
>>
>>> For the record, Jakub's comment on IRC:
>>>
 with older gdb you simply had to find the stwcx
 or whatever SC insn is, put a breakpoint after
 it and continue instead of single stepping
>>
>> I'm not familiar enough with gdb scripting to implement this, but if
>> someone fix _.gdb script, I can test the correct fix...
>
> It's not in the gdb script, it's in gdb proper.
>
> If you care to have a peek, look at
>
>  /* Handles single stepping of atomic sequences.  */
>  set_gdbarch_software_single_step (gdbarch, ppc_deal_with_atomic_sequence);
>
> in gdb/rs6000-tdep.c.  It ought to be fairly easy to write
> something similar for alpha-tdep.c.

Indeed, the attached is WIP gdb patch that survives all
simulate-thread.exp tests. It is almost a copy of rs6000-tdep.c (*)

(*) Please note the error in how relative address of the branch is
calculated in rs6000-tdep.c, ppc_deal_with_atomic_sequence function.
Target address immediate should be relative to "loc", not "pc"
variable.

Richard, can you please help with the calculation of branch target
address for alpha... I'm sure that the helper function should be
available somewhere...

Uros.
--- alpha-tdep.c	2011-11-16 21:57:26.998112380 +0100
+++ alpha-tdep.c.ub	2011-11-16 21:56:11.739733014 +0100
@@ -65,6 +65,7 @@
 /* Branch instruction format */
 #define BR_RA(insn) MEM_RA(insn)
 
+static const int beq_opcode = 0x39;
 static const int bne_opcode = 0x3d;
 
 /* Operate instruction format */
@@ -762,6 +763,96 @@
 }
 
 
+
+static const int ldl_l_opcode = 0x2a;
+static const int ldq_l_opcode = 0x2b;
+static const int stl_c_opcode = 0x2e;
+static const int stq_c_opcode = 0x2f;
+
+/* Checks for an atomic sequence of instructions beginning with a LWARX/LDARX
+   instruction and ending with a STWCX/STDCX instruction.  If such a sequence
+   is found, attempt to step through it.  A breakpoint is placed at the end of 
+   the sequence.  */
+
+int 
+alpha_deal_with_atomic_sequence (struct frame_info *frame)
+{
+  struct gdbarch *gdbarch = get_frame_arch (frame);
+  struct address_space *aspace = get_frame_address_space (frame);
+  CORE_ADDR pc = get_frame_pc (frame);
+  CORE_ADDR breaks[2] = {-1, -1};
+  CORE_ADDR loc = pc;
+  CORE_ADDR closing_insn; /* Instruction that closes the atomic sequence.  */
+  unsigned int insn = alpha_read_insn (gdbarch, loc);
+  int insn_count;
+  int index;
+  int last_breakpoint = 0; /* Defaults to 0 (no breakpoints placed).  */  
+  const int atomic_sequence_length = 16; /* Instruction sequence length.  */
+  int opcode; /* Branch instruction's OPcode.  */
+  int bc_insn_count = 0; /* Conditional branch instruction count.  */
+
+  /* Assume all atomic sequences start with a lwarx/ldarx instruction.  */
+  if (INSN_OPCODE (insn) != ldl_l_opcode
+  && INSN_OPCODE (insn) != ldq_l_opcode)
+return 0;
+
+  /* Assume that no atomic sequence is longer than "atomic_sequence_length" 
+ instructions.  */
+  for (insn_count = 0; insn_count < atomic_sequence_length; ++insn_count)
+{
+  loc += ALPHA_INSN_SIZE;
+  insn = alpha_read_insn (gdbarch, loc);
+
+  /* Assume that there is at most one conditional branch in the atomic
+ sequence.  If a conditional branch is found, put a breakpoint in 
+ its destination address.  */
+  if (INSN_OPCODE (insn) >= beq_opcode)
+{
+	  /* ??? */
+  int immediate = ((insn & 0x001f) << 2);
+
+  if (bc_insn_count >= 1)
+return 0; /* More than one conditional branch found, fallback 
+ to the standard single-step code.  */
+ 
+	  breaks[1] = loc + ALPHA_INSN_SIZE + immediate;
+
+	  bc_insn_count++;
+	  last_breakpoint++;
+}
+
+  if (INSN_OPCODE (insn) == stl_c_opcode
+	  || INSN_OPCODE (insn) == stq_c_opcode)
+	break;
+}
+
+  /* Assume that the atomic sequence ends with a stwcx/stdcx instruction.  */
+  if (INSN_OPCODE (insn) != stl_c_opcode
+  && INSN_OPCODE (insn) != stq_c_opcode)
+return 0;
+
+  closing_insn = loc;
+  loc += ALPHA_INSN_SIZE;
+
+  /* Insert a breakpoint right after the end of the atomic sequence.  */
+  breaks[0] = loc;
+
+  /* Check for duplicated breakpoints.  Check also for a breakpoint
+ placed (branch instruction's destination) at the stwcx/stdcx 
+ instruction, this resets the reservation and take us back to the 
+ lwarx/ldarx instruction at the beginning of the atomic sequence.  */
+  if (last_breakpoint && ((breaks[1] == breaks[0]) 
+  || (breaks[1] == closing_insn)))
+last_breakpoint = 0;
+
+  /* Effectively inserts the breakpoints.  */
+  for (index = 0; index <= last_breakpoint; index++)
+insert_single_step_breakpoint (gdbarch, aspace, breaks[index]);
+
+  return 1;
+}
+
+
 /* Figure out where the longjmp will land.
We expect the first arg to be a point

Re: A question about redudant load elimination

2011-11-16 Thread Michael Matz
Hi,

On Wed, 16 Nov 2011, Jeff Law wrote:

> Right.  In theory, path isolation would make this optimizable.  Make a 
> copy of the block containing a[x] = 2 and make it the target when x != 
> 100.  At the source level it'd look something like this:
> 
>  int x;
>  extern void f(void);
> 
>  void g(int *a)
>  {
> a[x] = 1;
> if (x == 100) {
> f();
>   a[x] = 2;
>   } else {
>   a[x] = 2;
>   }
>  }
> 
> 
> The problem then becomes identification of the load from "x" as
> redundant on the else path, which we're currently not capable of doing.

Also in this variant the first store to a[x] may-clobbers x itself.  The 
function call doesn't enter the picture.


Ciao,
Michael.


Re: A question about redudant load elimination

2011-11-16 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/16/11 04:00, Richard Guenther wrote:
> On Mon, Nov 14, 2011 at 9:01 AM, Jiangning Liu
>  wrote:
>> Hi,
>> 
>> For this test case,
>> 
>> int x; extern void f(void);
>> 
>> void g(int *a) { a[x] = 1; if (x == 100) f(); a[x] = 2; }
>> 
>> For trunk, the x86 assembly code is like below,
>> 
>> movlx, %eax movl16(%esp), %ebx movl$1, (%ebx,%eax,4) 
>> movlx, %eax   // Is this a redundant one? cmpl$100, %eax 
>> je  .L4 movl$2, (%ebx,%eax,4) addl$8, %esp 
>> .cfi_remember_state .cfi_def_cfa_offset 8 popl%ebx 
>> .cfi_restore 3 .cfi_def_cfa_offset 4 ret .p2align 4,,7 .p2align
>> 3 .L4: .cfi_restore_state callf movlx, %eax movl$2,
>> (%ebx,%eax,4) addl$8, %esp .cfi_def_cfa_offset 8 popl
>> %ebx .cfi_restore 3 .cfi_def_cfa_offset 4 Ret
>> 
>> Is the 2nd "movl x, %eax" is a redundant one for single thread
>> programming model? If yes, can this be optimized away?
> 
> f() may change the value of x, so you cannot optimize away the load
> on that execution path.
Right.  In theory, path isolation would make this optimizable.  Make a
copy of the block containing a[x] = 2 and make it the target when x !=
100.  At the source level it'd look something like this:

 int x;
 extern void f(void);

 void g(int *a)
 {
a[x] = 1;
if (x == 100) {
f();
a[x] = 2;
} else {
a[x] = 2;
}
 }


The problem then becomes identification of the load from "x" as
redundant on the else path, which we're currently not capable of doing.

jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOw+0dAAoJEBRtltQi2kC7MS0H/Ri6MqbuhEWbL8pLanaNpBwZ
NmAcecLVIbB4YVhfqtnWOftUYFAIZZ7WhYHGNsVFIZiIK0wfGmUCk6Jmun6kpUkp
TeE2Tlf6keyfGy8XImayhn6Ngifwa9salcxxL4eSND/pADZ2j+URTXVJQvlWlnIv
4aDXrQdlkTtO9EZYwSGoxbAIfHaooSQotLXT68esiTYeEIK9jD2pZVDDuvorps4E
ov84Z8DurSBa/N+CA2qk0n//mq1ChwiEn+nCRXOF92v6j6av9+jKf54rJPQu7NNf
b7Kj4vRYf+GAHquloqPqe18cK1qbFaZ4O5GGlL8TraXDgIDtOpyzTAtjumOLFZ8=
=raxr
-END PGP SIGNATURE-


Re: Developing GCC

2011-11-16 Thread Ian Lance Taylor
"Rick C. Hodgin"  writes:

> What's the best way to learn about developing GCC?  Not developing in
> GCC, but understanding and extending the compiler's design itself?

Start by reading http://gcc.gnu.org/onlinedocs/gccint/ and
http://gcc.gnu.org/wiki/ .

Ian


Re: Developing GCC

2011-11-16 Thread Basile Starynkevitch
On Wed, Nov 16, 2011 at 08:39:23AM -0500, Rick C. Hodgin wrote:
> What's the best way to learn about developing GCC?  Not developing in
> GCC, but understanding and extending the compiler's design itself?

[It really depends of what you want to do inside GCC, but]

Learn more about the pass machinery and the plugin machinery, and perhaps do
some plugins yourself.

IMHO (but I suppose many would disagree), developping a plugin (or a GCC
MELT extension - see http://gcc-melt.org/ which also have some slides which
could be useful to you) is best for code that you do not intend to push into
the compiler (in other words, making a plugin has probably more sense than
making your own private branch of GCC).

Once you mastered what plugins can give you, you have understood some of the
possibilities to develop GCC.

Happy hacking.

PS. And don't forget the legal aspects & prerequisites:
http://gcc.gnu.org/contribute.html - they will take a long time, so start
worrying about them right now!

-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: A question about redudant load elimination

2011-11-16 Thread Michael Matz
Hi,

On Wed, 16 Nov 2011, Eric Botcazou wrote:

> > f() may change the value of x, so you cannot optimize away the load on that
> > execution path.
> 
> This looks more like an aliasing issue with a, doesn't it?

Correct.  There's no call to f() between a[x] and x==100, but the store to 
a[x] might change x if x==0&&a==&x.


Ciao,
Michael.


Re: When is the hardware related register is allocated?

2011-11-16 Thread Feng LI
On Wed, Nov 16, 2011 at 12:18 PM, Joern Rennecke  wrote:
> Quoting Ian Lance Taylor :
>
>> Offhand I don't know of any way to get the compiler to save CC for you
>> around your instruction.  That's a stiff requirement.
>
> It's easy to do under explicit control of the pattern, using (a)
> match_scratch
> clobber(s) of (a) register(s) of the required class(es) - or if memory is
> needed, the stack top can be set aside by the prologue/epilogue code.
>

So I suppose with (clobber (reg:CC FLAGS_REG)), the compiler
will be capable to know this register will be modified in this instruction and
apply proper behavior when necessary? Here is the code:

(define_insn "spc_insn"
  [(set (match_operand:DI 0 "register_operand" "=r")
(unspec_volatile:DI [(match_operand:DI 1 "register_operand" "r")]
  UNSPEC_SPC))
   (clobber (reg:CC FLAGS_REG))]

Thanks,
Feng


Developing GCC

2011-11-16 Thread Rick C. Hodgin
What's the best way to learn about developing GCC?  Not developing in
GCC, but understanding and extending the compiler's design itself?

Thanks,
Rick C. Hodgin




Re: A question about redudant load elimination

2011-11-16 Thread Eric Botcazou
> f() may change the value of x, so you cannot optimize away the load on that
> execution path.

This looks more like an aliasing issue with a, doesn't it?

-- 
Eric Botcazou


Re: a question about IVOPTS: find_interesting_uses_address

2011-11-16 Thread Eric Botcazou
> Huh, IVOPTs should never cause a different size memory read.  I wonder
> if the original issue would still reproduce with the fix reverted.

The original issue was unaligned arrays in packed structures.  I don't see what 
could have changed since then.

-- 
Eric Botcazou


Re: When is the hardware related register is allocated?

2011-11-16 Thread Joern Rennecke

Quoting Ian Lance Taylor :


Offhand I don't know of any way to get the compiler to save CC for you
around your instruction.  That's a stiff requirement.


It's easy to do under explicit control of the pattern, using (a) match_scratch
clobber(s) of (a) register(s) of the required class(es) - or if memory is
needed, the stack top can be set aside by the prologue/epilogue code.



Re: A question about redudant load elimination

2011-11-16 Thread Richard Guenther
On Mon, Nov 14, 2011 at 9:01 AM, Jiangning Liu  wrote:
> Hi,
>
> For this test case,
>
> int x;
> extern void f(void);
>
> void g(int *a)
> {
>        a[x] = 1;
>        if (x == 100)
>                f();
>        a[x] = 2;
> }
>
> For trunk, the x86 assembly code is like below,
>
>        movl    x, %eax
>        movl    16(%esp), %ebx
>        movl    $1, (%ebx,%eax,4)
>        movl    x, %eax   // Is this a redundant one?
>        cmpl    $100, %eax
>        je      .L4
>        movl    $2, (%ebx,%eax,4)
>        addl    $8, %esp
>        .cfi_remember_state
>        .cfi_def_cfa_offset 8
>        popl    %ebx
>        .cfi_restore 3
>        .cfi_def_cfa_offset 4
>        ret
>        .p2align 4,,7
>        .p2align 3
> .L4:
>        .cfi_restore_state
>        call    f
>        movl    x, %eax
>        movl    $2, (%ebx,%eax,4)
>        addl    $8, %esp
>        .cfi_def_cfa_offset 8
>        popl    %ebx
>        .cfi_restore 3
>        .cfi_def_cfa_offset 4
>        Ret
>
> Is the 2nd "movl x, %eax" is a redundant one for single thread programming
> model? If yes, can this be optimized away?

f() may change the value of x, so you cannot optimize away the load on that
execution path.

> Thanks,
> -Jiangning
>
>
>
>


Re: a question about IVOPTS: find_interesting_uses_address

2011-11-16 Thread Richard Guenther
On Mon, Nov 14, 2011 at 4:58 AM, Yuehai Du  wrote:
> Hi
>
>  i found IVOPTS didn't work well on some case if the loop contain
> some unaligned access. it didn't take this kind of memory access into
> account because this check in function:find_interesting_uses_address
>
>      /* Moreover, on strict alignment platforms, check that it is
>         sufficiently aligned.  */
>      if (STRICT_ALIGNMENT && may_be_unaligned_p (base, step))
>        goto fail;
>
>  and this is used to fix http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17949
> because it converting byte loads into a unaligned word load on strict
> alignment platform.
>
>  but there are some platform which is strict alignment and  it also
> support unaligned access for special mode. for example, ARM NEON support
> vector mode unaligned access  via VLD1/VST1, and they both support write back
> address mode.
>
>  Moreover, lately ARM add unaligned access support for  all ARMv6 , ARMv7-A,
> ARMv7-R, and ARMv7-M architecture-based processors which controlled by the
> option: munaligned-access.
> see http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00513.html.
>
>  so what should we do now? should we use target hook to check this or
> something else?

Huh, IVOPTs should never cause a different size memory read.  I wonder
if the original issue would still reproduce with the fix reverted.  Can you
ponder in this direction (thus, remove the may_be_unaligned_p check
completely)?

Richard.

> --
> Yuehai Du
>