Re: Iterating over RTL in Graphite
On 02/17/2012 08:34 PM, David Malcolm wrote: On Thu, 2012-02-16 at 19:17 -0400, Arnaldo wrote: Hello everyone, I'm working on an extension to the Graphite pass of GCC 4.4.0. My intention is to associate costs to RTL instructions by adding them as RTX attributes to a machine description file, and to read them back during the Graphite pass by iterating through each basic block. Is the RTL available during this optimization pass? I'm not sure this is the case as I get a segfault when trying to iterate over the RTL with the code below ("internal compiler error: Segmentation fault"). I don't need the fully resolved RTL, just to be able to read the attribute given an RTL instruction. I've tried debugging the compiler with gdb but it can't find the debugging symbols even though they're there. I'll keep trying to get gdb to work but any leads on reading these attributes from within Graphite is greatly appreciated. I don't know about GCC 4.4, but a while back I wrote a script using my GCC Python plugin to draw a "subway map" of GCC 4.6's passes: http://gcc.gnu.org/ml/gcc/2011-07/msg00157.html which you can see here: http://gcc-python-plugin.readthedocs.org/en/latest/tables-of-passes.html If I reading things correctly, the graphite passes happen whilst the code is still in gimple form: the blocks are converted to RTL form in the "expand" pass, which happens about 20 or so passes later. Caveat: I'm not familiar with the insides of the graphite, and am relatively new to gcc's insides, so I could be wrong; also the script relies on the pass flags, and they're not necessarily correct either... Yes, graphite works on GIMPLE. I believe I have never seen RTL when working on graphite, so I doubt it is easily available. (Maybe it is, but it is definitely not used within graphite). Cheers Tobi
gcc-4.6-20120217 is now available
Snapshot gcc-4.6-20120217 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.6-20120217/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.6 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_6-branch revision 184352 You'll find: gcc-4.6-20120217.tar.bz2 Complete GCC MD5=d3181a901518147c0ba4b2eb9be6ed46 SHA1=deb2b657eef08fe3ba162a06ec817dff54b97ffc Diffs from 4.6-20120210 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.6 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: Iterating over RTL in Graphite
On Thu, 2012-02-16 at 19:17 -0400, Arnaldo wrote: > Hello everyone, > > I'm working on an extension to the Graphite pass of GCC 4.4.0. My > intention is to associate costs to RTL instructions by adding them as > RTX attributes to a machine description file, and to read them back > during the Graphite pass by iterating through each basic block. > > Is the RTL available during this optimization pass? I'm not sure this > is the case as I get a segfault when trying to iterate over the RTL > with the code below ("internal compiler error: Segmentation fault"). I > don't need the fully resolved RTL, just to be able to read the > attribute given an RTL instruction. > > I've tried debugging the compiler with gdb but it can't find the > debugging symbols even though they're there. I'll keep trying to get > gdb to work but any leads on reading these attributes from within > Graphite is greatly appreciated. I don't know about GCC 4.4, but a while back I wrote a script using my GCC Python plugin to draw a "subway map" of GCC 4.6's passes: http://gcc.gnu.org/ml/gcc/2011-07/msg00157.html which you can see here: http://gcc-python-plugin.readthedocs.org/en/latest/tables-of-passes.html If I reading things correctly, the graphite passes happen whilst the code is still in gimple form: the blocks are converted to RTL form in the "expand" pass, which happens about 20 or so passes later. Caveat: I'm not familiar with the insides of the graphite, and am relatively new to gcc's insides, so I could be wrong; also the script relies on the pass flags, and they're not necessarily correct either... Hope this is helpful Dave
BEWARE: PLATEAU + SUCCESSFACTORS + SAP
WITHOUT PREJUDICE AND BASED ON SIMPLE FACTS ATTENTION: The SAP Global Executive & Supervisory Board of Directors & Shareholders The SuccessFactors Board of Directors & Shareholders Plateau Management at SuccessFactors (you know who you are) Ladies & Gentlemen, Please beware of the Plateau team within the SuccessFactors organisation, as my contract for VP Asia Pacific with Plateau (along with many more innocent people) remains unpaid since 2008! The main culprit of this situation is Shelley Heiden along with the other Plateau executive management being: Paul Sparta, Brian Murphy and Stephen Blodgett, as well as Victor Meer It now seems they will never do the right thing after many years of ignoring the truth, denying me of my rights and defaming my position and career status in retaliation to my constant follow-up. As a result, I now have no choice but to globally exploit this issue utilising my freedom of speech for this serious wrong-doing. Sadly, a simple settlement payment to legally and ethically conclude my contract has just been ignored for far too long. It is now plain and clear that SuccessFactors and of course Plateau (SF&P) ARE NOT Human Capital Management and Resource compassionate whatsoever. I remain one of so many people that have been wronged and remain wronged by their actions to date as they have ferociously affected the innocent lives of so many along with their many innocent dependents. I hope for all our sakes that common sense can prevail with a simple payment to make this all go away…. Keep the Faith! Paul
Re: Wrong REG_UNUSED notes for PARALLEL insns?
Richard Henderson wrote: > On 02/17/12 07:11, Georg-Johann Lay wrote: >> ; (insn 13 12 8 (parallel [ >> ; (set (reg:SI 22 r22) >> ; (mem:SI (lo_sum:PSI (reg:QI 21 r21) >> ; (reg:HI 30 r30)) [0 S4 A8 AS7])) >> ; (clobber (reg:QI 21 r21)) >> ; (clobber (reg:HI 30 r30)) >> ; ]) movmem.c:42 15 {xload_si_libgcc} >> ; (expr_list:REG_DEAD (reg:QI 21 r21) >> ; (expr_list:REG_DEAD (reg:QI 31 r31) >> ; (expr_list:REG_UNUSED (reg:QI 21 r21) >> ; (expr_list:REG_UNUSED (reg:HI 30 r30) >> ; (nil)) >> rcall __xload_4 ; 13 xload_si_libgcc [length = 1] >> >> Notice the REG_UNUSED for R21 and R30 which are wrong. > > No, that's correct. > > REG_UNUSED isn't about inputs, it's about outputs. Yes, of course ...maybe too much of gcc for me at one single day... Thanks for the answer Johann
Re: Wrong REG_UNUSED notes for PARALLEL insns?
On 02/17/12 07:11, Georg-Johann Lay wrote: > ; (insn 13 12 8 (parallel [ > ; (set (reg:SI 22 r22) > ; (mem:SI (lo_sum:PSI (reg:QI 21 r21) > ; (reg:HI 30 r30)) [0 S4 A8 AS7])) > ; (clobber (reg:QI 21 r21)) > ; (clobber (reg:HI 30 r30)) > ; ]) movmem.c:42 15 {xload_si_libgcc} > ; (expr_list:REG_DEAD (reg:QI 21 r21) > ; (expr_list:REG_DEAD (reg:QI 31 r31) > ; (expr_list:REG_UNUSED (reg:QI 21 r21) > ; (expr_list:REG_UNUSED (reg:HI 30 r30) > ; (nil)) > rcall __xload_4 ; 13 xload_si_libgcc [length = 1] > > Notice the REG_UNUSED for R21 and R30 which are wrong. No, that's correct. REG_UNUSED isn't about inputs, it's about outputs. R21 and R30 are set (clobbered) by this insn, and their results are (of course) UNUSED. r~
Wrong REG_UNUSED notes for PARALLEL insns?
Suppose the following insn from avr.md: (define_insn "xload__libgcc" [(set (reg:MOVMODE 22) (mem:MOVMODE (lo_sum:PSI (reg:QI 21) (reg:HI REG_Z (clobber (reg:QI 21)) (clobber (reg:HI REG_Z))] "avr_xload_libgcc_p (mode)" { rtx x_bytes = GEN_INT (GET_MODE_SIZE (mode)); output_asm_insn ("%~call __xload_%0", &x_bytes); return ""; } [(set_attr "type" "xcall") (set_attr "cc" "clobber")]) and used by some code. In the -S -dP output, there is ; (insn 13 12 8 (parallel [ ; (set (reg:SI 22 r22) ; (mem:SI (lo_sum:PSI (reg:QI 21 r21) ; (reg:HI 30 r30)) [0 S4 A8 AS7])) ; (clobber (reg:QI 21 r21)) ; (clobber (reg:HI 30 r30)) ; ]) movmem.c:42 15 {xload_si_libgcc} ; (expr_list:REG_DEAD (reg:QI 21 r21) ; (expr_list:REG_DEAD (reg:QI 31 r31) ; (expr_list:REG_UNUSED (reg:QI 21 r21) ; (expr_list:REG_UNUSED (reg:HI 30 r30) ; (nil)) rcall __xload_4 ; 13 xload_si_libgcc [length = 1] Notice the REG_UNUSED for R21 and R30 which are wrong. As far as I understand PARALLEL, its elements operate parallel in time, i.e. a clobber takes place *after* the elements' actions whereas use resp. input operand indicate that respective operands are used. R21 and R30 are obviously used as input operands the the MEM. The surrounding code is correct, e.g. setting R21 is not optimized away. However, I am worried that in other context the wrong notes might lead to wrong code. Thanks for any hints on this. Johann
Business inquiry for your site /
Hi gcc.gnu.org Team, I have a proposal for your site that you might be interested in. I'd really like to hear from you regarding this matter. Are you interested? Please get back to me if you have any questions and I'd be glad to discuss it to you in details. Thanks! Kindest Regards, Katy
PR51782: Missing address-space information in .expand
I had a look into this again for the following small C program: struct rgb { char r; }; char read_bug (const __flash struct rgb *s) { struct rgb t = *s; return t.r; } char read_ok (const __flash struct rgb *s) { return s->r; } compiled as > avr-gcc flash-move.c -S -Os -dp -fdump-rtl-expand-details and with the patch applied to tree-pretty-print.c With that patch .expand dump reads: read_bug (const struct rgb * s) { char t$r; # BLOCK 2 freq:1 # PRED: ENTRY [100.0%] (fallthru,exec) t$r_4 = s_1(D){address-space-1 ->}r; return t$r_4; # SUCC: EXIT [100.0%] } ;; Generating RTL for gimple basic block 2 ;; return t$r_4; (insn 6 5 7 (set (reg:QI 46) (mem:QI (reg/v/f:HI 44 [ s ]) [0 s_1(D){address-space-1 ->}r+0 S1 A8])) flash-move.c:7 -1 (nil)) ... which is wrong because in insn 6 there is no "AS1" in the memory attributes. Just compare with the respective dump of insn 6 of read_ok, which is correct: (insn 6 5 7 (set (reg:QI 46) (mem:QI (reg/v/f:HI 44 [ s ]) [0 s_1(D){address-space-1 ->}r+0 S1 A8 AS1])) flash-move.c:12 -1 (nil)) Thus, the problem appears to be at a completely different place. Or the changes to tree-pretty-print just serve to confuse myself... Johann Index: tree-pretty-print.c === --- tree-pretty-print.c (revision 183939) +++ tree-pretty-print.c (working copy) @@ -603,6 +603,7 @@ dump_generic_node (pretty_printer *buffe tree op0, op1; const char *str; bool is_expr; + addr_space_t as; if (node == NULL_TREE) return spc; @@ -837,7 +838,16 @@ dump_generic_node (pretty_printer *buffe { if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR) { - pp_string (buffer, "*"); +as = TYPE_ADDR_SPACE (TREE_TYPE (node)); +if (!ADDR_SPACE_GENERIC_P (as)) + { +pp_string (buffer, "'); dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false); } @@ -1181,6 +1191,7 @@ dump_generic_node (pretty_printer *buffe case COMPONENT_REF: op0 = TREE_OPERAND (node, 0); str = "."; + as = ADDR_SPACE_GENERIC; if (op0 && (TREE_CODE (op0) == INDIRECT_REF || (TREE_CODE (op0) == MEM_REF @@ -1206,6 +1217,7 @@ dump_generic_node (pretty_printer *buffe (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1 { op0 = TREE_OPERAND (op0, 0); + as = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (op0))); str = "->"; } if (op_prio (op0) < op_prio (node)) @@ -1213,7 +1225,15 @@ dump_generic_node (pretty_printer *buffe dump_generic_node (buffer, op0, spc, flags, false); if (op_prio (op0) < op_prio (node)) pp_character (buffer, ')'); + if (!ADDR_SPACE_GENERIC_P (as)) +{ + pp_string (buffer, "{address-space-"); + pp_decimal_int (buffer, as); + pp_character (buffer, ' '); +} pp_string (buffer, str); + if (!ADDR_SPACE_GENERIC_P (as)) +pp_character (buffer, '}'); dump_generic_node (buffer, TREE_OPERAND (node, 1), spc, flags, false); op0 = component_ref_field_offset (node); if (op0 && TREE_CODE (op0) != INTEGER_CST) @@ -1670,7 +1690,19 @@ dump_generic_node (pretty_printer *buffe || TREE_CODE (TREE_OPERAND (node, 0)) == FUNCTION_DECL)) ; /* Do not output '&' for strings and function pointers. */ else - pp_string (buffer, op_symbol (node)); +{ + addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (node)); + + if (!ADDR_SPACE_GENERIC_P (as)) +{ + pp_string (buffer, "'); +} if (op_prio (TREE_OPERAND (node, 0)) < op_prio (node)) { struct rgb { char r; }; char read_bug (const __flash struct rgb *s) { struct rgb t = *s; return t.r; } char read_ok (const __flash struct rgb *s) { return s->r; } ;; Function read_bug (read_bug, funcdef_no=0, decl_uid=1320, cgraph_uid=0) read_bug (const struct rgb * s) { char t$r; # BLOCK 2 freq:1 # PRED: ENTRY [100.0%] (fallthru,exec) t$r_4 = s_1(D){address-space-1 ->}r; return t$r_4; # SUCC: EXIT [100.0%] } Partition map Partition 1 (s_1(D) - 1 ) Partition 3 (.MEM_3(D) - 3 ) Partition 4 (t$r_4 - 4 ) Partition 5 (.MEM_5 - 5 ) Partition map Partition 0 (s_1(D) - 1 ) Live on entry to BB2 : s_1(D) Conflict graph: After sorting: Coalesce List: Partition map Partition 0 (s_1(D) - 1 ) After Coalescing: Partition map Partition 0 (s_1(D) - 1 ) Partition 1 (t$r_4 - 4 ) Replacing Expressions t$r_4 replace with --> t$r_4 = s_1(D){address-space-1 ->}r; read_bug (const struct rgb * s) { char t$r; # BLOCK 2 freq:1 # PRED: ENTRY [100.0%] (fallthru,exec) t$r_4 = s_1(D){address-space-1 ->}r; return t$r_4; # SUCC: EXIT [100.0%] } ;; Generating RTL for gimple basic block 2 ;; return t$r_4; (insn 6 5 7 (set (reg:QI 46) (mem:QI (reg/v/f:H
A problem about loop store motion
Hi, For this small test case, int *l, *r; int test_func(void) { int i; int direction; static int pos; pos = 0; direction = 1; for ( i = 0; i <= 400; i++ ) { if ( direction == 0 ) pos = l[pos]; else pos = r[pos]; if ( pos == -1 ) { pos = 0; direction = !direction; } } return i; } In middle end, I don't see pos is sunk out of loop by loop store motion. Any idea? The dump after lim is like below, and I expect a SSA symbole xxx_lsm could be created with this pass. ;; Function test_func (test_func, funcdef_no=0, decl_uid=4057, cgraph_uid=0) Symbols to be put in SSA form { .MEM } Incremental SSA update started at block: 0 Number of blocks in CFG: 12 Number of blocks to update: 11 ( 92%) test_func () { int pretmp.14; unsigned int pretmp.13; int prephitmp.12; int pretmp.11; unsigned int pretmp.10; int pretmp.9; int D.4088; static int pos; int direction; int i; _Bool D.4082; int pos.5; int * D.4078; int * r.4; int pos.3; int * D.4074; unsigned int D.4073; unsigned int pos.2; int pos.1; int * l.0; : pos = 0; l.0_6 = l; r.4_12 = r; : # i_32 = PHI # direction_37 = PHI # prephitmp.12_35 = PHI if (direction_37 == 0) goto ; else goto ; : pos.1_7 = prephitmp.12_35; pos.2_8 = (unsigned int) pos.1_7; D.4073_9 = pos.2_8 * 4; D.4074_10 = l.0_6 + D.4073_9; pos.3_11 = *D.4074_10; pos = pos.3_11; goto ; : pos.1_13 = prephitmp.12_35; pos.2_14 = (unsigned int) pos.1_13; D.4073_15 = pos.2_14 * 4; D.4078_16 = r.4_12 + D.4073_15; pos.5_17 = *D.4078_16; pos = pos.5_17; : # prephitmp.12_31 = PHI pos.1_18 = prephitmp.12_31; if (pos.1_18 == -1) goto ; else goto ; : goto ; : pos = 0; D.4088_36 = direction_37 ^ 1; direction_20 = D.4088_36 & 1; : # direction_2 = PHI i_21 = i_32 + 1; if (i_21 != 401) goto ; else goto ; : pretmp.11_1 = pos; goto ; : return 401; } Thanks, -Jiangning