Re: [PATCH 1/3] gdb.trace: Move more target dependencies to trace-support.exp

2016-02-20 Thread Marcin Kościelnicki

Whoops, sorry for that mail - I typoed gdb-patches to gcc-patches.

On 20/02/16 14:56, Marcin Kościelnicki wrote:

While groveling through the old PPC64 tracepoint support patch, I've
noticed a few target dependencies in the testsuite that both me and
Antoine missed for s390 and ARM tracepoints, respectively.  This patch
moves them all to one place, so that anyone working on a new target
will hopefully see the whole set of needed changes.

For some strange reason, the call_insn setting code already knew about
powerpc, s390, and mips - I went ahead and added the remaining
information about those.  I'm not particularly sure if I got mips right,
but that won't matter anyway until someone actually writes tracepoint
support for that.

Tested on x86_64, i386, ppc, ppc64, ppc64le, s390, s390x.  Would be good
to test it on aarch64.

gdb/testsuite/ChangeLog:

* gdb.trace/entry-values.exp: Move call_insn setting to
trace-support.exp.
* gdb.trace/ftrace.exp: Move arg0exp setting to trace-support.exp.
* gdb.trace/mi-trace-unavailable.exp (proc_trace_unavailable): Move
pcnum setting to trace-support.exp, change fixed register 0 to
gpr0num variable.
* lib/trace-support.exp: Add setting pcnum, gpr0num, arg0exp,
call_insn; add powerpc, s390, and mips support.




[PATCH 1/3] gdb.trace: Move more target dependencies to trace-support.exp

2016-02-20 Thread Marcin Kościelnicki
While groveling through the old PPC64 tracepoint support patch, I've
noticed a few target dependencies in the testsuite that both me and
Antoine missed for s390 and ARM tracepoints, respectively.  This patch
moves them all to one place, so that anyone working on a new target
will hopefully see the whole set of needed changes.

For some strange reason, the call_insn setting code already knew about
powerpc, s390, and mips - I went ahead and added the remaining
information about those.  I'm not particularly sure if I got mips right,
but that won't matter anyway until someone actually writes tracepoint
support for that.

Tested on x86_64, i386, ppc, ppc64, ppc64le, s390, s390x.  Would be good
to test it on aarch64.

gdb/testsuite/ChangeLog:

* gdb.trace/entry-values.exp: Move call_insn setting to
trace-support.exp.
* gdb.trace/ftrace.exp: Move arg0exp setting to trace-support.exp.
* gdb.trace/mi-trace-unavailable.exp (proc_trace_unavailable): Move
pcnum setting to trace-support.exp, change fixed register 0 to
gpr0num variable.
* lib/trace-support.exp: Add setting pcnum, gpr0num, arg0exp,
call_insn; add powerpc, s390, and mips support.
---
Split up and removed arm in this version.

 gdb/testsuite/ChangeLog  | 11 
 gdb/testsuite/gdb.trace/entry-values.exp | 26 +
 gdb/testsuite/gdb.trace/ftrace.exp   | 11 
 gdb/testsuite/gdb.trace/mi-trace-unavailable.exp | 25 +++--
 gdb/testsuite/lib/trace-support.exp  | 69 +++-
 5 files changed, 87 insertions(+), 55 deletions(-)

diff --git a/gdb/testsuite/ChangeLog b/gdb/testsuite/ChangeLog
index 5676cac..ec2773b 100644
--- a/gdb/testsuite/ChangeLog
+++ b/gdb/testsuite/ChangeLog
@@ -1,3 +1,14 @@
+2016-02-19  Marcin Kościelnicki  
+
+   * gdb.trace/entry-values.exp: Move call_insn setting to
+   trace-support.exp.
+   * gdb.trace/ftrace.exp: Move arg0exp setting to trace-support.exp.
+   * gdb.trace/mi-trace-unavailable.exp (proc_trace_unavailable): Move
+   pcnum setting to trace-support.exp, change fixed register 0 to
+   gpr0num variable.
+   * lib/trace-support.exp: Add setting pcnum, gpr0num, arg0exp,
+   call_insn; add powerpc, s390, and mips support.
+
 2016-02-18  Iain Buclaw  
 
* lib/future.exp: Add D support.
diff --git a/gdb/testsuite/gdb.trace/entry-values.exp 
b/gdb/testsuite/gdb.trace/entry-values.exp
index 825928d..75c788a 100644
--- a/gdb/testsuite/gdb.trace/entry-values.exp
+++ b/gdb/testsuite/gdb.trace/entry-values.exp
@@ -12,6 +12,8 @@
 #
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+load_lib "trace-support.exp"
 load_lib dwarf.exp
 
 # This test can only be run on targets which support DWARF-2 and use gas.
@@ -37,30 +39,6 @@ gdb_load ${binfile}1.o
 
 set returned_from_foo ""
 
-if { [istarget "arm*-*-*"] || [istarget "aarch64*-*-*"] } {
-set call_insn "bl"
-} elseif { [istarget "s390*-*-*"] } {
-set call_insn "brasl"
-} elseif { [istarget "powerpc*-*-*"] } {
-set call_insn "bl"
-} elseif { [istarget "mips*-*-*"] } {
-# Skip the delay slot after the instruction used to make a call
-# (which can be a jump or a branch) if it has one.
-#
-#  JUMP (or BRANCH) foo
-#  insn1
-#  insn2
-#
-# Most MIPS instructions used to make calls have a delay slot.
-# These include JAL, JALS, JALX, JALR, JALRS, BAL and BALS.
-# In this case the program continues from `insn2' when `foo'
-# returns.  The only exception is JALRC, in which case execution
-# resumes from `insn1' instead.
-set call_insn {jalrc|[jb]al[sxr]*[ \t][^\r\n]+\r\n}
-} else {
-set call_insn "call"
-}
-
 # Calculate the offset of the instruction in bar returned from foo.
 set test "disassemble bar"
 gdb_test_multiple $test $test {
diff --git a/gdb/testsuite/gdb.trace/ftrace.exp 
b/gdb/testsuite/gdb.trace/ftrace.exp
index 15ad7e7..b9b2d8b 100644
--- a/gdb/testsuite/gdb.trace/ftrace.exp
+++ b/gdb/testsuite/gdb.trace/ftrace.exp
@@ -239,17 +239,6 @@ test_ftrace_condition "(globvar >> 2) == 2" "globvar" { 8 
9 10 }
 # Test emit_call by accessing trace state variables.
 test_ftrace_condition "(\$tsv = \$tsv + 2) > 10" "globvar" { 6 7 8 9 10 }
 
-# This expression is used for testing emit_reg.
-if [is_amd64_regs_target] {
-set arg0exp "\$rdi"
-} elseif [is_x86_like_target] {
-set arg0exp "*(int *) (\$ebp + 8)"
-} elseif { [istarget "aarch64*-*-*"] } {
-set arg0exp "\$x0"
-} else {
-set arg0exp ""
-}
-
 if { "$arg0exp" != "" } {
 test_ftrace_condition "($a

[PATCH 3/3] gdb.trace: Remove unnecessary target check from ftrace.exp.

2016-02-20 Thread Marcin Kościelnicki
The check used hardcoded targets and wasn't doing anything useful anyway,
since unsupported architectures blow up on link due to missing the IPA
library before they ever get to that check.

gdb/testsuite/ChangeLog:

* gdb.trace/ftrace.exp: Remove unnecessary target check.
---
 gdb/testsuite/ChangeLog|   4 ++
 gdb/testsuite/gdb.trace/ftrace.exp | 135 ++---
 2 files changed, 70 insertions(+), 69 deletions(-)

diff --git a/gdb/testsuite/ChangeLog b/gdb/testsuite/ChangeLog
index 4546cc7..8d695ec 100644
--- a/gdb/testsuite/ChangeLog
+++ b/gdb/testsuite/ChangeLog
@@ -1,5 +1,9 @@
 2016-02-19  Marcin Kościelnicki  
 
+   * gdb.trace/ftrace.exp: Remove unnecessary target check.
+
+2016-02-19  Marcin Kościelnicki  
+
* gdb.trace/entry-values.exp: Surround $call_insn with '\y'.
* lib/trace-support.exp: Change x86_64 call_insn to 'callq'.
 
diff --git a/gdb/testsuite/gdb.trace/ftrace.exp 
b/gdb/testsuite/gdb.trace/ftrace.exp
index b9b2d8b..b587896 100644
--- a/gdb/testsuite/gdb.trace/ftrace.exp
+++ b/gdb/testsuite/gdb.trace/ftrace.exp
@@ -84,97 +84,94 @@ proc test_fast_tracepoints {} {
 
 gdb_test "print gdb_agent_gdb_trampoline_buffer_error" ".*" ""
 
-if { [istarget "x86_64-*-*"] || [istarget "i\[34567\]86-*-*"] || 
[is_aarch64_target] } {
+gdb_test "ftrace set_point" "Fast tracepoint .*" \
+"fast tracepoint at a long insn"
 
-   gdb_test "ftrace set_point" "Fast tracepoint .*" \
-   "fast tracepoint at a long insn"
+gdb_trace_setactions "collect at set_point: define actions" \
+"" \
+"collect globvar, anarg" "^$"
 
-   gdb_trace_setactions "collect at set_point: define actions" \
-   "" \
-   "collect globvar, anarg" "^$"
+# Make a test of shorter fast tracepoints, 32-bit x86 only
 
-   # Make a test of shorter fast tracepoints, 32-bit x86 only
+if { [istarget "i?86-*-*"] } {
 
-   if { [istarget "i?86-*-*"] } {
+# A Linux target needs to be able to allocate trampolines in the
+# 16-bit range, check mmap_min_addr so we can warn testers.
+if { [istarget "i?86-*-linux*"] } {
 
-   # A Linux target needs to be able to allocate trampolines in the
-   # 16-bit range, check mmap_min_addr so we can warn testers.
-   if { [istarget "i?86-*-linux*"] } {
+set minaddr [exec sh -c "cat /proc/sys/vm/mmap_min_addr"]
 
-   set minaddr [exec sh -c "cat /proc/sys/vm/mmap_min_addr"]
+if { [expr $minaddr > 64512] } {
+warning "mmap_min_addr > 64512, fast tracepoint will fail"
+warning "do \"sudo sysctl -w vm.mmap_min_addr=32768\" to 
adjust"
+}
+}
 
-   if { [expr $minaddr > 64512] } {
-   warning "mmap_min_addr > 64512, fast tracepoint will fail"
-   warning "do \"sudo sysctl -w vm.mmap_min_addr=32768\" to 
adjust"
-   }
-   }
+gdb_test_multiple "ftrace four_byter" "set 4-byte fast tracepoint" {
+-re "May not have a fast tracepoint at .*\r\n$gdb_prompt $" {
+pass "4-byte fast tracepoint could not be set"
+}
+-re "Fast tracepoint .*\r\n$gdb_prompt $" {
+pass "4-byte fast tracepoint is set"
+set fourgood 1
+}
+}
 
-   gdb_test_multiple "ftrace four_byter" "set 4-byte fast tracepoint" {
-   -re "May not have a fast tracepoint at .*\r\n$gdb_prompt $" {
-   pass "4-byte fast tracepoint could not be set"
-   }
-   -re "Fast tracepoint .*\r\n$gdb_prompt $" {
-   pass "4-byte fast tracepoint is set"
-   set fourgood 1
-   }
-   }
+if { $fourgood } {
 
-   if { $fourgood } {
-
-   gdb_trace_setactions "collect at four_byter: define actions" \
-   "" \
-   "collect globvar, anarg" "^$"
-   }
-   }
-
-   run_trace_experiment
+gdb_trace_setactions "collect at four_byter: define actions" \
+"" \
+"collect globvar, anarg" "^$"
+}
+}
 
-   gdb_test "tfind pc *set_point" "Found trace frame .*" \
-   "tfind set_point frame, first time"
+run_trace_experiment
 
-   setup_k

[PATCH 2/3] gdb.trace: Surround $call_insn with \y in entry-values.exp

2016-02-20 Thread Marcin Kościelnicki
The PPC64 tracepoint patch added \y at the end of the call_insn pattern -
without that, it embarassed itself and matched the 'bl' in "Dump of
assem*bl*er code for function" as the powerpc call opcode.  Since that
sounds like a generally good idea, I've added \y before and after
call_insn for every target.  As a result, I had to change x86_64's mnemonic
to 'callq'.

gdb/testsuite/ChangeLog:

* gdb.trace/entry-values.exp: Surround $call_insn with '\y'.
* lib/trace-support.exp: Change x86_64 call_insn to 'callq'.
---
 gdb/testsuite/ChangeLog  | 5 +
 gdb/testsuite/gdb.trace/entry-values.exp | 2 +-
 gdb/testsuite/lib/trace-support.exp  | 2 +-
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/gdb/testsuite/ChangeLog b/gdb/testsuite/ChangeLog
index ec2773b..4546cc7 100644
--- a/gdb/testsuite/ChangeLog
+++ b/gdb/testsuite/ChangeLog
@@ -1,5 +1,10 @@
 2016-02-19  Marcin Kościelnicki  
 
+   * gdb.trace/entry-values.exp: Surround $call_insn with '\y'.
+   * lib/trace-support.exp: Change x86_64 call_insn to 'callq'.
+
+2016-02-19  Marcin Kościelnicki  
+
* gdb.trace/entry-values.exp: Move call_insn setting to
trace-support.exp.
* gdb.trace/ftrace.exp: Move arg0exp setting to trace-support.exp.
diff --git a/gdb/testsuite/gdb.trace/entry-values.exp 
b/gdb/testsuite/gdb.trace/entry-values.exp
index 75c788a..2890112 100644
--- a/gdb/testsuite/gdb.trace/entry-values.exp
+++ b/gdb/testsuite/gdb.trace/entry-values.exp
@@ -42,7 +42,7 @@ set returned_from_foo ""
 # Calculate the offset of the instruction in bar returned from foo.
 set test "disassemble bar"
 gdb_test_multiple $test $test {
--re ".*$hex <\\+$decimal>:\[ \t\]+$call_insn\[^\r\n\]+\r\n\[ \]+$hex 
<\\+($decimal)>:.*$gdb_prompt $" {
+-re ".*$hex <\\+$decimal>:\[ \t\]+\\y$call_insn\\y\[^\r\n\]+\r\n\[ \]+$hex 
<\\+($decimal)>:.*$gdb_prompt $" {
set returned_from_foo $expect_out(1,string)
 }
 -re ".*$gdb_prompt $" {
diff --git a/gdb/testsuite/lib/trace-support.exp 
b/gdb/testsuite/lib/trace-support.exp
index 0565030..96f2c35 100644
--- a/gdb/testsuite/lib/trace-support.exp
+++ b/gdb/testsuite/lib/trace-support.exp
@@ -35,7 +35,7 @@ if [is_amd64_regs_target] {
 # register usage in tracepoint conditions.
 set arg0exp "\$rdi"
 # The mnemonic of the usual, unconditional call instruction.
-set call_insn "call"
+set call_insn "callq"
 # Number of the PC register.
 set pcnum 16
 # Number of any GPR (it's supposed to be some register that's not
-- 
2.7.1



Re: [PATCH] s390: New mcount call sequence for z900+ CPUs in 31-bit mode.

2016-02-15 Thread Marcin Kościelnicki

On 21/01/16 14:03, Marcin Kościelnicki wrote:

On TARGET_CPU_ZARCH && !TARGET_64BIT, we can use a similiar lean mcount
call sequence to TARGET_64BIT.  The longer sequences are now used only
on deprecated g5/g6 CPUs.

Tested on s390-ibm-linux-gnu on RHEL 7.2.

gcc/ChangeLog:

* config/s390/s390.c (s390_function_profiler): Add a new sequence
for z900+ CPUs in 31-bit mode.
---
This change was mentioned in the s390 split-stack thread.

  gcc/ChangeLog  | 5 +
  gcc/config/s390/s390.c | 7 +++
  2 files changed, 12 insertions(+)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 0e77409..94b9bd0 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2016-01-21  Marcin Kościelnicki  
+
+   * config/s390/s390.c (s390_function_profiler): Add a new sequence
+   for z900+ CPUs in 31-bit mode.
+
  2016-01-21  Richard Biener  

* graphite-optimize-isl.c (get_schedule_map): Fix typo.
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 3be64de..eb26f18 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -11974,6 +11974,13 @@ s390_function_profiler (FILE *file, int labelno)
output_asm_insn ("brasl\t%0,%4", op);
output_asm_insn ("lg\t%0,%1", op);
  }
+  else if (TARGET_CPU_ZARCH)
+{
+  output_asm_insn ("st\t%0,%1", op);
+  output_asm_insn ("larl\t%2,%3", op);
+  output_asm_insn ("brasl\t%0,%4", op);
+  output_asm_insn ("l\t%0,%1", op);
+}
else if (!flag_pic)
  {
op[6] = gen_label_rtx ();




Ping?


Re: [PATCH] s390: Add -fsplit-stack support

2016-02-15 Thread Marcin Kościelnicki

On 15/02/16 11:21, Andreas Krebbel wrote:

On 02/14/2016 05:01 PM, Marcin Kościelnicki wrote:

libgcc/ChangeLog:

* config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
* config/s390/morestack.S: New file.
* config/s390/t-stack-s390: New file.
* generic-morestack.c (__splitstack_find): Add s390-specific code.

gcc/ChangeLog:

* common/config/s390/s390-common.c (s390_supports_split_stack):
New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
* config/s390/s390.c (struct machine_function): New field
split_stack_varargs_pointer.
(s390_register_info): Mark r12 as clobbered if it'll be used as temp
in s390_emit_prologue.
(s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
vararg pointer.
(morestack_ref): New global.
(SPLIT_STACK_AVAILABLE): New macro.
(s390_expand_split_stack_prologue): New function.
(s390_live_on_entry): New function.
(s390_va_start): Use split-stack vararg pointer if appropriate.
(s390_asm_file_end): Emit the split-stack note sections.
(TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
* config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
(UNSPECV_SPLIT_STACK_CALL): New unspec.
(UNSPECV_SPLIT_STACK_DATA): New unspec.
(split_stack_prologue): New expand.
(split_stack_space_check): New expand.
(split_stack_data): New insn.
(split_stack_call): New expand.
(split_stack_call_*): New insn.
(split_stack_cond_call): New expand.
(split_stack_cond_call_*): New insn.


Applied. Thanks!

-Andreas-



Thanks.  And how about that testcase I submitted, does that look OK?

Marcin Kościelnicki


[PATCH] s390: Add -fsplit-stack support

2016-02-14 Thread Marcin Kościelnicki
libgcc/ChangeLog:

* config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
* config/s390/morestack.S: New file.
* config/s390/t-stack-s390: New file.
* generic-morestack.c (__splitstack_find): Add s390-specific code.

gcc/ChangeLog:

* common/config/s390/s390-common.c (s390_supports_split_stack):
New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
* config/s390/s390.c (struct machine_function): New field
split_stack_varargs_pointer.
(s390_register_info): Mark r12 as clobbered if it'll be used as temp
in s390_emit_prologue.
(s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
vararg pointer.
(morestack_ref): New global.
(SPLIT_STACK_AVAILABLE): New macro.
(s390_expand_split_stack_prologue): New function.
(s390_live_on_entry): New function.
(s390_va_start): Use split-stack vararg pointer if appropriate.
(s390_asm_file_end): Emit the split-stack note sections.
(TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
* config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
(UNSPECV_SPLIT_STACK_CALL): New unspec.
(UNSPECV_SPLIT_STACK_DATA): New unspec.
(split_stack_prologue): New expand.
(split_stack_space_check): New expand.
(split_stack_data): New insn.
(split_stack_call): New expand.
(split_stack_call_*): New insn.
(split_stack_cond_call): New expand.
(split_stack_cond_call_*): New insn.
---
Whoops, I noticed a problem introduced when removing ESA bits: in the
__morestack exception-handling path in 31-bit version, I neglected
to stuff GOT address in %r12, which is necessary for the PLT stub to
work.  The only change in this version is the added larl %r12,
_GLOBAL_OFFSET_TABLE_ line.

 gcc/ChangeLog|  30 ++
 gcc/common/config/s390/s390-common.c |  14 +
 gcc/config/s390/s390-protos.h|   1 +
 gcc/config/s390/s390.c   | 214 +++-
 gcc/config/s390/s390.md  | 138 
 libgcc/ChangeLog |   7 +
 libgcc/config.host   |   4 +-
 libgcc/config/s390/morestack.S   | 611 +++
 libgcc/config/s390/t-stack-s390  |   2 +
 libgcc/generic-morestack.c   |   4 +
 10 files changed, 1018 insertions(+), 7 deletions(-)
 create mode 100644 libgcc/config/s390/morestack.S
 create mode 100644 libgcc/config/s390/t-stack-s390

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e81d1fe..60a4608 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,33 @@
+2016-02-14  Marcin Kościelnicki  
+
+   * common/config/s390/s390-common.c (s390_supports_split_stack):
+   New function.
+   (TARGET_SUPPORTS_SPLIT_STACK): New macro.
+   * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
+   * config/s390/s390.c (struct machine_function): New field
+   split_stack_varargs_pointer.
+   (s390_register_info): Mark r12 as clobbered if it'll be used as temp
+   in s390_emit_prologue.
+   (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
+   vararg pointer.
+   (morestack_ref): New global.
+   (SPLIT_STACK_AVAILABLE): New macro.
+   (s390_expand_split_stack_prologue): New function.
+   (s390_live_on_entry): New function.
+   (s390_va_start): Use split-stack vararg pointer if appropriate.
+   (s390_asm_file_end): Emit the split-stack note sections.
+   (TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
+   * config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
+   (UNSPECV_SPLIT_STACK_CALL): New unspec.
+   (UNSPECV_SPLIT_STACK_DATA): New unspec.
+   (split_stack_prologue): New expand.
+   (split_stack_space_check): New expand.
+   (split_stack_data): New insn.
+   (split_stack_call): New expand.
+   (split_stack_call_*): New insn.
+   (split_stack_cond_call): New expand.
+   (split_stack_cond_call_*): New insn.
+
 2016-02-14  Venkataramanan Kumar  
 
*  config/i386/znver1.md
diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index 4519c21..1e497e6 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -105,6 +105,17 @@ s390_handle_option (struct gcc_options *opts 
ATTRIBUTE_UNUSED,
 }
 }
 
+/* -fsplit-stack uses a field in the TCB, available with glibc-2.23.
+   We don't verify it, since earlier versions just have padding at
+   its place, which works just as well.  */
+
+static bool
+s390_supports_split_stack (bool report ATTRIBUTE_UNUSED,
+  struct gcc_options *opts ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
 #undef TARGET_DEFAULT_TARGET_FLAGS
 #define TARGET_DEFAULT_TARGET_FLAGS (TARGET_DEFAULT)
 
@@ -117,4 +128,7 @@ s390_handle

Re: [PATCH] s390: Add -fsplit-stack support

2016-02-10 Thread Marcin Kościelnicki

On 04/02/16 13:44, Marcin Kościelnicki wrote:

On 03/02/16 18:27, Ulrich Weigand wrote:

Marcin Kościelnicki wrote:


libgcc/ChangeLog:

* config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
* config/s390/morestack.S: New file.
* config/s390/t-stack-s390: New file.
* generic-morestack.c (__splitstack_find): Add s390-specific code.

gcc/ChangeLog:

* common/config/s390/s390-common.c (s390_supports_split_stack):
New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
* config/s390/s390.c (struct machine_function): New field
split_stack_varargs_pointer.
(s390_register_info): Mark r12 as clobbered if it'll be used as temp
in s390_emit_prologue.
(s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
vararg pointer.
(morestack_ref): New global.
(SPLIT_STACK_AVAILABLE): New macro.
(s390_expand_split_stack_prologue): New function.
(s390_live_on_entry): New function.
(s390_va_start): Use split-stack vararg pointer if appropriate.
(s390_asm_file_end): Emit the split-stack note sections.
(TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
* config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
(UNSPECV_SPLIT_STACK_CALL): New unspec.
(UNSPECV_SPLIT_STACK_DATA): New unspec.
(split_stack_prologue): New expand.
(split_stack_space_check): New expand.
(split_stack_data): New insn.
(split_stack_call): New expand.
(split_stack_call_*): New insn.
(split_stack_cond_call): New expand.
(split_stack_cond_call_*): New insn.
---
Changes applied.  Testsuite still running, still works on my simple
tests.

As for common code prerequisites: #3 is no longer needed, and very
likely
so is #4 (it fixes problems that I've only seen with ESA mode, and
testsuite
runs just fine without it now).


OK, I see.  The patch is OK for mainline then, assuming testing passes.


Well, testing passes (as in, is no worse than x86 - the testsuite
doesn't really agree with -fsplit-stack in a few places involving
backtraces).  However, there's still the libgo issue to be taken care
of.  For my tests, I patched it up with:
[...]


I see the libgo patch has landed today.  Can we get this pushed?

Marcin Kościelnicki




[PATCH] testsuite/s390: Add __morestack test.

2016-02-07 Thread Marcin Kościelnicki
gcc/testsuite/ChangeLog:

* gcc.target/s390/morestack.c: New test.
---
Here's the promised test.

 gcc/testsuite/ChangeLog   |   4 +
 gcc/testsuite/gcc.target/s390/morestack.c | 260 ++
 2 files changed, 264 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/morestack.c

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 8f528b2..26d600f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2016-02-05  Marcin Kościelnicki  :
+
+   * gcc.target/s390/morestack.c: New test.
+
 2016-02-04  Martin Liska  
 
* g++.dg/asan/pr69276.C: New test.
diff --git a/gcc/testsuite/gcc.target/s390/morestack.c 
b/gcc/testsuite/gcc.target/s390/morestack.c
new file mode 100644
index 000..aa28b72
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/morestack.c
@@ -0,0 +1,260 @@
+/* Checks proper behavior of __morestack function - specifically, GPR
+   values surviving, stack parameters being copied, and vararg
+   pointer being correct.  */
+
+/* { dg-do run } */
+/* { dg-options "" } */
+
+#include 
+
+void *orig_r15;
+
+/* 1. Function "test" saves registers, makes a stack frame, puts known
+ *values in registers, and calls __morestack, telling it to jump to
+ *testinner, with return address pointing to "testret".
+ * 2. "testinner" checks that parameter registers match what has been
+ *passed from "test", stack parameters were copied properly to
+ *the new stack, and the argument pointer matches the calling
+ *function's stack pointer.  It then leaves new values in volatile
+ *registers (including return value registers) and returns.
+ * 3. "testret" checks that return value registers contain the expected
+ *return value, callee-saved GPRs match the values from "test",
+ *and then returns to main. */
+
+extern unsigned long testparams[3];
+
+#ifdef __s390x__
+
+asm(
+  ".global test\n"
+  "test:\n"
+  ".type test, @function\n"
+  /* Save registers.  */
+  "stmg %r6, %r15, 0x30(%r15)\n"
+  /* Save original sp in a global.  */
+  "larl %r1, orig_r15\n"
+  "stg %r15, 0(%r1)\n"
+  /* Make a stack frame.  */
+  "aghi %r15, -168\n"
+  /* A stack parameter.  */
+  "lghi %r1, 0x1240\n"
+  "stg %r1, 160(%r15)\n"
+  /* Registers.  */
+  "lghi %r0, 0x1230\n"
+  "lghi %r2, 0x1232\n"
+  "lghi %r3, 0x1233\n"
+  "lghi %r4, 0x1234\n"
+  "lghi %r5, 0x1235\n"
+  "lghi %r6, 0x1236\n"
+  "lghi %r7, 0x1237\n"
+  "lghi %r8, 0x1238\n"
+  "lghi %r9, 0x1239\n"
+  "lghi %r10, 0x123a\n"
+  "lghi %r11, 0x123b\n"
+  "lghi %r12, 0x123c\n"
+  "lghi %r13, 0x123d\n"
+  /* Fake return address.  */
+  "larl %r14, testret\n"
+  /* Call morestack.  */
+  "larl %r1, testparams\n"
+  "jg __morestack\n"
+
+  /* Entry point.  */
+  "testinner:\n"
+  /* Check registers.  */
+  "cghi %r0, 0x1230\n"
+  "jne testerr\n"
+  "cghi %r2, 0x1232\n"
+  "jne testerr\n"
+  "cghi %r3, 0x1233\n"
+  "jne testerr\n"
+  "cghi %r4, 0x1234\n"
+  "jne testerr\n"
+  "cghi %r5, 0x1235\n"
+  "jne testerr\n"
+  "cghi %r6, 0x1236\n"
+  "jne testerr\n"
+  /* Check stack param.  */
+  "lg %r0, 0xa0(%r15)\n"
+  "cghi %r0, 0x1240\n"
+  "jne testerr\n"
+  /* Check argument pointer.  */
+  "aghi %r1, 8\n"
+  "larl %r2, orig_r15\n"
+  "cg %r1, 0(%r2)\n"
+  "jne testerr\n"
+  /* Modify volatile registers.  */
+  "lghi %r0, 0x1250\n"
+  "lghi %r1, 0x1251\n"
+  "lghi %r2, 0x1252\n"
+  "lghi %r3, 0x1253\n"
+  "lghi %r4, 0x1254\n"
+  "lghi %r5, 0x1255\n"
+  /* Return.  */
+  "br %r14\n"
+
+  /* Returns here.  */
+  "testret:\n"
+  /* Check return registers.  */
+  "cghi %r2, 0x1252\n"
+  "jne testerr\n"
+  /* Check callee-saved registers.  */
+  "cghi %r6, 0x1236\n"
+  "jne testerr\n"
+  "cghi %r7, 0x1237\n"
+  "jne testerr\n"
+  "cghi %r8, 0x1238\n"
+  "jne testerr\n"
+  "cghi %r9, 0x1239\n"
+  "jne testerr\n"
+  "cghi %r10, 0x123a\n"
+  "jne testerr\n"
+  "cghi %r11, 0x123b\n"
+  "jne testerr\n"
+  "cghi %r12, 0x123c\n"
+  "jne testerr\n"
+  "cghi %r13, 0x123d\n"
+  "jne testerr\n"
+  /* Return.  */
+  "lmg %r6, %r15, 0xd8(%r15)\n"
+  "br %r14\n" 
+
+  /* Parameters block.  */
+  ".section .data\n"
+  ".align 8\n"
+  "test

Re: [PATCH] s390: Add -fsplit-stack support

2016-02-05 Thread Marcin Kościelnicki

On 04/02/16 17:27, Ulrich Weigand wrote:

Marcin Kościelnicki wrote:


Fair enough.  Here's what I'm going to implement in gold:

- any PLT relocation: call
- PC32DBL on a larl: non-call
- PC32DBL otherwise: call
- any other relocation: non-call

Does that sound right?


Hmm, I'm wondering about the PC32DBL choices.  There are now
a large number of other non-call instructions that use PC32DBL,
including lrl, strl, crl, cgrl, cgfrl, ...

However, these all access *data* at the pointed-to location,
so it is quite unlikely they would ever be used with a
function symbol.  So, assuming that you also check that the
target of the relocation is a function symbol, treating only
larl as non-call might be OK.


Yeah, I make sure the symbol is a STT_FUNC.


Maybe a more conservative approach might be to  make the decision
the other way round: for PC32DBL check for *branch* instructions,
and treat only those are calls.  There's just a few branch
instruction using PC32DBL:

brasl  (call)
brcl   (conditional or unconditional sibcall)
brcth  (???)

where the last one is extremely unlikely (but theorically
possible) to be used as conditional sibcall combined with
a register decrement; I don't think this can ever happen
with current compilers however.


I'll stay with checking for larl - while I can imagine someone adding a 
new conditional branch instruction, I don't see a need for another 
larl-like instruction.  Besides, this way the failure mode for an 
unknown instruction would be producing an error, instead of silently 
emitting code with unfixed prologue.


For full completeness, there are also PC16DBL relocations that
*could* target called functions, but only when compiling with
the -msmall-exec flag to assume total executable size is less
than 64 KB.  These are used by the following instructions:

bras
brc
brct
brctg
brxh
brxhg
brxle
brxlg
crj
cgrj
clrj
clgrj
cij
cgij
clij
clgij

Note that those are *all* branch instructions, so it might
make sense to add any PC16DBL targetting a function symbol
to the list of calls, just in case.  (But since basically
nobody ever uses -msmall-exec, it doesn't really matter
much either.)


Ah right, I've added PC16DBL to the "always call" list.


Bye,
Ulrich



I've updated and resubmitted the gold patch.

Marcin Kościelnicki


Re: [PATCH] s390: Add -fsplit-stack support

2016-02-04 Thread Marcin Kościelnicki

On 04/02/16 16:06, Ulrich Weigand wrote:

Marcin Kościelnicki wrote:


Ugh. I take that back.  For -fPIC, the load-address sequence is:

  larl%r1,f@GOTENT
  lg  %r2,0(%r1)
  br  %r14


This is correct.


And (sibling) call sequence is:

  larl%r1,f@GOTENT
  lg  %r1,0(%r1)
  br  %r1


Oops.  That is actually a GCC bug.  The sibcall sequence really must be:

jg f@PLT

This is a real bug since it forces non-lazy symbol resolution for f
just because the compiler chose those use a sibcall optimization;
that's not supposed to happen.

It seems this bug was accidentally introduced here:

2010-04-20  Andreas Krebbel  

 PR target/43635
 * config/s390/s390.c (s390_emit_call): Turn direct into indirect
 calls for -fpic -m31 if they have been sibcall optimized.

since the patch doesn't check for TARGET_64BIT ...

Andreas, can you have a look?


It seems there's no proper way to recognize a call vs a load address -
so we can either go with emitting the marker, or have the same problem
as on ppc.

So - how much should we care?


I think we should fix that bug.  That won't help for existing objects,
but those don't use split stack either, so that shouldn't matter.

If we fix that bug before (or at the same time as) adding split-stack
support, the linker will still be able to distigunish function pointer
loads from calls (including sibcalls) on all objects using split stack.

Bye,
Ulrich




Fair enough.  Here's what I'm going to implement in gold:

- any PLT relocation: call
- PC32DBL on a larl: non-call
- PC32DBL otherwise: call
- any other relocation: non-call

Does that sound right?

Marcin Kościelnicki


Re: [PATCH] s390: Add -fsplit-stack support

2016-02-04 Thread Marcin Kościelnicki

On 03/02/16 18:27, Ulrich Weigand wrote:

Marcin Kościelnicki wrote:


libgcc/ChangeLog:

* config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
* config/s390/morestack.S: New file.
* config/s390/t-stack-s390: New file.
* generic-morestack.c (__splitstack_find): Add s390-specific code.

gcc/ChangeLog:

* common/config/s390/s390-common.c (s390_supports_split_stack):
New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
* config/s390/s390.c (struct machine_function): New field
split_stack_varargs_pointer.
(s390_register_info): Mark r12 as clobbered if it'll be used as temp
in s390_emit_prologue.
(s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
vararg pointer.
(morestack_ref): New global.
(SPLIT_STACK_AVAILABLE): New macro.
(s390_expand_split_stack_prologue): New function.
(s390_live_on_entry): New function.
(s390_va_start): Use split-stack vararg pointer if appropriate.
(s390_asm_file_end): Emit the split-stack note sections.
(TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
* config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
(UNSPECV_SPLIT_STACK_CALL): New unspec.
(UNSPECV_SPLIT_STACK_DATA): New unspec.
(split_stack_prologue): New expand.
(split_stack_space_check): New expand.
(split_stack_data): New insn.
(split_stack_call): New expand.
(split_stack_call_*): New insn.
(split_stack_cond_call): New expand.
(split_stack_cond_call_*): New insn.
---
Changes applied.  Testsuite still running, still works on my simple tests.

As for common code prerequisites: #3 is no longer needed, and very likely
so is #4 (it fixes problems that I've only seen with ESA mode, and testsuite
runs just fine without it now).


OK, I see.  The patch is OK for mainline then, assuming testing passes.


Well, testing passes (as in, is no worse than x86 - the testsuite 
doesn't really agree with -fsplit-stack in a few places involving 
backtraces).  However, there's still the libgo issue to be taken care 
of.  For my tests, I patched it up with:


diff --git a/libgo/runtime/proc.c b/libgo/runtime/proc.c
index c25a217..efa6806 100644
--- a/libgo/runtime/proc.c
+++ b/libgo/runtime/proc.c
@@ -2016,17 +2016,19 @@ doentersyscall()
m->locks++;

// Leave SP around for GC and traceback.
+   {
 #ifdef USING_SPLIT_STACK
-   g->gcstack = __splitstack_find(nil, nil, &g->gcstack_size,
-  &g->gcnext_segment, &g->gcnext_sp,
-  &g->gcinitial_sp);
+   size_t size_tmp;
+   g->gcstack = __splitstack_find(nil, nil, &size_tmp,
+  &g->gcnext_segment, 
&g->gcnext_sp,
+  &g->gcinitial_sp);
+   g->gcstack_size = size_tmp;
 #else
-   {
void *v;

g->gcnext_sp = (byte *) &v;
-   }
 #endif
+   }

g->status = Gsyscall;

@@ -2064,9 +2066,13 @@ runtime_entersyscallblock(void)

// Leave SP around for GC and traceback.
 #ifdef USING_SPLIT_STACK
-   g->gcstack = __splitstack_find(nil, nil, &g->gcstack_size,
-  &g->gcnext_segment, &g->gcnext_sp,
-  &g->gcinitial_sp);
+   {
+   size_t size_tmp;
+   g->gcstack = __splitstack_find(nil, nil, &size_tmp,
+  &g->gcnext_segment, 
&g->gcnext_sp,
+  &g->gcinitial_sp);
+   g->gcstack_size = size_tmp;
+   }
 #else
g->gcnext_sp = (byte *) &p;
 #endif

Andreas, did you have any luck with fixing this?  If not, I'll try 
submitting the above patch to gofrontend.




Thanks again,
Ulrich





Re: [PATCH] s390: Add -fsplit-stack support

2016-02-03 Thread Marcin Kościelnicki




The second issue I'm still not sure about is the magic nop marker
for frameless functions.  In an earlier mail you wrote:


Both currently supported
architectures always emit split-stack code on every function.


At least for rs6000 this doesn't appear to be true; in
rs6000_expand_split_stack_prologue we have:

   if (!info->push_p)
 return;

so it does nothing for frameless routines.

Now on i386 we do indeed generate code for frameless routines;
in fact, the *same* full stack check is generated as for any
other routine.  Now I'm wondering: is there are reason why
this check would be necessary (and there's simply a bug in
the rs6000 implementation)?  Then we obviously should do the
same on s390.


Try that on powerpc64(le):

$ cat a.c
#include 

void f(void) {
}

typedef void (*fptr)(void);

fptr g(void);

int main() {
 printf("%p\n", g());
}

$ cat b.c
void f(void);

typedef void (*fptr)(void);

fptr g(void) {
 return f;
}

$ gcc -O3 -fsplit-stack -c b.c
$ gcc -O3 -c a.c
$ gcc a.o b.o -fuse-ld=gold

I don't have a recent enough gcc for powerpc, but from what I've seen in
the code, this should explode with a linker error.

Of course, mixing split-stack and non-split-stack code when function
pointers are involved is sketchy anyway, so what's one more bug...

That said, for s390, we can avoid the above problem by checking the
relocation in gold now that ESA paths are gone - for direct function
calls (the only ones we care about), we should be seeing a relocation in
brasl.  So I'll remove the nopmark thing and add proper recognition in
gold.


Ugh. I take that back.  For -fPIC, the load-address sequence is:

larl%r1,f@GOTENT
lg  %r2,0(%r1)
br  %r14

And (sibling) call sequence is:

larl%r1,f@GOTENT
lg  %r1,0(%r1)
br  %r1

It seems there's no proper way to recognize a call vs a load address - 
so we can either go with emitting the marker, or have the same problem 
as on ppc.


So - how much should we care?





On the other hand, if rs6000 works fine *without* any code
in frameless routines, why wouldn't that work for s390 too?

Emitting a nop (that is always executed) still looks weird to me.


Bye,
Ulrich







[PATCH] s390: Add -fsplit-stack support

2016-02-03 Thread Marcin Kościelnicki
libgcc/ChangeLog:

* config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
* config/s390/morestack.S: New file.
* config/s390/t-stack-s390: New file.
* generic-morestack.c (__splitstack_find): Add s390-specific code.

gcc/ChangeLog:

* common/config/s390/s390-common.c (s390_supports_split_stack):
New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
* config/s390/s390.c (struct machine_function): New field
split_stack_varargs_pointer.
(s390_register_info): Mark r12 as clobbered if it'll be used as temp
in s390_emit_prologue.
(s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
vararg pointer.
(morestack_ref): New global.
(SPLIT_STACK_AVAILABLE): New macro.
(s390_expand_split_stack_prologue): New function.
(s390_live_on_entry): New function.
(s390_va_start): Use split-stack vararg pointer if appropriate.
(s390_asm_file_end): Emit the split-stack note sections.
(TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
* config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
(UNSPECV_SPLIT_STACK_CALL): New unspec.
(UNSPECV_SPLIT_STACK_DATA): New unspec.
(split_stack_prologue): New expand.
(split_stack_space_check): New expand.
(split_stack_data): New insn.
(split_stack_call): New expand.
(split_stack_call_*): New insn.
(split_stack_cond_call): New expand.
(split_stack_cond_call_*): New insn.
---
Changes applied.  Testsuite still running, still works on my simple tests.

As for common code prerequisites: #3 is no longer needed, and very likely
so is #4 (it fixes problems that I've only seen with ESA mode, and testsuite
runs just fine without it now).

 gcc/ChangeLog|  30 ++
 gcc/common/config/s390/s390-common.c |  14 +
 gcc/config/s390/s390-protos.h|   1 +
 gcc/config/s390/s390.c   | 214 +++-
 gcc/config/s390/s390.md  | 138 
 libgcc/ChangeLog |   7 +
 libgcc/config.host   |   4 +-
 libgcc/config/s390/morestack.S   | 609 +++
 libgcc/config/s390/t-stack-s390  |   2 +
 libgcc/generic-morestack.c   |   4 +
 10 files changed, 1016 insertions(+), 7 deletions(-)
 create mode 100644 libgcc/config/s390/morestack.S
 create mode 100644 libgcc/config/s390/t-stack-s390

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 92db764..8e3f9f7 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,33 @@
+2016-02-03  Marcin Kościelnicki  
+
+   * common/config/s390/s390-common.c (s390_supports_split_stack):
+   New function.
+   (TARGET_SUPPORTS_SPLIT_STACK): New macro.
+   * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
+   * config/s390/s390.c (struct machine_function): New field
+   split_stack_varargs_pointer.
+   (s390_register_info): Mark r12 as clobbered if it'll be used as temp
+   in s390_emit_prologue.
+   (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
+   vararg pointer.
+   (morestack_ref): New global.
+   (SPLIT_STACK_AVAILABLE): New macro.
+   (s390_expand_split_stack_prologue): New function.
+   (s390_live_on_entry): New function.
+   (s390_va_start): Use split-stack vararg pointer if appropriate.
+   (s390_asm_file_end): Emit the split-stack note sections.
+   (TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
+   * config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
+   (UNSPECV_SPLIT_STACK_CALL): New unspec.
+   (UNSPECV_SPLIT_STACK_DATA): New unspec.
+   (split_stack_prologue): New expand.
+   (split_stack_space_check): New expand.
+   (split_stack_data): New insn.
+   (split_stack_call): New expand.
+   (split_stack_call_*): New insn.
+   (split_stack_cond_call): New expand.
+   (split_stack_cond_call_*): New insn.
+
 2016-02-03  Kirill Yukhin  
 
PR target/69118
diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index 4519c21..1e497e6 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -105,6 +105,17 @@ s390_handle_option (struct gcc_options *opts 
ATTRIBUTE_UNUSED,
 }
 }
 
+/* -fsplit-stack uses a field in the TCB, available with glibc-2.23.
+   We don't verify it, since earlier versions just have padding at
+   its place, which works just as well.  */
+
+static bool
+s390_supports_split_stack (bool report ATTRIBUTE_UNUSED,
+  struct gcc_options *opts ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
 #undef TARGET_DEFAULT_TARGET_FLAGS
 #define TARGET_DEFAULT_TARGET_FLAGS (TARGET_DEFAULT)
 
@@ -117,4 +128,7 @@ s390_handle_option (struct gcc_options *opts 
ATTRI

[PATCH] s390: Add -fsplit-stack support

2016-02-02 Thread Marcin Kościelnicki
libgcc/ChangeLog:

* config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
* config/s390/morestack.S: New file.
* config/s390/t-stack-s390: New file.
* generic-morestack.c (__splitstack_find): Add s390-specific code.

gcc/ChangeLog:

* common/config/s390/s390-common.c (s390_supports_split_stack):
New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
* config/s390/s390.c (struct machine_function): New field
split_stack_varargs_pointer.
(s390_register_info): Mark r12 as clobbered if it'll be used as temp
in s390_emit_prologue.
(s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
vararg pointer.
(morestack_ref): New global.
(SPLIT_STACK_AVAILABLE): New macro.
(s390_expand_split_stack_prologue): New function.
(s390_live_on_entry): New function.
(s390_va_start): Use split-stack vararg pointer if appropriate.
(s390_asm_file_end): Emit the split-stack note sections.
(TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
* config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
(UNSPECV_SPLIT_STACK_CALL): New unspec.
(UNSPECV_SPLIT_STACK_DATA): New unspec.
(split_stack_prologue): New expand.
(split_stack_space_check): New expand.
(split_stack_data): New insn.
(split_stack_call): New expand.
(split_stack_call_*): New insn.
(split_stack_cond_call): New expand.
(split_stack_cond_call_*): New insn.
---
Comment fixed, split_stack_marker gone, reorg gone.  Generated code seems sane,
but testsuite still running.

I will need to modify the gold patch to handle the "leaf function taking 
non-split
stack function address" issue - this will likely require messing with the target
independent plumbing, the hook for that doesn't seem to get enough params.

 gcc/ChangeLog|  30 ++
 gcc/common/config/s390/s390-common.c |  14 +
 gcc/config/s390/s390-protos.h|   1 +
 gcc/config/s390/s390.c   | 214 +++-
 gcc/config/s390/s390.md  | 138 
 libgcc/ChangeLog |   7 +
 libgcc/config.host   |   4 +-
 libgcc/config/s390/morestack.S   | 609 +++
 libgcc/config/s390/t-stack-s390  |   2 +
 libgcc/generic-morestack.c   |   4 +
 10 files changed, 1016 insertions(+), 7 deletions(-)
 create mode 100644 libgcc/config/s390/morestack.S
 create mode 100644 libgcc/config/s390/t-stack-s390

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9a2cec8..568dff4 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,33 @@
+2016-02-02  Marcin Kościelnicki  
+
+   * common/config/s390/s390-common.c (s390_supports_split_stack):
+   New function.
+   (TARGET_SUPPORTS_SPLIT_STACK): New macro.
+   * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
+   * config/s390/s390.c (struct machine_function): New field
+   split_stack_varargs_pointer.
+   (s390_register_info): Mark r12 as clobbered if it'll be used as temp
+   in s390_emit_prologue.
+   (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
+   vararg pointer.
+   (morestack_ref): New global.
+   (SPLIT_STACK_AVAILABLE): New macro.
+   (s390_expand_split_stack_prologue): New function.
+   (s390_live_on_entry): New function.
+   (s390_va_start): Use split-stack vararg pointer if appropriate.
+   (s390_asm_file_end): Emit the split-stack note sections.
+   (TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
+   * config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
+   (UNSPECV_SPLIT_STACK_CALL): New unspec.
+   (UNSPECV_SPLIT_STACK_DATA): New unspec.
+   (split_stack_prologue): New expand.
+   (split_stack_space_check): New expand.
+   (split_stack_data): New insn.
+   (split_stack_call): New expand.
+   (split_stack_call_*): New insn.
+   (split_stack_cond_call): New expand.
+   (split_stack_cond_call_*): New insn.
+
 2016-02-02  Thomas Schwinge  
 
* omp-builtins.def (BUILT_IN_GOACC_HOST_DATA): Remove.
diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index 4519c21..1e497e6 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -105,6 +105,17 @@ s390_handle_option (struct gcc_options *opts 
ATTRIBUTE_UNUSED,
 }
 }
 
+/* -fsplit-stack uses a field in the TCB, available with glibc-2.23.
+   We don't verify it, since earlier versions just have padding at
+   its place, which works just as well.  */
+
+static bool
+s390_supports_split_stack (bool report ATTRIBUTE_UNUSED,
+  struct gcc_options *opts ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
 #undef TARGET

Re: [PATCH] s390: Add -fsplit-stack support

2016-02-02 Thread Marcin Kościelnicki

On 02/02/16 19:33, Ulrich Weigand wrote:

Marcin Kościelnicki wrote:


Here we go.  I've also removed the "see below", since I don't really
see anything below...


The "see below" refers to this code (which I agree isn't really obvious):

   if (TARGET_TPF_PROFILING)
 {
   /* Generate a BAS instruction to serve as a function
  entry intercept to facilitate the use of tracing
  algorithms located at the branch target.  */
   emit_insn (gen_prologue_tpf ());

What is not explicitly called out here is that this tracing function
actually refers to some hard registers, in particular r14, and assumes
they still have the original contents as at function entry.

That is why the prolog code avoid using r14 as temporary if the TPF
tracing mechanism is in use.  Now I think this doesn't apply to r12,
so this part of your patch should still be fine.  (In addition, TPF
is not going to support split stacks --or indeed the Go language--
anyway, so it doesn't really matter all that much.)


Very well, I'll improve the comment.



I do have two other issues; sorry for bringing those up again although
they've been discussed up in the past, but I still think we can find
some improvements here ...

The first is the question Andreas brought up, why we need the extra
set of insns introduced by s390_reorg.  I think this may really have
been necessary for the ESA case where data elements had to be intermixed
into code at a specific location.  But since we no longer support ESA,
we now just have a data block that can be placed anywhere.  For example,
we could just have an insn (at any point in the prolog stream) that
simply emits the full data block during final output, along the lines of
(note: needs to be updated for SImode vs. DImode.):

(define_insn "split_stack_data"
   [(unspec_volatile [(match_operand 0 "bras_sym_operand" "X")
  (match_operand 1 "bras_sym_operand" "X")
  (match_operand 2 "consttable_operand" "X")
  (match_operand 3 "consttable_operand" "X")]
 UNSPECV_SPLIT_STACK_DATA)]
   ""
{
   switch_to_section (targetm.asm_out.function_rodata_section
   (current_function_decl));

   output_asm_insn (\".align 3", operands);
   (*targetm.asm_out.internal_label) (asm_out_file, \"L\",
  CODE_LABEL_NUMBER (operands[0]));
   output_asm_insn (\".quad %2\", operands);
   output_asm_insn (\".quad %3\", operands);
   output_asm_insn (\".quad %1-%0\", operands);

   switch_to_section (current_function_section ());
   return "";
}
   [(set_attr "length" "0")])

Or possibly even cleaner, we can simply define the data block at the
tree level as if it were an initialized global variable of a certain
struct type, and just leave it to common code to emit it as usual.

Then we just have the code bits, but I don't really see much
difference between the split_stack_call and split_stack_sibcall
patterns (apart from the data block), so if code flow is OK with
the former insns, it should be OK with the latter too ..

[ Or else, if there *are* code flow issues, the other alternative
would be to emit the full call sequence, code and data, from a
single insn pattern during final output.  This might have the extra
benefit that the assembler sequence is fully fixed, and thus easier
to detect in the linker.  ]

Getting rid of the extra transformation in s390_reorg would not
just remove a bunch of code from the back-end (always good!),
it would also speed up compile time a bit.


When I wasn't using reorg, I had problems with gcc deleting the label in 
.rodata, since it wasn't used by any jump instruction.  I guess having a 
whole-block instruction that emits the label on its own should solve the 
issue, though - let's try that.



The second issue I'm still not sure about is the magic nop marker
for frameless functions.  In an earlier mail you wrote:


Both currently supported
architectures always emit split-stack code on every function.


At least for rs6000 this doesn't appear to be true; in
rs6000_expand_split_stack_prologue we have:

   if (!info->push_p)
 return;

so it does nothing for frameless routines.

Now on i386 we do indeed generate code for frameless routines;
in fact, the *same* full stack check is generated as for any
other routine.  Now I'm wondering: is there are reason why
this check would be necessary (and there's simply a bug in
the rs6000 implementation)?  Then we obviously should do the
same on s390.


Try that on powerpc64(le):

$ cat a.c
#include 

void f(void) {
}

typedef void (*fptr)(void);

fptr g(void);

int main() {
printf("%p\n", g());
}

$ cat b.c
void f(void);

typedef void (*fptr)(void);

fptr g(void) {
return f;
}

$ gcc -O3 -fsplit-stack -c b.c
$ gcc -O3 -c a.c
$ gcc a.o b.o -fuse-ld=gold

I don't have a recent enough gcc for powerpc, but from what I've seen in 
the code, this should explode with a linker error.


Of course, mixin

[PATCH] s390: Add -fsplit-stack support

2016-02-02 Thread Marcin Kościelnicki
libgcc/ChangeLog:

* config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
* config/s390/morestack.S: New file.
* config/s390/t-stack-s390: New file.
* generic-morestack.c (__splitstack_find): Add s390-specific code.

gcc/ChangeLog:

* common/config/s390/s390-common.c (s390_supports_split_stack):
New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
* config/s390/s390.c (struct machine_function): New field
split_stack_varargs_pointer.
(s390_register_info): Mark r12 as clobbered if it'll be used as temp
in s390_emit_prologue.
(s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
vararg pointer.
(morestack_ref): New global.
(SPLIT_STACK_AVAILABLE): New macro.
(s390_expand_split_stack_prologue): New function.
(s390_expand_split_stack_call): New function.
(s390_live_on_entry): New function.
(s390_va_start): Use split-stack vararg pointer if appropriate.
(s390_reorg): Lower the split-stack pseudo-insns.
(s390_asm_file_end): Emit the split-stack note sections.
(TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
* config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
(UNSPECV_SPLIT_STACK_CALL): New unspec.
(UNSPECV_SPLIT_STACK_SIBCALL): New unspec.
(UNSPECV_SPLIT_STACK_MARKER): New unspec.
(split_stack_prologue): New expand.
(split_stack_call): New expand.
(split_stack_call_*): New insn.
(split_stack_cond_call): New expand.
(split_stack_cond_call_*): New insn.
(split_stack_space_check): New expand.
(split_stack_sibcall): New expand.
(split_stack_sibcall_*): New insn.
(split_stack_cond_sibcall): New expand.
(split_stack_cond_sibcall_*): New insn.
(split_stack_marker): New insn.
---
Here we go.  I've also removed the "see below", since I don't really
see anything below...

 gcc/ChangeLog|  37 +++
 gcc/common/config/s390/s390-common.c |  14 +
 gcc/config/s390/s390-protos.h|   1 +
 gcc/config/s390/s390.c   | 323 ++-
 gcc/config/s390/s390.md  | 177 ++
 libgcc/ChangeLog |   7 +
 libgcc/config.host   |   4 +-
 libgcc/config/s390/morestack.S   | 609 +++
 libgcc/config/s390/t-stack-s390  |   2 +
 libgcc/generic-morestack.c   |   4 +
 10 files changed, 1171 insertions(+), 7 deletions(-)
 create mode 100644 libgcc/config/s390/morestack.S
 create mode 100644 libgcc/config/s390/t-stack-s390

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9a2cec8..af86079 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,40 @@
+2016-02-02  Marcin Kościelnicki  
+
+   * common/config/s390/s390-common.c (s390_supports_split_stack):
+   New function.
+   (TARGET_SUPPORTS_SPLIT_STACK): New macro.
+   * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
+   * config/s390/s390.c (struct machine_function): New field
+   split_stack_varargs_pointer.
+   (s390_register_info): Mark r12 as clobbered if it'll be used as temp
+   in s390_emit_prologue.
+   (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
+   vararg pointer.
+   (morestack_ref): New global.
+   (SPLIT_STACK_AVAILABLE): New macro.
+   (s390_expand_split_stack_prologue): New function.
+   (s390_expand_split_stack_call): New function.
+   (s390_live_on_entry): New function.
+   (s390_va_start): Use split-stack vararg pointer if appropriate.
+   (s390_reorg): Lower the split-stack pseudo-insns.
+   (s390_asm_file_end): Emit the split-stack note sections.
+   (TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
+   * config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
+   (UNSPECV_SPLIT_STACK_CALL): New unspec.
+   (UNSPECV_SPLIT_STACK_SIBCALL): New unspec.
+   (UNSPECV_SPLIT_STACK_MARKER): New unspec.
+   (split_stack_prologue): New expand.
+   (split_stack_call): New expand.
+   (split_stack_call_*): New insn.
+   (split_stack_cond_call): New expand.
+   (split_stack_cond_call_*): New insn.
+   (split_stack_space_check): New expand.
+   (split_stack_sibcall): New expand.
+   (split_stack_sibcall_*): New insn.
+   (split_stack_cond_sibcall): New expand.
+   (split_stack_cond_sibcall_*): New insn.
+   (split_stack_marker): New insn.
+
 2016-02-02  Thomas Schwinge  
 
* omp-builtins.def (BUILT_IN_GOACC_HOST_DATA): Remove.
diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index 4519c21..1e497e6 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -105,6 +105,17 @@ s390_han

[PATCH] s390: Add -fsplit-stack support

2016-02-02 Thread Marcin Kościelnicki
libgcc/ChangeLog:

* config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
* config/s390/morestack.S: New file.
* config/s390/t-stack-s390: New file.
* generic-morestack.c (__splitstack_find): Add s390-specific code.

gcc/ChangeLog:

* common/config/s390/s390-common.c (s390_supports_split_stack):
New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
* config/s390/s390.c (struct machine_function): New field
split_stack_varargs_pointer.
(s390_register_info): Mark r12 as clobbered if it'll be used as temp
in s390_emit_prologue.
(s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
vararg pointer.
(morestack_ref): New global.
(SPLIT_STACK_AVAILABLE): New macro.
(s390_expand_split_stack_prologue): New function.
(s390_expand_split_stack_call): New function.
(s390_live_on_entry): New function.
(s390_va_start): Use split-stack vararg pointer if appropriate.
(s390_reorg): Lower the split-stack pseudo-insns.
(s390_asm_file_end): Emit the split-stack note sections.
(TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
* config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
(UNSPECV_SPLIT_STACK_CALL): New unspec.
(UNSPECV_SPLIT_STACK_SIBCALL): New unspec.
(UNSPECV_SPLIT_STACK_MARKER): New unspec.
(split_stack_prologue): New expand.
(split_stack_call): New expand.
(split_stack_call_*): New insn.
(split_stack_cond_call): New expand.
(split_stack_cond_call_*): New insn.
(split_stack_space_check): New expand.
(split_stack_sibcall): New expand.
(split_stack_sibcall_*): New insn.
(split_stack_cond_sibcall): New expand.
(split_stack_cond_sibcall_*): New insn.
(split_stack_marker): New insn.
---
I've implemented most of your requested changes, with two exceptions:

- I don't use virtual_incoming_args_rtx in s390_expand_split_stack_prologue,
  since this causes constraint error - I suppose it just cannot be used after
  reload.
- It seems to me there's no problem with TPF and r1 - the conditional you
  mention is meant to avoid modifying r14 (which we do - by aiming at r1 and
  r12 for arg pointer and temp, respectively), not to ensure use of r1 as the
  temporary.  Unless there's a good reason to avoid modifying r12, the code
  seems fine to me.

As for the testcase we discussed, I'll submit it as a separate patch.


 gcc/ChangeLog|  37 +++
 gcc/common/config/s390/s390-common.c |  14 +
 gcc/config/s390/s390-protos.h|   1 +
 gcc/config/s390/s390.c   | 321 +-
 gcc/config/s390/s390.md  | 177 ++
 libgcc/ChangeLog |   7 +
 libgcc/config.host   |   4 +-
 libgcc/config/s390/morestack.S   | 609 +++
 libgcc/config/s390/t-stack-s390  |   2 +
 libgcc/generic-morestack.c   |   4 +
 10 files changed, 1170 insertions(+), 6 deletions(-)
 create mode 100644 libgcc/config/s390/morestack.S
 create mode 100644 libgcc/config/s390/t-stack-s390

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9a2cec8..af86079 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,40 @@
+2016-02-02  Marcin Kościelnicki  
+
+   * common/config/s390/s390-common.c (s390_supports_split_stack):
+   New function.
+   (TARGET_SUPPORTS_SPLIT_STACK): New macro.
+   * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
+   * config/s390/s390.c (struct machine_function): New field
+   split_stack_varargs_pointer.
+   (s390_register_info): Mark r12 as clobbered if it'll be used as temp
+   in s390_emit_prologue.
+   (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
+   vararg pointer.
+   (morestack_ref): New global.
+   (SPLIT_STACK_AVAILABLE): New macro.
+   (s390_expand_split_stack_prologue): New function.
+   (s390_expand_split_stack_call): New function.
+   (s390_live_on_entry): New function.
+   (s390_va_start): Use split-stack vararg pointer if appropriate.
+   (s390_reorg): Lower the split-stack pseudo-insns.
+   (s390_asm_file_end): Emit the split-stack note sections.
+   (TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
+   * config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
+   (UNSPECV_SPLIT_STACK_CALL): New unspec.
+   (UNSPECV_SPLIT_STACK_SIBCALL): New unspec.
+   (UNSPECV_SPLIT_STACK_MARKER): New unspec.
+   (split_stack_prologue): New expand.
+   (split_stack_call): New expand.
+   (split_stack_call_*): New insn.
+   (split_stack_cond_call): New expand.
+   (split_stack_cond_call_*): New insn.
+   (split_stack_space_check): New expand.
+ 

Re: [PATCH] s390: Add -fsplit-stack support

2016-01-29 Thread Marcin Kościelnicki

On 29/01/16 14:33, Andreas Krebbel wrote:

Hi Marcin,

sorry for the late feedback.

A few comments regarding the split stack implementation:

The GNU coding style requires to replace every 8 leading blanks on a
line with a tab.  There are many lines in your patch violating this.
In case you are an emacs user `whitespace-cleanup' will fix this for
you.


OK, will do.


Could you please add a testcase checking the different
variants. I.e. with early exit, no-alloc in __morestack, and with an
actual allocation?


The testsuite with -fsplit-stack already hits all of them, and checking 
them manually is rather tricky (I don't know if it could be done in 
target-independent way at all), but I think it'd be reasonable to make 
assembly testcases calling __morestack for the last two cases, to check 
if the registers are being preserved, etc.




There are a few more comments inline.

Bye,

-Andreas-


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c881d52..71f6f38 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,38 @@
  2016-01-16  Marcin Kościelnicki  

+   * common/config/s390/s390-common.c (s390_supports_split_stack):
+   New function.
+   (TARGET_SUPPORTS_SPLIT_STACK): New macro.
+   * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
+   * config/s390/s390.c (struct machine_function): New field
+   split_stack_varargs_pointer.
+   (s390_register_info): Mark r12 as clobbered if it'll be used as temp
+   in s390_emit_prologue.
+   (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
+   vararg pointer.
+   (morestack_ref): New global.
+   (SPLIT_STACK_AVAILABLE): New macro.
+   (s390_expand_split_stack_prologue): New function.
+   (s390_expand_split_stack_call): New function.
+   (s390_live_on_entry): New function.
+   (s390_va_start): Use split-stack vararg pointer if appropriate.
+   (s390_reorg): Lower the split-stack pseudo-insns.
+   (s390_asm_file_end): Emit the split-stack note sections.
+   (TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
+   * config/s390/s390.md: (UNSPEC_STACK_CHECK): New unspec.
+   (UNSPECV_SPLIT_STACK_CALL): New unspec.
+   (UNSPECV_SPLIT_STACK_SIBCALL): New unspec.
+   (UNSPECV_SPLIT_STACK_MARKER): New unspec.
+   (split_stack_prologue): New expand.
+   (split_stack_call_*): New insn.
+   (split_stack_cond_call_*): New insn.
+   (split_stack_space_check): New expand.
+   (split_stack_sibcall_*): New insn.
+   (split_stack_cond_sibcall_*): New insn.
+   (split_stack_marker): New insn.
+
+2016-01-02  Marcin Kościelnicki  
+
* cfgrtl.c (rtl_tidy_fallthru_edge): Bail for unconditional jumps
with side effects.

diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index 4519c21..1e497e6 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -105,6 +105,17 @@ s390_handle_option (struct gcc_options *opts 
ATTRIBUTE_UNUSED,
  }
  }

+/* -fsplit-stack uses a field in the TCB, available with glibc-2.23.
+   We don't verify it, since earlier versions just have padding at
+   its place, which works just as well.  */
+
+static bool
+s390_supports_split_stack (bool report ATTRIBUTE_UNUSED,
+  struct gcc_options *opts ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
  #undef TARGET_DEFAULT_TARGET_FLAGS
  #define TARGET_DEFAULT_TARGET_FLAGS (TARGET_DEFAULT)

@@ -117,4 +128,7 @@ s390_handle_option (struct gcc_options *opts 
ATTRIBUTE_UNUSED,
  #undef TARGET_OPTION_INIT_STRUCT
  #define TARGET_OPTION_INIT_STRUCT s390_option_init_struct

+#undef TARGET_SUPPORTS_SPLIT_STACK
+#define TARGET_SUPPORTS_SPLIT_STACK s390_supports_split_stack
+
  struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index 633bc1e..09032c9 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -42,6 +42,7 @@ extern bool s390_handle_option (struct gcc_options *opts 
ATTRIBUTE_UNUSED,
  extern HOST_WIDE_INT s390_initial_elimination_offset (int, int);
  extern void s390_emit_prologue (void);
  extern void s390_emit_epilogue (bool);
+extern void s390_expand_split_stack_prologue (void);
  extern bool s390_can_use_simple_return_insn (void);
  extern bool s390_can_use_return_insn (void);
  extern void s390_function_profiler (FILE *, int);
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 3be64de..6afce7c 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -426,6 +426,13 @@ struct GTY(()) machine_function
/* True if the current function may contain a tbegin clobbering
   FPRs.  */
bool tbegin_p;
+
+  /* For -fsplit-stack support: A stack local which holds a pointer to
+ the stack arguments for a function with a variable number of
+ arguments.  This is set at the start of

Re: [PATCH 4/5] Don't mark targets of unconditional jumps with side effects as FALLTHRU.

2016-01-22 Thread Marcin Kościelnicki

On 22/01/16 08:44, Andreas Krebbel wrote:

On 01/22/2016 12:10 AM, Jeff Law wrote:

On 01/21/2016 03:05 AM, Andreas Krebbel wrote:

On 01/02/2016 08:16 PM, Marcin Kościelnicki wrote:

When an unconditional jump with side effects targets an immediately
following label, rtl_tidy_fallthru_edge is called.  Since it has side
effects, it doesn't remove the jump, but the label is still marked
as fallthru.  This later causes a verification error.  Do nothing in this
case instead.

gcc/ChangeLog:

* cfgrtl.c (rtl_tidy_fallthru_edge): Bail for unconditional jumps
with side effects.


The change looks ok to me (although I'm not able to approve it). Could you 
please run regressions
tests on x86_64 with that change?

Perhaps a short comment in the code would be good.

I think the patch is technically fine, the question is does it fix a
visible bug?  I read the series as new feature enablement so I put this
patch into my gcc7 queue.


We need the patch for the S/390 split-stack implementation which we would like 
to see in GCC 6.  I'm
aware that this isn't stage 3 material but people seem to have reasons to 
really want split stack on
S/390 asap and we would have to backport this feature anyway. Therefore I would 
prefer to have it in
the official release already. That's the only common code change we would need 
for that.

I've started a bootstrap and regression test for the patch also on Power.

Do you see a chance we can get this into GCC 6?

Bye,

-Andreas-



I've tested the patch on x86_64, no regressions.

I'm not entirely sure if the patch needs to go in for the current 
version of split-stack support.


This patch fixed a showstopper bug on g5 CPUs when the patch still 
supported them.  I haven't seen this bug with the z900 sequences (which 
are now the only ones left), but since we're still using unconditional 
jumps with side effects, I left it in just to be safe.  The testsuite 
passes on s390x -fsplit-stack both with the patch and without it.


So, I don't know.  It seems to work now, probably because no 
optimization pass has a reason to touch that jump, but it may start to 
fail if someone adds a new optimization that tries to be smart with our 
prologue.


Re: [PATCH 5/5] s390: Add -fsplit-stack support

2016-01-21 Thread Marcin Kościelnicki

On 21/01/16 11:12, Andreas Krebbel wrote:

On 01/15/2016 10:08 PM, Marcin Kościelnicki wrote:

On 15/01/16 19:38, Andreas Krebbel wrote:

Marcin,

your implementation looks very good to me. Thanks!

But please be aware that we deprecated the support of g5 and g6 and intend to 
remove that code from
the back-end with the next GCC version.  So I would prefer if you could remove 
all the
!TARGET_CPU_ZARCH stuff from the implementation and just error out if 
split-stack is enabled with
-march g5/g6.  It currently makes the implementation more complicated and would 
have to be removed
anyway in the future.

Thanks!

https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01854.html


Bye,

-Andreas-




Very well, I'll do that.

Btw, as for dropping support for g5/g6: I've noticed
s390_function_profiler could also use larl+brasl for -m31 given
TARGET_CPU_ZARCH.  Should I submit a patch for that?  I'm asking because
gold with -fsplit-stack needs to know the exact sequence used, so if
it's going to change after g5/g6 removal, I'd better add it to gold now
(and make gcc always emit it for non-g5/g6, so that gold won't need to
look at the old one).


Yes please, that would be great. Good catch!

Thanks!

-Andreas-



I've submitted the gcc patch, and will soon update the gold patch.

Marcin Kościelnicki


[PATCH] s390: New mcount call sequence for z900+ CPUs in 31-bit mode.

2016-01-21 Thread Marcin Kościelnicki
On TARGET_CPU_ZARCH && !TARGET_64BIT, we can use a similiar lean mcount
call sequence to TARGET_64BIT.  The longer sequences are now used only
on deprecated g5/g6 CPUs.

Tested on s390-ibm-linux-gnu on RHEL 7.2.

gcc/ChangeLog:

* config/s390/s390.c (s390_function_profiler): Add a new sequence
for z900+ CPUs in 31-bit mode.
---
This change was mentioned in the s390 split-stack thread.

 gcc/ChangeLog  | 5 +
 gcc/config/s390/s390.c | 7 +++
 2 files changed, 12 insertions(+)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 0e77409..94b9bd0 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2016-01-21  Marcin Kościelnicki  
+
+   * config/s390/s390.c (s390_function_profiler): Add a new sequence
+   for z900+ CPUs in 31-bit mode.
+
 2016-01-21  Richard Biener  
 
* graphite-optimize-isl.c (get_schedule_map): Fix typo.
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 3be64de..eb26f18 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -11974,6 +11974,13 @@ s390_function_profiler (FILE *file, int labelno)
   output_asm_insn ("brasl\t%0,%4", op);
   output_asm_insn ("lg\t%0,%1", op);
 }
+  else if (TARGET_CPU_ZARCH)
+{
+  output_asm_insn ("st\t%0,%1", op);
+  output_asm_insn ("larl\t%2,%3", op);
+  output_asm_insn ("brasl\t%0,%4", op);
+  output_asm_insn ("l\t%0,%1", op);
+}
   else if (!flag_pic)
 {
   op[6] = gen_label_rtx ();
-- 
2.7.0



Re: [PATCH 4/5] Don't mark targets of unconditional jumps with side effects as FALLTHRU.

2016-01-21 Thread Marcin Kościelnicki

On 21/01/16 11:05, Andreas Krebbel wrote:

On 01/02/2016 08:16 PM, Marcin Kościelnicki wrote:

When an unconditional jump with side effects targets an immediately
following label, rtl_tidy_fallthru_edge is called.  Since it has side
effects, it doesn't remove the jump, but the label is still marked
as fallthru.  This later causes a verification error.  Do nothing in this
case instead.

gcc/ChangeLog:

* cfgrtl.c (rtl_tidy_fallthru_edge): Bail for unconditional jumps
with side effects.


The change looks ok to me (although I'm not able to approve it). Could you 
please run regressions
tests on x86_64 with that change?

Perhaps a short comment in the code would be good.

-Andreas-



OK, I'll run the testsuite and add a comment.


Re: [PATCH 2/5] s390: Fix missing .size directives.

2016-01-21 Thread Marcin Kościelnicki

On 21/01/16 10:58, Andreas Krebbel wrote:

On 01/20/2016 02:16 PM, Andreas Krebbel wrote:

On 01/02/2016 08:16 PM, Marcin Kościelnicki wrote:

It seems at some point the .size hook was hijacked to emit some
machine-specific directives, and the actual .size directive was
forgotten.  This caused problems for split-stack support, since
linker couldn't scan the function body for non-split-stack calls.

gcc/ChangeLog:

* config/s390/s390.c (s390_asm_declare_function_size): Add code
to actually emit the .size directive.


...


  s390_asm_declare_function_size (FILE *asm_out_file,
-   const char *fnname ATTRIBUTE_UNUSED, tree decl)
+   const char *fnname, tree decl)
  {
+  if (!flag_inhibit_size_directive)
+ASM_OUTPUT_MEASURED_SIZE (asm_out_file, fnname);
if (DECL_FUNCTION_SPECIFIC_TARGET (decl) == NULL)
  return;
fprintf (asm_out_file, "\t.machine pop\n");


It would be good to use the original ASM_DECLARE_FUNCTION_SIZE macro from 
config/elfos.h here.  This
probably would require to change its name in s390.h first and then use it from
s390_asm_declare_function_size. Not really beautiful but at least changes to 
the original macro
would not require adjusting our backend.


I've looked into how the other archs are doing this and didn't find anything 
better than just
including the code from the original macro. The real fix probably would be to 
turn this into a
target hook instead.

I've committed the patch now since it fixes a real problem not only with 
split-stack.

Thanks!

-Andreas-



I did a version that reincludes elfos.h, but it didn't finish testing 
(it made it through bootstrap) before you applied the patch:


diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 21a5687..c56b909 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6832,10 +6832,17 @@ s390_asm_output_function_prefix (FILE *asm_out_file,

 /* Write an extra function footer after the very end of the function.  */

+/* Get elfos.h's original ASM_DECLARE_FUNCTION_SIZE, so that we can 
delegate

+   to it below.  */
+
+#undef ASM_DECLARE_FUNCTION_SIZE
+#include "../elfos.h"
+
 void
 s390_asm_declare_function_size (FILE *asm_out_file,
-   const char *fnname ATTRIBUTE_UNUSED, 
tree decl)

+   const char *fnname, tree decl)
 {
+  ASM_DECLARE_FUNCTION_SIZE(asm_out_file, fnname, decl);
   if (DECL_FUNCTION_SPECIFIC_TARGET (decl) == NULL)
 return;
   fprintf (asm_out_file, "\t.machine pop\n");


But, this is much uglier, and the macro is very unlikely to change in 
the first place.  I guess we should stay with the applied patch.


Thanks,

Marcin


Re: [PATCH 1/5] s390: Use proper read-only data section for literals.

2016-01-20 Thread Marcin Kościelnicki

On 20/01/16 14:11, Andreas Krebbel wrote:

On 01/02/2016 08:16 PM, Marcin Kościelnicki wrote:

Previously, .rodata was hardcoded.  For C++ vague linkage functions,
this resulted in needlessly duplicated literals.  With the new split
stack support, this resulted in link errors, due to .rodata containing
relocations to the discarded text sections.

gcc/ChangeLog:

* config/s390/s390.md (pool_section_start): Use switch_to_section
to select proper read-only data section instead of hardcoding .rodata.
(pool_section_end): Use switch_to_section to match the above.
---
  gcc/ChangeLog   |  6 ++
  gcc/config/s390/s390.md | 11 +--
  2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 23ce209..2c572a7 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2016-01-02  Marcin Kościelnicki  
+
+   * config/s390/s390.md (pool_section_start): Use switch_to_section
+   to select proper read-only data section instead of hardcoding .rodata.
+   (pool_section_end): Use switch_to_section to match the above.
+


This is ok if bootstrap and regression tests are clean. Thanks!

-Andreas-




The bootstrap and regression tests are indeed clean for this patch and 
#2.  I don't have commit access to gcc repo, how do I get this pushed?


Marcin Kościelnicki


[PATCH] s390: Add -fsplit-stack support

2016-01-16 Thread Marcin Kościelnicki
libgcc/ChangeLog:

* config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
* config/s390/morestack.S: New file.
* config/s390/t-stack-s390: New file.
* generic-morestack.c (__splitstack_find): Add s390-specific code.

gcc/ChangeLog:

* common/config/s390/s390-common.c (s390_supports_split_stack):
New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
* config/s390/s390.c (struct machine_function): New field
split_stack_varargs_pointer.
(s390_register_info): Mark r12 as clobbered if it'll be used as temp
in s390_emit_prologue.
(s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
vararg pointer.
(morestack_ref): New global.
(SPLIT_STACK_AVAILABLE): New macro.
(s390_expand_split_stack_prologue): New function.
(s390_expand_split_stack_call): New function.
(s390_live_on_entry): New function.
(s390_va_start): Use split-stack vararg pointer if appropriate.
(s390_reorg): Lower the split-stack pseudo-insns.
(s390_asm_file_end): Emit the split-stack note sections.
(TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
* config/s390/s390.md: (UNSPEC_STACK_CHECK): New unspec.
(UNSPECV_SPLIT_STACK_CALL): New unspec.
(UNSPECV_SPLIT_STACK_SIBCALL): New unspec.
(UNSPECV_SPLIT_STACK_MARKER): New unspec.
(split_stack_prologue): New expand.
(split_stack_call_*): New insn.
(split_stack_cond_call_*): New insn.
(split_stack_space_check): New expand.
(split_stack_sibcall_*): New insn.
(split_stack_cond_sibcall_*): New insn.
(split_stack_marker): New insn.
---
Support for !TARGET_CPU_ZARCH removed and sorried.  I've also cleaned up
the 31-bit versions of morestack.S routines to more closely mirror their
64-bit counterparts, since I can now use the newer opcodes.

I'm also submitting a new version of the gold patch, which has support for
old CPUs likewise removed.

 gcc/ChangeLog|  33 ++
 gcc/common/config/s390/s390-common.c |  14 +
 gcc/config/s390/s390-protos.h|   1 +
 gcc/config/s390/s390.c   | 371 -
 gcc/config/s390/s390.md  | 109 +++
 libgcc/ChangeLog |   7 +
 libgcc/config.host   |   4 +-
 libgcc/config/s390/morestack.S   | 609 +++
 libgcc/config/s390/t-stack-s390  |   2 +
 libgcc/generic-morestack.c   |   4 +
 10 files changed, 1148 insertions(+), 6 deletions(-)
 create mode 100644 libgcc/config/s390/morestack.S
 create mode 100644 libgcc/config/s390/t-stack-s390

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c881d52..71f6f38 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,38 @@
 2016-01-16  Marcin Kościelnicki  
 
+   * common/config/s390/s390-common.c (s390_supports_split_stack):
+   New function.
+   (TARGET_SUPPORTS_SPLIT_STACK): New macro.
+   * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
+   * config/s390/s390.c (struct machine_function): New field
+   split_stack_varargs_pointer.
+   (s390_register_info): Mark r12 as clobbered if it'll be used as temp
+   in s390_emit_prologue.
+   (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
+   vararg pointer.
+   (morestack_ref): New global.
+   (SPLIT_STACK_AVAILABLE): New macro.
+   (s390_expand_split_stack_prologue): New function.
+   (s390_expand_split_stack_call): New function.
+   (s390_live_on_entry): New function.
+   (s390_va_start): Use split-stack vararg pointer if appropriate.
+   (s390_reorg): Lower the split-stack pseudo-insns.
+   (s390_asm_file_end): Emit the split-stack note sections.
+   (TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
+   * config/s390/s390.md: (UNSPEC_STACK_CHECK): New unspec.
+   (UNSPECV_SPLIT_STACK_CALL): New unspec.
+   (UNSPECV_SPLIT_STACK_SIBCALL): New unspec.
+   (UNSPECV_SPLIT_STACK_MARKER): New unspec.
+   (split_stack_prologue): New expand.
+   (split_stack_call_*): New insn.
+   (split_stack_cond_call_*): New insn.
+   (split_stack_space_check): New expand.
+   (split_stack_sibcall_*): New insn.
+   (split_stack_cond_sibcall_*): New insn.
+   (split_stack_marker): New insn.
+
+2016-01-02  Marcin Kościelnicki  
+
* cfgrtl.c (rtl_tidy_fallthru_edge): Bail for unconditional jumps
with side effects.
 
diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index 4519c21..1e497e6 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -105,6 +105,17 @@ s390_handle_option (struct gcc_options *opts 
ATTRIBUTE_UNUSED,
 }
 }
 
+/* -fsplit-stack uses a field i

Re: [PATCH 5/5] s390: Add -fsplit-stack support

2016-01-15 Thread Marcin Kościelnicki

On 15/01/16 19:38, Andreas Krebbel wrote:

Marcin,

your implementation looks very good to me. Thanks!

But please be aware that we deprecated the support of g5 and g6 and intend to 
remove that code from
the back-end with the next GCC version.  So I would prefer if you could remove 
all the
!TARGET_CPU_ZARCH stuff from the implementation and just error out if 
split-stack is enabled with
-march g5/g6.  It currently makes the implementation more complicated and would 
have to be removed
anyway in the future.

Thanks!

https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01854.html


Bye,

-Andreas-




Very well, I'll do that.

Btw, as for dropping support for g5/g6: I've noticed 
s390_function_profiler could also use larl+brasl for -m31 given 
TARGET_CPU_ZARCH.  Should I submit a patch for that?  I'm asking because 
gold with -fsplit-stack needs to know the exact sequence used, so if 
it's going to change after g5/g6 removal, I'd better add it to gold now 
(and make gcc always emit it for non-g5/g6, so that gold won't need to 
look at the old one).


What about the other patches?  #1 and #2 should be ready to go.  I'm not 
sure how I should go about getting #3 and #4 reviewed.  We don't need #3 
anymore once g5/g6 support is removed, but #4 might still be necessary - 
we still have that unconditional jump.


Marcin Kościelnicki


Re: [RFC] [PR 68191] s390: Add -fsplit-stack support.

2016-01-03 Thread Marcin Kościelnicki

On 03/01/16 04:20, Ian Lance Taylor wrote:

On Sat, Jan 2, 2016 at 11:16 AM, Marcin Kościelnicki  wrote:


The differences start in the __morestack calling convention.  Basically,
since pushing things on stuck is unwieldy and there's only one free
register (%r0 could be used for static chain, %r2-%r6 contain arguments,
%r6-%r15 are callee-saved), I stuff the parameters somewhere in .rodata
or .text section, and pass the address of the parameter block in %r1.
The parameter block also contains a (position-relative) address that
__morestack should jump to (x86 just mangles the return address from
__morestack to compute that).  On zSeries CPUs, the parameter block
is stuffed somewhere in .rodata, its address loaded to %r1 by larl
instruction, and __morestack is sibling-called by jg instruction.


Does that work in a multi-threaded program if two different threads
are calling the same function at the same time and both threads need
to split the stack?


For a few more details - __morestack takes three parameters:

- function's frame size (initial frame size if it happens to use alloca 
or VLAs later)

- size function's arguments on stack (not including varargs, if any)
- a pointer to the label where execution should be continued after stack 
is allocated


All three are per-function consts.  The first two are computed by the 
compiler (though frame size can be mangled by linker for functions 
calling non-split-stack code), and the third by the linker (since it 
involves relocation).  Since the parameters are known at link time, 
they're put in a per-function block in .rodata or .text and never 
change.  Simultanous access to that area is not a problem, since it's 
never written.


Marcin Kościelnicki



Ian





Re: [RFC] [PR 68191] s390: Add -fsplit-stack support.

2016-01-03 Thread Marcin Kościelnicki

On 03/01/16 04:20, Ian Lance Taylor wrote:

On Sat, Jan 2, 2016 at 11:16 AM, Marcin Kościelnicki  wrote:


The differences start in the __morestack calling convention.  Basically,
since pushing things on stuck is unwieldy and there's only one free
register (%r0 could be used for static chain, %r2-%r6 contain arguments,
%r6-%r15 are callee-saved), I stuff the parameters somewhere in .rodata
or .text section, and pass the address of the parameter block in %r1.
The parameter block also contains a (position-relative) address that
__morestack should jump to (x86 just mangles the return address from
__morestack to compute that).  On zSeries CPUs, the parameter block
is stuffed somewhere in .rodata, its address loaded to %r1 by larl
instruction, and __morestack is sibling-called by jg instruction.


Does that work in a multi-threaded program if two different threads
are calling the same function at the same time and both threads need
to split the stack?

Ian



Sure, why not?  The parameters are link-time constants after all.

Marcin Kościelnicki


[PATCH 5/5] s390: Add -fsplit-stack support

2016-01-02 Thread Marcin Kościelnicki
libgcc/ChangeLog:

* config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
* config/s390/morestack.S: New file.
* config/s390/t-stack-s390: New file.
* generic-morestack.c (__splitstack_find): Add s390-specific code.

gcc/ChangeLog:

* common/config/s390/s390-common.c (s390_supports_split_stack):
New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
* config/s390/s390.c (struct machine_function): New field
split_stack_varargs_pointer.
(s390_split_branches): Don't split split-stack pseudo-insns, rewire
split-stack prologue conditional jump instead of splitting it.
(s390_chunkify_start): Don't reload const pool register on split-stack
prologue conditional jumps.
(s390_register_info): Mark r12 as clobbered if it'll be used as temp
in s390_emit_prologue.
(s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
vararg pointer.
(morestack_ref): New global.
(SPLIT_STACK_AVAILABLE): New macro.
(s390_expand_split_stack_prologue): New function.
(s390_expand_split_stack_call_esa): New function.
(s390_expand_split_stack_call_zarch): New function.
(s390_live_on_entry): New function.
(s390_va_start): Use split-stack vararg pointer if appropriate.
(s390_reorg): Lower the split-stack pseudo-insns.
(s390_asm_file_end): Emit the split-stack note sections.
(TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
* config/s390/s390.md: (UNSPEC_STACK_CHECK): New unspec.
(UNSPECV_SPLIT_STACK_CALL_ZARCH): New unspec.
(UNSPECV_SPLIT_STACK_CALL_ESA): New unspec.
(UNSPECV_SPLIT_STACK_SIBCALL): New unspec.
(UNSPECV_SPLIT_STACK_MARKER): New unspec.
(split_stack_prologue): New expand.
(split_stack_call_esa): New insn.
(split_stack_call_zarch_*): New insn.
(split_stack_cond_call_zarch_*): New insn.
(split_stack_space_check): New expand.
(split_stack_sibcall_basr): New insn.
(split_stack_sibcall_*): New insn.
(split_stack_cond_sibcall_*): New insn.
(split_stack_marker): New insn.
---
 gcc/ChangeLog|  41 ++
 gcc/common/config/s390/s390-common.c |  14 +
 gcc/config/s390/s390-protos.h|   1 +
 gcc/config/s390/s390.c   | 538 +-
 gcc/config/s390/s390.md  | 133 +++
 libgcc/ChangeLog |   7 +
 libgcc/config.host   |   4 +-
 libgcc/config/s390/morestack.S   | 718 +++
 libgcc/config/s390/t-stack-s390  |   2 +
 libgcc/generic-morestack.c   |   4 +
 10 files changed, 1454 insertions(+), 8 deletions(-)
 create mode 100644 libgcc/config/s390/morestack.S
 create mode 100644 libgcc/config/s390/t-stack-s390

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 4c7046f..a4f4dff 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,46 @@
 2016-01-02  Marcin Kościelnicki  
 
+   * common/config/s390/s390-common.c (s390_supports_split_stack):
+   New function.
+   (TARGET_SUPPORTS_SPLIT_STACK): New macro.
+   * config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
+   * config/s390/s390.c (struct machine_function): New field
+   split_stack_varargs_pointer.
+   (s390_split_branches): Don't split split-stack pseudo-insns, rewire
+   split-stack prologue conditional jump instead of splitting it.
+   (s390_chunkify_start): Don't reload const pool register on split-stack
+   prologue conditional jumps.
+   (s390_register_info): Mark r12 as clobbered if it'll be used as temp
+   in s390_emit_prologue.
+   (s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
+   vararg pointer.
+   (morestack_ref): New global.
+   (SPLIT_STACK_AVAILABLE): New macro.
+   (s390_expand_split_stack_prologue): New function.
+   (s390_expand_split_stack_call_esa): New function.
+   (s390_expand_split_stack_call_zarch): New function.
+   (s390_live_on_entry): New function.
+   (s390_va_start): Use split-stack vararg pointer if appropriate.
+   (s390_reorg): Lower the split-stack pseudo-insns.
+   (s390_asm_file_end): Emit the split-stack note sections.
+   (TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
+   * config/s390/s390.md: (UNSPEC_STACK_CHECK): New unspec.
+   (UNSPECV_SPLIT_STACK_CALL_ZARCH): New unspec.
+   (UNSPECV_SPLIT_STACK_CALL_ESA): New unspec.
+   (UNSPECV_SPLIT_STACK_SIBCALL): New unspec.
+   (UNSPECV_SPLIT_STACK_MARKER): New unspec.
+   (split_stack_prologue): New expand.
+   (split_stack_call_esa): New insn.
+   (split_stack_call_zarch_*): New insn.
+   (split_stack_cond_call_zarch_*): New insn.
+   (split_

[PATCH 1/5] s390: Use proper read-only data section for literals.

2016-01-02 Thread Marcin Kościelnicki
Previously, .rodata was hardcoded.  For C++ vague linkage functions,
this resulted in needlessly duplicated literals.  With the new split
stack support, this resulted in link errors, due to .rodata containing
relocations to the discarded text sections.

gcc/ChangeLog:

* config/s390/s390.md (pool_section_start): Use switch_to_section
to select proper read-only data section instead of hardcoding .rodata.
(pool_section_end): Use switch_to_section to match the above.
---
 gcc/ChangeLog   |  6 ++
 gcc/config/s390/s390.md | 11 +--
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 23ce209..2c572a7 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2016-01-02  Marcin Kościelnicki  
+
+   * config/s390/s390.md (pool_section_start): Use switch_to_section
+   to select proper read-only data section instead of hardcoding .rodata.
+   (pool_section_end): Use switch_to_section to match the above.
+
 2016-01-01  Sandra Loosemore  
 
PR 1078
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index a1fc96a..0ebefd6 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -10247,13 +10247,20 @@
 (define_insn "pool_section_start"
   [(unspec_volatile [(const_int 1)] UNSPECV_POOL_SECTION)]
   ""
-  ".section\t.rodata"
+{
+  switch_to_section (targetm.asm_out.function_rodata_section
+(current_function_decl));
+  return "";
+}
   [(set_attr "length" "0")])
 
 (define_insn "pool_section_end"
   [(unspec_volatile [(const_int 0)] UNSPECV_POOL_SECTION)]
   ""
-  ".previous"
+{
+  switch_to_section (current_function_section ());
+  return "";
+}
   [(set_attr "length" "0")])
 
 (define_insn "main_base_31_small"
-- 
2.6.4



[RFC] [PR 68191] s390: Add -fsplit-stack support.

2016-01-02 Thread Marcin Kościelnicki
Here's my attempt at adding -fsplit-stack support for s390 targets
(bug 68191).  Patches 1 and 2 fix s390-specific issues affecting split
stack code, and can be pushed independently of the main course.  Patches
3 and 4 attempt to fix target-independent issues involving unconditional
jumps with side effects (see below).  I'm not exactly sure I'm doing
the right thing in these, and I'd really welcome some feedback about
them and the general approach taken.  Patch 5 is split stack support
proper.  This patch should be used along with the matching glibc and
gold patches (I'll soon link them all in the bugzilla entry).

The generic approach is identical to x86: I add a new __private_ss
field to the TCB in glibc, add a target-specific __morestack function
and friends, emit a split-stack prologue, teach va_start to deal with
a dedicated vararg pointer, and teach gold to recognize the split-stack
prologue and handle non-split-stack calls by bumping the requested
frame size.

The differences start in the __morestack calling convention.  Basically,
since pushing things on stuck is unwieldy and there's only one free
register (%r0 could be used for static chain, %r2-%r6 contain arguments,
%r6-%r15 are callee-saved), I stuff the parameters somewhere in .rodata
or .text section, and pass the address of the parameter block in %r1.
The parameter block also contains a (position-relative) address that
__morestack should jump to (x86 just mangles the return address from
__morestack to compute that).  On zSeries CPUs, the parameter block
is stuffed somewhere in .rodata, its address loaded to %r1 by larl
instruction, and __morestack is sibling-called by jg instruction.
On older CPUs, lacking long jump and PC-relative load-address
instructions, I use the following sequence instead:

# load .L1 to %r1
basr %r1, 0 
.L1:
# Load __morestack to %r1
a %r1, .L2-.L1(%r1)
# Jump to __morestack and stuff return address (aka param block address)
# to %r1.
basr %r1, %r1
# param block comes here
.L3:
.long 
.long 
.long .L4-.L3
# relative __morestack address here
.L2:
.long __morestack-.L1
.L4:
# __morestack jumps here

As on other targets, the call to __morestack is conditional, based
on comparing the stack pointer with a field in TCB.  For zSeries,
I just make the jump to __morestack a conditional one, while for
older CPUs I emit a jump over the sequence.  Also, for vararg
functions, I need to stuff the vararg pointer in some register. Since
%r1 is again the only one guaranteed to be free, it's the one used.
If __morestack is called, it'll leave the correct pointer in %r1.
Otherwise, I emit a simple load-address instruction.  Since I only
need that instruction in the not-called branch (as opposed to x86
that emits it on both branches), I get terser code.

Now, here come the problems.  To keep optimization passes from
destroying the above sequence (as well as the simpler ones with larl),
I emit a pseudo-insn (split_stack_call_*) that is expanded to the
above in machine-dependent reorg phase, just like normal const pools.
The instruction is considered to be an unconditional jump to the .L4
label (since __morestack will jump to an arbitrary address selected
by param block anyway, that's what it effectively is).  For a zSeries
CPU with a conditional call, I represent the sequence as a conditional
jump instead.  So overall the sequences, as emitted by
s390_expand_split_stack_prologue, look as follows:

# (1) Old CPU, unconditional

.L4:
# Normal prologue starts here.

# (2) zSeries CPU, unconditional

.L4:
# Normal prologue starts here.

# Which will expand to:
larl %r1, .L3
jg __morestack
.section .rodata
.L3:
# Or .long for 31-bit target.
.quad 
.quad 
.quad .L4-.L3
.text

# (3) Old CPU, conditional

jhe .L5

.L5:
# Compute vararg pointer (vararg functions only)
la %r1, 96(%r15)
.L4:
# Normal prologue starts here.

# (4) zSeries CPU, conditional


# Compute vararg pointer (vararg functions only)
la %r1, 160(%r15)
.L4:
# Normal prologue starts here.
# Expands as above, except with jgl instead of jg.

Case (4) is the least problematic: conditional jumps with side effects
appear to work quite well.  However, the other variants involve an
unconditional jump with side effects, which causes two problems:

- If the jump is to immediately following label (which will happen always
  in cases (1) and (2), and for non-vararg functions in (3)),
  rtl_tidy_fallthru_edge mistakenly marks it as a fallthru edge, even
  though it correctly figures the jump cannot be removed due to the side
  effects.  This causes a verification failure later.
- In case (3), since the call to __morestack is considered to be unlikely,
  the basic block with the call pseudo-insn will be moved to the end of
  the function if we're optimizing.  Since it already ends with
  an unconditional jump, no new jump will be inserted (as opposed to x86).
  Soon afterwards, reposition_prologue_and_epilogue_notes will move
  NOTE_INSN_PROLOGUE_END after the last prologue instruction, which i

[PATCH 3/5] Fix NOTE_INSN_PROLOGUE_END after unconditional jump.

2016-01-02 Thread Marcin Kościelnicki
With the new s390 split-stack support, when optimization is enabled,
the cold path of calling __morestack is likely to be moved to the
end of the function.  This will result in the function ending in
split_stack_call_esa, which is an unconditional jump instruction and
part of the function prologue.  reposition_prologue_and_epilogue_notes
will insert NOTE_INSN_PROLOGUE_END right after it (and before the
following barrier), causing a verification error.  Insert it after
the barrier instead (and outside of basic block).

gcc/ChangeLog:

* function.c (reposition_prologue_and_epilogue_notes): Avoid
verification error if the last insn of prologue is an unconditional
jump.
---
 gcc/ChangeLog  | 6 ++
 gcc/function.c | 6 ++
 2 files changed, 12 insertions(+)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6aef3f9..56e31f6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,11 @@
 2016-01-02  Marcin Kościelnicki  
 
+   * function.c (reposition_prologue_and_epilogue_notes): Avoid
+   verification error if the last insn of prologue is an unconditional
+   jump.
+
+2016-01-02  Marcin Kościelnicki  
+
* config/s390/s390.c (s390_asm_declare_function_size): Add code
to actually emit the .size directive.
 
diff --git a/gcc/function.c b/gcc/function.c
index 035a49e..921945f 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -6348,6 +6348,12 @@ reposition_prologue_and_epilogue_notes (void)
  /* Avoid placing note between CODE_LABEL and BASIC_BLOCK note.  */
  if (LABEL_P (last))
last = NEXT_INSN (last);
+ if (BARRIER_P (last) && BLOCK_FOR_INSN (note))
+   {
+ if (BB_END (BLOCK_FOR_INSN (note)) == note)
+   BB_END (BLOCK_FOR_INSN (note)) = PREV_INSN (note);
+ BLOCK_FOR_INSN (note) = 0;
+   }
  reorder_insns (note, note, last);
}
 }
-- 
2.6.4



[PATCH 2/5] s390: Fix missing .size directives.

2016-01-02 Thread Marcin Kościelnicki
It seems at some point the .size hook was hijacked to emit some
machine-specific directives, and the actual .size directive was
forgotten.  This caused problems for split-stack support, since
linker couldn't scan the function body for non-split-stack calls.

gcc/ChangeLog:

* config/s390/s390.c (s390_asm_declare_function_size): Add code
to actually emit the .size directive.
---
 gcc/ChangeLog  | 5 +
 gcc/config/s390/s390.c | 4 +++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2c572a7..6aef3f9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,10 @@
 2016-01-02  Marcin Kościelnicki  
 
+   * config/s390/s390.c (s390_asm_declare_function_size): Add code
+   to actually emit the .size directive.
+
+2016-01-02  Marcin Kościelnicki  
+
* config/s390/s390.md (pool_section_start): Use switch_to_section
to select proper read-only data section instead of hardcoding .rodata.
(pool_section_end): Use switch_to_section to match the above.
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 16045f0..9dc8d1e 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6834,8 +6834,10 @@ s390_asm_output_function_prefix (FILE *asm_out_file,
 
 void
 s390_asm_declare_function_size (FILE *asm_out_file,
-   const char *fnname ATTRIBUTE_UNUSED, tree decl)
+   const char *fnname, tree decl)
 {
+  if (!flag_inhibit_size_directive)
+ASM_OUTPUT_MEASURED_SIZE (asm_out_file, fnname);
   if (DECL_FUNCTION_SPECIFIC_TARGET (decl) == NULL)
 return;
   fprintf (asm_out_file, "\t.machine pop\n");
-- 
2.6.4



[PATCH 4/5] Don't mark targets of unconditional jumps with side effects as FALLTHRU.

2016-01-02 Thread Marcin Kościelnicki
When an unconditional jump with side effects targets an immediately
following label, rtl_tidy_fallthru_edge is called.  Since it has side
effects, it doesn't remove the jump, but the label is still marked
as fallthru.  This later causes a verification error.  Do nothing in this
case instead.

gcc/ChangeLog:

* cfgrtl.c (rtl_tidy_fallthru_edge): Bail for unconditional jumps
with side effects.
---
 gcc/ChangeLog | 5 +
 gcc/cfgrtl.c  | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 56e31f6..4c7046f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,10 @@
 2016-01-02  Marcin Kościelnicki  
 
+   * cfgrtl.c (rtl_tidy_fallthru_edge): Bail for unconditional jumps
+   with side effects.
+
+2016-01-02  Marcin Kościelnicki  
+
* function.c (reposition_prologue_and_epilogue_notes): Avoid
verification error if the last insn of prologue is an unconditional
jump.
diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index fbfc7cd..dc4c2b1 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -1762,6 +1762,8 @@ rtl_tidy_fallthru_edge (edge e)
  If block B consisted only of this single jump, turn it into a deleted
  note.  */
   q = BB_END (b);
+  if (JUMP_P (q) && !onlyjump_p (q))
+return;
   if (JUMP_P (q)
   && onlyjump_p (q)
   && (any_uncondjump_p (q)
-- 
2.6.4