Re: [PATCH] S/390: Allow to use r1 to r4 as literal pool base.

2015-12-16 Thread Ulrich Weigand
Dominik Vogt wrote:
> On Wed, Dec 16, 2015 at 01:51:45PM +0100, Ulrich Weigand wrote:
> > Dominik Vogt wrote:
> > > > r2 through r4 should be fine.  [ Not sure if there will be many (any?) 
> > > > cases
> > > > where one of those is unused but r5 isn't, however. ]
> > > 
> > > This can happen if the function only uses register pairs
> > > (__int128).  Actually I'm not sure whether r2 and r4 are valid
> > > candidates.
> > 
> > Huh?  Why not?
> 
> Because I'm not sure it is possible to write code where r2 (r4) is
> free but r3 (r5) is not - at least when s390_emit_prologue is
> called.  Writing code that uses r4 and r5 but not r3 was diffucult
> enough:

Ah, OK.  I agree that it will rarely happen (it could in more
complex cases where something initially uses r2 but a very late
optimization pass manages to eliminate that use).

However, when it is *is* free, it is a valid candidate in the
sense that it would be safe and correct to use it.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH] S/390: Allow to use r1 to r4 as literal pool base.

2015-12-16 Thread Dominik Vogt
On Wed, Dec 16, 2015 at 01:51:45PM +0100, Ulrich Weigand wrote:
> Dominik Vogt wrote:
> > > r2 through r4 should be fine.  [ Not sure if there will be many (any?) 
> > > cases
> > > where one of those is unused but r5 isn't, however. ]
> > 
> > This can happen if the function only uses register pairs
> > (__int128).  Actually I'm not sure whether r2 and r4 are valid
> > candidates.
> 
> Huh?  Why not?

Because I'm not sure it is possible to write code where r2 (r4) is
free but r3 (r5) is not - at least when s390_emit_prologue is
called.  Writing code that uses r4 and r5 but not r3 was diffucult
enough:

  __int128 gi;
  const int c = 0x12345678u;
  int foo(void)
  {
gi += c;
return c;
  }

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH] S/390: Allow to use r1 to r4 as literal pool base.

2015-12-16 Thread Ulrich Weigand
Dominik Vogt wrote:
> On Mon, Dec 14, 2015 at 04:08:32PM +0100, Ulrich Weigand wrote:
> > I don't think that r1 is actually safe here.  Note that it may be used
> > (unconditionally) as temp register in s390_emit_prologue in certain cases;
> > the upcoming split-stack code will also need to use r1 in some cases.
> 
> How about the attached patch?  It also allows to use r0 as the
> temp register if possible (needs more testing).

This doesn't look safe either.  In particular:

- you use cfun_save_high_fprs_p at a place where its value might not yet
  have been determined (when calling s390_get_prologue_temp_regno from
  s390_init_frame_layout *before* the s390_register_info/s390_frame_info
  calls)

- r0 might hold the incoming static chain value, in which case it cannot
  be used as temp register

> If that's too
> much effort, I'm fine with limiting the original patch to r4 to
> r2.

That seems preferable to me.

> > r2 through r4 should be fine.  [ Not sure if there will be many (any?) cases
> > where one of those is unused but r5 isn't, however. ]
> 
> This can happen if the function only uses register pairs
> (__int128).  Actually I'm not sure whether r2 and r4 are valid
> candidates.

Huh?  Why not?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH] S/390: Allow to use r1 to r4 as literal pool base.

2015-12-15 Thread Dominik Vogt
On Mon, Dec 14, 2015 at 04:08:32PM +0100, Ulrich Weigand wrote:
> Dominik Vogt wrote:
> 
> > The attached patch enables using r1 to r4 as the literal pool base pointer 
> > if
> > one of them is unused in a leaf function.  The unpatched code supports only 
> > r5
> > and r13.
> 
> I don't think that r1 is actually safe here.  Note that it may be used
> (unconditionally) as temp register in s390_emit_prologue in certain cases;
> the upcoming split-stack code will also need to use r1 in some cases.

How about the attached patch?  It also allows to use r0 as the
temp register if possible (needs more testing).  If that's too
much effort, I'm fine with limiting the original patch to r4 to
r2.

> r2 through r4 should be fine.  [ Not sure if there will be many (any?) cases
> where one of those is unused but r5 isn't, however. ]

This can happen if the function only uses register pairs
(__int128).  Actually I'm not sure whether r2 and r4 are valid
candidates.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/s390.c (s390_init_frame_layout): Try r4 to r1 for the
literal pool pointer.
(s390_get_prologue_temp_regno): New function to choose the temp_reg for
the prologue.  Allow to use r0 if that's safe.
(s390_emit_prologue): Move code choosing the temp_reg to a separate
function.  Add assertions.
>From 806973409adc48c8ca701d55fdbad897b0e31c78 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Fri, 11 Dec 2015 11:33:23 +0100
Subject: [PATCH] S/390: Allow to use r1 to r4 as literal pool base
 pointer.

The old code only considered r5 and r13.
---
 gcc/config/s390/s390.c | 61 +++---
 1 file changed, 48 insertions(+), 13 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index bc6f05b..c45b992 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -9506,6 +9506,26 @@ s390_frame_info (void)
   & ~(STACK_BOUNDARY / BITS_PER_UNIT - 1));
 }
 
+/* Returns the register number that is used as a temp register in the prologue.
+ */
+static int
+s390_get_prologue_temp_regno (void)
+{
+  if (!has_hard_reg_initial_val (Pmode, RETURN_REGNUM)
+  && !crtl->is_leaf
+  && !TARGET_TPF_PROFILING)
+return RETURN_REGNUM;
+  if (cfun_save_high_fprs_p)
+/* Needs an address register.  */
+return 1;
+  else if (TARGET_BACKCHAIN)
+/* Does not need an address register.  */
+return 0;
+
+  /* No temp register needed.  */
+  return -1;
+}
+
 /* Generate frame layout.  Fills in register and frame data for the current
function in cfun->machine.  This routine can be called multiple times;
it will re-do the complete frame layout every time.  */
@@ -9543,10 +9563,24 @@ s390_init_frame_layout (void)
 	 as base register to avoid save/restore overhead.  */
   if (!base_used)
 	cfun->machine->base_reg = NULL_RTX;
-  else if (crtl->is_leaf && !df_regs_ever_live_p (5))
-	cfun->machine->base_reg = gen_rtx_REG (Pmode, 5);
   else
-	cfun->machine->base_reg = gen_rtx_REG (Pmode, BASE_REGNUM);
+	{
+	  int br = 0;
+
+	  if (crtl->is_leaf)
+	{
+	  int temp_regno;
+
+	  temp_regno = s390_get_prologue_temp_regno ();
+	  /* Prefer r5 (most likely to be free).  */
+	  for (br = 5;
+		   br >= 1 && (br == temp_regno || df_regs_ever_live_p (br));
+		   br--)
+		;
+	}
+	  cfun->machine->base_reg =
+	gen_rtx_REG (Pmode, (br > 0) ? br : BASE_REGNUM);
+	}
 
   s390_register_info ();
   s390_frame_info ();
@@ -10385,19 +10419,16 @@ s390_emit_prologue (void)
 {
   rtx insn, addr;
   rtx temp_reg;
+  int temp_regno;
   int i;
   int offset;
   int next_fpr = 0;
 
-  /* Choose best register to use for temp use within prologue.
- See below for why TPF must use the register 1.  */
-
-  if (!has_hard_reg_initial_val (Pmode, RETURN_REGNUM)
-  && !crtl->is_leaf
-  && !TARGET_TPF_PROFILING)
-temp_reg = gen_rtx_REG (Pmode, RETURN_REGNUM);
-  else
-temp_reg = gen_rtx_REG (Pmode, 1);
+  /* If new uses of temp_reg are introduced into the prologue, be sure to
+ update the conditions in s390_get_prologue_temp_regno().  Otherwise the
+ prologue might overwrite the literal pool pointer in r1.  */
+  temp_regno = s390_get_prologue_temp_regno ();
+  temp_reg = (temp_regno >= 0) ? gen_rtx_REG (Pmode, temp_regno) : NULL_RTX;
 
   s390_save_gprs_to_fprs ();
 
@@ -10551,7 +10582,10 @@ s390_emit_prologue (void)
 
   /* Save incoming stack pointer into temp reg.  */
   if (TARGET_BACKCHAIN || next_fpr)
-	insn = emit_insn (gen_move_insn (temp_reg, stack_pointer_rtx));
+	{
+	  gcc_assert (temp_regno >= 0);
+	  insn = emit_insn (gen_move_insn (temp_reg, stack_pointer_rtx));
+	}
 
   /* Subtract frame size from stack pointer.  */
 
@@ -10606,6 +10640,7 @@ s390_emit_prologue (void)
 
   if (cfun_save_high_fprs_p && next_fpr)
 {
+  gcc_assert (temp_regno >= 1);
   /* If the stack might be accessed through a diff

Re: [PATCH] S/390: Allow to use r1 to r4 as literal pool base.

2015-12-14 Thread Ulrich Weigand
Dominik Vogt wrote:

> The attached patch enables using r1 to r4 as the literal pool base pointer if
> one of them is unused in a leaf function.  The unpatched code supports only r5
> and r13.

I don't think that r1 is actually safe here.  Note that it may be used
(unconditionally) as temp register in s390_emit_prologue in certain cases;
the upcoming split-stack code will also need to use r1 in some cases.

r2 through r4 should be fine.  [ Not sure if there will be many (any?) cases
where one of those is unused but r5 isn't, however. ]

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com