Re: patch to fix PR93564

2020-02-28 Thread Vladimir Makarov



On 2020-02-24 4:54 a.m., Christophe Lyon wrote:

Hi,

On Sun, 23 Feb 2020 at 22:26, Vladimir Makarov  wrote:

The following patch is for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93564

The patch was successfully bootstrapped on x86-64 and benchmarked on
SPEC2000.



It seems this patch causes regression on some arm cores (seen on
cortex-a15 and cortex-57).
When configuring
--target arm-none-linux-gnueabihf
--with-cpu cortex-a57
--with-fpu crypto-neon-fp-armv8

gcc.target/arm/unaligned-memcpy-2.c: ldrd found 1 times
FAIL: gcc.target/arm/unaligned-memcpy-2.c scan-assembler-times ldrd 0
gcc.target/arm/unaligned-memcpy-2.c: strd found 2 times
FAIL: gcc.target/arm/unaligned-memcpy-2.c scan-assembler-times strd 1

This one regresses only on cortex-a15:
gcc.target/arm/unaligned-memcpy-3.c: ldrd found 2 times
FAIL: gcc.target/arm/unaligned-memcpy-3.c scan-assembler-times ldrd 1
gcc.target/arm/unaligned-memcpy-3.c: strd found 1 times
FAIL: gcc.target/arm/unaligned-memcpy-3.c scan-assembler-times strd 0


I've just submitted a patch which I hope will solve some of the failures.




Re: patch to fix PR93564

2020-02-27 Thread Vladimir Makarov



On 2020-02-27 7:33 a.m., Andrew Stubbs wrote:

On 26/02/2020 15:16, Andrew Stubbs wrote:
The problem appears to be that the high-part of a register pair is 
not marked as "ever live".  I'm trying to figure out whether this is 
some kind of target-specific issue that has merely been exposed, but 
it's difficult to see what's going on. I'm pretty sure I've never 
seen this one before.


I'm now pretty sure your patch didn't cause this issue so much as 
expose it.


Either way, it's fixed now.

Thank you for informing me about this.  Such heuristic changes should 
not affect generated code correctness unless they trigger a hidden bug 
in RA or machine-depended code used by RA.





Re: patch to fix PR93564

2020-02-27 Thread Andrew Stubbs

On 26/02/2020 15:16, Andrew Stubbs wrote:
The problem appears to be that the high-part of a register pair is not 
marked as "ever live".  I'm trying to figure out whether this is some 
kind of target-specific issue that has merely been exposed, but it's 
difficult to see what's going on. I'm pretty sure I've never seen this 
one before.


I'm now pretty sure your patch didn't cause this issue so much as expose it.

Either way, it's fixed now.

Andrew



Re: patch to fix PR93564

2020-02-26 Thread Andrew Stubbs

On 23/02/2020 21:25, Vladimir Makarov wrote:

The following patch is for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93564

The patch was successfully bootstrapped on x86-64 and benchmarked on 
SPEC2000.


Since this patch I get an ICE with checking enabled, for amdgcn-amdhsa:

during RTL pass: reload
dump file: /scratch/astubbs/amd/upstreamB/tmp/target.283r.reload
../gcc.c-torture/execute/ieee/compare-fp-1.c: In function 'ieq':
../gcc.c-torture/execute/ieee/compare-fp-1.c:33:1: internal 
compiler error: in lra_constraints, at lra-constraints.c:5051

   33 | }
  | ^
0x103d7cf lra_constraints(bool)
/gcc/lra-constraints.c:5051
0x102541a lra(_IO_FILE*)
/gcc/lra.c:2437
0xfcc9c9 do_reload
/gcc/ira.c:5523
0xfcce50 execute
/gcc/ira.c:5709
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

This happens at all optimization levels (but not -O0).

The problem appears to be that the high-part of a register pair is not 
marked as "ever live".  I'm trying to figure out whether this is some 
kind of target-specific issue that has merely been exposed, but it's 
difficult to see what's going on. I'm pretty sure I've never seen this 
one before.


Andrew


Re: patch to fix PR93564

2020-02-24 Thread Christophe Lyon
Hi,

On Sun, 23 Feb 2020 at 22:26, Vladimir Makarov  wrote:
>
> The following patch is for
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93564
>
> The patch was successfully bootstrapped on x86-64 and benchmarked on
> SPEC2000.
>


It seems this patch causes regression on some arm cores (seen on
cortex-a15 and cortex-57).
When configuring
--target arm-none-linux-gnueabihf
--with-cpu cortex-a57
--with-fpu crypto-neon-fp-armv8

gcc.target/arm/unaligned-memcpy-2.c: ldrd found 1 times
FAIL: gcc.target/arm/unaligned-memcpy-2.c scan-assembler-times ldrd 0
gcc.target/arm/unaligned-memcpy-2.c: strd found 2 times
FAIL: gcc.target/arm/unaligned-memcpy-2.c scan-assembler-times strd 1

This one regresses only on cortex-a15:
gcc.target/arm/unaligned-memcpy-3.c: ldrd found 2 times
FAIL: gcc.target/arm/unaligned-memcpy-3.c scan-assembler-times ldrd 1
gcc.target/arm/unaligned-memcpy-3.c: strd found 1 times
FAIL: gcc.target/arm/unaligned-memcpy-3.c scan-assembler-times strd 0

Christophe


patch to fix PR93564

2020-02-23 Thread Vladimir Makarov

The following patch is for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93564

The patch was successfully bootstrapped on x86-64 and benchmarked on 
SPEC2000.



commit 3133bed5d0327e8a9cd0a601b7ecdb9de4fc825d
Author: Vladimir N. Makarov 
Date:   Sun Feb 23 16:20:05 2020 -0500

Changing cost propagation and ordering colorable bucket heuristics for PR93564.

2020-02-23  Vladimir Makarov  

PR rtl-optimization/93564
* ira-color.c (struct update_cost_queue_elem): New member start.
(queue_update_cost, get_next_update_cost): Add new arg start.
(allocnos_conflict_p): New function.
(update_costs_from_allocno): Add new arg conflict_cost_update_p.
Add checking conflicts with allocnos_conflict_p.
(update_costs_from_prefs, restore_costs_from_copies): Adjust
update_costs_from_allocno calls.
(update_conflict_hard_regno_costs): Add checking conflicts with
allocnos_conflict_p.  Adjust calls of queue_update_cost and
get_next_update_cost.
(assign_hard_reg): Adjust calls of queue_update_cost.  Add
debugging print.
(bucket_allocno_compare_func): Restore previous version.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index cea52f00523..b3b9d92a1ec 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,20 @@
+2020-02-23  Vladimir Makarov  
+
+	PR rtl-optimization/93564
+	* ira-color.c (struct update_cost_queue_elem): New member start.
+	(queue_update_cost, get_next_update_cost): Add new arg start.
+	(allocnos_conflict_p): New function.
+	(update_costs_from_allocno): Add new arg conflict_cost_update_p.
+	Add checking conflicts with allocnos_conflict_p.
+	(update_costs_from_prefs, restore_costs_from_copies): Adjust
+	update_costs_from_allocno calls.
+	(update_conflict_hard_regno_costs): Add checking conflicts with
+	allocnos_conflict_p.  Adjust calls of queue_update_cost and
+	get_next_update_cost.
+	(assign_hard_reg): Adjust calls of queue_update_cost.  Add
+	debugging print.
+	(bucket_allocno_compare_func): Restore previous version.
+
 2020-02-21  John David Anglin  
 
 	* gcc/config/pa/pa.c (pa_function_value): Fix check for word and
diff --git a/gcc/ira-color.c b/gcc/ira-color.c
index 0bcc80463b0..0ffdd192020 100644
--- a/gcc/ira-color.c
+++ b/gcc/ira-color.c
@@ -1199,6 +1199,10 @@ struct update_cost_queue_elem
  connecting this allocno to the one being allocated.  */
   int divisor;
 
+  /* Allocno from which we started chaining costs of connected
+ allocnos. */
+  ira_allocno_t start;
+
   /* Allocno from which we are chaining costs of connected allocnos.
  It is used not go back in graph of allocnos connected by
  copies.  */
@@ -1258,10 +1262,11 @@ start_update_cost (void)
   update_cost_queue = NULL;
 }
 
-/* Add (ALLOCNO, FROM, DIVISOR) to the end of update_cost_queue, unless
+/* Add (ALLOCNO, START, FROM, DIVISOR) to the end of update_cost_queue, unless
ALLOCNO is already in the queue, or has NO_REGS class.  */
 static inline void
-queue_update_cost (ira_allocno_t allocno, ira_allocno_t from, int divisor)
+queue_update_cost (ira_allocno_t allocno, ira_allocno_t start,
+		   ira_allocno_t from, int divisor)
 {
   struct update_cost_queue_elem *elem;
 
@@ -1270,6 +1275,7 @@ queue_update_cost (ira_allocno_t allocno, ira_allocno_t from, int divisor)
   && ALLOCNO_CLASS (allocno) != NO_REGS)
 {
   elem->check = update_cost_check;
+  elem->start = start;
   elem->from = from;
   elem->divisor = divisor;
   elem->next = NULL;
@@ -1282,10 +1288,11 @@ queue_update_cost (ira_allocno_t allocno, ira_allocno_t from, int divisor)
 }
 
 /* Try to remove the first element from update_cost_queue.  Return
-   false if the queue was empty, otherwise make (*ALLOCNO, *FROM,
-   *DIVISOR) describe the removed element.  */
+   false if the queue was empty, otherwise make (*ALLOCNO, *START,
+   *FROM, *DIVISOR) describe the removed element.  */
 static inline bool
-get_next_update_cost (ira_allocno_t *allocno, ira_allocno_t *from, int *divisor)
+get_next_update_cost (ira_allocno_t *allocno, ira_allocno_t *start,
+		  ira_allocno_t *from, int *divisor)
 {
   struct update_cost_queue_elem *elem;
 
@@ -1294,6 +1301,7 @@ get_next_update_cost (ira_allocno_t *allocno, ira_allocno_t *from, int *divisor)
 
   *allocno = update_cost_queue;
   elem = &update_cost_queue_elems[ALLOCNO_NUM (*allocno)];
+  *start = elem->start;
   *from = elem->from;
   *divisor = elem->divisor;
   update_cost_queue = elem->next;
@@ -1325,18 +1333,41 @@ update_allocno_cost (ira_allocno_t allocno, int hard_regno,
   return true;
 }
 
+/* Return TRUE if allocnos A1 and A2 conflicts. Here we are
+   interesting only in conflicts of allocnos with intersected allocno
+   classes. */
+static bool
+allocnos_conflict_p (ira_allocno_t a1, ira_allocno_t a2)
+{
+  ira_object_t obj, conflict_obj;
+  ira_object_conflict_iterator oci;
+  int word, nwo