subject:"\[llvm\-commits\] CVS\: llvm\/lib\/Target\/X86\/README.txt"

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-05-18 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.168 -> 1.169
---
Log message:

add a note


---
Diffs of the changes:  (+9 -0)

 README.txt |9 +
 1 files changed, 9 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.168 
llvm/lib/Target/X86/README.txt:1.169
--- llvm/lib/Target/X86/README.txt:1.168Wed May  9 19:08:04 2007
+++ llvm/lib/Target/X86/README.txt  Fri May 18 15:18:14 2007
@@ -26,6 +26,15 @@
 
 ... which should only be one imul instruction.
 
+or:
+
+unsigned long long int t2(unsigned int a, unsigned int b) {
+   return (unsigned long long)a * b;
+}
+
+... which should be one mul instruction.
+
+
 This can be done with a custom expander, but it would be nice to move this to
 generic code.
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-05-09 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.167 -> 1.168
---
Log message:

add some notes


---
Diffs of the changes:  (+28 -0)

 README.txt |   28 
 1 files changed, 28 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.167 
llvm/lib/Target/X86/README.txt:1.168
--- llvm/lib/Target/X86/README.txt:1.167Sat May  5 17:10:24 2007
+++ llvm/lib/Target/X86/README.txt  Wed May  9 19:08:04 2007
@@ -1094,5 +1094,33 @@
 has this xform, but it is currently disabled until the alignment fields of 
 the load/store nodes are trustworthy.
 
+//===-===//
 
+Sometimes it is better to codegen subtractions from a constant (e.g. 7-x) with
+a neg instead of a sub instruction.  Consider:
+
+int test(char X) { return 7-X; }
+
+we currently produce:
+_test:
+movl $7, %eax
+movsbl 4(%esp), %ecx
+subl %ecx, %eax
+ret
+
+We would use one fewer register if codegen'd as:
+
+movsbl 4(%esp), %eax
+   neg %eax
+add $7, %eax
+ret
+
+Note that this isn't beneficial if the load can be folded into the sub.  In
+this case, we want a sub:
+
+int test(int X) { return 7-X; }
+_test:
+movl $7, %eax
+subl 4(%esp), %eax
+ret
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-05-05 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.166 -> 1.167
---
Log message:

move CodeGen/X86/overlap-add.ll here.


---
Diffs of the changes:  (+27 -0)

 README.txt |   27 +++
 1 files changed, 27 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.166 
llvm/lib/Target/X86/README.txt:1.167
--- llvm/lib/Target/X86/README.txt:1.166Mon Apr 16 19:02:37 2007
+++ llvm/lib/Target/X86/README.txt  Sat May  5 17:10:24 2007
@@ -1004,6 +1004,33 @@
 movl %edi, %eax
 ret
 
+Another example is:
+
+;; X's live range extends beyond the shift, so the register allocator
+;; cannot coalesce it with Y.  Because of this, a copy needs to be
+;; emitted before the shift to save the register value before it is
+;; clobbered.  However, this copy is not needed if the register
+;; allocator turns the shift into an LEA.  This also occurs for ADD.
+
+; Check that the shift gets turned into an LEA.
+; RUN: llvm-upgrade < %s | llvm-as | llc -march=x86 -x86-asm-syntax=intel | \
+; RUN:   not grep {mov E.X, E.X}
+
+%G = external global int
+
+int %test1(int %X, int %Y) {
+%Z = add int %X, %Y
+volatile store int %Y, int* %G
+volatile store int %Z, int* %G
+ret int %X
+}
+
+int %test2(int %X) {
+%Z = add int %X, 1  ;; inc
+volatile store int %Z, int* %G
+ret int %X
+}
+
 //===-===//
 
 We use push/pop of stack space around calls in situations where we don't have 
to.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-04-16 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.165 -> 1.166
---
Log message:

SSE4 is apparently public now.


---
Diffs of the changes:  (+2 -0)

 README.txt |2 ++
 1 files changed, 2 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.165 
llvm/lib/Target/X86/README.txt:1.166
--- llvm/lib/Target/X86/README.txt:1.165Sat Apr 14 18:06:09 2007
+++ llvm/lib/Target/X86/README.txt  Mon Apr 16 19:02:37 2007
@@ -3,6 +3,8 @@
 //===-===//
 
 Missing features:
+  - Support for SSE4: http://www.intel.com/software/penryn
+http://softwarecommunity.intel.com/isn/Downloads/Intel%20SSE4%20Programming%20Reference.pdf
   - support for 3DNow!
   - weird abis?
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-04-14 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.164 -> 1.165
---
Log message:

add a note


---
Diffs of the changes:  (+18 -0)

 README.txt |   18 ++
 1 files changed, 18 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.164 
llvm/lib/Target/X86/README.txt:1.165
--- llvm/lib/Target/X86/README.txt:1.164Wed Apr 11 00:34:00 2007
+++ llvm/lib/Target/X86/README.txt  Sat Apr 14 18:06:09 2007
@@ -1049,3 +1049,21 @@
 
 //===-===//
 
+This:
+#include 
+unsigned test(float f) {
+ return _mm_cvtsi128_si32( (__m128i) _mm_set_ss( f ));
+}
+
+Compiles to:
+_test:
+movss 4(%esp), %xmm0
+movd %xmm0, %eax
+ret
+
+it should compile to a move from the stack slot directly into eax.  DAGCombine
+has this xform, but it is currently disabled until the alignment fields of 
+the load/store nodes are trustworthy.
+
+
+



___
llvm-commits mailing list
[EMAIL PROTECTED]
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-04-10 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.163 -> 1.164
---
Log message:

done


---
Diffs of the changes:  (+0 -28)

 README.txt |   28 
 1 files changed, 28 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.163 
llvm/lib/Target/X86/README.txt:1.164
--- llvm/lib/Target/X86/README.txt:1.163Tue Apr 10 16:14:01 2007
+++ llvm/lib/Target/X86/README.txt  Wed Apr 11 00:34:00 2007
@@ -1049,31 +1049,3 @@
 
 //===-===//
 
-Consider:
-
-int isnegative(unsigned int X) {
-   return !(X < 2147483648U);
-}
-
-We current compile this to:
-
-define i32 @isnegative(i32 %X) {
-icmp slt i32 %X, 0  ; :0 [#uses=1]
-%retval = zext i1 %0 to i32 ;  [#uses=1]
-ret i32 %retval
-}
-
-and:
-
-_isnegative:
-cmpl $0, 4(%esp)
-sets %al
-movzbl %al, %eax
-ret
-
-We should produce:
-
-   movl4(%esp), %eax
-   shrl$31, %eax
-ret
-



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-04-10 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.162 -> 1.163
---
Log message:

new micro optzn


---
Diffs of the changes:  (+30 -0)

 README.txt |   30 ++
 1 files changed, 30 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.162 
llvm/lib/Target/X86/README.txt:1.163
--- llvm/lib/Target/X86/README.txt:1.162Tue Apr  3 18:41:34 2007
+++ llvm/lib/Target/X86/README.txt  Tue Apr 10 16:14:01 2007
@@ -1047,3 +1047,33 @@
 }
 
 
+//===-===//
+
+Consider:
+
+int isnegative(unsigned int X) {
+   return !(X < 2147483648U);
+}
+
+We current compile this to:
+
+define i32 @isnegative(i32 %X) {
+icmp slt i32 %X, 0  ; :0 [#uses=1]
+%retval = zext i1 %0 to i32 ;  [#uses=1]
+ret i32 %retval
+}
+
+and:
+
+_isnegative:
+cmpl $0, 4(%esp)
+sets %al
+movzbl %al, %eax
+ret
+
+We should produce:
+
+   movl4(%esp), %eax
+   shrl$31, %eax
+ret
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-04-03 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.161 -> 1.162
---
Log message:

make a new missing features section


---
Diffs of the changes:  (+6 -2)

 README.txt |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.161 
llvm/lib/Target/X86/README.txt:1.162
--- llvm/lib/Target/X86/README.txt:1.161Tue Apr  3 18:37:20 2007
+++ llvm/lib/Target/X86/README.txt  Tue Apr  3 18:41:34 2007
@@ -2,6 +2,12 @@
 // Random ideas for the X86 backend.
 //===-===//
 
+Missing features:
+  - support for 3DNow!
+  - weird abis?
+
+//===-===//
+
 Add a MUL2U and MUL2S nodes to represent a multiply that returns both the
 Hi and Lo parts (combination of MUL and MULH[SU] into one node).  Add this to
 X86, & make the dag combiner produce it when needed.  This will eliminate one
@@ -1040,6 +1046,4 @@
   return 0;
 }
 
-//===-===//
 
-Add support for 3DNow!



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-04-03 Thread Bill Wendling



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.160 -> 1.161
---
Log message:

Updated

---
Diffs of the changes:  (+4 -0)

 README.txt |4 
 1 files changed, 4 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.160 
llvm/lib/Target/X86/README.txt:1.161
--- llvm/lib/Target/X86/README.txt:1.160Wed Mar 28 13:17:19 2007
+++ llvm/lib/Target/X86/README.txt  Tue Apr  3 18:37:20 2007
@@ -1039,3 +1039,7 @@
 return f(decode);
   return 0;
 }
+
+//===-===//
+
+Add support for 3DNow!



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-03-28 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.159 -> 1.160
---
Log message:

add a note


---
Diffs of the changes:  (+8 -0)

 README.txt |8 
 1 files changed, 8 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.159 
llvm/lib/Target/X86/README.txt:1.160
--- llvm/lib/Target/X86/README.txt:1.159Wed Mar 21 16:16:39 2007
+++ llvm/lib/Target/X86/README.txt  Wed Mar 28 13:17:19 2007
@@ -23,6 +23,14 @@
 
 //===-===//
 
+CodeGen/X86/lea-3.ll:test3 should be a single LEA, not a shift/move.  The X86
+backend knows how to three-addressify this shift, but it appears the register
+allocator isn't even asking it to do so in this case.  We should investigate
+why this isn't happening, it could have significant impact on other important
+cases for X86 as well.
+
+//===-===//
+
 This should be one DIV/IDIV instruction, not a libcall:
 
 unsigned test(unsigned long long X, unsigned Y) {



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-03-21 Thread Dale Johannesen



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.158 -> 1.159
---
Log message:

add generation of unnecessary push/pop around calls


---
Diffs of the changes:  (+42 -0)

 README.txt |   42 ++
 1 files changed, 42 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.158 
llvm/lib/Target/X86/README.txt:1.159
--- llvm/lib/Target/X86/README.txt:1.158Wed Mar 14 16:03:53 2007
+++ llvm/lib/Target/X86/README.txt  Wed Mar 21 16:16:39 2007
@@ -989,3 +989,45 @@
 ret
 
 //===-===//
+
+We use push/pop of stack space around calls in situations where we don't have 
to.
+Call to f below produces:
+subl $16, %esp  <
+movl %eax, (%esp)
+call L_f$stub
+addl $16, %esp <
+The stack push/pop can be moved into the prolog/epilog.  It does this because 
it's
+building the frame pointer, but this should not be sufficient, only the use of 
alloca
+should cause it to do this.
+(There are other issues shown by this code, but this is one.)
+
+typedef struct _range_t {
+float fbias;
+float fscale;
+int ibias;
+int iscale;
+int ishift;
+unsigned char lut[];
+} range_t;
+
+struct _decode_t {
+int type:4;
+int unit:4;
+int alpha:8;
+int N:8;
+int bpc:8;
+int bpp:16;
+int skip:8;
+int swap:8;
+const range_t*const*range;
+};
+
+typedef struct _decode_t decode_t;
+
+extern int f(const decode_t* decode);
+
+int decode_byte (const decode_t* decode) {
+  if (decode->swap != 0)
+return f(decode);
+  return 0;
+}



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-03-14 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.157 -> 1.158
---
Log message:

Notes about codegen issues.

---
Diffs of the changes:  (+47 -0)

 README.txt |   47 +++
 1 files changed, 47 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.157 
llvm/lib/Target/X86/README.txt:1.158
--- llvm/lib/Target/X86/README.txt:1.157Thu Mar  1 23:04:52 2007
+++ llvm/lib/Target/X86/README.txt  Wed Mar 14 16:03:53 2007
@@ -339,6 +339,53 @@
 
 //===-===//
 
+We are generating far worse code than gcc:
+
+volatile short X, Y;
+
+void foo(int N) {
+  int i;
+  for (i = 0; i < N; i++) { X = i; Y = i*4; }
+}
+
+LBB1_1:#bb.preheader
+   xorl %ecx, %ecx
+   xorw %dx, %dx
+LBB1_2:#bb
+   movl L_X$non_lazy_ptr, %esi
+   movw %dx, (%esi)
+   movw %dx, %si
+   shlw $2, %si
+   movl L_Y$non_lazy_ptr, %edi
+   movw %si, (%edi)
+   incl %ecx
+   incw %dx
+   cmpl %eax, %ecx
+   jne LBB1_2  #bb
+
+vs.
+
+   xorl%edx, %edx
+   movlL_X$non_lazy_ptr-"L001$pb"(%ebx), %esi
+   movlL_Y$non_lazy_ptr-"L001$pb"(%ebx), %ecx
+L4:
+   movw%dx, (%esi)
+   leal0(,%edx,4), %eax
+   movw%ax, (%ecx)
+   addl$1, %edx
+   cmpl%edx, %edi
+   jne L4
+
+There are 3 issues:
+
+1. Lack of post regalloc LICM.
+2. Poor sub-regclass support. That leads to inability to promote the 16-bit
+   arithmetic op to 32-bit and making use of leal.
+3. LSR unable to reused IV for a different type (i16 vs. i32) even though
+   the cast would be free.
+
+//===-===//
+
 Teach the coalescer to coalesce vregs of different register classes. e.g. FR32 
/
 FR64 to VR128.
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-03-01 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.156 -> 1.157
---
Log message:

add a note


---
Diffs of the changes:  (+22 -0)

 README.txt |   22 ++
 1 files changed, 22 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.156 
llvm/lib/Target/X86/README.txt:1.157
--- llvm/lib/Target/X86/README.txt:1.156Mon Feb 12 15:20:26 2007
+++ llvm/lib/Target/X86/README.txt  Thu Mar  1 23:04:52 2007
@@ -920,3 +920,25 @@
 Though this probably isn't worth it.
 
 //===-===//
+
+We need to teach the codegen to convert two-address INC instructions to LEA
+when the flags are dead.  For example, on X86-64, compile:
+
+int foo(int A, int B) {
+  return A+1;
+}
+
+to:
+
+_foo:
+leal1(%edi), %eax
+ret
+
+instead of:
+
+_foo:
+incl %edi
+movl %edi, %eax
+ret
+
+//===-===//



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-02-12 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.155 -> 1.156
---
Log message:

more notes


---
Diffs of the changes:  (+26 -3)

 README.txt |   29 ++---
 1 files changed, 26 insertions(+), 3 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.155 
llvm/lib/Target/X86/README.txt:1.156
--- llvm/lib/Target/X86/README.txt:1.155Mon Feb 12 14:26:34 2007
+++ llvm/lib/Target/X86/README.txt  Mon Feb 12 15:20:26 2007
@@ -874,15 +874,15 @@
   if (X) abort();
 }
 
-is currently compiled to (with -static):
+is currently compiled to:
 
 _test:
 subl $12, %esp
 cmpl $0, 16(%esp)
-jne LBB1_1  #cond_true
+jne LBB1_1
 addl $12, %esp
 ret
-LBB1_1: #cond_true
+LBB1_1:
 call L_abort$stub
 
 It would be better to produce:
@@ -895,5 +895,28 @@
 ret
 
 This can be applied to any no-return function call that takes no arguments etc.
+Alternatively, the stack save/restore logic could be shrink-wrapped, producing
+something like this:
+
+_test:
+cmpl $0, 4(%esp)
+jne LBB1_1
+ret
+LBB1_1:
+subl $12, %esp
+call L_abort$stub
+
+Both are useful in different situations.  Finally, it could be shrink-wrapped
+and tail called, like this:
+
+_test:
+cmpl $0, 4(%esp)
+jne LBB1_1
+ret
+LBB1_1:
+pop %eax   # realign stack.
+call L_abort$stub
+
+Though this probably isn't worth it.
 
 //===-===//



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-02-12 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.154 -> 1.155
---
Log message:

add a note


---
Diffs of the changes:  (+29 -0)

 README.txt |   29 +
 1 files changed, 29 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.154 
llvm/lib/Target/X86/README.txt:1.155
--- llvm/lib/Target/X86/README.txt:1.154Thu Feb  8 17:53:38 2007
+++ llvm/lib/Target/X86/README.txt  Mon Feb 12 14:26:34 2007
@@ -868,3 +868,32 @@
 
 //===-===//
 
+This code:
+
+void test(int X) {
+  if (X) abort();
+}
+
+is currently compiled to (with -static):
+
+_test:
+subl $12, %esp
+cmpl $0, 16(%esp)
+jne LBB1_1  #cond_true
+addl $12, %esp
+ret
+LBB1_1: #cond_true
+call L_abort$stub
+
+It would be better to produce:
+
+_test:
+subl $12, %esp
+cmpl $0, 16(%esp)
+jne L_abort$stub
+addl $12, %esp
+ret
+
+This can be applied to any no-return function call that takes no arguments etc.
+
+//===-===//



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-02-08 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.153 -> 1.154
---
Log message:

This is done.

---
Diffs of the changes:  (+0 -14)

 README.txt |   14 --
 1 files changed, 14 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.153 
llvm/lib/Target/X86/README.txt:1.154
--- llvm/lib/Target/X86/README.txt:1.153Sun Jan 21 01:03:37 2007
+++ llvm/lib/Target/X86/README.txt  Thu Feb  8 17:53:38 2007
@@ -665,20 +665,6 @@
 
 //===-===//
 
-We generate really bad code in some cases due to lowering SETCC/SELECT at 
-legalize time, which prevents the post-legalize dag combine pass from
-understanding the code.  As a silly example, this prevents us from folding 
-stuff like this:
-
-bool %test(ulong %x) {
-  %tmp = setlt ulong %x, 4294967296
-  ret bool %tmp
-}
-
-into x.h == 0
-
-//===-===//
-
 We currently compile sign_extend_inreg into two shifts:
 
 long foo(long X) {



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-01-20 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.152 -> 1.153
---
Log message:

add a note


---
Diffs of the changes:  (+52 -0)

 README.txt |   52 
 1 files changed, 52 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.152 
llvm/lib/Target/X86/README.txt:1.153
--- llvm/lib/Target/X86/README.txt:1.152Mon Jan 15 00:25:39 2007
+++ llvm/lib/Target/X86/README.txt  Sun Jan 21 01:03:37 2007
@@ -830,3 +830,55 @@
 
 the pxor is not needed, we could compare the value against itself.
 
+//===-===//
+
+These two functions have identical effects:
+
+unsigned int f(unsigned int i, unsigned int n) {++i; if (i == n) ++i; return 
i;}
+unsigned int f2(unsigned int i, unsigned int n) {++i; i += i == n; return i;}
+
+We currently compile them to:
+
+_f:
+movl 4(%esp), %eax
+movl %eax, %ecx
+incl %ecx
+movl 8(%esp), %edx
+cmpl %edx, %ecx
+jne LBB1_2  #UnifiedReturnBlock
+LBB1_1: #cond_true
+addl $2, %eax
+ret
+LBB1_2: #UnifiedReturnBlock
+movl %ecx, %eax
+ret
+_f2:
+movl 4(%esp), %eax
+movl %eax, %ecx
+incl %ecx
+cmpl 8(%esp), %ecx
+sete %cl
+movzbl %cl, %ecx
+leal 1(%ecx,%eax), %eax
+ret
+
+both of which are inferior to GCC's:
+
+_f:
+movl4(%esp), %edx
+leal1(%edx), %eax
+addl$2, %edx
+cmpl8(%esp), %eax
+cmove   %edx, %eax
+ret
+_f2:
+movl4(%esp), %eax
+addl$1, %eax
+xorl%edx, %edx
+cmpl8(%esp), %eax
+sete%dl
+addl%edx, %eax
+ret
+
+//===-===//
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-01-14 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.151 -> 1.152
---
Log message:

add some notes


---
Diffs of the changes:  (+68 -0)

 README.txt |   68 +
 1 files changed, 68 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.151 
llvm/lib/Target/X86/README.txt:1.152
--- llvm/lib/Target/X86/README.txt:1.151Fri Jan 12 13:20:47 2007
+++ llvm/lib/Target/X86/README.txt  Mon Jan 15 00:25:39 2007
@@ -762,3 +762,71 @@
 We should inline lrintf and probably other libc functions.
 
 //===-===//
+
+Start using the flags more.  For example, compile:
+
+int add_zf(int *x, int y, int a, int b) {
+ if ((*x += y) == 0)
+  return a;
+ else
+  return b;
+}
+
+to:
+   addl%esi, (%rdi)
+   movl%edx, %eax
+   cmovne  %ecx, %eax
+   ret
+instead of:
+
+_add_zf:
+addl (%rdi), %esi
+movl %esi, (%rdi)
+testl %esi, %esi
+cmove %edx, %ecx
+movl %ecx, %eax
+ret
+
+and:
+
+int add_zf(int *x, int y, int a, int b) {
+ if ((*x + y) < 0)
+  return a;
+ else
+  return b;
+}
+
+to:
+
+add_zf:
+addl(%rdi), %esi
+movl%edx, %eax
+cmovns  %ecx, %eax
+ret
+
+instead of:
+
+_add_zf:
+addl (%rdi), %esi
+testl %esi, %esi
+cmovs %edx, %ecx
+movl %ecx, %eax
+ret
+
+//===-===//
+
+This:
+#include 
+int foo(double X) { return isnan(X); }
+
+compiles to (-m64):
+
+_foo:
+pxor %xmm1, %xmm1
+ucomisd %xmm1, %xmm0
+setp %al
+movzbl %al, %eax
+ret
+
+the pxor is not needed, we could compare the value against itself.
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt X86ATTAsmPrinter.cpp X86AsmPrinter.cpp X86AsmPrinter.h X86ISelDAGToDAG.cpp X86ISelLowering.cpp X86RegisterInfo.cpp X86Subtarget.cpp X86Subtarget.h X8

2007-01-12 Thread Anton Korobeynikov



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.150 -> 1.151
X86ATTAsmPrinter.cpp updated: 1.83 -> 1.84
X86AsmPrinter.cpp updated: 1.224 -> 1.225
X86AsmPrinter.h updated: 1.41 -> 1.42
X86ISelDAGToDAG.cpp updated: 1.141 -> 1.142
X86ISelLowering.cpp updated: 1.313 -> 1.314
X86RegisterInfo.cpp updated: 1.188 -> 1.189
X86Subtarget.cpp updated: 1.47 -> 1.48
X86Subtarget.h updated: 1.25 -> 1.26
X86TargetMachine.cpp updated: 1.134 -> 1.135
---
Log message:

* PIC codegen for X86/Linux has been implemented
* PIC-aware internal structures in X86 Codegen have been refactored
* Visibility (default/weak) has been added
* Docs fixes (external weak linkage, visibility, formatting)


---
Diffs of the changes:  (+201 -97)

 README.txt   |4 -
 X86ATTAsmPrinter.cpp |  109 +++
 X86AsmPrinter.cpp|   13 ++
 X86AsmPrinter.h  |   11 -
 X86ISelDAGToDAG.cpp  |   17 +++
 X86ISelLowering.cpp  |   79 +++-
 X86RegisterInfo.cpp  |4 +
 X86Subtarget.cpp |   15 +++
 X86Subtarget.h   |   25 +--
 X86TargetMachine.cpp |   21 +
 10 files changed, 201 insertions(+), 97 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.150 
llvm/lib/Target/X86/README.txt:1.151
--- llvm/lib/Target/X86/README.txt:1.150Fri Jan  5 19:30:45 2007
+++ llvm/lib/Target/X86/README.txt  Fri Jan 12 13:20:47 2007
@@ -534,10 +534,6 @@
 
 //===-===//
 
-We should handle __attribute__ ((__visibility__ ("hidden"))).
-
-//===-===//
-
 int %foo(int* %a, int %t) {
 entry:
 br label %cond_true


Index: llvm/lib/Target/X86/X86ATTAsmPrinter.cpp
diff -u llvm/lib/Target/X86/X86ATTAsmPrinter.cpp:1.83 
llvm/lib/Target/X86/X86ATTAsmPrinter.cpp:1.84
--- llvm/lib/Target/X86/X86ATTAsmPrinter.cpp:1.83   Sat Jan  6 18:41:20 2007
+++ llvm/lib/Target/X86/X86ATTAsmPrinter.cppFri Jan 12 13:20:47 2007
@@ -19,6 +19,7 @@
 #include "X86MachineFunctionInfo.h"
 #include "X86TargetMachine.h"
 #include "X86TargetAsmInfo.h"
+#include "llvm/ADT/StringExtras.h"
 #include "llvm/CallingConv.h"
 #include "llvm/Module.h"
 #include "llvm/Support/Mangler.h"
@@ -29,6 +30,21 @@
 
 STATISTIC(EmittedInsts, "Number of machine instrs printed");
 
+static std::string computePICLabel(unsigned fnNumber,
+   const X86Subtarget* Subtarget) 
+{
+  std::string label;
+
+  if (Subtarget->isTargetDarwin()) {
+label =  "\"L" + utostr_32(fnNumber) + "$pb\"";
+  } else if (Subtarget->isTargetELF()) {
+label = ".Lllvm$" + utostr_32(fnNumber) + "$piclabel";
+  } else
+assert(0 && "Don't know how to print PIC label!\n");
+
+  return label;
+}
+
 /// getSectionForFunction - Return the section that we should emit the
 /// specified function body into.
 std::string X86ATTAsmPrinter::getSectionForFunction(const Function &F) const {
@@ -109,12 +125,15 @@
 }
 break;
   }
+  if (F->hasHiddenVisibility())
+O << "\t.hidden " << CurrentFnName << "\n";
+  
   O << CurrentFnName << ":\n";
   // Add some workaround for linkonce linkage on Cygwin\MinGW
   if (Subtarget->isTargetCygMing() &&
   (F->getLinkage() == Function::LinkOnceLinkage ||
F->getLinkage() == Function::WeakLinkage))
-O << "_llvm$workaround$fake$stub_" << CurrentFnName << ":\n";
+O << "Lllvm$workaround$fake$stub$" << CurrentFnName << ":\n";
 
   if (Subtarget->isTargetDarwin() ||
   Subtarget->isTargetELF() ||
@@ -193,9 +212,14 @@
 if (!isMemOp) O << '$';
 O << TAI->getPrivateGlobalPrefix() << "JTI" << getFunctionNumber() << "_"
   << MO.getJumpTableIndex();
-if (X86PICStyle == PICStyle::Stub &&
-TM.getRelocationModel() == Reloc::PIC_)
-  O << "-\"L" << getFunctionNumber() << "$pb\"";
+
+if (TM.getRelocationModel() == Reloc::PIC_) {
+  if (Subtarget->isPICStyleStub())
+O << "-\"L" << getFunctionNumber() << "$pb\"";
+  else if (Subtarget->isPICStyleGOT())
+O << "@GOTOFF";
+}
+
 if (isMemOp && Subtarget->is64Bit() && !NotRIPRel)
   O << "(%rip)";
 return;
@@ -205,9 +229,14 @@
 if (!isMemOp) O << '$';
 O << TAI->getPrivateGlobalPrefix() << "CPI" << getFunctionNumber() << "_"
   << MO.getConstantPoolIndex();
-if (X86PICStyle == PICStyle::Stub &&
-TM.getRelocationModel() == Reloc::PIC_)
-  O << "-\"L" << getFunctionNumber() << "$pb\"";
+
+if (TM.getRelocationModel() == Reloc::PIC_) {
+  if (Subtarget->isPICStyleStub())
+O << "-\"L" << getFunctionNumber() << "$pb\"";
+  if (Subtarget->isPICStyleGOT())
+O << "@GOTOFF";
+}
+
 int Offset = MO.getOffset();
 if (Offset > 0)
   O << "+" << Offset;
@@ -228,11 +257,11 @@
 
 bool isExt = (GV->isExternal() || GV->hasWeakLinkage() ||

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-01-05 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.149 -> 1.150
---
Log message:

new note


---
Diffs of the changes:  (+5 -0)

 README.txt |5 +
 1 files changed, 5 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.149 
llvm/lib/Target/X86/README.txt:1.150
--- llvm/lib/Target/X86/README.txt:1.149Wed Jan  3 13:12:31 2007
+++ llvm/lib/Target/X86/README.txt  Fri Jan  5 19:30:45 2007
@@ -761,3 +761,8 @@
 return 0;
 }
 
+//===-===//
+
+We should inline lrintf and probably other libc functions.
+
+//===-===//



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2007-01-03 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.148 -> 1.149
---
Log message:

fix testcase.  It's not safe to strictly evaluate a load that should be lazy.


---
Diffs of the changes:  (+2 -1)

 README.txt |3 ++-
 1 files changed, 2 insertions(+), 1 deletion(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.148 
llvm/lib/Target/X86/README.txt:1.149
--- llvm/lib/Target/X86/README.txt:1.148Thu Dec 21 19:03:22 2006
+++ llvm/lib/Target/X86/README.txt  Wed Jan  3 13:12:31 2007
@@ -755,8 +755,9 @@
 //===-===//
 
 This could be a single 16-bit load.
+
 int f(char *p) {
-if (p[0] == 1 && p[1] == 2) return 1;
+if ((p[0] == 1) & (p[1] == 2)) return 1;
 return 0;
 }
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-12-21 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.147 -> 1.148
---
Log message:

add a note


---
Diffs of the changes:  (+9 -0)

 README.txt |9 +
 1 files changed, 9 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.147 
llvm/lib/Target/X86/README.txt:1.148
--- llvm/lib/Target/X86/README.txt:1.147Sun Dec 10 19:20:25 2006
+++ llvm/lib/Target/X86/README.txt  Thu Dec 21 19:03:22 2006
@@ -751,3 +751,12 @@
 //===-===//
 
 In c99 mode, the preprocessor doesn't like assembly comments like #TRUNCATE.
+
+//===-===//
+
+This could be a single 16-bit load.
+int f(char *p) {
+if (p[0] == 1 && p[1] == 2) return 1;
+return 0;
+}
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-11-28 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.145 -> 1.146
---
Log message:

New entries.

---
Diffs of the changes:  (+20 -0)

 README.txt |   20 
 1 files changed, 20 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.145 
llvm/lib/Target/X86/README.txt:1.146
--- llvm/lib/Target/X86/README.txt:1.145Tue Nov 14 02:08:46 2006
+++ llvm/lib/Target/X86/README.txt  Tue Nov 28 13:59:25 2006
@@ -730,3 +730,23 @@
 except that mul isn't a commutative 2-addr instruction.  I guess this has
 to be done at isel time based on the #uses to mul?
 
+//===-===//
+
+Make sure the instruction which starts a loop does not cross a cacheline
+boundary. This requires knowning the exact length of each machine instruction.
+That is somewhat complicated, but doable. Example 256.bzip2:
+
+In the new trace, the hot loop has an instruction which crosses a cacheline
+boundary.  In addition to potential cache misses, this can't help decoding as I
+imagine there has to be some kind of complicated decoder reset and realignment
+to grab the bytes from the next cacheline.
+
+532  532 0x3cfc movb (1809(%esp, %esi), %bl   <<<--- spans 2 64 byte lines
+942  942 0x3d03 movl %dh, (1809(%esp, %esi)
  
+937  937 0x3d0a incl %esi   
+33   0x3d0b cmpb %bl, %dl  
 
+27   27  0x3d0d jnz  0x62db 
+
+//===-===//
+
+In c99 mode, the preprocessor doesn't like assembly comments like #TRUNCATE.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-11-14 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.144 -> 1.145
---
Log message:

it would be nice of ctlz were lowered to bsf etc.


---
Diffs of the changes:  (+9 -0)

 README.txt |9 +
 1 files changed, 9 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.144 
llvm/lib/Target/X86/README.txt:1.145
--- llvm/lib/Target/X86/README.txt:1.144Fri Nov 10 16:03:35 2006
+++ llvm/lib/Target/X86/README.txt  Tue Nov 14 02:08:46 2006
@@ -114,6 +114,15 @@
 however, check that these are defined for 0 and 32.  Our intrinsics are, GCC's
 aren't.
 
+Another example (use predsimplify to eliminate a select):
+
+int foo (unsigned long j) {
+  if (j)
+return __builtin_ffs (j) - 1;
+  else
+return 0;
+}
+
 //===-===//
 
 Use push/pop instructions in prolog/epilog sequences instead of stores off 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-10-22 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.142 -> 1.143
---
Log message:

this part implemented.


---
Diffs of the changes:  (+0 -29)

 README.txt |   29 -
 1 files changed, 29 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.142 
llvm/lib/Target/X86/README.txt:1.143
--- llvm/lib/Target/X86/README.txt:1.142Thu Oct 12 17:01:26 2006
+++ llvm/lib/Target/X86/README.txt  Sun Oct 22 16:40:12 2006
@@ -607,35 +607,6 @@
 cmp eax, 6
 jz label
 
-If we aren't going to do this, we should lower the switch better.  We compile 
-the code to:
-
-_f:
-movl 8(%esp), %eax
-movl 4(%esp), %ecx
-cmpl $6, %ecx
-jl LBB1_4   #entry
-jmp LBB1_3  #entry
-LBB1_3: #entry
-cmpl $6, %ecx
-je LBB1_1   #bb
-jmp LBB1_2  #UnifiedReturnBlock
-LBB1_4: #entry
-cmpl $4, %ecx
-jne LBB1_2  #UnifiedReturnBlock
-LBB1_1: #bb
-incl %eax
-ret
-LBB1_2: #UnifiedReturnBlock
-ret
-
-In the code above, the 'if' is turned into a 'switch' at the mid-level.  It
-looks  like the 'lower to branches' mode could be improved a little here.  In
-particular, the fall-through to LBB1_3 doesn't need a branch.  It would also be
-nice to eliminate the redundant "cmp 6", maybe by lowering to a linear sequence
-of compares if there are below a certain number of cases (instead of a binary
-sequence)?
-
 //===-===//
 
 GCC's ix86_expand_int_movcc function (in i386.c) has a ton of interesting



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-10-12 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.141 -> 1.142
---
Log message:

add a note


---
Diffs of the changes:  (+27 -0)

 README.txt |   27 +++
 1 files changed, 27 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.141 
llvm/lib/Target/X86/README.txt:1.142
--- llvm/lib/Target/X86/README.txt:1.141Fri Oct  6 12:39:34 2006
+++ llvm/lib/Target/X86/README.txt  Thu Oct 12 17:01:26 2006
@@ -734,3 +734,30 @@
 ret
 
 //===-===//
+
+Consider the expansion of:
+
+uint %test3(uint %X) {
+%tmp1 = rem uint %X, 255
+ret uint %tmp1
+}
+
+Currently it compiles to:
+
+...
+movl $2155905153, %ecx
+movl 8(%esp), %esi
+movl %esi, %eax
+mull %ecx
+...
+
+This could be "reassociated" into:
+
+movl $2155905153, %eax
+movl 8(%esp), %ecx
+mull %ecx
+
+to avoid the copy.  In fact, the existing two-address stuff would do this
+except that mul isn't a commutative 2-addr instruction.  I guess this has
+to be done at isel time based on the #uses to mul?
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-10-06 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.140 -> 1.141
---
Log message:

ugly codegen


---
Diffs of the changes:  (+22 -0)

 README.txt |   22 ++
 1 files changed, 22 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.140 
llvm/lib/Target/X86/README.txt:1.141
--- llvm/lib/Target/X86/README.txt:1.140Fri Oct  6 03:21:07 2006
+++ llvm/lib/Target/X86/README.txt  Fri Oct  6 12:39:34 2006
@@ -711,4 +711,26 @@
 
 into x.h == 0
 
+//===-===//
 
+We currently compile sign_extend_inreg into two shifts:
+
+long foo(long X) {
+  return (long)(signed char)X;
+}
+
+becomes:
+
+_foo:
+movl 4(%esp), %eax
+shll $24, %eax
+sarl $24, %eax
+ret
+
+This could be:
+
+_foo:
+movsbl  4(%esp),%eax
+ret
+
+//===-===//



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-09-20 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.138 -> 1.139
---
Log message:

implemented


---
Diffs of the changes:  (+0 -35)

 README.txt |   35 ---
 1 files changed, 35 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.138 
llvm/lib/Target/X86/README.txt:1.139
--- llvm/lib/Target/X86/README.txt:1.138Thu Sep 21 00:46:00 2006
+++ llvm/lib/Target/X86/README.txt  Thu Sep 21 01:14:54 2006
@@ -642,41 +642,6 @@
 
 //===-===//
 
-Compile:
-int %test(ulong *%tmp) {
-%tmp = load ulong* %tmp ;  [#uses=1]
-%tmp.mask = shr ulong %tmp, ubyte 50;  [#uses=1]
-%tmp.mask = cast ulong %tmp.mask to ubyte   
-%tmp2 = and ubyte %tmp.mask, 3  ;  [#uses=1]
-%tmp2 = cast ubyte %tmp2 to int ;  [#uses=1]
-ret int %tmp2
-}
-
-to:
-
-_test:
-movl 4(%esp), %eax
-movl 4(%eax), %eax
-shrl $18, %eax
-andl $3, %eax
-ret
-
-instead of:
-
-_test:
-movl 4(%esp), %eax
-movl 4(%eax), %eax
-shrl $18, %eax
-# TRUNCATE movb %al, %al
-andb $3, %al
-movzbl %al, %eax
-ret
-
-This saves a movzbl, and saves a truncate if it doesn't get coallesced right.
-This is a simple DAGCombine to propagate the zext through the and.
-
-//===-===//
-
 GCC's ix86_expand_int_movcc function (in i386.c) has a ton of interesting
 simplifications for integer "x cmp y ? a : b".  For example, instead of:
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-09-20 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.137 -> 1.138
---
Log message:

Fit in 80-cols


---
Diffs of the changes:  (+10 -9)

 README.txt |   19 ++-
 1 files changed, 10 insertions(+), 9 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.137 
llvm/lib/Target/X86/README.txt:1.138
--- llvm/lib/Target/X86/README.txt:1.137Wed Sep 20 01:32:10 2006
+++ llvm/lib/Target/X86/README.txt  Thu Sep 21 00:46:00 2006
@@ -544,9 +544,9 @@
 br label %cond_true
 
 cond_true:  ; preds = %cond_true, %entry
-%x.0.0 = phi int [ 0, %entry ], [ %tmp9, %cond_true ]   ; 
 [#uses=3]
-%t_addr.0.0 = phi int [ %t, %entry ], [ %tmp7, %cond_true ]
 ;  [#uses=1]
-%tmp2 = getelementptr int* %a, int %x.0.0   ;  
[#uses=1]
+%x.0.0 = phi int [ 0, %entry ], [ %tmp9, %cond_true ]  
+%t_addr.0.0 = phi int [ %t, %entry ], [ %tmp7, %cond_true ]
+%tmp2 = getelementptr int* %a, int %x.0.0  
 %tmp3 = load int* %tmp2 ;  [#uses=1]
 %tmp5 = add int %t_addr.0.0, %x.0.0 ;  [#uses=1]
 %tmp7 = add int %tmp5, %tmp3;  [#uses=2]
@@ -633,11 +633,12 @@
 LBB1_2: #UnifiedReturnBlock
 ret
 
-In the code above, the 'if' is turned into a 'switch' at the mid-level.  It 
looks 
-like the 'lower to branches' mode could be improved a little here.  In 
particular,
-the fall-through to LBB1_3 doesn't need a branch.  It would also be nice to
-eliminate the redundant "cmp 6", maybe by lowering to a linear sequence of
-compares if there are below a certain number of cases (instead of a binary 
sequence)?
+In the code above, the 'if' is turned into a 'switch' at the mid-level.  It
+looks  like the 'lower to branches' mode could be improved a little here.  In
+particular, the fall-through to LBB1_3 doesn't need a branch.  It would also be
+nice to eliminate the redundant "cmp 6", maybe by lowering to a linear sequence
+of compares if there are below a certain number of cases (instead of a binary
+sequence)?
 
 //===-===//
 
@@ -645,7 +646,7 @@
 int %test(ulong *%tmp) {
 %tmp = load ulong* %tmp ;  [#uses=1]
 %tmp.mask = shr ulong %tmp, ubyte 50;  [#uses=1]
-%tmp.mask = cast ulong %tmp.mask to ubyte   ;  
[#uses=1]
+%tmp.mask = cast ulong %tmp.mask to ubyte   
 %tmp2 = and ubyte %tmp.mask, 3  ;  [#uses=1]
 %tmp2 = cast ubyte %tmp2 to int ;  [#uses=1]
 ret int %tmp2



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-09-19 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.136 -> 1.137
---
Log message:

add a note


---
Diffs of the changes:  (+16 -0)

 README.txt |   16 
 1 files changed, 16 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.136 
llvm/lib/Target/X86/README.txt:1.137
--- llvm/lib/Target/X86/README.txt:1.136Mon Sep 18 00:36:54 2006
+++ llvm/lib/Target/X86/README.txt  Wed Sep 20 01:32:10 2006
@@ -734,3 +734,19 @@
addl $12, %esp
 
 The add\sub pair is really unneeded here.
+
+//===-===//
+
+We generate really bad code in some cases due to lowering SETCC/SELECT at 
+legalize time, which prevents the post-legalize dag combine pass from
+understanding the code.  As a silly example, this prevents us from folding 
+stuff like this:
+
+bool %test(ulong %x) {
+  %tmp = setlt ulong %x, 4294967296
+  ret bool %tmp
+}
+
+into x.h == 0
+
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-09-17 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.135 -> 1.136
---
Log message:

add a note.  Our 64-bit shifts are ~30% slower than gcc's


---
Diffs of the changes:  (+2 -1)

 README.txt |3 ++-
 1 files changed, 2 insertions(+), 1 deletion(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.135 
llvm/lib/Target/X86/README.txt:1.136
--- llvm/lib/Target/X86/README.txt:1.135Sun Sep 17 15:25:45 2006
+++ llvm/lib/Target/X86/README.txt  Mon Sep 18 00:36:54 2006
@@ -59,7 +59,8 @@
 
 But that requires good 8-bit subreg support.
 
-
+64-bit shifts (in general) expand to really bad code.  Instead of using
+cmovs, we should expand to a conditional branch like GCC produces.
 
 //===-===//
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt X86ATTAsmPrinter.cpp X86AsmPrinter.cpp X86ISelDAGToDAG.cpp X86ISelLowering.cpp X86RegisterInfo.cpp

2006-09-17 Thread Anton Korobeynikov



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.134 -> 1.135
X86ATTAsmPrinter.cpp updated: 1.62 -> 1.63
X86AsmPrinter.cpp updated: 1.197 -> 1.198
X86ISelDAGToDAG.cpp updated: 1.108 -> 1.109
X86ISelLowering.cpp updated: 1.260 -> 1.261
X86RegisterInfo.cpp updated: 1.169 -> 1.170
---
Log message:

Added some eye-candy for Subtarget type checking
Added X86 StdCall & FastCall calling conventions. Codegen will follow.


---
Diffs of the changes:  (+33 -7)

 README.txt   |   26 ++
 X86ATTAsmPrinter.cpp |2 +-
 X86AsmPrinter.cpp|4 ++--
 X86ISelDAGToDAG.cpp  |2 +-
 X86ISelLowering.cpp  |2 +-
 X86RegisterInfo.cpp  |4 ++--
 6 files changed, 33 insertions(+), 7 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.134 
llvm/lib/Target/X86/README.txt:1.135
--- llvm/lib/Target/X86/README.txt:1.134Fri Sep 15 22:30:19 2006
+++ llvm/lib/Target/X86/README.txt  Sun Sep 17 15:25:45 2006
@@ -707,3 +707,29 @@
 
 //===-===//
 
+Currently we don't have elimination of redundant stack manipulations. Consider
+the code:
+
+int %main() {
+entry:
+   call fastcc void %test1( )
+   call fastcc void %test2( sbyte* cast (void ()* %test1 to sbyte*) )
+   ret int 0
+}
+
+declare fastcc void %test1()
+
+declare fastcc void %test2(sbyte*)
+
+
+This currently compiles to:
+
+   subl $16, %esp
+   call _test5
+   addl $12, %esp
+   subl $16, %esp
+   movl $_test5, (%esp)
+   call _test6
+   addl $12, %esp
+
+The add\sub pair is really unneeded here.


Index: llvm/lib/Target/X86/X86ATTAsmPrinter.cpp
diff -u llvm/lib/Target/X86/X86ATTAsmPrinter.cpp:1.62 
llvm/lib/Target/X86/X86ATTAsmPrinter.cpp:1.63
--- llvm/lib/Target/X86/X86ATTAsmPrinter.cpp:1.62   Thu Sep 14 13:23:27 2006
+++ llvm/lib/Target/X86/X86ATTAsmPrinter.cppSun Sep 17 15:25:45 2006
@@ -63,7 +63,7 @@
 ".section __TEXT,__textcoal_nt,coalesced,pure_instructions", 
F);
   O << "\t.globl\t" << CurrentFnName << "\n";
   O << "\t.weak_definition\t" << CurrentFnName << "\n";
-} else if (Subtarget->TargetType == X86Subtarget::isCygwin) {
+} else if (Subtarget->isTargetCygwin()) {
   EmitAlignment(4, F); // FIXME: This should be parameterized 
somewhere.
   O << "\t.section\t.llvm.linkonce.t." << CurrentFnName
 << ",\"ax\"\n";


Index: llvm/lib/Target/X86/X86AsmPrinter.cpp
diff -u llvm/lib/Target/X86/X86AsmPrinter.cpp:1.197 
llvm/lib/Target/X86/X86AsmPrinter.cpp:1.198
--- llvm/lib/Target/X86/X86AsmPrinter.cpp:1.197 Thu Sep 14 13:23:27 2006
+++ llvm/lib/Target/X86/X86AsmPrinter.cpp   Sun Sep 17 15:25:45 2006
@@ -83,7 +83,7 @@
   } else
 O << TAI->getCOMMDirective()  << name << "," << Size;
 } else {
-  if (Subtarget->TargetType != X86Subtarget::isCygwin) {
+  if (!Subtarget->isTargetCygwin()) {
 if (I->hasInternalLinkage())
   O << "\t.local\t" << name << "\n";
   }
@@ -101,7 +101,7 @@
   O << "\t.globl " << name << "\n"
 << "\t.weak_definition " << name << "\n";
   SwitchToDataSection(".section __DATA,__const_coal,coalesced", I);
-} else if (Subtarget->TargetType == X86Subtarget::isCygwin) {
+} else if (Subtarget->isTargetCygwin()) {
   O << "\t.section\t.llvm.linkonce.d." << name << ",\"aw\"\n"
 << "\t.weak " << name << "\n";
 } else {


Index: llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
diff -u llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:1.108 
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:1.109
--- llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:1.108   Thu Sep 14 18:55:02 2006
+++ llvm/lib/Target/X86/X86ISelDAGToDAG.cpp Sun Sep 17 15:25:45 2006
@@ -468,7 +468,7 @@
 /// the main function.
 void X86DAGToDAGISel::EmitSpecialCodeForMain(MachineBasicBlock *BB,
  MachineFrameInfo *MFI) {
-  if (Subtarget->TargetType == X86Subtarget::isCygwin)
+  if (Subtarget->isTargetCygwin())
 BuildMI(BB, X86::CALLpcrel32, 1).addExternalSymbol("__main");
 
   // Switch the FPU to 64-bit precision mode for better compatibility and 
speed.


Index: llvm/lib/Target/X86/X86ISelLowering.cpp
diff -u llvm/lib/Target/X86/X86ISelLowering.cpp:1.260 
llvm/lib/Target/X86/X86ISelLowering.cpp:1.261
--- llvm/lib/Target/X86/X86ISelLowering.cpp:1.260   Sun Sep 17 08:06:18 2006
+++ llvm/lib/Target/X86/X86ISelLowering.cpp Sun Sep 17 15:25:45 2006
@@ -3907,7 +3907,7 @@
   MachineFunction &MF = DAG.getMachineFunction();
   const Function* Fn = MF.getFunction();
   if (Fn->hasExternalLinkage() &&
-  Subtarget->TargetType == X86Subtarget::isCygwin &&
+  Subtarget->isTargetCygwin() &&
   Fn->getName() == "main")
 MF.getInfo()->setForceFramePointer(true);
 


Index: llvm/lib/Target/X86/X86RegisterInfo.cpp
diff -u llvm/lib/Target/X86/X86Regis

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-09-15 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.133 -> 1.134
---
Log message:

add a note


---
Diffs of the changes:  (+3 -0)

 README.txt |3 +++
 1 files changed, 3 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.133 
llvm/lib/Target/X86/README.txt:1.134
--- llvm/lib/Target/X86/README.txt:1.133Wed Sep 13 18:37:16 2006
+++ llvm/lib/Target/X86/README.txt  Fri Sep 15 22:30:19 2006
@@ -18,6 +18,9 @@
 
 ... which should only be one imul instruction.
 
+This can be done with a custom expander, but it would be nice to move this to
+generic code.
+
 //===-===//
 
 This should be one DIV/IDIV instruction, not a libcall:



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-09-13 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.132 -> 1.133
---
Log message:

add note about switch lowering


---
Diffs of the changes:  (+29 -0)

 README.txt |   29 +
 1 files changed, 29 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.132 
llvm/lib/Target/X86/README.txt:1.133
--- llvm/lib/Target/X86/README.txt:1.132Tue Sep 12 23:19:50 2006
+++ llvm/lib/Target/X86/README.txt  Wed Sep 13 18:37:16 2006
@@ -607,6 +607,34 @@
 cmp eax, 6
 jz label
 
+If we aren't going to do this, we should lower the switch better.  We compile 
+the code to:
+
+_f:
+movl 8(%esp), %eax
+movl 4(%esp), %ecx
+cmpl $6, %ecx
+jl LBB1_4   #entry
+jmp LBB1_3  #entry
+LBB1_3: #entry
+cmpl $6, %ecx
+je LBB1_1   #bb
+jmp LBB1_2  #UnifiedReturnBlock
+LBB1_4: #entry
+cmpl $4, %ecx
+jne LBB1_2  #UnifiedReturnBlock
+LBB1_1: #bb
+incl %eax
+ret
+LBB1_2: #UnifiedReturnBlock
+ret
+
+In the code above, the 'if' is turned into a 'switch' at the mid-level.  It 
looks 
+like the 'lower to branches' mode could be improved a little here.  In 
particular,
+the fall-through to LBB1_3 doesn't need a branch.  It would also be nice to
+eliminate the redundant "cmp 6", maybe by lowering to a linear sequence of
+compares if there are below a certain number of cases (instead of a binary 
sequence)?
+
 //===-===//
 
 Compile:
@@ -675,3 +703,4 @@
 etc.
 
 //===-===//
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-09-12 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.131 -> 1.132
---
Log message:

new note


---
Diffs of the changes:  (+32 -0)

 README.txt |   32 
 1 files changed, 32 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.131 
llvm/lib/Target/X86/README.txt:1.132
--- llvm/lib/Target/X86/README.txt:1.131Tue Sep 12 22:54:54 2006
+++ llvm/lib/Target/X86/README.txt  Tue Sep 12 23:19:50 2006
@@ -643,3 +643,35 @@
 This is a simple DAGCombine to propagate the zext through the and.
 
 //===-===//
+
+GCC's ix86_expand_int_movcc function (in i386.c) has a ton of interesting
+simplifications for integer "x cmp y ? a : b".  For example, instead of:
+
+int G;
+void f(int X, int Y) {
+  G = X < 0 ? 14 : 13;
+}
+
+compiling to:
+
+_f:
+movl $14, %eax
+movl $13, %ecx
+movl 4(%esp), %edx
+testl %edx, %edx
+cmovl %eax, %ecx
+movl %ecx, _G
+ret
+
+it could be:
+_f:
+movl4(%esp), %eax
+sarl$31, %eax
+notl%eax
+addl$14, %eax
+movl%eax, _G
+ret
+
+etc.
+
+//===-===//



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-09-12 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.130 -> 1.131
---
Log message:

new note


---
Diffs of the changes:  (+13 -0)

 README.txt |   13 +
 1 files changed, 13 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.130 
llvm/lib/Target/X86/README.txt:1.131
--- llvm/lib/Target/X86/README.txt:1.130Tue Sep 12 22:22:10 2006
+++ llvm/lib/Target/X86/README.txt  Tue Sep 12 22:54:54 2006
@@ -45,6 +45,19 @@
 
 Another useful one would be  ~0ULL >> X and ~0ULL << X.
 
+One better solution for 1LL << x is:
+xorl%eax, %eax
+xorl%edx, %edx
+testb   $32, %cl
+sete%al
+setne   %dl
+sall%cl, %eax
+sall%cl, %edx
+
+But that requires good 8-bit subreg support.
+
+
+
 //===-===//
 
 Compile this:



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt X86ISelLowering.cpp

2006-09-12 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.129 -> 1.130
X86ISelLowering.cpp updated: 1.255 -> 1.256
---
Log message:

Compile X > -1   -> text X,X; js dest
This implements CodeGen/X86/jump_sign.ll.


---
Diffs of the changes:  (+23 -28)

 README.txt  |   12 
 X86ISelLowering.cpp |   39 +++
 2 files changed, 23 insertions(+), 28 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.129 
llvm/lib/Target/X86/README.txt:1.130
--- llvm/lib/Target/X86/README.txt:1.129Tue Sep 12 01:36:01 2006
+++ llvm/lib/Target/X86/README.txt  Tue Sep 12 22:22:10 2006
@@ -630,15 +630,3 @@
 This is a simple DAGCombine to propagate the zext through the and.
 
 //===-===//
-
-Instead of:
-
-   cmpl $4294967295, %edx
-   jg LBB1_8   #cond_false49
-
-emit:
-
-   testl %edx, %edx
-   js LBB1_8
-
-This saves a byte of code space.


Index: llvm/lib/Target/X86/X86ISelLowering.cpp
diff -u llvm/lib/Target/X86/X86ISelLowering.cpp:1.255 
llvm/lib/Target/X86/X86ISelLowering.cpp:1.256
--- llvm/lib/Target/X86/X86ISelLowering.cpp:1.255   Tue Sep 12 16:03:39 2006
+++ llvm/lib/Target/X86/X86ISelLowering.cpp Tue Sep 12 22:22:10 2006
@@ -1866,13 +1866,23 @@
 
 /// translateX86CC - do a one to one translation of a ISD::CondCode to the X86
 /// specific condition code. It returns a false if it cannot do a direct
-/// translation. X86CC is the translated CondCode. Flip is set to true if the
-/// the order of comparison operands should be flipped.
+/// translation. X86CC is the translated CondCode.  LHS/RHS are modified as
+/// needed.
 static bool translateX86CC(ISD::CondCode SetCCOpcode, bool isFP,
-   unsigned &X86CC, bool &Flip) {
-  Flip = false;
+   unsigned &X86CC, SDOperand &LHS, SDOperand &RHS,
+   SelectionDAG &DAG) {
   X86CC = X86ISD::COND_INVALID;
   if (!isFP) {
+if (SetCCOpcode == ISD::SETGT) {
+  if (ConstantSDNode *RHSC = dyn_cast(RHS))
+if (RHSC->isAllOnesValue()) {
+  // X > -1   -> X == 0, jump on sign.
+  RHS = DAG.getConstant(0, RHS.getValueType());
+  X86CC = X86ISD::COND_S;
+  return true;
+}
+}
+
 switch (SetCCOpcode) {
 default: break;
 case ISD::SETEQ:  X86CC = X86ISD::COND_E;  break;
@@ -1893,6 +1903,7 @@
 //  0 | 0 | 1 | X < Y
 //  1 | 0 | 0 | X == Y
 //  1 | 1 | 1 | unordered
+bool Flip = false;
 switch (SetCCOpcode) {
 default: break;
 case ISD::SETUEQ:
@@ -1914,16 +1925,13 @@
 case ISD::SETUO: X86CC = X86ISD::COND_P;  break;
 case ISD::SETO:  X86CC = X86ISD::COND_NP; break;
 }
+if (Flip)
+  std::swap(LHS, RHS);
   }
 
   return X86CC != X86ISD::COND_INVALID;
 }
 
-static bool translateX86CC(SDOperand CC, bool isFP, unsigned &X86CC,
-   bool &Flip) {
-  return translateX86CC(cast(CC)->get(), isFP, X86CC, Flip);
-}
-
 /// hasFPCMov - is there a floating point cmov for the specific X86 condition
 /// code. Current x86 isa includes the following FP cmov instructions:
 /// fcmovb, fcomvbe, fcomve, fcmovu, fcmovae, fcmova, fcmovne, fcmovnu.
@@ -3620,12 +3628,11 @@
   ISD::CondCode SetCCOpcode = cast(CC)->get();
   const MVT::ValueType *VTs = DAG.getNodeValueTypes(MVT::Other, MVT::Flag);
   bool isFP = MVT::isFloatingPoint(Op.getOperand(1).getValueType());
-  bool Flip;
   unsigned X86CC;
 
   VTs = DAG.getNodeValueTypes(MVT::i8, MVT::Flag);
-  if (translateX86CC(CC, isFP, X86CC, Flip)) {
-if (Flip) std::swap(Op0, Op1);
+  if (translateX86CC(cast(CC)->get(), isFP, X86CC, 
+ Op0, Op1, DAG)) {
 SDOperand Ops1[] = { Chain, Op0, Op1 };
 Cond = DAG.getNode(X86ISD::CMP, VTs, 2, Ops1, 3).getValue(1);
 SDOperand Ops2[] = { DAG.getConstant(X86CC, MVT::i8), Cond };
@@ -4356,13 +4363,13 @@
   break;
 }
 
-bool Flip;
 unsigned X86CC;
-translateX86CC(CC, true, X86CC, Flip);
+SDOperand LHS = Op.getOperand(1);
+SDOperand RHS = Op.getOperand(2);
+translateX86CC(CC, true, X86CC, LHS, RHS, DAG);
 
 const MVT::ValueType *VTs = DAG.getNodeValueTypes(MVT::Other, MVT::Flag);
-SDOperand Ops1[] = { DAG.getEntryNode(), Op.getOperand(Flip?2:1),
- Op.getOperand(Flip?1:2) };
+SDOperand Ops1[] = { DAG.getEntryNode(), LHS, RHS };
 SDOperand Cond = DAG.getNode(Opc, VTs, 2, Ops1, 3);
 VTs = DAG.getNodeValueTypes(MVT::i8, MVT::Flag);
 SDOperand Ops2[] = { DAG.getConstant(X86CC, MVT::i8), Cond };



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-09-11 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.128 -> 1.129
---
Log message:

add a note


---
Diffs of the changes:  (+14 -0)

 README.txt |   14 ++
 1 files changed, 14 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.128 
llvm/lib/Target/X86/README.txt:1.129
--- llvm/lib/Target/X86/README.txt:1.128Mon Sep 11 18:00:56 2006
+++ llvm/lib/Target/X86/README.txt  Tue Sep 12 01:36:01 2006
@@ -628,3 +628,17 @@
 
 This saves a movzbl, and saves a truncate if it doesn't get coallesced right.
 This is a simple DAGCombine to propagate the zext through the and.
+
+//===-===//
+
+Instead of:
+
+   cmpl $4294967295, %edx
+   jg LBB1_8   #cond_false49
+
+emit:
+
+   testl %edx, %edx
+   js LBB1_8
+
+This saves a byte of code space.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-09-11 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.127 -> 1.128
---
Log message:

Testcase noticed from PR906: http://llvm.org/PR906 


---
Diffs of the changes:  (+34 -0)

 README.txt |   34 ++
 1 files changed, 34 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.127 
llvm/lib/Target/X86/README.txt:1.128
--- llvm/lib/Target/X86/README.txt:1.127Mon Sep 11 17:57:51 2006
+++ llvm/lib/Target/X86/README.txt  Mon Sep 11 18:00:56 2006
@@ -594,3 +594,37 @@
 cmp eax, 6
 jz label
 
+//===-===//
+
+Compile:
+int %test(ulong *%tmp) {
+%tmp = load ulong* %tmp ;  [#uses=1]
+%tmp.mask = shr ulong %tmp, ubyte 50;  [#uses=1]
+%tmp.mask = cast ulong %tmp.mask to ubyte   ;  
[#uses=1]
+%tmp2 = and ubyte %tmp.mask, 3  ;  [#uses=1]
+%tmp2 = cast ubyte %tmp2 to int ;  [#uses=1]
+ret int %tmp2
+}
+
+to:
+
+_test:
+movl 4(%esp), %eax
+movl 4(%eax), %eax
+shrl $18, %eax
+andl $3, %eax
+ret
+
+instead of:
+
+_test:
+movl 4(%esp), %eax
+movl 4(%eax), %eax
+shrl $18, %eax
+# TRUNCATE movb %al, %al
+andb $3, %al
+movzbl %al, %eax
+ret
+
+This saves a movzbl, and saves a truncate if it doesn't get coallesced right.
+This is a simple DAGCombine to propagate the zext through the and.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-09-11 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.126 -> 1.127
---
Log message:

add compilable testcase


---
Diffs of the changes:  (+6 -1)

 README.txt |7 ++-
 1 files changed, 6 insertions(+), 1 deletion(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.126 
llvm/lib/Target/X86/README.txt:1.127
--- llvm/lib/Target/X86/README.txt:1.126Mon Sep 11 00:35:17 2006
+++ llvm/lib/Target/X86/README.txt  Mon Sep 11 17:57:51 2006
@@ -581,7 +581,12 @@
 
 Codegen:
 
-if ((variable == 4) || (variable == 6)) { stuff }
+int f(int a, int b) {
+  if (a == 4 || a == 6)
+b++;
+  return b;
+}
+
 
 as:
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-09-10 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.124 -> 1.125
---
Log message:

Update README file.

---
Diffs of the changes:  (+3 -98)

 README.txt |  101 +
 1 files changed, 3 insertions(+), 98 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.124 
llvm/lib/Target/X86/README.txt:1.125
--- llvm/lib/Target/X86/README.txt:1.124Tue Aug 15 21:47:44 2006
+++ llvm/lib/Target/X86/README.txt  Mon Sep 11 00:25:15 2006
@@ -80,15 +80,6 @@
 
 //===-===//
 
-Model X86 EFLAGS as a real register to avoid redudant cmp / test. e.g.
-
-   cmpl $1, %eax
-   setg %al
-   testb %al, %al  # unnecessary
-   jne .BB7
-
-//===-===//
-
 Count leading zeros and count trailing zeros:
 
 int clz(int X) { return __builtin_clz(X); }
@@ -126,6 +117,8 @@
 should be made smart enough to cannonicalize the load into the RHS of a compare
 when it can invert the result of the compare for free.
 
+//===-===//
+
 How about intrinsics? An example is:
   *res = _mm_mulhi_epu16(*A, _mm_mul_epu32(*B, *C));
 
@@ -140,51 +133,6 @@
 
 //===-===//
 
-The DAG Isel doesn't fold the loads into the adds in this testcase.  The
-pattern selector does.  This is because the chain value of the load gets 
-selected first, and the loads aren't checking to see if they are only used by
-and add.
-
-.ll:
-
-int %test(int* %x, int* %y, int* %z) {
-%X = load int* %x
-%Y = load int* %y
-%Z = load int* %z
-%a = add int %X, %Y
-%b = add int %a, %Z
-ret int %b
-}
-
-dag isel:
-
-_test:
-movl 4(%esp), %eax
-movl (%eax), %eax
-movl 8(%esp), %ecx
-movl (%ecx), %ecx
-addl %ecx, %eax
-movl 12(%esp), %ecx
-movl (%ecx), %ecx
-addl %ecx, %eax
-ret
-
-pattern isel:
-
-_test:
-movl 12(%esp), %ecx
-movl 4(%esp), %edx
-movl 8(%esp), %eax
-movl (%eax), %eax
-addl (%edx), %eax
-addl (%ecx), %eax
-ret
-
-This is bad for register pressure, though the dag isel is producing a 
-better schedule. :)
-
-//===-===//
-
 In many cases, LLVM generates code like this:
 
 _test:
@@ -198,7 +146,7 @@
 
 _test:
 movl 8(%esp), %ebx
-   xor  %eax, %eax
+xor  %eax, %eax
 cmpl %ebx, 4(%esp)
 setl %al
 ret
@@ -207,38 +155,6 @@
 
 //===-===//
 
-We should generate 'test' instead of 'cmp' in various cases, e.g.:
-
-bool %test(int %X) {
-%Y = shl int %X, ubyte 1
-%C = seteq int %Y, 0
-ret bool %C
-}
-bool %test(int %X) {
-%Y = and int %X, 8
-%C = seteq int %Y, 0
-ret bool %C
-}
-
-This may just be a matter of using 'test' to write bigger patterns for X86cmp.
-
-An important case is comparison against zero:
-
-if (X == 0) ...
-
-instead of:
-
-   cmpl $0, %eax
-   je LBB4_2   #cond_next
-
-use:
-   test %eax, %eax
-   jz LBB4_2
-
-which is smaller.
-
-//===-===//
-
 We should generate bts/btr/etc instructions on targets where they are cheap or
 when codesize is important.  e.g., for:
 
@@ -564,17 +480,6 @@
 
 //===-===//
 
-Some ideas for instruction selection code simplification: 1. A pre-pass to
-determine which chain producing node can or cannot be folded. The generated
-isel code would then use the information. 2. The same pre-pass can force
-ordering of TokenFactor operands to allow load / store folding. 3. During isel,
-instead of recursively going up the chain operand chain, mark the chain operand
-as available and put it in some work list. Select other nodes in the normal
-manner. The chain operands are selected after all other nodes are selected. 
Uses
-of chain nodes are modified after instruction selection is completed.
-
-//===-===//
-
 Another instruction selector deficiency:
 
 void %bar() {



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-08-15 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.123 -> 1.124
---
Log message:

add a note


---
Diffs of the changes:  (+13 -0)

 README.txt |   13 +
 1 files changed, 13 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.123 
llvm/lib/Target/X86/README.txt:1.124
--- llvm/lib/Target/X86/README.txt:1.123Wed Aug  2 00:31:20 2006
+++ llvm/lib/Target/X86/README.txt  Tue Aug 15 21:47:44 2006
@@ -709,3 +709,16 @@
 
 When using fastcc abi, align stack slot of argument of type double on 8 byte
 boundary to improve performance.
+
+//===-===//
+
+Codegen:
+
+if ((variable == 4) || (variable == 6)) { stuff }
+
+as:
+
+or eax, 2
+cmp eax, 6
+jz label
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-08-01 Thread Nate Begeman



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.122 -> 1.123
---
Log message:

Update the readme to remove duplicate information and clarify the loop
problem.


---
Diffs of the changes:  (+20 -45)

 README.txt |   65 ++---
 1 files changed, 20 insertions(+), 45 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.122 
llvm/lib/Target/X86/README.txt:1.123
--- llvm/lib/Target/X86/README.txt:1.122Wed Jul 26 16:49:52 2006
+++ llvm/lib/Target/X86/README.txt  Wed Aug  2 00:31:20 2006
@@ -198,7 +198,7 @@
 
 _test:
 movl 8(%esp), %ebx
-   xor %eax, %eax
+   xor  %eax, %eax
 cmpl %ebx, 4(%esp)
 setl %al
 ret
@@ -340,22 +340,6 @@
 
 //===-===//
 
-Investigate whether it is better to codegen the following
-
-%tmp.1 = mul int %x, 9
-to
-
-   movl4(%esp), %eax
-   leal(%eax,%eax,8), %eax
-
-as opposed to what llc is currently generating:
-
-   imull $9, 4(%esp), %eax
-
-Currently the load folding imull has a higher complexity than the LEA32 
pattern.
-
-//===-===//
-
 We are currently lowering large (1MB+) memmove/memcpy to rep/stosl and 
rep/movsl
 We should leave these as libcalls for everything over a much lower threshold,
 since libc is hand tuned for medium and large mem ops (avoiding RFO for large
@@ -671,35 +655,26 @@
 
 //===-===//
 
-Consider:
-int foo(int *a, int t) {
-int x;
-for (x=0; x<40; ++x)
-   t = t + a[x] + x;
-return t;
-}
-
-We generate:
-LBB1_1: #cond_true
-movl %ecx, %esi
-movl (%edx,%eax,4), %edi
-movl %esi, %ecx
-addl %edi, %ecx
-addl %eax, %ecx
-incl %eax
-cmpl $40, %eax
-jne LBB1_1  #cond_true
-
-GCC generates:
-
-L2:
-addl(%ecx,%edx,4), %eax
-addl%edx, %eax
-addl$1, %edx
-cmpl$40, %edx
-jne L2
+int %foo(int* %a, int %t) {
+entry:
+br label %cond_true
+
+cond_true:  ; preds = %cond_true, %entry
+%x.0.0 = phi int [ 0, %entry ], [ %tmp9, %cond_true ]   ; 
 [#uses=3]
+%t_addr.0.0 = phi int [ %t, %entry ], [ %tmp7, %cond_true ]
 ;  [#uses=1]
+%tmp2 = getelementptr int* %a, int %x.0.0   ;  
[#uses=1]
+%tmp3 = load int* %tmp2 ;  [#uses=1]
+%tmp5 = add int %t_addr.0.0, %x.0.0 ;  [#uses=1]
+%tmp7 = add int %tmp5, %tmp3;  [#uses=2]
+%tmp9 = add int %x.0.0, 1   ;  [#uses=2]
+%tmp = setgt int %tmp9, 39  ;  [#uses=1]
+br bool %tmp, label %bb12, label %cond_true
+
+bb12:   ; preds = %cond_true
+ret int %tmp7
+}
 
-Smells like a register coallescing/reassociation issue.
+is pessimized by -loop-reduce and -indvars
 
 //===-===//
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-07-26 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.121 -> 1.122
---
Log message:

New entry.

---
Diffs of the changes:  (+5 -0)

 README.txt |5 +
 1 files changed, 5 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.121 
llvm/lib/Target/X86/README.txt:1.122
--- llvm/lib/Target/X86/README.txt:1.121Fri Jul 21 18:07:23 2006
+++ llvm/lib/Target/X86/README.txt  Wed Jul 26 16:49:52 2006
@@ -729,3 +729,8 @@
 002aflds(%esp,1)
 002daddl$0x04,%esp
 0030ret
+
+//===-===//
+
+When using fastcc abi, align stack slot of argument of type double on 8 byte
+boundary to improve performance.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-07-21 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.120 -> 1.121
---
Log message:

Done.

---
Diffs of the changes:  (+0 -5)

 README.txt |5 -
 1 files changed, 5 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.120 
llvm/lib/Target/X86/README.txt:1.121
--- llvm/lib/Target/X86/README.txt:1.120Wed Jul 19 16:29:30 2006
+++ llvm/lib/Target/X86/README.txt  Fri Jul 21 18:07:23 2006
@@ -707,11 +707,6 @@
 
 //===-===//
 
-JIT should resolve __cxa_atexit on Mac OS X. In a non-jit environment, the
-symbol is a dynamically resolved by the linker.
-
-//===-===//
-
 u32 to float conversion improvement:
 
 float uint32_2_float( unsigned u ) {



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-07-19 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.119 -> 1.120
---
Log message:

New entry.

---
Diffs of the changes:  (+25 -0)

 README.txt |   25 +
 1 files changed, 25 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.119 
llvm/lib/Target/X86/README.txt:1.120
--- llvm/lib/Target/X86/README.txt:1.119Wed Jul 19 01:06:24 2006
+++ llvm/lib/Target/X86/README.txt  Wed Jul 19 16:29:30 2006
@@ -709,3 +709,28 @@
 
 JIT should resolve __cxa_atexit on Mac OS X. In a non-jit environment, the
 symbol is a dynamically resolved by the linker.
+
+//===-===//
+
+u32 to float conversion improvement:
+
+float uint32_2_float( unsigned u ) {
+  float fl = (int) (u & 0x);
+  float fh = (int) (u >> 16);
+  fh *= 0x1.0p16f;
+  return fh + fl;
+}
+
+subl$0x04,%esp
+0003movl0x08(%esp,1),%eax
+0007movl%eax,%ecx
+0009shrl$0x10,%ecx
+000ccvtsi2ss%ecx,%xmm0
+0010andl$0x,%eax
+0015cvtsi2ss%eax,%xmm1
+0019mulss   0x0078,%xmm0
+0021addss   %xmm1,%xmm0
+0025movss   %xmm0,(%esp,1)
+002aflds(%esp,1)
+002daddl$0x04,%esp
+0030ret



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-07-18 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.118 -> 1.119
---
Log message:

Misc. new entry.

---
Diffs of the changes:  (+5 -0)

 README.txt |5 +
 1 files changed, 5 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.118 
llvm/lib/Target/X86/README.txt:1.119
--- llvm/lib/Target/X86/README.txt:1.118Fri Jun 16 19:45:49 2006
+++ llvm/lib/Target/X86/README.txt  Wed Jul 19 01:06:24 2006
@@ -704,3 +704,8 @@
 //===-===//
 
 Use cpuid to auto-detect CPU features such as SSE, SSE2, and SSE3.
+
+//===-===//
+
+JIT should resolve __cxa_atexit on Mac OS X. In a non-jit environment, the
+symbol is a dynamically resolved by the linker.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-06-16 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.117 -> 1.118
---
Log message:

A new entry.

---
Diffs of the changes:  (+2 -0)

 README.txt |2 ++
 1 files changed, 2 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.117 
llvm/lib/Target/X86/README.txt:1.118
--- llvm/lib/Target/X86/README.txt:1.117Thu Jun 15 16:33:31 2006
+++ llvm/lib/Target/X86/README.txt  Fri Jun 16 19:45:49 2006
@@ -702,3 +702,5 @@
 Smells like a register coallescing/reassociation issue.
 
 //===-===//
+
+Use cpuid to auto-detect CPU features such as SSE, SSE2, and SSE3.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-06-15 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.116 -> 1.117
---
Log message:

Add a note that Nate noticed.


---
Diffs of the changes:  (+34 -0)

 README.txt |   34 ++
 1 files changed, 34 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.116 
llvm/lib/Target/X86/README.txt:1.117
--- llvm/lib/Target/X86/README.txt:1.116Sun Jun  4 04:08:00 2006
+++ llvm/lib/Target/X86/README.txt  Thu Jun 15 16:33:31 2006
@@ -668,3 +668,37 @@
 //===-===//
 
 We should handle __attribute__ ((__visibility__ ("hidden"))).
+
+//===-===//
+
+Consider:
+int foo(int *a, int t) {
+int x;
+for (x=0; x<40; ++x)
+   t = t + a[x] + x;
+return t;
+}
+
+We generate:
+LBB1_1: #cond_true
+movl %ecx, %esi
+movl (%edx,%eax,4), %edi
+movl %esi, %ecx
+addl %edi, %ecx
+addl %eax, %ecx
+incl %eax
+cmpl $40, %eax
+jne LBB1_1  #cond_true
+
+GCC generates:
+
+L2:
+addl(%ecx,%edx,4), %eax
+addl%edx, %eax
+addl$1, %edx
+cmpl$40, %edx
+jne L2
+
+Smells like a register coallescing/reassociation issue.
+
+//===-===//



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-06-04 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.115 -> 1.116
---
Log message:

A few new entries.

---
Diffs of the changes:  (+19 -0)

 README.txt |   19 +++
 1 files changed, 19 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.115 
llvm/lib/Target/X86/README.txt:1.116
--- llvm/lib/Target/X86/README.txt:1.115Tue May 30 02:37:37 2006
+++ llvm/lib/Target/X86/README.txt  Sun Jun  4 04:08:00 2006
@@ -538,6 +538,9 @@
sarl $24, %eax
ret
 
+SIGN_EXTEND_INREG can be implemented as (sext (trunc)) to take advantage of 
+sub-registers.
+
 //===-===//
 
 Consider this:
@@ -649,3 +652,19 @@
 
 However, if we care more about code size, then imull is better. It's two bytes
 shorter than movl + leal.
+
+//===-===//
+
+Implement CTTZ, CTLZ with bsf and bsr.
+
+//===-===//
+
+It appears gcc place string data with linkonce linkage in
+.section __TEXT,__const_coal,coalesced instead of
+.section __DATA,__const_coal,coalesced.
+Take a look at darwin.h, there are other Darwin assembler directives that we
+do not make use of.
+
+//===-===//
+
+We should handle __attribute__ ((__visibility__ ("hidden"))).



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-05-30 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.114 -> 1.115
---
Log message:

Add a note about integer multiplication by constants.

---
Diffs of the changes:  (+27 -0)

 README.txt |   27 +++
 1 files changed, 27 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.114 
llvm/lib/Target/X86/README.txt:1.115
--- llvm/lib/Target/X86/README.txt:1.114Tue May 30 01:23:50 2006
+++ llvm/lib/Target/X86/README.txt  Tue May 30 02:37:37 2006
@@ -622,3 +622,30 @@
 operand? i.e. Print as 32-bit super-class register / 16-bit sub-class register.
 Do this for the cases where a truncate / anyext is guaranteed to be eliminated.
 For IA32 that is truncate from 32 to 16 and anyext from 16 to 32.
+
+//===-===//
+
+For this:
+
+int test(int a)
+{
+  return a * 3;
+}
+
+We currently emits
+   imull $3, 4(%esp), %eax
+
+Perhaps this is what we really should generate is? Is imull three or four
+cycles? Note: ICC generates this:
+   movl4(%esp), %eax
+   leal(%eax,%eax,2), %eax
+
+The current instruction priority is based on pattern complexity. The former is
+more "complex" because it folds a load so the latter will not be emitted.
+
+Perhaps we should use AddedComplexity to give LEA32r a higher priority? We
+should always try to match LEA first since the LEA matching code does some
+estimate to determine whether the match is profitable.
+
+However, if we care more about code size, then imull is better. It's two bytes
+shorter than movl + leal.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-05-29 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.113 -> 1.114
---
Log message:

Some new entries about truncate / anyext

---
Diffs of the changes:  (+16 -0)

 README.txt |   16 
 1 files changed, 16 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.113 
llvm/lib/Target/X86/README.txt:1.114
--- llvm/lib/Target/X86/README.txt:1.113Mon May 22 00:54:49 2006
+++ llvm/lib/Target/X86/README.txt  Tue May 30 01:23:50 2006
@@ -606,3 +606,19 @@
 
 The current isel scheme will not allow the load to be folded in the call since
 the load's chain result is read by the callseq_start.
+
+//===-===//
+
+Don't forget to find a way to squash noop truncates in the JIT environment.
+
+//===-===//
+
+Implement anyext in the same manner as truncate that would allow them to be
+eliminated.
+
+//===-===//
+
+How about implementing truncate / anyext as a property of machine instruction
+operand? i.e. Print as 32-bit super-class register / 16-bit sub-class register.
+Do this for the cases where a truncate / anyext is guaranteed to be eliminated.
+For IA32 that is truncate from 32 to 16 and anyext from 16 to 32.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-05-20 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.111 -> 1.112
---
Log message:

A new entry

---
Diffs of the changes:  (+10 -0)

 README.txt |   10 ++
 1 files changed, 10 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.111 
llvm/lib/Target/X86/README.txt:1.112
--- llvm/lib/Target/X86/README.txt:1.111Fri May 19 15:55:31 2006
+++ llvm/lib/Target/X86/README.txt  Sat May 20 02:44:53 2006
@@ -577,3 +577,13 @@
 
 //===-===//
 
+Some ideas for instruction selection code simplification: 1. A pre-pass to
+determine which chain producing node can or cannot be folded. The generated
+isel code would then use the information. 2. The same pre-pass can force
+ordering of TokenFactor operands to allow load / store folding. 3. During isel,
+instead of recursively going up the chain operand chain, mark the chain operand
+as available and put it in some work list. Select other nodes in the normal
+manner. The chain operands are selected after all other nodes are selected. 
Uses
+of chain nodes are modified after instruction selection is completed.
+
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-05-19 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.110 -> 1.111
---
Log message:

Add a note


---
Diffs of the changes:  (+38 -0)

 README.txt |   38 ++
 1 files changed, 38 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.110 
llvm/lib/Target/X86/README.txt:1.111
--- llvm/lib/Target/X86/README.txt:1.110Fri May 19 15:51:43 2006
+++ llvm/lib/Target/X86/README.txt  Fri May 19 15:55:31 2006
@@ -539,3 +539,41 @@
ret
 
 //===-===//
+
+Consider this:
+
+typedef struct pair { float A, B; } pair;
+void pairtest(pair P, float *FP) {
+*FP = P.A+P.B;
+}
+
+We currently generate this code with llvmgcc4:
+
+_pairtest:
+subl $12, %esp
+movl 20(%esp), %eax
+movl %eax, 4(%esp)
+movl 16(%esp), %eax
+movl %eax, (%esp)
+movss (%esp), %xmm0
+addss 4(%esp), %xmm0
+movl 24(%esp), %eax
+movss %xmm0, (%eax)
+addl $12, %esp
+ret
+
+we should be able to generate:
+_pairtest:
+movss 4(%esp), %xmm0
+movl 12(%esp), %eax
+addss 8(%esp), %xmm0
+movss %xmm0, (%eax)
+ret
+
+The issue is that llvmgcc4 is forcing the struct to memory, then passing it as
+integer chunks.  It does this so that structs like {short,short} are passed in
+a single 32-bit integer stack slot.  We should handle the safe cases above much
+nicer, while still handling the hard cases.
+
+//===-===//
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-05-19 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.107 -> 1.108
---
Log message:

Particularly ugly code.


---
Diffs of the changes:  (+14 -0)

 README.txt |   14 ++
 1 files changed, 14 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.107 
llvm/lib/Target/X86/README.txt:1.108
--- llvm/lib/Target/X86/README.txt:1.107Thu May 18 12:38:16 2006
+++ llvm/lib/Target/X86/README.txt  Fri May 19 14:41:33 2006
@@ -36,6 +36,20 @@
 
 //===-===//
 
+On darwin/x86, we should codegen:
+
+ret double 0.00e+00
+
+as fld0/ret, not as:
+
+movl $0, 4(%esp)
+movl $0, (%esp)
+fldl (%esp)
+   ...
+ret
+
+//===-===//
+
 This should use fiadd on chips where it is profitable:
 double foo(double P, int *I) { return P+*I; }
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-05-18 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.106 -> 1.107
---
Log message:

add a note


---
Diffs of the changes:  (+15 -0)

 README.txt |   15 +++
 1 files changed, 15 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.106 
llvm/lib/Target/X86/README.txt:1.107
--- llvm/lib/Target/X86/README.txt:1.106Wed May 17 16:20:51 2006
+++ llvm/lib/Target/X86/README.txt  Thu May 18 12:38:16 2006
@@ -380,6 +380,21 @@
 
 This may just be a matter of using 'test' to write bigger patterns for X86cmp.
 
+An important case is comparison against zero:
+
+if (X == 0) ...
+
+instead of:
+
+   cmpl $0, %eax
+   je LBB4_2   #cond_next
+
+use:
+   test %eax, %eax
+   jz LBB4_2
+
+which is smaller.
+
 //===-===//
 
 SSE should implement 'select_cc' using 'emulated conditional moves' that use



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-05-17 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.105 -> 1.106
---
Log message:

Another entry

---
Diffs of the changes:  (+9 -0)

 README.txt |9 +
 1 files changed, 9 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.105 
llvm/lib/Target/X86/README.txt:1.106
--- llvm/lib/Target/X86/README.txt:1.105Wed May 17 14:05:31 2006
+++ llvm/lib/Target/X86/README.txt  Wed May 17 16:20:51 2006
@@ -1183,3 +1183,12 @@
shll $24, %eax
sarl $24, %eax
ret
+
+//===-===//
+
+Some useful information in the Apple Altivec / SSE Migration Guide:
+
+http://developer.apple.com/documentation/Performance/Conceptual/
+Accelerate_sse_migration/index.html
+
+e.g. SSE select using and, andnot, or. Various SSE compare translations.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-05-17 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.104 -> 1.105
---
Log message:

Another entry

---
Diffs of the changes:  (+12 -0)

 README.txt |   12 
 1 files changed, 12 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.104 
llvm/lib/Target/X86/README.txt:1.105
--- llvm/lib/Target/X86/README.txt:1.104Tue May  9 01:54:05 2006
+++ llvm/lib/Target/X86/README.txt  Wed May 17 14:05:31 2006
@@ -1171,3 +1171,15 @@
 ret
 
 or use pxor (to make a zero vector) and shuffle (to insert it).
+
+//===-===//
+
+Bad codegen:
+
+char foo(int x) { return x; }
+
+_foo:
+   movl 4(%esp), %eax
+   shll $24, %eax
+   sarl $24, %eax
+   ret



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-05-09 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.103 -> 1.104
---
Log message:

Remove a completed entry.

---
Diffs of the changes:  (+0 -42)

 README.txt |   42 --
 1 files changed, 42 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.103 
llvm/lib/Target/X86/README.txt:1.104
--- llvm/lib/Target/X86/README.txt:1.103Mon May  8 16:39:45 2006
+++ llvm/lib/Target/X86/README.txt  Tue May  9 01:54:05 2006
@@ -1126,48 +1126,6 @@
 
 //===-===//
 
-This testcase:
-
-%G1 = weak global <4 x float> zeroinitializer   ; <<4 x float>*> 
[#uses=1]
-%G2 = weak global <4 x float> zeroinitializer   ; <<4 x float>*> 
[#uses=1]
-%G3 = weak global <4 x float> zeroinitializer   ; <<4 x float>*> 
[#uses=1]
-%G4 = weak global <4 x float> zeroinitializer   ; <<4 x float>*> 
[#uses=1]
-
-implementation   ; Functions:
-
-void %test() {
-%tmp = load <4 x float>* %G1; <<4 x float>> [#uses=2]
-%tmp2 = load <4 x float>* %G2   ; <<4 x float>> [#uses=2]
-%tmp135 = shufflevector <4 x float> %tmp, <4 x float> %tmp2, <4 x 
uint> < uint 0, uint 4, uint 1, uint 5 >; <<4 x float>> [#uses=1]
-store <4 x float> %tmp135, <4 x float>* %G3
-%tmp293 = shufflevector <4 x float> %tmp, <4 x float> %tmp2, <4 x 
uint> < uint 1, uint undef, uint 3, uint 4 >; <<4 x float>> [#uses=1]
-store <4 x float> %tmp293, <4 x float>* %G4
-ret void
-}
-
-Compiles (llc -march=x86 -mcpu=yonah -relocation-model=static) to:
-
-_test:
-movaps _G2, %xmm0
-movaps _G1, %xmm1
-movaps %xmm1, %xmm2
-2)  shufps $3, %xmm0, %xmm2
-movaps %xmm1, %xmm3
-2)  shufps $1, %xmm0, %xmm3
-1)  unpcklps %xmm0, %xmm1
-2)  shufps $128, %xmm2, %xmm3
-1)  movaps %xmm1, _G3
-movaps %xmm3, _G4
-ret
-
-The 1) marked instructions could be scheduled better for reduced register 
-pressure.  The scheduling issue is more pronounced without -static.
-
-The 2) marked instructions are the lowered form of the 1,undef,3,4 
-shufflevector.  It seems that there should be a better way to do it :)
-
-//===-===//
-
 If shorter, we should use things like:
 movzwl %ax, %eax
 instead of:



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-05-08 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.102 -> 1.103
---
Log message:

Another bad case I noticed


---
Diffs of the changes:  (+37 -0)

 README.txt |   37 +
 1 files changed, 37 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.102 
llvm/lib/Target/X86/README.txt:1.103
--- llvm/lib/Target/X86/README.txt:1.102Mon May  8 16:24:21 2006
+++ llvm/lib/Target/X86/README.txt  Mon May  8 16:39:45 2006
@@ -1176,3 +1176,40 @@
 The former can also be used when the two-addressy nature of the 'and' would
 require a copy to be inserted (in X86InstrInfo::convertToThreeAddress).
 
+//===-===//
+
+This code generates ugly code, probably due to costs being off or something:
+
+void %test(float* %P, <4 x float>* %P2 ) {
+%xFloat0.688 = load float* %P
+%loadVector37.712 = load <4 x float>* %P2
+%inFloat3.713 = insertelement <4 x float> %loadVector37.712, float 
0.00e+00, uint 3
+store <4 x float> %inFloat3.713, <4 x float>* %P2
+ret void
+}
+
+Generates:
+
+_test:
+pxor %xmm0, %xmm0
+movd %xmm0, %eax;; EAX = 0!
+movl 8(%esp), %ecx
+movaps (%ecx), %xmm0
+pinsrw $6, %eax, %xmm0
+shrl $16, %eax  ;; EAX = 0 again!
+pinsrw $7, %eax, %xmm0
+movaps %xmm0, (%ecx)
+ret
+
+It would be better to generate:
+
+_test:
+movl 8(%esp), %ecx
+movaps (%ecx), %xmm0
+   xor %eax, %eax
+pinsrw $6, %eax, %xmm0
+pinsrw $7, %eax, %xmm0
+movaps %xmm0, (%ecx)
+ret
+
+or use pxor (to make a zero vector) and shuffle (to insert it).



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-05-08 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.101 -> 1.102
---
Log message:

add a note


---
Diffs of the changes:  (+9 -0)

 README.txt |9 +
 1 files changed, 9 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.101 
llvm/lib/Target/X86/README.txt:1.102
--- llvm/lib/Target/X86/README.txt:1.101Tue May  2 17:43:31 2006
+++ llvm/lib/Target/X86/README.txt  Mon May  8 16:24:21 2006
@@ -1166,4 +1166,13 @@
 The 2) marked instructions are the lowered form of the 1,undef,3,4 
 shufflevector.  It seems that there should be a better way to do it :)
 
+//===-===//
+
+If shorter, we should use things like:
+movzwl %ax, %eax
+instead of:
+andl $65535, %EAX
+
+The former can also be used when the two-addressy nature of the 'and' would
+require a copy to be inserted (in X86InstrInfo::convertToThreeAddress).
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-05-02 Thread Nate Begeman



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.100 -> 1.101
---
Log message:

Remove some stuff from the README


---
Diffs of the changes:  (+0 -21)

 README.txt |   21 -
 1 files changed, 21 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.100 
llvm/lib/Target/X86/README.txt:1.101
--- llvm/lib/Target/X86/README.txt:1.100Thu Apr 27 16:40:57 2006
+++ llvm/lib/Target/X86/README.txt  Tue May  2 17:43:31 2006
@@ -1126,27 +1126,6 @@
 
 //===-===//
 
-typedef short  v8i16 __attribute__ ((__vector_size__ (16)));
-v8i16 test(v8i16 x, v8i16 y) {
-  return x + y;
-}
-
-compiles to
-
-_test:
-   paddw %xmm0, %xmm1
-   movaps %xmm1, %xmm0
-   ret
-
-It should be
-
-   paddw %xmm1, %xmm0
-   ret
-
-since paddw is commutative.
-
-//===-===//
-
 This testcase:
 
 %G1 = weak global <4 x float> zeroinitializer   ; <<4 x float>*> 
[#uses=1]



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-27 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.99 -> 1.100
---
Log message:

Add a note


---
Diffs of the changes:  (+44 -0)

 README.txt |   44 
 1 files changed, 44 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.99 llvm/lib/Target/X86/README.txt:1.100
--- llvm/lib/Target/X86/README.txt:1.99 Thu Apr 27 03:31:33 2006
+++ llvm/lib/Target/X86/README.txt  Thu Apr 27 16:40:57 2006
@@ -1144,3 +1144,47 @@
ret
 
 since paddw is commutative.
+
+//===-===//
+
+This testcase:
+
+%G1 = weak global <4 x float> zeroinitializer   ; <<4 x float>*> 
[#uses=1]
+%G2 = weak global <4 x float> zeroinitializer   ; <<4 x float>*> 
[#uses=1]
+%G3 = weak global <4 x float> zeroinitializer   ; <<4 x float>*> 
[#uses=1]
+%G4 = weak global <4 x float> zeroinitializer   ; <<4 x float>*> 
[#uses=1]
+
+implementation   ; Functions:
+
+void %test() {
+%tmp = load <4 x float>* %G1; <<4 x float>> [#uses=2]
+%tmp2 = load <4 x float>* %G2   ; <<4 x float>> [#uses=2]
+%tmp135 = shufflevector <4 x float> %tmp, <4 x float> %tmp2, <4 x 
uint> < uint 0, uint 4, uint 1, uint 5 >; <<4 x float>> [#uses=1]
+store <4 x float> %tmp135, <4 x float>* %G3
+%tmp293 = shufflevector <4 x float> %tmp, <4 x float> %tmp2, <4 x 
uint> < uint 1, uint undef, uint 3, uint 4 >; <<4 x float>> [#uses=1]
+store <4 x float> %tmp293, <4 x float>* %G4
+ret void
+}
+
+Compiles (llc -march=x86 -mcpu=yonah -relocation-model=static) to:
+
+_test:
+movaps _G2, %xmm0
+movaps _G1, %xmm1
+movaps %xmm1, %xmm2
+2)  shufps $3, %xmm0, %xmm2
+movaps %xmm1, %xmm3
+2)  shufps $1, %xmm0, %xmm3
+1)  unpcklps %xmm0, %xmm1
+2)  shufps $128, %xmm2, %xmm3
+1)  movaps %xmm1, _G3
+movaps %xmm3, _G4
+ret
+
+The 1) marked instructions could be scheduled better for reduced register 
+pressure.  The scheduling issue is more pronounced without -static.
+
+The 2) marked instructions are the lowered form of the 1,undef,3,4 
+shufflevector.  It seems that there should be a better way to do it :)
+
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-27 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.98 -> 1.99
---
Log message:

A couple of new entries.


---
Diffs of the changes:  (+37 -0)

 README.txt |   37 +
 1 files changed, 37 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.98 llvm/lib/Target/X86/README.txt:1.99
--- llvm/lib/Target/X86/README.txt:1.98 Mon Apr 24 18:30:10 2006
+++ llvm/lib/Target/X86/README.txt  Thu Apr 27 03:31:33 2006
@@ -1107,3 +1107,40 @@
 
 So icc is smart enough to know that B is in memory so it doesn't load it and
 store it back to stack.
+
+//===-===//
+
+__m128d test1( __m128d A, __m128d B) {
+  return _mm_shuffle_pd(A, B, 0x3);
+}
+
+compiles to
+
+shufpd $3, %xmm1, %xmm0
+
+Perhaps it's better to use unpckhpd instead?
+
+unpckhpd %xmm1, %xmm0
+
+Don't know if unpckhpd is faster. But it is shorter.
+
+//===-===//
+
+typedef short  v8i16 __attribute__ ((__vector_size__ (16)));
+v8i16 test(v8i16 x, v8i16 y) {
+  return x + y;
+}
+
+compiles to
+
+_test:
+   paddw %xmm0, %xmm1
+   movaps %xmm1, %xmm0
+   ret
+
+It should be
+
+   paddw %xmm1, %xmm0
+   ret
+
+since paddw is commutative.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-24 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.97 -> 1.98
---
Log message:

Add a new entry.

---
Diffs of the changes:  (+32 -0)

 README.txt |   32 
 1 files changed, 32 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.97 llvm/lib/Target/X86/README.txt:1.98
--- llvm/lib/Target/X86/README.txt:1.97 Mon Apr 24 12:38:16 2006
+++ llvm/lib/Target/X86/README.txt  Mon Apr 24 18:30:10 2006
@@ -1075,3 +1075,35 @@
 There is also one case we do worse on PPC.
 
 //===-===//
+
+For this:
+
+#include 
+void test(__m128d *r, __m128d *A, double B) {
+  *r = _mm_loadl_pd(*A, &B);
+}
+
+We generates:
+
+   subl $12, %esp
+   movsd 24(%esp), %xmm0
+   movsd %xmm0, (%esp)
+   movl 20(%esp), %eax
+   movapd (%eax), %xmm0
+   movlpd (%esp), %xmm0
+   movl 16(%esp), %eax
+   movapd %xmm0, (%eax)
+   addl $12, %esp
+   ret
+
+icc generates:
+
+movl  4(%esp), %edx #3.6
+movl  8(%esp), %eax #3.6
+movapd(%eax), %xmm0 #4.22
+movlpd12(%esp), %xmm0   #4.8
+movapd%xmm0, (%edx) #4.3
+ret #5.1
+
+So icc is smart enough to know that B is in memory so it doesn't load it and
+store it back to stack.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-24 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.96 -> 1.97
---
Log message:

Remove a completed entry.


---
Diffs of the changes:  (+0 -55)

 README.txt |   55 ---
 1 files changed, 55 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.96 llvm/lib/Target/X86/README.txt:1.97
--- llvm/lib/Target/X86/README.txt:1.96 Sun Apr 23 14:47:09 2006
+++ llvm/lib/Target/X86/README.txt  Mon Apr 24 12:38:16 2006
@@ -999,61 +999,6 @@
 
 //===-===//
 
-Use the 0's in the top part of movss from memory (and from other instructions
-that generate them) to build vectors more efficiently.  Consider:
-
-vector float test(float a) {
- return (vector float){ 0.0, a, 0.0, 0.0}; 
-}
-
-We currently generate this as:
-
-_test:
-sub %ESP, 28
-movss %XMM0, DWORD PTR [%ESP + 32]
-movss DWORD PTR [%ESP + 4], %XMM0
-mov DWORD PTR [%ESP + 12], 0
-mov DWORD PTR [%ESP + 8], 0
-mov DWORD PTR [%ESP], 0
-movaps %XMM0, XMMWORD PTR [%ESP]
-add %ESP, 28
-ret
-
-Something like this should be sufficient:
-
-_test:
-   movss %XMM0, DWORD PTR [%ESP + 4]
-   shufps %XMM0, %XMM0, 81
-   ret
-
-... which takes advantage of the zero elements provided by movss.
-Even xoring a register and shufps'ing IT would be better than the
-above code.
-
-Likewise, for this:
-
-vector float test(float a, float b) {
- return (vector float){ b, a, 0.0, 0.0}; 
-}
-
-_test:
-pxor %XMM0, %XMM0
-movss %XMM1, %XMM0
-movss %XMM2, DWORD PTR [%ESP + 4]
-unpcklps %XMM2, %XMM1
-movss %XMM0, DWORD PTR [%ESP + 8]
-unpcklps %XMM0, %XMM1
-unpcklps %XMM0, %XMM2
-ret
-
-... where we do use pxor, it would be better to use the zero'd 
-elements that movss provides to turn this into 2 shufps's instead
-of 3 unpcklps's.
-
-Another example: {0.0, 0.0, a, b }
-
-//===-===//
-
 Consider:
 
 __m128 test(float a) {



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-23 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.95 -> 1.96
---
Log message:

Add a note


---
Diffs of the changes:  (+7 -0)

 README.txt |7 +++
 1 files changed, 7 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.95 llvm/lib/Target/X86/README.txt:1.96
--- llvm/lib/Target/X86/README.txt:1.95 Fri Apr 21 16:05:22 2006
+++ llvm/lib/Target/X86/README.txt  Sun Apr 23 14:47:09 2006
@@ -1123,3 +1123,10 @@
 
 //===-===//
 
+We generate significantly worse code for this than GCC:
+http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21150
+http://gcc.gnu.org/bugzilla/attachment.cgi?id=8701
+
+There is also one case we do worse on PPC.
+
+//===-===//



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-21 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.94 -> 1.95
---
Log message:

fix thinko


---
Diffs of the changes:  (+2 -2)

 README.txt |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.94 llvm/lib/Target/X86/README.txt:1.95
--- llvm/lib/Target/X86/README.txt:1.94 Fri Apr 21 16:03:21 2006
+++ llvm/lib/Target/X86/README.txt  Fri Apr 21 16:05:22 2006
@@ -1068,8 +1068,8 @@
 movss %xmm1, %xmm0
 ret
 
-Because mulss multiplies 0*0 = 0.0, the top elements of xmm1 are already zerod.
-We could compile this to:
+Because mulss doesn't modify the top 3 elements, the top elements of 
+xmm1 are already zero'd.  We could compile this to:
 
 movss 4(%esp), %xmm0
 mulss %xmm0, %xmm0



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-21 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.93 -> 1.94
---
Log message:

add some low-prio notes


---
Diffs of the changes:  (+69 -0)

 README.txt |   69 +
 1 files changed, 69 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.93 llvm/lib/Target/X86/README.txt:1.94
--- llvm/lib/Target/X86/README.txt:1.93 Wed Apr 19 00:53:27 2006
+++ llvm/lib/Target/X86/README.txt  Fri Apr 21 16:03:21 2006
@@ -1054,3 +1054,72 @@
 
 //===-===//
 
+Consider:
+
+__m128 test(float a) {
+  return _mm_set_ps(0.0, 0.0, 0.0, a*a);
+}
+
+This compiles into:
+
+movss 4(%esp), %xmm1
+mulss %xmm1, %xmm1
+xorps %xmm0, %xmm0
+movss %xmm1, %xmm0
+ret
+
+Because mulss multiplies 0*0 = 0.0, the top elements of xmm1 are already zerod.
+We could compile this to:
+
+movss 4(%esp), %xmm0
+mulss %xmm0, %xmm0
+ret
+
+//===-===//
+
+Here's a sick and twisted idea.  Consider code like this:
+
+__m128 test(__m128 a) {
+  float b = *(float*)&A;
+  ...
+  return _mm_set_ps(0.0, 0.0, 0.0, b);
+}
+
+This might compile to this code:
+
+movaps c(%esp), %xmm1
+xorps %xmm0, %xmm0
+movss %xmm1, %xmm0
+ret
+
+Now consider if the ... code caused xmm1 to get spilled.  This might produce
+this code:
+
+movaps c(%esp), %xmm1
+movaps %xmm1, c2(%esp)
+...
+
+xorps %xmm0, %xmm0
+movaps c2(%esp), %xmm1
+movss %xmm1, %xmm0
+ret
+
+However, since the reload is only used by these instructions, we could 
+"fold" it into the uses, producing something like this:
+
+movaps c(%esp), %xmm1
+movaps %xmm1, c2(%esp)
+...
+
+movss c2(%esp), %xmm0
+ret
+
+... saving two instructions.
+
+The basic idea is that a reload from a spill slot, can, if only one 4-byte 
+chunk is used, bring in 3 zeros the the one element instead of 4 elements.
+This can be used to simplify a variety of shuffle operations, where the
+elements are fixed zeros.
+
+//===-===//
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-18 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.92 -> 1.93
---
Log message:

Add a note.


---
Diffs of the changes:  (+58 -0)

 README.txt |   58 ++
 1 files changed, 58 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.92 llvm/lib/Target/X86/README.txt:1.93
--- llvm/lib/Target/X86/README.txt:1.92 Mon Apr 17 22:45:01 2006
+++ llvm/lib/Target/X86/README.txt  Wed Apr 19 00:53:27 2006
@@ -996,3 +996,61 @@
movaps %xmm3, %xmm2
movaps %xmm4, %xmm3
jne LBB_main_4  # cond_true44
+
+//===-===//
+
+Use the 0's in the top part of movss from memory (and from other instructions
+that generate them) to build vectors more efficiently.  Consider:
+
+vector float test(float a) {
+ return (vector float){ 0.0, a, 0.0, 0.0}; 
+}
+
+We currently generate this as:
+
+_test:
+sub %ESP, 28
+movss %XMM0, DWORD PTR [%ESP + 32]
+movss DWORD PTR [%ESP + 4], %XMM0
+mov DWORD PTR [%ESP + 12], 0
+mov DWORD PTR [%ESP + 8], 0
+mov DWORD PTR [%ESP], 0
+movaps %XMM0, XMMWORD PTR [%ESP]
+add %ESP, 28
+ret
+
+Something like this should be sufficient:
+
+_test:
+   movss %XMM0, DWORD PTR [%ESP + 4]
+   shufps %XMM0, %XMM0, 81
+   ret
+
+... which takes advantage of the zero elements provided by movss.
+Even xoring a register and shufps'ing IT would be better than the
+above code.
+
+Likewise, for this:
+
+vector float test(float a, float b) {
+ return (vector float){ b, a, 0.0, 0.0}; 
+}
+
+_test:
+pxor %XMM0, %XMM0
+movss %XMM1, %XMM0
+movss %XMM2, DWORD PTR [%ESP + 4]
+unpcklps %XMM2, %XMM1
+movss %XMM0, DWORD PTR [%ESP + 8]
+unpcklps %XMM0, %XMM1
+unpcklps %XMM0, %XMM2
+ret
+
+... where we do use pxor, it would be better to use the zero'd 
+elements that movss provides to turn this into 2 shufps's instead
+of 3 unpcklps's.
+
+Another example: {0.0, 0.0, a, b }
+
+//===-===//
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-17 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.91 -> 1.92
---
Log message:

Correct comments

---
Diffs of the changes:  (+6 -6)

 README.txt |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.91 llvm/lib/Target/X86/README.txt:1.92
--- llvm/lib/Target/X86/README.txt:1.91 Mon Apr 17 20:22:57 2006
+++ llvm/lib/Target/X86/README.txt  Mon Apr 17 22:45:01 2006
@@ -982,17 +982,17 @@
jne LBB_main_4  # cond_true44
 
 There are two problems. 1) No need to two loop induction variables. We can
-compare against 262144 * 16. 2) Poor register allocation decisions. We should
+compare against 262144 * 16. 2) Known register coalescer issue. We should
 be able eliminate one of the movaps:
 
-   addps %xmm1, %xmm2
-   subps %xmm3, %xmm2
+   addps %xmm2, %xmm1<=== Commute!
+   subps %xmm3, %xmm1
movaps (%ecx), %xmm4
-   movaps %xmm2, %xmm2   <=== Eliminate!
-   addps %xmm4, %xmm2
+   movaps %xmm1, %xmm1   <=== Eliminate!
+   addps %xmm4, %xmm1
addl $16, %ecx
incl %edx
cmpl $262144, %edx
-   movaps %xmm3, %xmm1
+   movaps %xmm3, %xmm2
movaps %xmm4, %xmm3
jne LBB_main_4  # cond_true44



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-17 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.90 -> 1.91
---
Log message:

Another entry

---
Diffs of the changes:  (+35 -0)

 README.txt |   35 +++
 1 files changed, 35 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.90 llvm/lib/Target/X86/README.txt:1.91
--- llvm/lib/Target/X86/README.txt:1.90 Mon Apr 17 19:21:01 2006
+++ llvm/lib/Target/X86/README.txt  Mon Apr 17 20:22:57 2006
@@ -961,3 +961,38 @@
 to three-address transformation.
 
 It also exposes some other problems. See MOV32ri -3 and the spills.
+
+//===-===//
+
+http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25500
+
+LLVM is producing bad code.
+
+LBB_main_4:# cond_true44
+   addps %xmm1, %xmm2
+   subps %xmm3, %xmm2
+   movaps (%ecx), %xmm4
+   movaps %xmm2, %xmm1
+   addps %xmm4, %xmm1
+   addl $16, %ecx
+   incl %edx
+   cmpl $262144, %edx
+   movaps %xmm3, %xmm2
+   movaps %xmm4, %xmm3
+   jne LBB_main_4  # cond_true44
+
+There are two problems. 1) No need to two loop induction variables. We can
+compare against 262144 * 16. 2) Poor register allocation decisions. We should
+be able eliminate one of the movaps:
+
+   addps %xmm1, %xmm2
+   subps %xmm3, %xmm2
+   movaps (%ecx), %xmm4
+   movaps %xmm2, %xmm2   <=== Eliminate!
+   addps %xmm4, %xmm2
+   addl $16, %ecx
+   incl %edx
+   cmpl $262144, %edx
+   movaps %xmm3, %xmm1
+   movaps %xmm4, %xmm3
+   jne LBB_main_4  # cond_true44



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-17 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.89 -> 1.90
---
Log message:

Another entry.


---
Diffs of the changes:  (+151 -0)

 README.txt |  151 +
 1 files changed, 151 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.89 llvm/lib/Target/X86/README.txt:1.90
--- llvm/lib/Target/X86/README.txt:1.89 Sat Apr 15 00:37:34 2006
+++ llvm/lib/Target/X86/README.txt  Mon Apr 17 19:21:01 2006
@@ -810,3 +810,154 @@
 How about andps, andpd, and pand? Do we really care about the type of the 
packed
 elements? If not, why not always use the "ps" variants which are likely to be
 shorter.
+
+//===-===//
+
+We are emitting bad code for this:
+
+float %test(float* %V, int %I, int %D, float %V) {
+entry:
+   %tmp = seteq int %D, 0
+   br bool %tmp, label %cond_true, label %cond_false23
+
+cond_true:
+   %tmp3 = getelementptr float* %V, int %I
+   %tmp = load float* %tmp3
+   %tmp5 = setgt float %tmp, %V
+   %tmp6 = tail call bool %llvm.isunordered.f32( float %tmp, float %V )
+   %tmp7 = or bool %tmp5, %tmp6
+   br bool %tmp7, label %UnifiedReturnBlock, label %cond_next
+
+cond_next:
+   %tmp10 = add int %I, 1
+   %tmp12 = getelementptr float* %V, int %tmp10
+   %tmp13 = load float* %tmp12
+   %tmp15 = setle float %tmp13, %V
+   %tmp16 = tail call bool %llvm.isunordered.f32( float %tmp13, float %V )
+   %tmp17 = or bool %tmp15, %tmp16
+   %retval = select bool %tmp17, float 0.00e+00, float 1.00e+00
+   ret float %retval
+
+cond_false23:
+   %tmp28 = tail call float %foo( float* %V, int %I, int %D, float %V )
+   ret float %tmp28
+
+UnifiedReturnBlock:; preds = %cond_true
+   ret float 0.00e+00
+}
+
+declare bool %llvm.isunordered.f32(float, float)
+
+declare float %foo(float*, int, int, float)
+
+
+It exposes a known load folding problem:
+
+   movss (%edx,%ecx,4), %xmm1
+   ucomiss %xmm1, %xmm0
+
+As well as this:
+
+LBB_test_2:# cond_next
+   movss LCPI1_0, %xmm2
+   pxor %xmm3, %xmm3
+   ucomiss %xmm0, %xmm1
+   jbe LBB_test_6  # cond_next
+LBB_test_5:# cond_next
+   movaps %xmm2, %xmm3
+LBB_test_6:# cond_next
+   movss %xmm3, 40(%esp)
+   flds 40(%esp)
+   addl $44, %esp
+   ret
+
+Clearly it's unnecessary to clear %xmm3. It's also not clear why we are 
emitting
+three moves (movss, movaps, movss).
+
+//===-===//
+
+External test Nurbs exposed some problems. Look for
+__ZN15Nurbs_SSE_Cubic17TessellateSurfaceE, bb cond_next140. This is what icc
+emits:
+
+movaps(%edx), %xmm2 #59.21
+movaps(%edx), %xmm5 #60.21
+movaps(%edx), %xmm4 #61.21
+movaps(%edx), %xmm3 #62.21
+movl  40(%ecx), %ebp#69.49
+shufps$0, %xmm2, %xmm5  #60.21
+movl  100(%esp), %ebx   #69.20
+movl  (%ebx), %edi  #69.20
+imull %ebp, %edi#69.49
+addl  (%eax), %edi  #70.33
+shufps$85, %xmm2, %xmm4 #61.21
+shufps$170, %xmm2, %xmm3#62.21
+shufps$255, %xmm2, %xmm2#63.21
+lea   (%ebp,%ebp,2), %ebx   #69.49
+negl  %ebx  #69.49
+lea   -3(%edi,%ebx), %ebx   #70.33
+shll  $4, %ebx  #68.37
+addl  32(%ecx), %ebx#68.37
+testb $15, %bl  #91.13
+jne   L_B1.24   # Prob 5%   #91.13
+
+This is the llvm code after instruction scheduling:
+
+cond_next140 (0xa910740, LLVM BB @0xa90beb0):
+   %reg1078 = MOV32ri -3
+   %reg1079 = ADD32rm %reg1078, %reg1068, 1, %NOREG, 0
+   %reg1037 = MOV32rm %reg1024, 1, %NOREG, 40
+   %reg1080 = IMUL32rr %reg1079, %reg1037
+   %reg1081 = MOV32rm %reg1058, 1, %NOREG, 0
+   %reg1038 = LEA32r %reg1081, 1, %reg1080, -3
+   %reg1036 = MOV32rm %reg1024, 1, %NOREG, 32
+   %reg1082 = SHL32ri %reg1038, 4
+   %reg1039 = ADD32rr %reg1036, %reg1082
+   %reg1083 = MOVAPSrm %reg1059, 1, %NOREG, 0
+   %reg1034 = SHUFPSrr %reg1083, %reg1083, 170
+   %reg1032 = SHUFPSrr %reg1083, %reg1083, 0
+   %reg1035 = SHUFPSrr %reg1083, %reg1083, 255
+   %reg1033 = S

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt X86ISelLowering.cpp X86InstrSSE.td

2006-04-14 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.88 -> 1.89
X86ISelLowering.cpp updated: 1.167 -> 1.168
X86InstrSSE.td updated: 1.91 -> 1.92
---
Log message:

Silly bug

---
Diffs of the changes:  (+11 -18)

 README.txt  |5 -
 X86ISelLowering.cpp |   22 ++
 X86InstrSSE.td  |2 +-
 3 files changed, 11 insertions(+), 18 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.88 llvm/lib/Target/X86/README.txt:1.89
--- llvm/lib/Target/X86/README.txt:1.88 Fri Apr 14 02:24:04 2006
+++ llvm/lib/Target/X86/README.txt  Sat Apr 15 00:37:34 2006
@@ -810,8 +810,3 @@
 How about andps, andpd, and pand? Do we really care about the type of the 
packed
 elements? If not, why not always use the "ps" variants which are likely to be
 shorter.
-
-//===-===//
-
-Make sure XMM registers are spilled to 128-bit locations (if not already) and
-add vector SSE opcodes to X86RegisterInfo::foldMemoryOperand().


Index: llvm/lib/Target/X86/X86ISelLowering.cpp
diff -u llvm/lib/Target/X86/X86ISelLowering.cpp:1.167 
llvm/lib/Target/X86/X86ISelLowering.cpp:1.168
--- llvm/lib/Target/X86/X86ISelLowering.cpp:1.167   Fri Apr 14 22:13:24 2006
+++ llvm/lib/Target/X86/X86ISelLowering.cpp Sat Apr 15 00:37:34 2006
@@ -1724,27 +1724,26 @@
 return false;
 
   // Expect 1, 1, 3, 3
-  unsigned NumNodes = 0;
   for (unsigned i = 0; i < 2; ++i) {
 SDOperand Arg = N->getOperand(i);
 if (Arg.getOpcode() == ISD::UNDEF) continue;
 assert(isa(Arg) && "Invalid VECTOR_SHUFFLE mask!");
 unsigned Val = cast(Arg)->getValue();
 if (Val != 1) return false;
-NumNodes++;
   }
+
+  bool HasHi = false;
   for (unsigned i = 2; i < 4; ++i) {
 SDOperand Arg = N->getOperand(i);
 if (Arg.getOpcode() == ISD::UNDEF) continue;
 assert(isa(Arg) && "Invalid VECTOR_SHUFFLE mask!");
 unsigned Val = cast(Arg)->getValue();
 if (Val != 3) return false;
-NumNodes++;
+HasHi = true;
   }
 
-  // Don't use movshdup if the resulting vector contains only one undef node.
-  // Use {p}shuf* instead.
-  return NumNodes > 1;
+  // Don't use movshdup if it can be done with a shufps.
+  return HasHi;
 }
 
 /// isMOVSLDUPMask - Return true if the specified VECTOR_SHUFFLE operand
@@ -1756,27 +1755,26 @@
 return false;
 
   // Expect 0, 0, 2, 2
-  unsigned NumNodes = 0;
   for (unsigned i = 0; i < 2; ++i) {
 SDOperand Arg = N->getOperand(i);
 if (Arg.getOpcode() == ISD::UNDEF) continue;
 assert(isa(Arg) && "Invalid VECTOR_SHUFFLE mask!");
 unsigned Val = cast(Arg)->getValue();
 if (Val != 0) return false;
-NumNodes++;
   }
+
+  bool HasHi = false;
   for (unsigned i = 2; i < 4; ++i) {
 SDOperand Arg = N->getOperand(i);
 if (Arg.getOpcode() == ISD::UNDEF) continue;
 assert(isa(Arg) && "Invalid VECTOR_SHUFFLE mask!");
 unsigned Val = cast(Arg)->getValue();
 if (Val != 2) return false;
-NumNodes++;
+HasHi = true;
   }
 
-  // Don't use movsldup if the resulting vector contains only one undef node.
-  // Use {p}shuf* instead.
-  return NumNodes > 1;
+  // Don't use movshdup if it can be done with a shufps.
+  return HasHi;
 }
 
 /// isSplatMask - Return true if the specified VECTOR_SHUFFLE operand specifies


Index: llvm/lib/Target/X86/X86InstrSSE.td
diff -u llvm/lib/Target/X86/X86InstrSSE.td:1.91 
llvm/lib/Target/X86/X86InstrSSE.td:1.92
--- llvm/lib/Target/X86/X86InstrSSE.td:1.91 Fri Apr 14 18:32:40 2006
+++ llvm/lib/Target/X86/X86InstrSSE.td  Sat Apr 15 00:37:34 2006
@@ -169,7 +169,7 @@
 // PDIi8 - SSE2 instructions with ImmT == Imm8 and TB and OpSize prefixes.
 // S3I - SSE3 instructions with TB and OpSize prefixes.
 // S3SI - SSE3 instructions with XS prefix.
-// S3SI - SSE3 instructions with XD prefix.
+// S3DI - SSE3 instructions with XD prefix.
 class SSI o, Format F, dag ops, string asm, list pattern>
   : I, XS, Requires<[HasSSE1]>;
 class SDI o, Format F, dag ops, string asm, list pattern>



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-14 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.87 -> 1.88
---
Log message:

New entry

---
Diffs of the changes:  (+5 -0)

 README.txt |5 +
 1 files changed, 5 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.87 llvm/lib/Target/X86/README.txt:1.88
--- llvm/lib/Target/X86/README.txt:1.87 Thu Apr 13 00:09:45 2006
+++ llvm/lib/Target/X86/README.txt  Fri Apr 14 02:24:04 2006
@@ -810,3 +810,8 @@
 How about andps, andpd, and pand? Do we really care about the type of the 
packed
 elements? If not, why not always use the "ps" variants which are likely to be
 shorter.
+
+//===-===//
+
+Make sure XMM registers are spilled to 128-bit locations (if not already) and
+add vector SSE opcodes to X86RegisterInfo::foldMemoryOperand().



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-12 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.86 -> 1.87
---
Log message:

Update

---
Diffs of the changes:  (+12 -0)

 README.txt |   12 
 1 files changed, 12 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.86 llvm/lib/Target/X86/README.txt:1.87
--- llvm/lib/Target/X86/README.txt:1.86 Wed Apr 12 16:21:57 2006
+++ llvm/lib/Target/X86/README.txt  Thu Apr 13 00:09:45 2006
@@ -191,6 +191,18 @@
 should be made smart enough to cannonicalize the load into the RHS of a compare
 when it can invert the result of the compare for free.
 
+How about intrinsics? An example is:
+  *res = _mm_mulhi_epu16(*A, _mm_mul_epu32(*B, *C));
+
+compiles to
+   pmuludq (%eax), %xmm0
+   movl 8(%esp), %eax
+   movdqa (%eax), %xmm1
+   pmulhuw %xmm0, %xmm1
+
+The transformation probably requires a X86 specific pass or a DAG combiner
+target specific hook.
+
 //===-===//
 
 LSR should be turned on for the X86 backend and tuned to take advantage of its



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt X86ISelLowering.cpp X86InstrSSE.td

2006-04-12 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.85 -> 1.86
X86ISelLowering.cpp updated: 1.163 -> 1.164
X86InstrSSE.td updated: 1.79 -> 1.80
---
Log message:

All "integer" logical ops (pand, por, pxor) are now promoted to v2i64.
Clean up and fix various logical ops issues.


---
Diffs of the changes:  (+71 -146)

 README.txt  |4 +
 X86ISelLowering.cpp |   45 -
 X86InstrSSE.td  |  168 
 3 files changed, 71 insertions(+), 146 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.85 llvm/lib/Target/X86/README.txt:1.86
--- llvm/lib/Target/X86/README.txt:1.85 Mon Apr 10 16:51:03 2006
+++ llvm/lib/Target/X86/README.txt  Wed Apr 12 16:21:57 2006
@@ -794,3 +794,7 @@
 X86RegisterInfo::copyRegToReg() returns X86::MOVAPSrr for VR128. Is it possible
 to choose between movaps, movapd, and movdqa based on types of source and
 destination?
+
+How about andps, andpd, and pand? Do we really care about the type of the 
packed
+elements? If not, why not always use the "ps" variants which are likely to be
+shorter.


Index: llvm/lib/Target/X86/X86ISelLowering.cpp
diff -u llvm/lib/Target/X86/X86ISelLowering.cpp:1.163 
llvm/lib/Target/X86/X86ISelLowering.cpp:1.164
--- llvm/lib/Target/X86/X86ISelLowering.cpp:1.163   Wed Apr 12 12:12:36 2006
+++ llvm/lib/Target/X86/X86ISelLowering.cpp Wed Apr 12 16:21:57 2006
@@ -275,6 +275,9 @@
   if (Subtarget->hasSSE1()) {
 addRegisterClass(MVT::v4f32, X86::VR128RegisterClass);
 
+setOperationAction(ISD::AND,MVT::v4f32, Legal);
+setOperationAction(ISD::OR, MVT::v4f32, Legal);
+setOperationAction(ISD::XOR,MVT::v4f32, Legal);
 setOperationAction(ISD::ADD,MVT::v4f32, Legal);
 setOperationAction(ISD::SUB,MVT::v4f32, Legal);
 setOperationAction(ISD::MUL,MVT::v4f32, Legal);
@@ -301,36 +304,43 @@
 setOperationAction(ISD::SUB,MVT::v8i16, Legal);
 setOperationAction(ISD::SUB,MVT::v4i32, Legal);
 setOperationAction(ISD::MUL,MVT::v2f64, Legal);
-setOperationAction(ISD::LOAD,   MVT::v2f64, Legal);
+
 setOperationAction(ISD::SCALAR_TO_VECTOR,   MVT::v16i8, Custom);
 setOperationAction(ISD::SCALAR_TO_VECTOR,   MVT::v8i16, Custom);
+setOperationAction(ISD::INSERT_VECTOR_ELT,  MVT::v8i16, Custom);
+
+// Custom lower build_vector, vector_shuffle, and extract_vector_elt.
+for (unsigned VT = (unsigned)MVT::v16i8; VT != (unsigned)MVT::v2i64; VT++) 
{
+  setOperationAction(ISD::BUILD_VECTOR,(MVT::ValueType)VT, Custom);
+  setOperationAction(ISD::VECTOR_SHUFFLE,  (MVT::ValueType)VT, Custom);
+  setOperationAction(ISD::EXTRACT_VECTOR_ELT,  (MVT::ValueType)VT, Custom);
+}
 setOperationAction(ISD::BUILD_VECTOR,   MVT::v2f64, Custom);
-setOperationAction(ISD::BUILD_VECTOR,   MVT::v16i8, Custom);
-setOperationAction(ISD::BUILD_VECTOR,   MVT::v8i16, Custom);
-setOperationAction(ISD::BUILD_VECTOR,   MVT::v4i32, Custom);
 setOperationAction(ISD::BUILD_VECTOR,   MVT::v2i64, Custom);
 setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v2f64, Custom);
-setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v16i8, Custom);
-setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v8i16, Custom);
-setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v4i32, Custom);
 setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v2i64, Custom);
 setOperationAction(ISD::EXTRACT_VECTOR_ELT, MVT::v2f64, Custom);
-setOperationAction(ISD::EXTRACT_VECTOR_ELT, MVT::v8i16, Custom);
-setOperationAction(ISD::EXTRACT_VECTOR_ELT, MVT::v4i32, Custom);
-setOperationAction(ISD::INSERT_VECTOR_ELT,  MVT::v8i16, Custom);
+setOperationAction(ISD::EXTRACT_VECTOR_ELT, MVT::v2i64, Custom);
 
-// Promote v16i8, v8i16, v4i32 selects to v2i64. Custom lower v2i64, v2f64,
-// and v4f32 selects.
-for (unsigned VT = (unsigned)MVT::v16i8;
- VT != (unsigned)MVT::v2i64; VT++) {
-  setOperationAction(ISD::SELECT, (MVT::ValueType)VT, Promote);
-  AddPromotedToType (ISD::SELECT, (MVT::ValueType)VT, MVT::v2i64);
+// Promote v16i8, v8i16, v4i32 load, select, and, or, xor to v2i64. 
+for (unsigned VT = (unsigned)MVT::v16i8; VT != (unsigned)MVT::v2i64; VT++) 
{
+  setOperationAction(ISD::AND,(MVT::ValueType)VT, Promote);
+  AddPromotedToType (ISD::AND,(MVT::ValueType)VT, MVT::v2i64);
+  setOperationAction(ISD::OR, (MVT::ValueType)VT, Promote);
+  AddPromotedToType (ISD::OR, (MVT::ValueType)VT, MVT::v2i64);
+  setOperationAction(ISD::XOR,(MVT::ValueType)VT, Promote);
+  AddPromotedToType (ISD::XOR,(MVT::ValueType)VT, MVT::v2i64);
   setOperationAction(ISD::LOAD,   (MVT::ValueType)VT, Promote);
   AddPromotedToType (ISD::LOAD,   (MVT::ValueTy

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-10 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.84 -> 1.85
---
Log message:

add a note


---
Diffs of the changes:  (+23 -0)

 README.txt |   23 +++
 1 files changed, 23 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.84 llvm/lib/Target/X86/README.txt:1.85
--- llvm/lib/Target/X86/README.txt:1.84 Mon Apr 10 16:42:57 2006
+++ llvm/lib/Target/X86/README.txt  Mon Apr 10 16:51:03 2006
@@ -675,6 +675,29 @@
 
 //===-===//
 
+Better codegen for:
+
+void f(float a, float b, vector float * out) { *out = (vector float){ a, 0.0, 
0.0, b}; }
+void f(float a, float b, vector float * out) { *out = (vector float){ a, b, 
0.0, 0}; }
+
+For the later we generate:
+
+_f:
+pxor %xmm0, %xmm0
+movss 8(%esp), %xmm1
+movaps %xmm0, %xmm2
+unpcklps %xmm1, %xmm2
+movss 4(%esp), %xmm1
+unpcklps %xmm0, %xmm1
+unpcklps %xmm2, %xmm1
+movl 12(%esp), %eax
+movaps %xmm1, (%eax)
+ret
+
+This seems like it should use shufps, one for each of a & b.
+
+//===-===//
+
 Adding to the list of cmp / test poor codegen issues:
 
 int test(__m128 *A, __m128 *B) {



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-10 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.82 -> 1.83
---
Log message:

Correct an entry

---
Diffs of the changes:  (+2 -2)

 README.txt |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.82 llvm/lib/Target/X86/README.txt:1.83
--- llvm/lib/Target/X86/README.txt:1.82 Mon Apr 10 02:22:03 2006
+++ llvm/lib/Target/X86/README.txt  Mon Apr 10 16:41:39 2006
@@ -770,8 +770,8 @@
 
 #include 
 
-void test(__m128 *res, __m128 *A) {
-  *res = _mm_shuffle_ps(*A, *A, 0xF0);
+void test(__m128 *res, __m128 *A, __m128 *B) {
+  *res = _mm_shuffle_ps(*A, *B, 0xF0);
 }
 
 We should emit



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-10 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.83 -> 1.84
---
Log message:

Remove an entry that is now done.

---
Diffs of the changes:  (+0 -13)

 README.txt |   13 -
 1 files changed, 13 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.83 llvm/lib/Target/X86/README.txt:1.84
--- llvm/lib/Target/X86/README.txt:1.83 Mon Apr 10 16:41:39 2006
+++ llvm/lib/Target/X86/README.txt  Mon Apr 10 16:42:57 2006
@@ -768,19 +768,6 @@
 
 //===-===//
 
-#include 
-
-void test(__m128 *res, __m128 *A, __m128 *B) {
-  *res = _mm_shuffle_ps(*A, *B, 0xF0);
-}
-
-We should emit
-  shufps $240, (%eax), %xmm0
-instead of 
-  pshufd $240, (%eax), %xmm0
-
-//===-===//
-
 X86RegisterInfo::copyRegToReg() returns X86::MOVAPSrr for VR128. Is it possible
 to choose between movaps, movapd, and movdqa based on types of source and
 destination?



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-10 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.81 -> 1.82
---
Log message:

New entries

---
Diffs of the changes:  (+19 -0)

 README.txt |   19 +++
 1 files changed, 19 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.81 llvm/lib/Target/X86/README.txt:1.82
--- llvm/lib/Target/X86/README.txt:1.81 Fri Apr  7 16:19:53 2006
+++ llvm/lib/Target/X86/README.txt  Mon Apr 10 02:22:03 2006
@@ -765,3 +765,22 @@
 
 A Mac OS X IA-32 specific ABI bug wrt returning value > 8 bytes:
 http://llvm.org/bugs/show_bug.cgi?id=729
+
+//===-===//
+
+#include 
+
+void test(__m128 *res, __m128 *A) {
+  *res = _mm_shuffle_ps(*A, *A, 0xF0);
+}
+
+We should emit
+  shufps $240, (%eax), %xmm0
+instead of 
+  pshufd $240, (%eax), %xmm0
+
+//===-===//
+
+X86RegisterInfo::copyRegToReg() returns X86::MOVAPSrr for VR128. Is it possible
+to choose between movaps, movapd, and movdqa based on types of source and
+destination?



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-07 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.80 -> 1.81
---
Log message:

Keep track of an Mac OS X / x86 ABI bug.


---
Diffs of the changes:  (+5 -0)

 README.txt |5 +
 1 files changed, 5 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.80 llvm/lib/Target/X86/README.txt:1.81
--- llvm/lib/Target/X86/README.txt:1.80 Thu Apr  6 18:21:24 2006
+++ llvm/lib/Target/X86/README.txt  Fri Apr  7 16:19:53 2006
@@ -760,3 +760,8 @@
movddup 8(%esp), %xmm0
movapd %xmm0, (%eax)
ret
+
+//===-===//
+
+A Mac OS X IA-32 specific ABI bug wrt returning value > 8 bytes:
+http://llvm.org/bugs/show_bug.cgi?id=729



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-06 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.79 -> 1.80
---
Log message:

New entries.


---
Diffs of the changes:  (+56 -0)

 README.txt |   56 
 1 files changed, 56 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.79 llvm/lib/Target/X86/README.txt:1.80
--- llvm/lib/Target/X86/README.txt:1.79 Wed Apr  5 18:46:04 2006
+++ llvm/lib/Target/X86/README.txt  Thu Apr  6 18:21:24 2006
@@ -704,3 +704,59 @@
 so a any extend (which becomes a zero extend) is added.
 
 We probably need some kind of target DAG combine hook to fix this.
+
+//===-===//
+
+How to decide when to use the "floating point version" of logical ops? Here are
+some code fragments:
+
+   movaps LCPI5_5, %xmm2
+   divps %xmm1, %xmm2
+   mulps %xmm2, %xmm3
+   mulps 8656(%ecx), %xmm3
+   addps 8672(%ecx), %xmm3
+   andps LCPI5_6, %xmm2
+   andps LCPI5_1, %xmm3
+   por %xmm2, %xmm3
+   movdqa %xmm3, (%edi)
+
+   movaps LCPI5_5, %xmm1
+   divps %xmm0, %xmm1
+   mulps %xmm1, %xmm3
+   mulps 8656(%ecx), %xmm3
+   addps 8672(%ecx), %xmm3
+   andps LCPI5_6, %xmm1
+   andps LCPI5_1, %xmm3
+   orps %xmm1, %xmm3
+   movaps %xmm3, 112(%esp)
+   movaps %xmm3, (%ebx)
+
+Due to some minor source change, the later case ended up using orps and movaps
+instead of por and movdqa. Does it matter?
+
+//===-===//
+
+Use movddup to splat a v2f64 directly from a memory source. e.g.
+
+#include 
+
+void test(__m128d *r, double A) {
+  *r = _mm_set1_pd(A);
+}
+
+llc:
+
+_test:
+   movsd 8(%esp), %xmm0
+   unpcklpd %xmm0, %xmm0
+   movl 4(%esp), %eax
+   movapd %xmm0, (%eax)
+   ret
+
+icc:
+
+_test:
+   movl 4(%esp), %eax
+   movddup 8(%esp), %xmm0
+   movapd %xmm0, (%eax)
+   ret



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-05 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.78 -> 1.79
---
Log message:

An entry about comi / ucomi intrinsics.


---
Diffs of the changes:  (+31 -0)

 README.txt |   31 +++
 1 files changed, 31 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.78 llvm/lib/Target/X86/README.txt:1.79
--- llvm/lib/Target/X86/README.txt:1.78 Tue Mar 28 21:03:46 2006
+++ llvm/lib/Target/X86/README.txt  Wed Apr  5 18:46:04 2006
@@ -673,3 +673,34 @@
 Better codegen for vector_shuffles like this { x, 0, 0, 0 } or { x, 0, x, 0}.
 Perhaps use pxor / xorp* to clear a XMM register first?
 
+//===-===//
+
+Adding to the list of cmp / test poor codegen issues:
+
+int test(__m128 *A, __m128 *B) {
+  if (_mm_comige_ss(*A, *B))
+return 3;
+  else
+return 4;
+}
+
+_test:
+   movl 8(%esp), %eax
+   movaps (%eax), %xmm0
+   movl 4(%esp), %eax
+   movaps (%eax), %xmm1
+   comiss %xmm0, %xmm1
+   setae %al
+   movzbl %al, %ecx
+   movl $3, %eax
+   movl $4, %edx
+   cmpl $0, %ecx
+   cmove %edx, %eax
+   ret
+
+Note the setae, movzbl, cmpl, cmove can be replaced with a single cmovae. There
+are a number of issues. 1) We are introducing a setcc between the result of the
+intrisic call and select. 2) The intrinsic is expected to produce a i32 value
+so a any extend (which becomes a zero extend) is added.
+
+We probably need some kind of target DAG combine hook to fix this.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-28 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.77 -> 1.78
---
Log message:

Another entry about shuffles.


---
Diffs of the changes:  (+6 -0)

 README.txt |6 ++
 1 files changed, 6 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.77 llvm/lib/Target/X86/README.txt:1.78
--- llvm/lib/Target/X86/README.txt:1.77 Tue Mar 28 00:55:45 2006
+++ llvm/lib/Target/X86/README.txt  Tue Mar 28 21:03:46 2006
@@ -667,3 +667,9 @@
 
 Use movhps to update upper 64-bits of a v4sf value. Also movlps on lower half
 of a v4sf value.
+
+//===-===//
+
+Better codegen for vector_shuffles like this { x, 0, 0, 0 } or { x, 0, x, 0}.
+Perhaps use pxor / xorp* to clear a XMM register first?
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-27 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.76 -> 1.77
---
Log message:

Update

---
Diffs of the changes:  (+2 -23)

 README.txt |   25 ++---
 1 files changed, 2 insertions(+), 23 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.76 llvm/lib/Target/X86/README.txt:1.77
--- llvm/lib/Target/X86/README.txt:1.76 Mon Mar 27 20:49:12 2006
+++ llvm/lib/Target/X86/README.txt  Tue Mar 28 00:55:45 2006
@@ -665,26 +665,5 @@
 
 //===-===//
 
-Is it really a good idea to use movlhps to move 1 double-precision FP value 
from
-low quadword of source to high quadword of destination?
-
-e.g.
-
-void test2 (v2sd *b, double X, double Y) {
-  v2sd a = (v2sd) {X, X*Y};
-  *b = a;
-}
-
-   movsd 8(%esp), %xmm0
-   movapd %xmm0, %xmm1
-   mulsd 16(%esp), %xmm1
-   movlhps %xmm1, %xmm0
-   movl 4(%esp), %eax
-   movapd %xmm0, (%eax)
-   ret
-
-icc uses unpcklpd instead.
-
-//===-===//
-
-Use movhps and movlhps to update upper 64-bits of a v4sf value.
+Use movhps to update upper 64-bits of a v4sf value. Also movlps on lower half
+of a v4sf value.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-27 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.75 -> 1.76
---
Log message:

Added a couple of entries about movhps and movlhps.


---
Diffs of the changes:  (+26 -0)

 README.txt |   26 ++
 1 files changed, 26 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.75 llvm/lib/Target/X86/README.txt:1.76
--- llvm/lib/Target/X86/README.txt:1.75 Mon Mar 27 20:44:05 2006
+++ llvm/lib/Target/X86/README.txt  Mon Mar 27 20:49:12 2006
@@ -662,3 +662,29 @@
 
 Obviously it would have been better for the first mov (or any op) to store
 directly %esp[0] if there are no other uses.
+
+//===-===//
+
+Is it really a good idea to use movlhps to move 1 double-precision FP value 
from
+low quadword of source to high quadword of destination?
+
+e.g.
+
+void test2 (v2sd *b, double X, double Y) {
+  v2sd a = (v2sd) {X, X*Y};
+  *b = a;
+}
+
+   movsd 8(%esp), %xmm0
+   movapd %xmm0, %xmm1
+   mulsd 16(%esp), %xmm1
+   movlhps %xmm1, %xmm0
+   movl 4(%esp), %eax
+   movapd %xmm0, (%eax)
+   ret
+
+icc uses unpcklpd instead.
+
+//===-===//
+
+Use movhps and movlhps to update upper 64-bits of a v4sf value.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-27 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.74 -> 1.75
---
Log message:

All unpack cases are now being handled.


---
Diffs of the changes:  (+0 -4)

 README.txt |4 
 1 files changed, 4 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.74 llvm/lib/Target/X86/README.txt:1.75
--- llvm/lib/Target/X86/README.txt:1.74 Sun Mar 26 13:19:27 2006
+++ llvm/lib/Target/X86/README.txt  Mon Mar 27 20:44:05 2006
@@ -662,7 +662,3 @@
 
 Obviously it would have been better for the first mov (or any op) to store
 directly %esp[0] if there are no other uses.
-
-//===-===//
-
-Add more vector shuffle special cases using unpckhps and unpcklps.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

Re: [llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-26 Thread Reid Spencer

The original note for implementing this (which I wrote) indicated that
this should only be done for very small memory blocks, probably < 8
bytes, but certainly less than 64. I don't know what the magic number is
where there's a tradeoff and its probably different for different
targets, but certainly megabytes is WAAAY to big :)

Reid.

On Sun, 2006-03-26 at 13:19 -0600, Nate Begeman wrote:
> 
> Changes in directory llvm/lib/Target/X86:
> 
> README.txt updated: 1.73 -> 1.74
> ---
> Log message:
> 
> Readme note
> 
> 
> ---
> Diffs of the changes:  (+7 -0)
> 
>  README.txt |7 +++
>  1 files changed, 7 insertions(+)
> 
> 
> Index: llvm/lib/Target/X86/README.txt
> diff -u llvm/lib/Target/X86/README.txt:1.73 
> llvm/lib/Target/X86/README.txt:1.74
> --- llvm/lib/Target/X86/README.txt:1.73   Fri Mar 24 01:12:19 2006
> +++ llvm/lib/Target/X86/README.txtSun Mar 26 13:19:27 2006
> @@ -542,6 +542,13 @@
>  
>  
> //===-===//
>  
> +We are currently lowering large (1MB+) memmove/memcpy to rep/stosl and 
> rep/movsl
> +We should leave these as libcalls for everything over a much lower threshold,
> +since libc is hand tuned for medium and large mem ops (avoiding RFO for large
> +stores, TLB preheating, etc)
> +
> +//===-===//
> +
>  Lower memcpy / memset to a series of SSE 128 bit move instructions when it's
>  feasible.
>  
> 
> 
> 
> ___
> llvm-commits mailing list
> llvm-commits@cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


signature.asc
Description: This is a digitally signed message part
___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-26 Thread Nate Begeman



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.73 -> 1.74
---
Log message:

Readme note


---
Diffs of the changes:  (+7 -0)

 README.txt |7 +++
 1 files changed, 7 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.73 llvm/lib/Target/X86/README.txt:1.74
--- llvm/lib/Target/X86/README.txt:1.73 Fri Mar 24 01:12:19 2006
+++ llvm/lib/Target/X86/README.txt  Sun Mar 26 13:19:27 2006
@@ -542,6 +542,13 @@
 
 //===-===//
 
+We are currently lowering large (1MB+) memmove/memcpy to rep/stosl and 
rep/movsl
+We should leave these as libcalls for everything over a much lower threshold,
+since libc is hand tuned for medium and large mem ops (avoiding RFO for large
+stores, TLB preheating, etc)
+
+//===-===//
+
 Lower memcpy / memset to a series of SSE 128 bit move instructions when it's
 feasible.
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt X86ISelLowering.cpp

2006-03-23 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.72 -> 1.73
X86ISelLowering.cpp updated: 1.127 -> 1.128
---
Log message:

Gabor points out that we can't spell. :)


---
Diffs of the changes:  (+4 -4)

 README.txt  |4 ++--
 X86ISelLowering.cpp |4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.72 llvm/lib/Target/X86/README.txt:1.73
--- llvm/lib/Target/X86/README.txt:1.72 Fri Mar 24 00:40:32 2006
+++ llvm/lib/Target/X86/README.txt  Fri Mar 24 01:12:19 2006
@@ -547,7 +547,7 @@
 
 //===-===//
 
-Teach the coallescer to commute 2-addr instructions, allowing us to eliminate
+Teach the coalescer to commute 2-addr instructions, allowing us to eliminate
 the reg-reg copy in this example:
 
 float foo(int *x, float *y, unsigned c) {
@@ -642,7 +642,7 @@
 
 //===-===//
 
-Teach the coallescer to coales vregs of different register classes. e.g. FR32 /
+Teach the coalescer to coalesce vregs of different register classes. e.g. FR32 
/
 FR64 to VR128.
 
 //===-===//


Index: llvm/lib/Target/X86/X86ISelLowering.cpp
diff -u llvm/lib/Target/X86/X86ISelLowering.cpp:1.127 
llvm/lib/Target/X86/X86ISelLowering.cpp:1.128
--- llvm/lib/Target/X86/X86ISelLowering.cpp:1.127   Fri Mar 24 00:40:32 2006
+++ llvm/lib/Target/X86/X86ISelLowering.cpp Fri Mar 24 01:12:19 2006
@@ -660,10 +660,10 @@
 // EDX".  Anything more is illegal.
 //
 // FIXME: The linscan register allocator currently has problem with
-// coallescing.  At the time of this writing, whenever it decides to coallesce
+// coalescing.  At the time of this writing, whenever it decides to coalesce
 // a physreg with a virtreg, this increases the size of the physreg's live
 // range, and the live range cannot ever be reduced.  This causes problems if
-// too many physregs are coalleced with virtregs, which can cause the register
+// too many physregs are coaleced with virtregs, which can cause the register
 // allocator to wedge itself.
 //
 // This code triggers this problem more often if we pass args in registers,



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt X86ISelLowering.cpp

2006-03-23 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.71 -> 1.72
X86ISelLowering.cpp updated: 1.126 -> 1.127
---
Log message:

All v2f64 shuffle cases can be handled.

---
Diffs of the changes:  (+8 -1)

 README.txt  |4 
 X86ISelLowering.cpp |5 -
 2 files changed, 8 insertions(+), 1 deletion(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.71 llvm/lib/Target/X86/README.txt:1.72
--- llvm/lib/Target/X86/README.txt:1.71 Thu Mar 23 20:57:03 2006
+++ llvm/lib/Target/X86/README.txt  Fri Mar 24 00:40:32 2006
@@ -655,3 +655,7 @@
 
 Obviously it would have been better for the first mov (or any op) to store
 directly %esp[0] if there are no other uses.
+
+//===-===//
+
+Add more vector shuffle special cases using unpckhps and unpcklps.


Index: llvm/lib/Target/X86/X86ISelLowering.cpp
diff -u llvm/lib/Target/X86/X86ISelLowering.cpp:1.126 
llvm/lib/Target/X86/X86ISelLowering.cpp:1.127
--- llvm/lib/Target/X86/X86ISelLowering.cpp:1.126   Thu Mar 23 20:58:06 2006
+++ llvm/lib/Target/X86/X86ISelLowering.cpp Fri Mar 24 00:40:32 2006
@@ -2329,7 +2329,10 @@
 return DAG.getNode(ISD::VECTOR_SHUFFLE, VT, V1,
DAG.getNode(ISD::UNDEF, V1.getValueType()),
PermMask);
-} else if (NumElems == 2 || X86::isSHUFPMask(PermMask.Val)) {
+} else if (NumElems == 2) {
+  // All v2f64 cases are handled.
+  return SDOperand();
+} else if (X86::isSHUFPMask(PermMask.Val)) {
   SDOperand Elt = PermMask.getOperand(0);
   if (cast(Elt)->getValue() >= NumElems) {
 // Swap the operands and change mask.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-23 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.70 -> 1.71
---
Log message:

A new entry

---
Diffs of the changes:  (+11 -0)

 README.txt |   11 +++
 1 files changed, 11 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.70 llvm/lib/Target/X86/README.txt:1.71
--- llvm/lib/Target/X86/README.txt:1.70 Tue Mar 21 01:18:26 2006
+++ llvm/lib/Target/X86/README.txt  Thu Mar 23 20:57:03 2006
@@ -644,3 +644,14 @@
 
 Teach the coallescer to coales vregs of different register classes. e.g. FR32 /
 FR64 to VR128.
+
+//===-===//
+
+mov $reg, 48(%esp)
+...
+leal 48(%esp), %eax
+mov %eax, (%esp)
+call _foo
+
+Obviously it would have been better for the first mov (or any op) to store
+directly %esp[0] if there are no other uses.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-20 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.69 -> 1.70
---
Log message:

Combine 2 entries

---
Diffs of the changes:  (+6 -8)

 README.txt |   14 ++
 1 files changed, 6 insertions(+), 8 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.69 llvm/lib/Target/X86/README.txt:1.70
--- llvm/lib/Target/X86/README.txt:1.69 Tue Mar 21 01:12:57 2006
+++ llvm/lib/Target/X86/README.txt  Tue Mar 21 01:18:26 2006
@@ -485,6 +485,12 @@
 
 //===-===//
 
+Should generate min/max for stuff like:
+
+void minf(float a, float b, float *X) {
+  *X = a <= b ? a : b;
+}
+
 Make use of floating point min / max instructions. Perhaps introduce ISD::FMIN
 and ISD::FMAX node types?
 
@@ -636,13 +642,5 @@
 
 //===-===//
 
-Should generate min/max for stuff like:
-
-void minf(float a, float b, float *X) {
-  *X = a <= b ? a : b;
-}
-
-//===-===//
-
 Teach the coallescer to coales vregs of different register classes. e.g. FR32 /
 FR64 to VR128.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-20 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.68 -> 1.69
---
Log message:

Add a note about x86 register coallescing

---
Diffs of the changes:  (+2 -0)

 README.txt |2 ++
 1 files changed, 2 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.68 llvm/lib/Target/X86/README.txt:1.69
--- llvm/lib/Target/X86/README.txt:1.68 Sun Mar 19 16:27:41 2006
+++ llvm/lib/Target/X86/README.txt  Tue Mar 21 01:12:57 2006
@@ -644,3 +644,5 @@
 
 //===-===//
 
+Teach the coallescer to coales vregs of different register classes. e.g. FR32 /
+FR64 to VR128.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-19 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.67 -> 1.68
---
Log message:

add a note with a testcase


---
Diffs of the changes:  (+11 -0)

 README.txt |   11 +++
 1 files changed, 11 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.67 llvm/lib/Target/X86/README.txt:1.68
--- llvm/lib/Target/X86/README.txt:1.67 Sun Mar 19 00:08:11 2006
+++ llvm/lib/Target/X86/README.txt  Sun Mar 19 16:27:41 2006
@@ -633,3 +633,14 @@
 The following tests perform worse with LSR:
 
 lambda, siod, optimizer-eval, ackermann, hash2, nestedloop, strcat, and 
Treesor.
+
+//===-===//
+
+Should generate min/max for stuff like:
+
+void minf(float a, float b, float *X) {
+  *X = a <= b ? a : b;
+}
+
+//===-===//
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-18 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.66 -> 1.67
---
Log message:

Remember which tests are hurt by LSR.

---
Diffs of the changes:  (+4 -0)

 README.txt |4 
 1 files changed, 4 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.66 llvm/lib/Target/X86/README.txt:1.67
--- llvm/lib/Target/X86/README.txt:1.66 Thu Mar 16 16:44:22 2006
+++ llvm/lib/Target/X86/README.txt  Sun Mar 19 00:08:11 2006
@@ -629,3 +629,7 @@
 dependent LICM pass or 2) makeing SelectDAG represent the whole function. 
 
 //===-===//
+
+The following tests perform worse with LSR:
+
+lambda, siod, optimizer-eval, ackermann, hash2, nestedloop, strcat, and 
Treesor.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-16 Thread Evan Cheng



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.65 -> 1.66
---
Log message:

A new entry.


---
Diffs of the changes:  (+45 -0)

 README.txt |   45 +
 1 files changed, 45 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.65 llvm/lib/Target/X86/README.txt:1.66
--- llvm/lib/Target/X86/README.txt:1.65 Wed Mar  8 19:39:46 2006
+++ llvm/lib/Target/X86/README.txt  Thu Mar 16 16:44:22 2006
@@ -584,3 +584,48 @@
 
 //===-===//
 
+%X = weak global int 0
+
+void %foo(int %N) {
+   %N = cast int %N to uint
+   %tmp.24 = setgt int %N, 0
+   br bool %tmp.24, label %no_exit, label %return
+
+no_exit:
+   %indvar = phi uint [ 0, %entry ], [ %indvar.next, %no_exit ]
+   %i.0.0 = cast uint %indvar to int
+   volatile store int %i.0.0, int* %X
+   %indvar.next = add uint %indvar, 1
+   %exitcond = seteq uint %indvar.next, %N
+   br bool %exitcond, label %return, label %no_exit
+
+return:
+   ret void
+}
+
+compiles into:
+
+   .text
+   .align  4
+   .globl  _foo
+_foo:
+   movl 4(%esp), %eax
+   cmpl $1, %eax
+   jl LBB_foo_4# return
+LBB_foo_1: # no_exit.preheader
+   xorl %ecx, %ecx
+LBB_foo_2: # no_exit
+   movl L_X$non_lazy_ptr, %edx
+   movl %ecx, (%edx)
+   incl %ecx
+   cmpl %eax, %ecx
+   jne LBB_foo_2   # no_exit
+LBB_foo_3: # return.loopexit
+LBB_foo_4: # return
+   ret
+
+We should hoist "movl L_X$non_lazy_ptr, %edx" out of the loop after
+remateralization is implemented. This can be accomplished with 1) a target
+dependent LICM pass or 2) makeing SelectDAG represent the whole function. 
+
+//===-===//



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-08 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.64 -> 1.65
---
Log message:

a couple of miscellaneous things.


---
Diffs of the changes:  (+18 -0)

 README.txt |   18 ++
 1 files changed, 18 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.64 llvm/lib/Target/X86/README.txt:1.65
--- llvm/lib/Target/X86/README.txt:1.64 Sat Mar  4 19:15:18 2006
+++ llvm/lib/Target/X86/README.txt  Wed Mar  8 19:39:46 2006
@@ -566,3 +566,21 @@
 jb LBB_foo_3# no_exit
 
 //===-===//
+
+Codegen:
+  if (copysign(1.0, x) == copysign(1.0, y))
+into:
+  if (x^y & mask)
+when using SSE.
+
+//===-===//
+
+Optimize this into something reasonable:
+ x * copysign(1.0, y) * copysign(1.0, z)
+
+//===-===//
+
+Optimize copysign(x, *y) to use an integer load from y.
+
+//===-===//
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-03-04 Thread Chris Lattner



Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.63 -> 1.64
---
Log message:

add a note for something evan noticed


---
Diffs of the changes:  (+28 -0)

 README.txt |   28 
 1 files changed, 28 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.63 llvm/lib/Target/X86/README.txt:1.64
--- llvm/lib/Target/X86/README.txt:1.63 Sat Mar  4 01:49:50 2006
+++ llvm/lib/Target/X86/README.txt  Sat Mar  4 19:15:18 2006
@@ -538,3 +538,31 @@
 
 Lower memcpy / memset to a series of SSE 128 bit move instructions when it's
 feasible.
+
+//===-===//
+
+Teach the coallescer to commute 2-addr instructions, allowing us to eliminate
+the reg-reg copy in this example:
+
+float foo(int *x, float *y, unsigned c) {
+  float res = 0.0;
+  unsigned i;
+  for (i = 0; i < c; i++) {
+float xx = (float)x[i];
+xx = xx * y[i];
+xx += res;
+res = xx;
+  }
+  return res;
+}
+
+LBB_foo_3:  # no_exit
+cvtsi2ss %XMM0, DWORD PTR [%EDX + 4*%ESI]
+mulss %XMM0, DWORD PTR [%EAX + 4*%ESI]
+addss %XMM0, %XMM1
+inc %ESI
+cmp %ESI, %ECX
+movaps %XMM1, %XMM0
+jb LBB_foo_3# no_exit
+
+//===-===//



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

1 2 >

1 - 100 of 149 matches

Mail list logo