[llvm-commits] CVS: llvm/docs/ReleaseNotes.html

2006-04-17 Thread Chris Lattner


Changes in directory llvm/docs:

ReleaseNotes.html updated: 1.348 -> 1.349
---
Log message:

Add some more notes, many still missing



---
Diffs of the changes:  (+31 -2)

 ReleaseNotes.html |   33 +++--
 1 files changed, 31 insertions(+), 2 deletions(-)


Index: llvm/docs/ReleaseNotes.html
diff -u llvm/docs/ReleaseNotes.html:1.348 llvm/docs/ReleaseNotes.html:1.349
--- llvm/docs/ReleaseNotes.html:1.348   Tue Apr 18 01:18:36 2006
+++ llvm/docs/ReleaseNotes.html Tue Apr 18 01:32:08 2006
@@ -171,13 +171,42 @@
 
 
 
+
+Optimizer 
+Improvements
+
+
+
+The Loop Unswitching pass (-loop-unswitch) has had several bugs
+fixed, has several new features, and is enabled by default in llvmgcc3
+now.
+The Loop Strength Reduction pass (-loop-reduce) is now enabled for
+the X86 backend.
+The Instruction Combining pass (-instcombine) now includes a
+framework and implementation for simplifying code based on whether computed
+bits are demanded or not.
+The Scalar Replacement of Aggregates pass (-scalarrepl) can now
+promote simple unions to registers.
+Several LLVM passes are http://llvm.org/PR681";>significantly
+faster.
+
+
+
 
 

 Other New Features
 
 
 
-foo
+LLVM now supports first class global ctor/dtor initialization lists, no
+longer forcing targets to use "__main".
+LLVM supports assigning globals and functions to a particular section
+in the result executable using the GCC section attribute.
+Adding intrinsics to LLVM is now
+significantly easier.
+llvmgcc4 now fully supports C99 Variable Length Arrays, including dynamic
+stack deallocation.
+
 
 
 
@@ -701,7 +730,7 @@
   src="http://www.w3.org/Icons/valid-html401"; alt="Valid HTML 4.01!" />
 
   http://llvm.org/";>The LLVM Compiler Infrastructure
-  Last modified: $Date: 2006/04/18 06:18:36 $
+  Last modified: $Date: 2006/04/18 06:32:08 $
 
 
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/projects/sample/autoconf/AutoRegen.sh configure.ac

2006-04-17 Thread Reid Spencer


Changes in directory llvm/projects/sample/autoconf:

AutoRegen.sh updated: 1.2 -> 1.3
configure.ac updated: 1.4 -> 1.5
---
Log message:

Have the AutoRegen.sh script prompt the user for the LLVM src and obj 
directories if it can't find them. Then, replace those values into the
configure.ac script and pass them to the LLVM_CONFIG_PROJECT so that the
values become the default for llvm_src and llvm_obj variables. In this way
the user is required to input this exactly once, and the scripts take it
from there.


---
Diffs of the changes:  (+29 -4)

 AutoRegen.sh |   26 --
 configure.ac |7 +--
 2 files changed, 29 insertions(+), 4 deletions(-)


Index: llvm/projects/sample/autoconf/AutoRegen.sh
diff -u llvm/projects/sample/autoconf/AutoRegen.sh:1.2 
llvm/projects/sample/autoconf/AutoRegen.sh:1.3
--- llvm/projects/sample/autoconf/AutoRegen.sh:1.2  Thu Feb 24 12:42:34 2005
+++ llvm/projects/sample/autoconf/AutoRegen.sh  Tue Apr 18 01:27:47 2006
@@ -7,20 +7,42 @@
 test -f configure.ac || die "Can't find 'autoconf' dir; please cd into it 
first"
 autoconf --version | egrep '2\.5[0-9]' > /dev/null
 if test $? -ne 0 ; then
-   die "Your autoconf was not detected as being 2.5x"
+  die "Your autoconf was not detected as being 2.5x"
 fi
 cwd=`pwd`
 if test -d ../../../autoconf/m4 ; then
   cd ../../../autoconf/m4
   llvm_m4=`pwd`
+  llvm_src_root=../../..
+  llvm_obj_root=../../..
   cd $cwd
 elif test -d ../../llvm/autoconf/m4 ; then
   cd ../../llvm/autoconf/m4
   llvm_m4=`pwd`
+  llvm_src_root=../..
+  llvm_obj_root=../..
   cd $cwd
 else
-  die "Can't find the LLVM autoconf/m4 directory. The project should be 
checked out to projects directory"
+  while true ; do
+echo "LLVM source root not found." 
+read -p "Enter full path to LLVM source:"
+if test -d "$REPLY/autoconf/m4" ; then
+  llvm_src_root="$REPLY"
+  llvm_m4="$REPLY/autoconf/m4"
+  read -p "Enter full path to LLVM objects (empty for same as source):"
+  if test -d "$REPLY" ; then
+llvm_obj_root="$REPLY"
+  else
+llvm_obj_root="$llvm_src_root"
+  fi
+  break
+fi
+  done
 fi
+# Patch the LLVM_ROOT in configure.ac, if it needs it
+cp configure.ac configure.bak
+sed -e "s#^LLVM_SRC_ROOT=.*#LLVM_SRC_ROOT=\"$llvm_src_root\"#" \
+-e "s#^LLVM_OBJ_ROOT=.*#LLVM_OBJ_ROOT=\"$llvm_obj_root\"#" configure.bak > 
configure.ac
 echo "Regenerating aclocal.m4 with aclocal"
 rm -f aclocal.m4
 aclocal -I $llvm_m4 -I "$llvm_m4/.." || die "aclocal failed"


Index: llvm/projects/sample/autoconf/configure.ac
diff -u llvm/projects/sample/autoconf/configure.ac:1.4 
llvm/projects/sample/autoconf/configure.ac:1.5
--- llvm/projects/sample/autoconf/configure.ac:1.4  Thu Feb 24 12:50:53 2005
+++ llvm/projects/sample/autoconf/configure.ac  Tue Apr 18 01:27:47 2006
@@ -3,13 +3,16 @@
 dnl **
 AC_INIT([[[SAMPLE]]],[[[x.xx]]],[EMAIL PROTECTED])
 
+dnl Identify where LLVM source tree is
+LLVM_SRC_ROOT="../../"
+LLVM_OBJ_ROOT="../../"
 dnl Tell autoconf that the auxilliary files are actually located in
 dnl the LLVM autoconf directory, not here.
-AC_CONFIG_AUX_DIR(../../autoconf)
+AC_CONFIG_AUX_DIR($LLVM_SRC_ROOT/autoconf)
 
 dnl Tell autoconf that this is an LLVM project being configured
 dnl This provides the --with-llvmsrc and --with-llvmobj options
-LLVM_CONFIG_PROJECT
+LLVM_CONFIG_PROJECT($LLVM_SRC_ROOT,$LLVM_OBJ_ROOT)
 
 dnl Verify that the source directory is valid
 AC_CONFIG_SRCDIR(["Makefile.common.in"])



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/autoconf/m4/config_project.m4

2006-04-17 Thread Reid Spencer


Changes in directory llvm/autoconf/m4:

config_project.m4 updated: 1.1 -> 1.2
---
Log message:

Make it possible to default the llvm_src and llvm_obj variables based on
the arguments to the macro. This better supports the AutoRegen.sh script
in projects/sample/autoconf.


---
Diffs of the changes:  (+2 -2)

 config_project.m4 |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)


Index: llvm/autoconf/m4/config_project.m4
diff -u llvm/autoconf/m4/config_project.m4:1.1 
llvm/autoconf/m4/config_project.m4:1.2
--- llvm/autoconf/m4/config_project.m4:1.1  Thu Feb 24 12:25:24 2005
+++ llvm/autoconf/m4/config_project.m4  Tue Apr 18 01:25:37 2006
@@ -4,11 +4,11 @@
 AC_DEFUN([LLVM_CONFIG_PROJECT],
   [AC_ARG_WITH([llvmsrc],
 AS_HELP_STRING([--with-llvmsrc],[Location of LLVM Source Code]),
-[llvm_src="$withval"],[llvm_src=`cd ${srcdir}/../..; pwd`])
+[llvm_src="$withval"],[llvm_src="]$1["])
   AC_SUBST(LLVM_SRC,$llvm_src)
   AC_ARG_WITH([llvmobj],
 AS_HELP_STRING([--with-llvmobj],[Location of LLVM Object Code]),
-[llvm_obj="$withval"],[llvm_obj=`cd ../..; pwd`])
+[llvm_obj="$withval"],[llvm_obj="]$2["])
   AC_SUBST(LLVM_OBJ,$llvm_obj)
   AC_CONFIG_COMMANDS([setup],,[llvm_src="${LLVM_SRC}"])
 ])



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/docs/ReleaseNotes.html

2006-04-17 Thread Chris Lattner


Changes in directory llvm/docs:

ReleaseNotes.html updated: 1.347 -> 1.348
---
Log message:

add a bunch of stuff, pieces still missing


---
Diffs of the changes:  (+171 -47)

 ReleaseNotes.html |  218 ++
 1 files changed, 171 insertions(+), 47 deletions(-)


Index: llvm/docs/ReleaseNotes.html
diff -u llvm/docs/ReleaseNotes.html:1.347 llvm/docs/ReleaseNotes.html:1.348
--- llvm/docs/ReleaseNotes.html:1.347   Thu Mar  2 18:34:26 2006
+++ llvm/docs/ReleaseNotes.html Tue Apr 18 01:18:36 2006
@@ -4,7 +4,7 @@
 
   
   
-  LLVM 1.7cvs Release Notes
+  LLVM 1.7 Release Notes
 
 
 
@@ -60,38 +60,152 @@
 
 
 
-This is the seventh public release of the LLVM Compiler Infrastructure. This
-release incorporates a large number of enhancements and additions (primarily in
-the code generator), which combine to improve the quality of the code generated
-by LLVM by up to 30% in some cases.  This release is also the first release to
-have first-class support for Mac OS X: all of the major bugs have been shaken
-out and it is now as well supported as Linux on X86.
+This is the eighth public release of the LLVM Compiler Infrastructure. This
+release incorporates a large number of enhancements and new features,
+including vector support (Intel SSE and Altivec), a new GCC4.0-based
+C/C++ front-end, Objective C/C++ support, inline assembly support, and many
+other big features.
+
 
 
 
 

 
-New Features in LLVM 1.7cvs
+New Features in LLVM 1.7
 
 
+
+GCC4.0-based llvm-gcc
+front-end
+
+
+
+LLVM 1.8 includes a brand new llvm-gcc, based on GCC 4.0.1.  This version
+of llvm-gcc solves many serious long-standing problems with llvm-gcc, including
+all of those blocked by the http://llvm.org/PR498";>llvm-gcc 4 meta 
+bug.  In addition, llvm-gcc4 implements support for many new features, 
+including GCC inline assembly, generic vector support, SSE and Altivec
+intrinsics, and several new GCC attributes.  In addition, llvm-gcc4 is 
+significantly faster than llvm-gcc3, respects -O options, its -c/-S options
+correspond to GCC's (they emit native code).
+
+If you can use it, llvm-gcc4 is offers significant new functionality, and we
+hope that it will replace llvm-gcc3 completely in a future release.  
+Unfortunately, it does not currently support C++ exception handling at all, and
+it only works on Apple Mac OS/X machines with X86 or PowerPC processors.
+
+
+
+
+
+Inline Assembly
+Support
+
+
+
+The LLVM IR and llvm-gcc4 front-end now fully support arbitrary GCC inline assembly.  The LLVM X86 and PowerPC
+code generators have initial support for it,
+being able to compile basic statements, but are missing some features.  Please
+report any inline asm statements that crash the compiler or that are 
miscompiled
+as bugs.
+
+
+
+
+New SPARC backend
+
+
+
+LLVM 1.7 includes a new, fully functional, SPARC backend built in the
+target-independent code generator.  This SPARC backend includes support for 
+SPARC V8 and SPARC V9 subtargets (controlling whether V9 features can be used),
+and targets the 32-bit SPARC ABI.
+
+The LLVM 1.7 release is the last release that will include the LLVM 
"SparcV9"
+backend, which was the very first LLVM native code generator.  In 1.8, it will
+be removed, replaced with the new SPARC backend.
+
+
+
+
+Generic Vector Support
+
+
+
+
+LLVM now includes significantly extended support for SIMD vectors in its
+core instruction set.  It now includes three new instructions for manipulating
+vectors: extractelement,
+insertelement, and
+shufflevector.  Further,
+many bugs in vector handling have been fixed, and vectors are now supported by
+the target-independent code generator.  For example, if a vector operation is
+not supported by a particular target, it will be correctly broken down and
+executed as scalar operations.
+
+Because llvm-gcc3 does not support GCC generic vectors or vector intrinsics,
+so llvm-gcc4 must be used.
+
+
+
+
+Intel SSE and PowerPC 
+Altivec support
+
+
+
+
+The LLVM X86 backend now supports Intel SSE 1, 2, and 3, and now uses scalar
+SSE operations to implement scalar floating point math when the target supports
+SSE1 (for floats) or SSE2 (for doubles).  Vector SSE instructions are generated
+by llvm-gcc4 when the generic vector mechanism or specific SSE intrinsics are 
+used.
+
+
+The LLVM PowerPC backend now supports the Altivec instruction set, including
+both GCC -maltivec and -faltivec modes.  Altivec instructions are generated
+by llvm-gcc4 when the generic vector mechanism or specific Altivec intrinsics
+are used.
+
+
+
+
+
+
+Other New Features
+
 
 
-New C front-end.
-New SPARC backend.
-Inline assembly support.
+foo
 
 
 
+
 

 
-Significant changes in LLVM 1.7cvs
+Significant Changes in LLVM 1.7
 
 
 
 
-Removed the llvm.readport/llvm.writeport/llvm.readio/llvm.writeio
-intrinsics.
-Separated the other intrinsics based on type.
+The LLVM intrinsics used to be overloaded based on type: for example,
+llvm.ctpop could wor

[llvm-commits] CVS: llvm/tools/llc/llc.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/tools/llc:

llc.cpp updated: 1.129 -> 1.130
---
Log message:

Add a warning.


---
Diffs of the changes:  (+3 -0)

 llc.cpp |3 +++
 1 files changed, 3 insertions(+)


Index: llvm/tools/llc/llc.cpp
diff -u llvm/tools/llc/llc.cpp:1.129 llvm/tools/llc/llc.cpp:1.130
--- llvm/tools/llc/llc.cpp:1.129Wed Mar 22 23:27:47 2006
+++ llvm/tools/llc/llc.cpp  Tue Apr 18 00:31:20 2006
@@ -216,6 +216,9 @@
 sys::RemoveFileOnSignal(sys::Path(OutputFilename));
   }
 }
+
+if (FileType != TargetMachine::AssemblyFile)
+  std::cerr << "WARNING: only -filetype=asm is currently supported.\n";
 
 // Ask the target to add backend passes as necessary.
 if (Target.addPassesToEmitFile(Passes, *Out, FileType, Fast)) {



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/tools/llvm-db/llvm-db.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/tools/llvm-db:

llvm-db.cpp updated: 1.9 -> 1.10
---
Log message:

Add a warning


---
Diffs of the changes:  (+1 -0)

 llvm-db.cpp |1 +
 1 files changed, 1 insertion(+)


Index: llvm/tools/llvm-db/llvm-db.cpp
diff -u llvm/tools/llvm-db/llvm-db.cpp:1.9 llvm/tools/llvm-db/llvm-db.cpp:1.10
--- llvm/tools/llvm-db/llvm-db.cpp:1.9  Thu Apr 21 18:59:36 2005
+++ llvm/tools/llvm-db/llvm-db.cpp  Tue Apr 18 00:26:10 2006
@@ -50,6 +50,7 @@
 // main Driver function
 //
 int main(int argc, char **argv, char * const *envp) {
+  std::cout << "NOTE: llvm-db is known useless right now.\n";
   try {
 cl::ParseCommandLineOptions(argc, argv,
 " llvm source-level debugger\n");



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/PowerPC/PPCISelLowering.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

PPCISelLowering.cpp updated: 1.163 -> 1.164
---
Log message:

Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing
even/odd halves.  Thanks to Nate telling me what's what.



---
Diffs of the changes:  (+3 -18)

 PPCISelLowering.cpp |   21 +++--
 1 files changed, 3 insertions(+), 18 deletions(-)


Index: llvm/lib/Target/PowerPC/PPCISelLowering.cpp
diff -u llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.163 
llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.164
--- llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.163   Mon Apr 17 22:57:35 2006
+++ llvm/lib/Target/PowerPC/PPCISelLowering.cpp Mon Apr 17 23:28:57 2006
@@ -1602,25 +1602,10 @@
   } else if (Op.getValueType() == MVT::v8i16) {
 SDOperand LHS = Op.getOperand(0), RHS = Op.getOperand(1);
 
-// Multiply the even 16-bit parts, producing 32-bit sums.
-SDOperand EvenParts = BuildIntrinsicOp(Intrinsic::ppc_altivec_vmuleuh,
-   LHS, RHS, DAG, MVT::v4i32);
-EvenParts = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, EvenParts);
-
-// Multiply the odd 16-bit parts, producing 32-bit sums.
-SDOperand OddParts = BuildIntrinsicOp(Intrinsic::ppc_altivec_vmulouh,
-  LHS, RHS, DAG, MVT::v4i32);
-OddParts = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, OddParts);
+SDOperand Zero = BuildSplatI(0, 1, MVT::v8i16, DAG);
 
-// Merge the results together.
-std::vector Ops;
-for (unsigned i = 0; i != 4; ++i) {
-  Ops.push_back(DAG.getConstant(2*i+1, MVT::i16));
-  Ops.push_back(DAG.getConstant(2*i+1+8, MVT::i16));
-}
-
-return DAG.getNode(ISD::VECTOR_SHUFFLE, MVT::v8i16, EvenParts, OddParts,
-   DAG.getNode(ISD::BUILD_VECTOR, MVT::v8i16, Ops));
+return BuildIntrinsicOp(Intrinsic::ppc_altivec_vmladduhm,
+LHS, RHS, Zero, DAG);
   } else if (Op.getValueType() == MVT::v16i8) {
 SDOperand LHS = Op.getOperand(0), RHS = Op.getOperand(1);
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/PowerPC/PPCISelLowering.cpp README_ALTIVEC.txt

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

PPCISelLowering.cpp updated: 1.162 -> 1.163
README_ALTIVEC.txt updated: 1.27 -> 1.28
---
Log message:

Implement v16i8 multiply with this code:

vmuloub v5, v3, v2
vmuleub v2, v3, v2
vperm v2, v2, v5, v4

This implements CodeGen/PowerPC/vec_mul.ll.  With this, v16i8 multiplies are
6.79x faster than before.

Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with
GCC.

Remove the 'integer multiplies' todo from the README file.



---
Diffs of the changes:  (+25 -11)

 PPCISelLowering.cpp |   27 +--
 README_ALTIVEC.txt  |9 -
 2 files changed, 25 insertions(+), 11 deletions(-)


Index: llvm/lib/Target/PowerPC/PPCISelLowering.cpp
diff -u llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.162 
llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.163
--- llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.162   Mon Apr 17 22:43:48 2006
+++ llvm/lib/Target/PowerPC/PPCISelLowering.cpp Mon Apr 17 22:57:35 2006
@@ -229,6 +229,7 @@
 setOperationAction(ISD::MUL, MVT::v4f32, Legal);
 setOperationAction(ISD::MUL, MVT::v4i32, Custom);
 setOperationAction(ISD::MUL, MVT::v8i16, Custom);
+setOperationAction(ISD::MUL, MVT::v16i8, Custom);
 
 setOperationAction(ISD::SCALAR_TO_VECTOR, MVT::v4f32, Custom);
 setOperationAction(ISD::SCALAR_TO_VECTOR, MVT::v4i32, Custom);
@@ -1601,12 +1602,12 @@
   } else if (Op.getValueType() == MVT::v8i16) {
 SDOperand LHS = Op.getOperand(0), RHS = Op.getOperand(1);
 
-// Multiply the even 16-parts, producing 32-bit sums.
+// Multiply the even 16-bit parts, producing 32-bit sums.
 SDOperand EvenParts = BuildIntrinsicOp(Intrinsic::ppc_altivec_vmuleuh,
LHS, RHS, DAG, MVT::v4i32);
 EvenParts = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, EvenParts);
 
-// Multiply the odd 16-parts, producing 32-bit sums.
+// Multiply the odd 16-bit parts, producing 32-bit sums.
 SDOperand OddParts = BuildIntrinsicOp(Intrinsic::ppc_altivec_vmulouh,
   LHS, RHS, DAG, MVT::v4i32);
 OddParts = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, OddParts);
@@ -1620,6 +1621,28 @@
 
 return DAG.getNode(ISD::VECTOR_SHUFFLE, MVT::v8i16, EvenParts, OddParts,
DAG.getNode(ISD::BUILD_VECTOR, MVT::v8i16, Ops));
+  } else if (Op.getValueType() == MVT::v16i8) {
+SDOperand LHS = Op.getOperand(0), RHS = Op.getOperand(1);
+
+// Multiply the even 8-bit parts, producing 16-bit sums.
+SDOperand EvenParts = BuildIntrinsicOp(Intrinsic::ppc_altivec_vmuleub,
+   LHS, RHS, DAG, MVT::v8i16);
+EvenParts = DAG.getNode(ISD::BIT_CONVERT, MVT::v16i8, EvenParts);
+
+// Multiply the odd 8-bit parts, producing 16-bit sums.
+SDOperand OddParts = BuildIntrinsicOp(Intrinsic::ppc_altivec_vmuloub,
+  LHS, RHS, DAG, MVT::v8i16);
+OddParts = DAG.getNode(ISD::BIT_CONVERT, MVT::v16i8, OddParts);
+
+// Merge the results together.
+std::vector Ops;
+for (unsigned i = 0; i != 8; ++i) {
+  Ops.push_back(DAG.getConstant(2*i+1, MVT::i8));
+  Ops.push_back(DAG.getConstant(2*i+1+16, MVT::i8));
+}
+
+return DAG.getNode(ISD::VECTOR_SHUFFLE, MVT::v16i8, EvenParts, OddParts,
+   DAG.getNode(ISD::BUILD_VECTOR, MVT::v16i8, Ops));
   } else {
 assert(0 && "Unknown mul to lower!");
 abort();


Index: llvm/lib/Target/PowerPC/README_ALTIVEC.txt
diff -u llvm/lib/Target/PowerPC/README_ALTIVEC.txt:1.27 
llvm/lib/Target/PowerPC/README_ALTIVEC.txt:1.28
--- llvm/lib/Target/PowerPC/README_ALTIVEC.txt:1.27 Mon Apr 17 16:52:03 2006
+++ llvm/lib/Target/PowerPC/README_ALTIVEC.txt  Mon Apr 17 22:57:35 2006
@@ -75,15 +75,6 @@
 
 
//===--===//
 
-Implement multiply for vector integer types, to avoid the horrible scalarized
-code produced by legalize.
-
-void test(vector int *X, vector int *Y) {
-  *X = *X * *Y;
-}
-
-//===--===//
-
 extract_vector_elt of an arbitrary constant vector can be done with the 
 following instructions:
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/test/Regression/CodeGen/PowerPC/vec_mul.ll

2006-04-17 Thread Chris Lattner


Changes in directory llvm/test/Regression/CodeGen/PowerPC:

vec_mul.ll updated: 1.1 -> 1.2
---
Log message:

Add tests for v8i16 and v16i8


---
Diffs of the changes:  (+16 -2)

 vec_mul.ll |   18 --
 1 files changed, 16 insertions(+), 2 deletions(-)


Index: llvm/test/Regression/CodeGen/PowerPC/vec_mul.ll
diff -u llvm/test/Regression/CodeGen/PowerPC/vec_mul.ll:1.1 
llvm/test/Regression/CodeGen/PowerPC/vec_mul.ll:1.2
--- llvm/test/Regression/CodeGen/PowerPC/vec_mul.ll:1.1 Mon Apr 17 22:22:16 2006
+++ llvm/test/Regression/CodeGen/PowerPC/vec_mul.ll Mon Apr 17 22:54:50 2006
@@ -1,11 +1,25 @@
 ; RUN: llvm-as < %s | llc -march=ppc32 -mcpu=g5 &&
-; RUN: llvm-as < %s | llc -march=ppc32 -mcpu=g5 | not grep mullw
+; RUN: llvm-as < %s | llc -march=ppc32 -mcpu=g5 | not grep mullw &&
 ; RUN: llvm-as < %s | llc -march=ppc32 -mcpu=g5 | grep vmsumuhm
 
-<4 x int> %test(<4 x int>* %X, <4 x int>* %Y) {
+<4 x int> %test_v4i32(<4 x int>* %X, <4 x int>* %Y) {
 %tmp = load <4 x int>* %X
 %tmp2 = load <4 x int>* %Y
 %tmp3 = mul <4 x int> %tmp, %tmp2
 ret <4 x int> %tmp3
 }
 
+<8 x short> %test_v8i16(<8 x short>* %X, <8 x short>* %Y) {
+%tmp = load <8 x short>* %X
+%tmp2 = load <8 x short>* %Y
+%tmp3 = mul <8 x short> %tmp, %tmp2
+ret <8 x short> %tmp3
+}
+
+<16 x sbyte> %test_v16i8(<16 x sbyte>* %X, <16 x sbyte>* %Y) {
+%tmp = load <16 x sbyte>* %X
+%tmp2 = load <16 x sbyte>* %Y
+%tmp3 = mul <16 x sbyte> %tmp, %tmp2
+ret <16 x sbyte> %tmp3
+}
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-17 Thread Evan Cheng


Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.91 -> 1.92
---
Log message:

Correct comments

---
Diffs of the changes:  (+6 -6)

 README.txt |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.91 llvm/lib/Target/X86/README.txt:1.92
--- llvm/lib/Target/X86/README.txt:1.91 Mon Apr 17 20:22:57 2006
+++ llvm/lib/Target/X86/README.txt  Mon Apr 17 22:45:01 2006
@@ -982,17 +982,17 @@
jne LBB_main_4  # cond_true44
 
 There are two problems. 1) No need to two loop induction variables. We can
-compare against 262144 * 16. 2) Poor register allocation decisions. We should
+compare against 262144 * 16. 2) Known register coalescer issue. We should
 be able eliminate one of the movaps:
 
-   addps %xmm1, %xmm2
-   subps %xmm3, %xmm2
+   addps %xmm2, %xmm1<=== Commute!
+   subps %xmm3, %xmm1
movaps (%ecx), %xmm4
-   movaps %xmm2, %xmm2   <=== Eliminate!
-   addps %xmm4, %xmm2
+   movaps %xmm1, %xmm1   <=== Eliminate!
+   addps %xmm4, %xmm1
addl $16, %ecx
incl %edx
cmpl $262144, %edx
-   movaps %xmm3, %xmm1
+   movaps %xmm3, %xmm2
movaps %xmm4, %xmm3
jne LBB_main_4  # cond_true44



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/PowerPC/PPCISelLowering.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

PPCISelLowering.cpp updated: 1.161 -> 1.162
---
Log message:

Lower v8i16 multiply into this code:

li r5, lo16(LCPI1_0)
lis r6, ha16(LCPI1_0)
lvx v4, r6, r5
vmulouh v5, v3, v2
vmuleuh v2, v3, v2
vperm v2, v2, v5, v4

where v4 is:
LCPI1_0:;  <16 x ubyte>
.byte   2
.byte   3
.byte   18
.byte   19
.byte   6
.byte   7
.byte   22
.byte   23
.byte   10
.byte   11
.byte   26
.byte   27
.byte   14
.byte   15
.byte   30
.byte   31

This is 5.07x faster on the G5 (measured) than lowering to scalar code + 
loads/stores.



---
Diffs of the changes:  (+51 -25)

 PPCISelLowering.cpp |   76 ++--
 1 files changed, 51 insertions(+), 25 deletions(-)


Index: llvm/lib/Target/PowerPC/PPCISelLowering.cpp
diff -u llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.161 
llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.162
--- llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.161   Mon Apr 17 22:24:30 2006
+++ llvm/lib/Target/PowerPC/PPCISelLowering.cpp Mon Apr 17 22:43:48 2006
@@ -228,6 +228,7 @@
 
 setOperationAction(ISD::MUL, MVT::v4f32, Legal);
 setOperationAction(ISD::MUL, MVT::v4i32, Custom);
+setOperationAction(ISD::MUL, MVT::v8i16, Custom);
 
 setOperationAction(ISD::SCALAR_TO_VECTOR, MVT::v4f32, Custom);
 setOperationAction(ISD::SCALAR_TO_VECTOR, MVT::v4i32, Custom);
@@ -1573,31 +1574,56 @@
 }
 
 static SDOperand LowerMUL(SDOperand Op, SelectionDAG &DAG) {
-  assert(Op.getValueType() == MVT::v4i32 && "Unknown mul to lower!");
-  SDOperand LHS = Op.getOperand(0);
-  SDOperand RHS = Op.getOperand(1);
-  
-  SDOperand Zero  = BuildSplatI(  0, 1, MVT::v4i32, DAG);
-  SDOperand Neg16 = BuildSplatI(-16, 4, MVT::v4i32, DAG);  // +16 as shift amt.
-  
-  SDOperand RHSSwap =   // = vrlw RHS, 16
-BuildIntrinsicOp(Intrinsic::ppc_altivec_vrlw, RHS, Neg16, DAG);
-  
-  // Shrinkify inputs to v8i16.
-  LHS = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, LHS);
-  RHS = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, RHS);
-  RHSSwap = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, RHSSwap);
-  
-  // Low parts multiplied together, generating 32-bit results (we ignore the 
top
-  // parts).
-  SDOperand LoProd = BuildIntrinsicOp(Intrinsic::ppc_altivec_vmulouh,
-  LHS, RHS, DAG, MVT::v4i32);
-  
-  SDOperand HiProd = BuildIntrinsicOp(Intrinsic::ppc_altivec_vmsumuhm,
-  LHS, RHSSwap, Zero, DAG, MVT::v4i32);
-  // Shift the high parts up 16 bits.
-  HiProd = BuildIntrinsicOp(Intrinsic::ppc_altivec_vslw, HiProd, Neg16, DAG);
-  return DAG.getNode(ISD::ADD, MVT::v4i32, LoProd, HiProd);
+  if (Op.getValueType() == MVT::v4i32) {
+SDOperand LHS = Op.getOperand(0), RHS = Op.getOperand(1);
+
+SDOperand Zero  = BuildSplatI(  0, 1, MVT::v4i32, DAG);
+SDOperand Neg16 = BuildSplatI(-16, 4, MVT::v4i32, DAG); // +16 as shift 
amt.
+
+SDOperand RHSSwap =   // = vrlw RHS, 16
+  BuildIntrinsicOp(Intrinsic::ppc_altivec_vrlw, RHS, Neg16, DAG);
+
+// Shrinkify inputs to v8i16.
+LHS = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, LHS);
+RHS = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, RHS);
+RHSSwap = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, RHSSwap);
+
+// Low parts multiplied together, generating 32-bit results (we ignore the
+// top parts).
+SDOperand LoProd = BuildIntrinsicOp(Intrinsic::ppc_altivec_vmulouh,
+LHS, RHS, DAG, MVT::v4i32);
+
+SDOperand HiProd = BuildIntrinsicOp(Intrinsic::ppc_altivec_vmsumuhm,
+LHS, RHSSwap, Zero, DAG, MVT::v4i32);
+// Shift the high parts up 16 bits.
+HiProd = BuildIntrinsicOp(Intrinsic::ppc_altivec_vslw, HiProd, Neg16, DAG);
+return DAG.getNode(ISD::ADD, MVT::v4i32, LoProd, HiProd);
+  } else if (Op.getValueType() == MVT::v8i16) {
+SDOperand LHS = Op.getOperand(0), RHS = Op.getOperand(1);
+
+// Multiply the even 16-parts, producing 32-bit sums.
+SDOperand EvenParts = BuildIntrinsicOp(Intrinsic::ppc_altivec_vmuleuh,
+   LHS, RHS, DAG, MVT::v4i32);
+EvenParts = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, EvenParts);
+
+// Multiply the odd 16-parts, producing 32-bit sums.
+SDOperand OddParts = BuildIntrinsicOp(Intrinsic::ppc_altivec_vmulouh,
+  LHS, RHS, DAG, MVT::v4i32);
+OddParts = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, OddParts);
+
+// Merge the results together.
+std::vector Ops;
+for (unsigned i = 0; i != 4; ++i) {
+  Ops.push_back(DAG.getConstant(2*i+1, MVT::i16));
+  Ops.push_back(DAG.getConstant(2*i+1+8, MVT::i16));
+}
+
+return DAG.get

[llvm-commits] CVS: llvm/lib/Target/PowerPC/PPCISelLowering.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

PPCISelLowering.cpp updated: 1.160 -> 1.161
---
Log message:

Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
scalarize the sequence into 4 mullw's and a bunch of load/store traffic.

This speeds up v4i32 multiplies 4.1x (measured) on a G5.  This implements
PowerPC/vec_mul.ll


---
Diffs of the changes:  (+53 -10)

 PPCISelLowering.cpp |   63 +++-
 1 files changed, 53 insertions(+), 10 deletions(-)


Index: llvm/lib/Target/PowerPC/PPCISelLowering.cpp
diff -u llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.160 
llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.161
--- llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.160   Mon Apr 17 13:09:22 2006
+++ llvm/lib/Target/PowerPC/PPCISelLowering.cpp Mon Apr 17 22:24:30 2006
@@ -227,6 +227,7 @@
 addRegisterClass(MVT::v16i8, PPC::VRRCRegisterClass);
 
 setOperationAction(ISD::MUL, MVT::v4f32, Legal);
+setOperationAction(ISD::MUL, MVT::v4i32, Custom);
 
 setOperationAction(ISD::SCALAR_TO_VECTOR, MVT::v4f32, Custom);
 setOperationAction(ISD::SCALAR_TO_VECTOR, MVT::v4i32, Custom);
@@ -1062,14 +1063,27 @@
   return DAG.getNode(ISD::BIT_CONVERT, VT, Res);
 }
 
-/// BuildIntrinsicBinOp - Return a binary operator intrinsic node with the
+/// BuildIntrinsicOp - Return a binary operator intrinsic node with the
 /// specified intrinsic ID.
-static SDOperand BuildIntrinsicBinOp(unsigned IID, SDOperand LHS, SDOperand 
RHS,
- SelectionDAG &DAG) {
-  return DAG.getNode(ISD::INTRINSIC_WO_CHAIN, LHS.getValueType(),
+static SDOperand BuildIntrinsicOp(unsigned IID, SDOperand LHS, SDOperand RHS,
+  SelectionDAG &DAG, 
+  MVT::ValueType DestVT = MVT::Other) {
+  if (DestVT == MVT::Other) DestVT = LHS.getValueType();
+  return DAG.getNode(ISD::INTRINSIC_WO_CHAIN, DestVT,
  DAG.getConstant(IID, MVT::i32), LHS, RHS);
 }
 
+/// BuildIntrinsicOp - Return a ternary operator intrinsic node with the
+/// specified intrinsic ID.
+static SDOperand BuildIntrinsicOp(unsigned IID, SDOperand Op0, SDOperand Op1,
+  SDOperand Op2, SelectionDAG &DAG, 
+  MVT::ValueType DestVT = MVT::Other) {
+  if (DestVT == MVT::Other) DestVT = Op0.getValueType();
+  return DAG.getNode(ISD::INTRINSIC_WO_CHAIN, DestVT,
+ DAG.getConstant(IID, MVT::i32), Op0, Op1, Op2);
+}
+
+
 /// BuildVSLDOI - Return a VECTOR_SHUFFLE that is a vsldoi of the specified
 /// amount.  The result has the specified value type.
 static SDOperand BuildVSLDOI(SDOperand LHS, SDOperand RHS, unsigned Amt,
@@ -1145,8 +1159,8 @@
   SDOperand OnesV = BuildSplatI(-1, 4, MVT::v4i32, DAG);
   
   // Make the VSLW intrinsic, computing 0x8000_.
-  SDOperand Res = BuildIntrinsicBinOp(Intrinsic::ppc_altivec_vslw, OnesV, 
-  OnesV, DAG);
+  SDOperand Res = BuildIntrinsicOp(Intrinsic::ppc_altivec_vslw, OnesV, 
+   OnesV, DAG);
   
   // xor by OnesV to invert it.
   Res = DAG.getNode(ISD::XOR, MVT::v4i32, Res, OnesV);
@@ -1175,7 +1189,7 @@
   Intrinsic::ppc_altivec_vslb, Intrinsic::ppc_altivec_vslh, 0,
   Intrinsic::ppc_altivec_vslw
 };
-return BuildIntrinsicBinOp(IIDs[SplatSize-1], Op, Op, DAG);
+return BuildIntrinsicOp(IIDs[SplatSize-1], Op, Op, DAG);
   }
   
   // vsplti + srl self.
@@ -1185,7 +1199,7 @@
   Intrinsic::ppc_altivec_vsrb, Intrinsic::ppc_altivec_vsrh, 0,
   Intrinsic::ppc_altivec_vsrw
 };
-return BuildIntrinsicBinOp(IIDs[SplatSize-1], Op, Op, DAG);
+return BuildIntrinsicOp(IIDs[SplatSize-1], Op, Op, DAG);
   }
   
   // vsplti + sra self.
@@ -1195,7 +1209,7 @@
   Intrinsic::ppc_altivec_vsrab, Intrinsic::ppc_altivec_vsrah, 0,
   Intrinsic::ppc_altivec_vsraw
 };
-return BuildIntrinsicBinOp(IIDs[SplatSize-1], Op, Op, DAG);
+return BuildIntrinsicOp(IIDs[SplatSize-1], Op, Op, DAG);
   }
   
   // vsplti + rol self.
@@ -1206,7 +1220,7 @@
   Intrinsic::ppc_altivec_vrlb, Intrinsic::ppc_altivec_vrlh, 0,
   Intrinsic::ppc_altivec_vrlw
 };
-return BuildIntrinsicBinOp(IIDs[SplatSize-1], Op, Op, DAG);
+return BuildIntrinsicOp(IIDs[SplatSize-1], Op, Op, DAG);
   }
 
   // t = vsplti c, result = vsldoi t, t, 1
@@ -1558,6 +1572,34 @@
   return DAG.getLoad(Op.getValueType(), Store, FIdx, DAG.getSrcValue(NULL));
 }
 
+static SDOperand LowerMUL(SDOperand Op, SelectionDAG &DAG) {
+  assert(Op.getValueType() == MVT::v4i32 && "Unknown mul to lower!");
+  SDOperand LHS = Op.getOperand(0);
+  SDOperand RHS = Op.getOperand(1);
+  
+  SDOperand Zero  = BuildSplatI(  0, 1, MVT::v4i32, DAG);
+  SDOperand Neg16 = BuildS

[llvm-commits] CVS: llvm/test/Regression/CodeGen/PowerPC/vec_mul.ll

2006-04-17 Thread Chris Lattner


Changes in directory llvm/test/Regression/CodeGen/PowerPC:

vec_mul.ll added (r1.1)
---
Log message:

new testcase


---
Diffs of the changes:  (+11 -0)

 vec_mul.ll |   11 +++
 1 files changed, 11 insertions(+)


Index: llvm/test/Regression/CodeGen/PowerPC/vec_mul.ll
diff -c /dev/null llvm/test/Regression/CodeGen/PowerPC/vec_mul.ll:1.1
*** /dev/null   Mon Apr 17 22:22:26 2006
--- llvm/test/Regression/CodeGen/PowerPC/vec_mul.ll Mon Apr 17 22:22:16 2006
***
*** 0 
--- 1,11 
+ ; RUN: llvm-as < %s | llc -march=ppc32 -mcpu=g5 &&
+ ; RUN: llvm-as < %s | llc -march=ppc32 -mcpu=g5 | not grep mullw
+ ; RUN: llvm-as < %s | llc -march=ppc32 -mcpu=g5 | grep vmsumuhm
+ 
+ <4 x int> %test(<4 x int>* %X, <4 x int>* %Y) {
+ %tmp = load <4 x int>* %X
+ %tmp2 = load <4 x int>* %Y
+ %tmp3 = mul <4 x int> %tmp, %tmp2
+ ret <4 x int> %tmp3
+ }
+ 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-17 Thread Evan Cheng


Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.90 -> 1.91
---
Log message:

Another entry

---
Diffs of the changes:  (+35 -0)

 README.txt |   35 +++
 1 files changed, 35 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.90 llvm/lib/Target/X86/README.txt:1.91
--- llvm/lib/Target/X86/README.txt:1.90 Mon Apr 17 19:21:01 2006
+++ llvm/lib/Target/X86/README.txt  Mon Apr 17 20:22:57 2006
@@ -961,3 +961,38 @@
 to three-address transformation.
 
 It also exposes some other problems. See MOV32ri -3 and the spills.
+
+//===-===//
+
+http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25500
+
+LLVM is producing bad code.
+
+LBB_main_4:# cond_true44
+   addps %xmm1, %xmm2
+   subps %xmm3, %xmm2
+   movaps (%ecx), %xmm4
+   movaps %xmm2, %xmm1
+   addps %xmm4, %xmm1
+   addl $16, %ecx
+   incl %edx
+   cmpl $262144, %edx
+   movaps %xmm3, %xmm2
+   movaps %xmm4, %xmm3
+   jne LBB_main_4  # cond_true44
+
+There are two problems. 1) No need to two loop induction variables. We can
+compare against 262144 * 16. 2) Poor register allocation decisions. We should
+be able eliminate one of the movaps:
+
+   addps %xmm1, %xmm2
+   subps %xmm3, %xmm2
+   movaps (%ecx), %xmm4
+   movaps %xmm2, %xmm2   <=== Eliminate!
+   addps %xmm4, %xmm2
+   addl $16, %ecx
+   incl %edx
+   cmpl $262144, %edx
+   movaps %xmm3, %xmm1
+   movaps %xmm4, %xmm3
+   jne LBB_main_4  # cond_true44



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/utils/PerfectShuffle/PerfectShuffle.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/utils/PerfectShuffle:

PerfectShuffle.cpp updated: 1.6 -> 1.7
---
Log message:

Fix a build failure on Vladimir's tester.


---
Diffs of the changes:  (+1 -0)

 PerfectShuffle.cpp |1 +
 1 files changed, 1 insertion(+)


Index: llvm/utils/PerfectShuffle/PerfectShuffle.cpp
diff -u llvm/utils/PerfectShuffle/PerfectShuffle.cpp:1.6 
llvm/utils/PerfectShuffle/PerfectShuffle.cpp:1.7
--- llvm/utils/PerfectShuffle/PerfectShuffle.cpp:1.6Mon Apr 17 00:25:16 2006
+++ llvm/utils/PerfectShuffle/PerfectShuffle.cppMon Apr 17 19:21:25 2006
@@ -16,6 +16,7 @@
 
 #include 
 #include 
+#include 
 
 struct Operator;
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/X86/README.txt

2006-04-17 Thread Evan Cheng


Changes in directory llvm/lib/Target/X86:

README.txt updated: 1.89 -> 1.90
---
Log message:

Another entry.


---
Diffs of the changes:  (+151 -0)

 README.txt |  151 +
 1 files changed, 151 insertions(+)


Index: llvm/lib/Target/X86/README.txt
diff -u llvm/lib/Target/X86/README.txt:1.89 llvm/lib/Target/X86/README.txt:1.90
--- llvm/lib/Target/X86/README.txt:1.89 Sat Apr 15 00:37:34 2006
+++ llvm/lib/Target/X86/README.txt  Mon Apr 17 19:21:01 2006
@@ -810,3 +810,154 @@
 How about andps, andpd, and pand? Do we really care about the type of the 
packed
 elements? If not, why not always use the "ps" variants which are likely to be
 shorter.
+
+//===-===//
+
+We are emitting bad code for this:
+
+float %test(float* %V, int %I, int %D, float %V) {
+entry:
+   %tmp = seteq int %D, 0
+   br bool %tmp, label %cond_true, label %cond_false23
+
+cond_true:
+   %tmp3 = getelementptr float* %V, int %I
+   %tmp = load float* %tmp3
+   %tmp5 = setgt float %tmp, %V
+   %tmp6 = tail call bool %llvm.isunordered.f32( float %tmp, float %V )
+   %tmp7 = or bool %tmp5, %tmp6
+   br bool %tmp7, label %UnifiedReturnBlock, label %cond_next
+
+cond_next:
+   %tmp10 = add int %I, 1
+   %tmp12 = getelementptr float* %V, int %tmp10
+   %tmp13 = load float* %tmp12
+   %tmp15 = setle float %tmp13, %V
+   %tmp16 = tail call bool %llvm.isunordered.f32( float %tmp13, float %V )
+   %tmp17 = or bool %tmp15, %tmp16
+   %retval = select bool %tmp17, float 0.00e+00, float 1.00e+00
+   ret float %retval
+
+cond_false23:
+   %tmp28 = tail call float %foo( float* %V, int %I, int %D, float %V )
+   ret float %tmp28
+
+UnifiedReturnBlock:; preds = %cond_true
+   ret float 0.00e+00
+}
+
+declare bool %llvm.isunordered.f32(float, float)
+
+declare float %foo(float*, int, int, float)
+
+
+It exposes a known load folding problem:
+
+   movss (%edx,%ecx,4), %xmm1
+   ucomiss %xmm1, %xmm0
+
+As well as this:
+
+LBB_test_2:# cond_next
+   movss LCPI1_0, %xmm2
+   pxor %xmm3, %xmm3
+   ucomiss %xmm0, %xmm1
+   jbe LBB_test_6  # cond_next
+LBB_test_5:# cond_next
+   movaps %xmm2, %xmm3
+LBB_test_6:# cond_next
+   movss %xmm3, 40(%esp)
+   flds 40(%esp)
+   addl $44, %esp
+   ret
+
+Clearly it's unnecessary to clear %xmm3. It's also not clear why we are 
emitting
+three moves (movss, movaps, movss).
+
+//===-===//
+
+External test Nurbs exposed some problems. Look for
+__ZN15Nurbs_SSE_Cubic17TessellateSurfaceE, bb cond_next140. This is what icc
+emits:
+
+movaps(%edx), %xmm2 #59.21
+movaps(%edx), %xmm5 #60.21
+movaps(%edx), %xmm4 #61.21
+movaps(%edx), %xmm3 #62.21
+movl  40(%ecx), %ebp#69.49
+shufps$0, %xmm2, %xmm5  #60.21
+movl  100(%esp), %ebx   #69.20
+movl  (%ebx), %edi  #69.20
+imull %ebp, %edi#69.49
+addl  (%eax), %edi  #70.33
+shufps$85, %xmm2, %xmm4 #61.21
+shufps$170, %xmm2, %xmm3#62.21
+shufps$255, %xmm2, %xmm2#63.21
+lea   (%ebp,%ebp,2), %ebx   #69.49
+negl  %ebx  #69.49
+lea   -3(%edi,%ebx), %ebx   #70.33
+shll  $4, %ebx  #68.37
+addl  32(%ecx), %ebx#68.37
+testb $15, %bl  #91.13
+jne   L_B1.24   # Prob 5%   #91.13
+
+This is the llvm code after instruction scheduling:
+
+cond_next140 (0xa910740, LLVM BB @0xa90beb0):
+   %reg1078 = MOV32ri -3
+   %reg1079 = ADD32rm %reg1078, %reg1068, 1, %NOREG, 0
+   %reg1037 = MOV32rm %reg1024, 1, %NOREG, 40
+   %reg1080 = IMUL32rr %reg1079, %reg1037
+   %reg1081 = MOV32rm %reg1058, 1, %NOREG, 0
+   %reg1038 = LEA32r %reg1081, 1, %reg1080, -3
+   %reg1036 = MOV32rm %reg1024, 1, %NOREG, 32
+   %reg1082 = SHL32ri %reg1038, 4
+   %reg1039 = ADD32rr %reg1036, %reg1082
+   %reg1083 = MOVAPSrm %reg1059, 1, %NOREG, 0
+   %reg1034 = SHUFPSrr %reg1083, %reg1083, 170
+   %reg1032 = SHUFPSrr %reg1083, %reg1083, 0
+   %reg1035 = SHUFPSrr %reg1083, %reg1083, 255
+   %reg1033 = S

[llvm-commits] CVS: llvm-test/autoconf/configure.ac

2006-04-17 Thread Evan Cheng


Changes in directory llvm-test/autoconf:

configure.ac updated: 1.31 -> 1.32
---
Log message:

Added nurbs as an external test.


---
Diffs of the changes:  (+1 -0)

 configure.ac |1 +
 1 files changed, 1 insertion(+)


Index: llvm-test/autoconf/configure.ac
diff -u llvm-test/autoconf/configure.ac:1.31 
llvm-test/autoconf/configure.ac:1.32
--- llvm-test/autoconf/configure.ac:1.31Fri Apr  7 13:53:21 2006
+++ llvm-test/autoconf/configure.ac Mon Apr 17 19:02:53 2006
@@ -74,6 +74,7 @@
 EXTERNAL_BENCHMARK(sweep3d,${LLVM_EXTERNALS}/sweep3d)
 EXTERNAL_BENCHMARK(fpgrowth,${LLVM_EXTERNALS}/fpgrowth)
 EXTERNAL_BENCHMARK(alp,${LLVM_EXTERNALS}/alp)
+EXTERNAL_BENCHMARK(nurbs,${LLVM_EXTERNALS}/nurbs)
 
 dnl LLC Diff Option
 AC_ARG_ENABLE(llc_diffs,



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/X86/X86ISelLowering.cpp X86InstrSSE.td

2006-04-17 Thread Evan Cheng


Changes in directory llvm/lib/Target/X86:

X86ISelLowering.cpp updated: 1.172 -> 1.173
X86InstrSSE.td updated: 1.99 -> 1.100
---
Log message:

Use movss to insert_vector_elt(v, s, 0).


---
Diffs of the changes:  (+37 -19)

 X86ISelLowering.cpp |   51 ---
 X86InstrSSE.td  |5 +
 2 files changed, 37 insertions(+), 19 deletions(-)


Index: llvm/lib/Target/X86/X86ISelLowering.cpp
diff -u llvm/lib/Target/X86/X86ISelLowering.cpp:1.172 
llvm/lib/Target/X86/X86ISelLowering.cpp:1.173
--- llvm/lib/Target/X86/X86ISelLowering.cpp:1.172   Mon Apr 17 17:04:06 2006
+++ llvm/lib/Target/X86/X86ISelLowering.cpp Mon Apr 17 17:45:49 2006
@@ -3015,28 +3015,41 @@
 N2 = DAG.getConstant(cast(N2)->getValue(), MVT::i32);
   return DAG.getNode(X86ISD::PINSRW, VT, N0, N1, N2);
 } else if (MVT::getSizeInBits(BaseVT) == 32) {
-  // Use two pinsrw instructions to insert a 32 bit value.
   unsigned Idx = cast(N2)->getValue();
-  Idx <<= 1;
-  if (MVT::isFloatingPoint(N1.getValueType())) {
-if (N1.getOpcode() == ISD::LOAD) {
-  // Just load directly from f32mem to R32.
-  N1 = DAG.getLoad(MVT::i32, N1.getOperand(0), N1.getOperand(1),
-   N1.getOperand(2));
-} else {
-  N1 = DAG.getNode(ISD::SCALAR_TO_VECTOR, MVT::v4f32, N1);
-  N1 = DAG.getNode(ISD::BIT_CONVERT, MVT::v4i32, N1);
-  N1 = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, MVT::i32, N1,
-   DAG.getConstant(0, MVT::i32));
+  if (Idx == 0) {
+// Use a movss.
+N1 = DAG.getNode(ISD::SCALAR_TO_VECTOR, VT, N1);
+MVT::ValueType MaskVT = MVT::getIntVectorWithNumElements(4);
+MVT::ValueType BaseVT = MVT::getVectorBaseType(MaskVT);
+std::vector MaskVec;
+MaskVec.push_back(DAG.getConstant(4, BaseVT));
+for (unsigned i = 1; i <= 3; ++i)
+  MaskVec.push_back(DAG.getConstant(i, BaseVT));
+return DAG.getNode(ISD::VECTOR_SHUFFLE, VT, N0, N1,
+   DAG.getNode(ISD::BUILD_VECTOR, MaskVT, MaskVec));
+  } else {
+// Use two pinsrw instructions to insert a 32 bit value.
+Idx <<= 1;
+if (MVT::isFloatingPoint(N1.getValueType())) {
+  if (N1.getOpcode() == ISD::LOAD) {
+// Just load directly from f32mem to R32.
+N1 = DAG.getLoad(MVT::i32, N1.getOperand(0), N1.getOperand(1),
+ N1.getOperand(2));
+  } else {
+N1 = DAG.getNode(ISD::SCALAR_TO_VECTOR, MVT::v4f32, N1);
+N1 = DAG.getNode(ISD::BIT_CONVERT, MVT::v4i32, N1);
+N1 = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, MVT::i32, N1,
+ DAG.getConstant(0, MVT::i32));
+  }
 }
+N0 = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, N0);
+N0 = DAG.getNode(X86ISD::PINSRW, MVT::v8i16, N0, N1,
+ DAG.getConstant(Idx, MVT::i32));
+N1 = DAG.getNode(ISD::SRL, MVT::i32, N1, DAG.getConstant(16, MVT::i8));
+N0 = DAG.getNode(X86ISD::PINSRW, MVT::v8i16, N0, N1,
+ DAG.getConstant(Idx+1, MVT::i32));
+return DAG.getNode(ISD::BIT_CONVERT, VT, N0);
   }
-  N0 = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, N0);
-  N0 = DAG.getNode(X86ISD::PINSRW, MVT::v8i16, N0, N1,
-   DAG.getConstant(Idx, MVT::i32));
-  N1 = DAG.getNode(ISD::SRL, MVT::i32, N1, DAG.getConstant(16, MVT::i8));
-  N0 = DAG.getNode(X86ISD::PINSRW, MVT::v8i16, N0, N1,
-   DAG.getConstant(Idx+1, MVT::i32));
-  return DAG.getNode(ISD::BIT_CONVERT, VT, N0);
 }
 
 return SDOperand();


Index: llvm/lib/Target/X86/X86InstrSSE.td
diff -u llvm/lib/Target/X86/X86InstrSSE.td:1.99 
llvm/lib/Target/X86/X86InstrSSE.td:1.100
--- llvm/lib/Target/X86/X86InstrSSE.td:1.99 Mon Apr 17 16:33:57 2006
+++ llvm/lib/Target/X86/X86InstrSSE.td  Mon Apr 17 17:45:49 2006
@@ -2414,6 +2414,11 @@
   MOVSLDUP_shuffle_mask)),
   (MOVSLDUPrm addr:$src)>, Requires<[HasSSE3]>;
 
+// vector_shuffle v1, v2 <4, 1, 2, 3>
+def : Pat<(v4i32 (vector_shuffle VR128:$src1, VR128:$src2,
+  MOVS_shuffle_mask)),
+  (MOVLPSrr VR128:$src1, VR128:$src2)>;
+
 // 128-bit logical shifts
 def : Pat<(int_x86_sse2_psll_dq VR128:$src1, imm:$src2),
   (v2i64 (PSLLDQri VR128:$src1, (PSxLDQ_imm imm:$src2)))>,



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Transforms/Scalar/InstructionCombining.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Transforms/Scalar:

InstructionCombining.cpp updated: 1.468 -> 1.469
---
Log message:

Turn x86 unaligned load/store intrinsics into aligned load/store instructions
if the pointer is known aligned.


---
Diffs of the changes:  (+16 -1)

 InstructionCombining.cpp |   17 -
 1 files changed, 16 insertions(+), 1 deletion(-)


Index: llvm/lib/Transforms/Scalar/InstructionCombining.cpp
diff -u llvm/lib/Transforms/Scalar/InstructionCombining.cpp:1.468 
llvm/lib/Transforms/Scalar/InstructionCombining.cpp:1.469
--- llvm/lib/Transforms/Scalar/InstructionCombining.cpp:1.468   Sat Apr 15 
19:51:47 2006
+++ llvm/lib/Transforms/Scalar/InstructionCombining.cpp Mon Apr 17 17:26:56 2006
@@ -5471,7 +5471,11 @@
 default: break;
 case Intrinsic::ppc_altivec_lvx:
 case Intrinsic::ppc_altivec_lvxl:
-  // Turn lvx -> load if the pointer is known aligned.
+case Intrinsic::x86_sse_loadu_ps:
+case Intrinsic::x86_sse2_loadu_pd:
+case Intrinsic::x86_sse2_loadu_dq:
+  // Turn PPC lvx -> load if the pointer is known aligned.
+  // Turn X86 loadups -> load if the pointer is known aligned.
   if (GetKnownAlignment(II->getOperand(1), TD) >= 16) {
 Value *Ptr = InsertCastBefore(II->getOperand(1),
   PointerType::get(II->getType()), CI);
@@ -5487,6 +5491,17 @@
 return new StoreInst(II->getOperand(1), Ptr);
   }
   break;
+case Intrinsic::x86_sse_storeu_ps:
+case Intrinsic::x86_sse2_storeu_pd:
+case Intrinsic::x86_sse2_storeu_dq:
+case Intrinsic::x86_sse2_storel_dq:
+  // Turn X86 storeu -> store if the pointer is known aligned.
+  if (GetKnownAlignment(II->getOperand(1), TD) >= 16) {
+const Type *OpPtrTy = PointerType::get(II->getOperand(2)->getType());
+Value *Ptr = InsertCastBefore(II->getOperand(1), OpPtrTy, CI);
+return new StoreInst(II->getOperand(2), Ptr);
+  }
+  break;
 case Intrinsic::ppc_altivec_vperm:
   // Turn vperm(V1,V2,mask) -> shuffle(V1,V2,mask) if mask is a constant.
   if (ConstantPacked *Mask = dyn_cast(II->getOperand(3))) {



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/CodeGen/SelectionDAG:

LegalizeDAG.cpp updated: 1.367 -> 1.368
---
Log message:

Fix handling of calls in functions that use vectors.  This fixes a crash on
the code in GCC PR26546: http://llvm.cs.uiuc.edu/PR26546 .


---
Diffs of the changes:  (+1 -13)

 LegalizeDAG.cpp |   14 +-
 1 files changed, 1 insertion(+), 13 deletions(-)


Index: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
diff -u llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.367 
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.368
--- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.367 Mon Apr 17 14:21:01 2006
+++ llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp   Mon Apr 17 17:10:08 2006
@@ -433,19 +433,7 @@
   if (OperandsLeadToDest) return true;
 
   // Okay, this node looks safe, legalize it and return false.
-  switch (getTypeAction(N->getValueType(0))) {
-  case Legal:
-LegalizeOp(SDOperand(N, 0));
-break;
-  case Promote:
-PromoteOp(SDOperand(N, 0));
-break;
-  case Expand: {
-SDOperand X, Y;
-ExpandOp(SDOperand(N, 0), X, Y);
-break;
-  }
-  }
+  HandleOp(SDOperand(N, 0));
   return false;
 }
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/X86/X86ISelLowering.cpp

2006-04-17 Thread Evan Cheng


Changes in directory llvm/lib/Target/X86:

X86ISelLowering.cpp updated: 1.171 -> 1.172
---
Log message:

Use two pinsrw to insert an element into v4i32 / v4f32 vector.

---
Diffs of the changes:  (+30 -3)

 X86ISelLowering.cpp |   33 ++---
 1 files changed, 30 insertions(+), 3 deletions(-)


Index: llvm/lib/Target/X86/X86ISelLowering.cpp
diff -u llvm/lib/Target/X86/X86ISelLowering.cpp:1.171 
llvm/lib/Target/X86/X86ISelLowering.cpp:1.172
--- llvm/lib/Target/X86/X86ISelLowering.cpp:1.171   Mon Apr 17 15:43:08 2006
+++ llvm/lib/Target/X86/X86ISelLowering.cpp Mon Apr 17 17:04:06 2006
@@ -309,6 +309,9 @@
 setOperationAction(ISD::SCALAR_TO_VECTOR,   MVT::v16i8, Custom);
 setOperationAction(ISD::SCALAR_TO_VECTOR,   MVT::v8i16, Custom);
 setOperationAction(ISD::INSERT_VECTOR_ELT,  MVT::v8i16, Custom);
+setOperationAction(ISD::INSERT_VECTOR_ELT,  MVT::v4i32, Custom);
+// Implement v4f32 insert_vector_elt in terms of SSE2 v8i16 ones.
+setOperationAction(ISD::INSERT_VECTOR_ELT,  MVT::v4f32, Custom);
 
 // Custom lower build_vector, vector_shuffle, and extract_vector_elt.
 for (unsigned VT = (unsigned)MVT::v16i8; VT != (unsigned)MVT::v2i64; VT++) 
{
@@ -3002,14 +3005,38 @@
 // as its second argument.
 MVT::ValueType VT = Op.getValueType();
 MVT::ValueType BaseVT = MVT::getVectorBaseType(VT);
+SDOperand N0 = Op.getOperand(0);
+SDOperand N1 = Op.getOperand(1);
+SDOperand N2 = Op.getOperand(2);
 if (MVT::getSizeInBits(BaseVT) == 16) {
-  SDOperand N1 = Op.getOperand(1);
-  SDOperand N2 = Op.getOperand(2);
   if (N1.getValueType() != MVT::i32)
 N1 = DAG.getNode(ISD::ANY_EXTEND, MVT::i32, N1);
   if (N2.getValueType() != MVT::i32)
 N2 = DAG.getConstant(cast(N2)->getValue(), MVT::i32);
-  return DAG.getNode(X86ISD::PINSRW, VT, Op.getOperand(0), N1, N2);
+  return DAG.getNode(X86ISD::PINSRW, VT, N0, N1, N2);
+} else if (MVT::getSizeInBits(BaseVT) == 32) {
+  // Use two pinsrw instructions to insert a 32 bit value.
+  unsigned Idx = cast(N2)->getValue();
+  Idx <<= 1;
+  if (MVT::isFloatingPoint(N1.getValueType())) {
+if (N1.getOpcode() == ISD::LOAD) {
+  // Just load directly from f32mem to R32.
+  N1 = DAG.getLoad(MVT::i32, N1.getOperand(0), N1.getOperand(1),
+   N1.getOperand(2));
+} else {
+  N1 = DAG.getNode(ISD::SCALAR_TO_VECTOR, MVT::v4f32, N1);
+  N1 = DAG.getNode(ISD::BIT_CONVERT, MVT::v4i32, N1);
+  N1 = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, MVT::i32, N1,
+   DAG.getConstant(0, MVT::i32));
+}
+  }
+  N0 = DAG.getNode(ISD::BIT_CONVERT, MVT::v8i16, N0);
+  N0 = DAG.getNode(X86ISD::PINSRW, MVT::v8i16, N0, N1,
+   DAG.getConstant(Idx, MVT::i32));
+  N1 = DAG.getNode(ISD::SRL, MVT::i32, N1, DAG.getConstant(16, MVT::i8));
+  N0 = DAG.getNode(X86ISD::PINSRW, MVT::v8i16, N0, N1,
+   DAG.getConstant(Idx+1, MVT::i32));
+  return DAG.getNode(ISD::BIT_CONVERT, VT, N0);
 }
 
 return SDOperand();



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/PowerPC/README_ALTIVEC.txt

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

README_ALTIVEC.txt updated: 1.26 -> 1.27
---
Log message:

remove done item


---
Diffs of the changes:  (+2 -19)

 README_ALTIVEC.txt |   21 ++---
 1 files changed, 2 insertions(+), 19 deletions(-)


Index: llvm/lib/Target/PowerPC/README_ALTIVEC.txt
diff -u llvm/lib/Target/PowerPC/README_ALTIVEC.txt:1.26 
llvm/lib/Target/PowerPC/README_ALTIVEC.txt:1.27
--- llvm/lib/Target/PowerPC/README_ALTIVEC.txt:1.26 Mon Apr 17 12:29:41 2006
+++ llvm/lib/Target/PowerPC/README_ALTIVEC.txt  Mon Apr 17 16:52:03 2006
@@ -5,8 +5,8 @@
 
 
//===--===//
 
-Altivec support.  The first should be a single lvx from the constant pool, the
-second should be a xor/stvx:
+The first should be a single lvx from the constant pool, the second should be 
+a xor/stvx:
 
 void foo(void) {
   int x[8] __attribute__((aligned(128))) = { 1, 1, 1, 17, 1, 1, 1, 1 };
@@ -39,23 +39,6 @@
 
 
//===--===//
 
-There are a wide range of vector constants we can generate with combinations of
-altivec instructions.
-
-Examples, these work with all widths:
-  Splat(+/- 16,18,20,22,24,28,30):  t = vspliti I/2,  r = t+t
-  Splat(+/- 17,19,21,23,25,29): t = vsplti +/-15, t2 = vsplti I-15, r=t + 
t2
-  Splat(31):t = vsplti FB,  r = srl t,t
-  Splat(256):  t = vsplti 1, r = vsldoi t, t, 1
-
-Lots more are listed here:
-http://www.informatik.uni-bremen.de/~hobold/AltiVec.html
-
-This should be added to the ISD::BUILD_VECTOR case in 
-PPCTargetLowering::LowerOperation.
-
-//===--===//
-
 FABS/FNEG can be codegen'd with the appropriate and/xor of -0.0.
 
 
//===--===//



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/test/Regression/CodeGen/PowerPC/vec_vrsave.ll

2006-04-17 Thread Chris Lattner


Changes in directory llvm/test/Regression/CodeGen/PowerPC:

vec_vrsave.ll added (r1.1)
---
Log message:

New testcase, shouldn't touch vrsave


---
Diffs of the changes:  (+7 -0)

 vec_vrsave.ll |7 +++
 1 files changed, 7 insertions(+)


Index: llvm/test/Regression/CodeGen/PowerPC/vec_vrsave.ll
diff -c /dev/null llvm/test/Regression/CodeGen/PowerPC/vec_vrsave.ll:1.1
*** /dev/null   Mon Apr 17 16:48:13 2006
--- llvm/test/Regression/CodeGen/PowerPC/vec_vrsave.ll  Mon Apr 17 16:48:03 2006
***
*** 0 
--- 1,7 
+ ; RUN: llvm-as < %s | llc -march=ppc32 -mcpu=g5 | grep vrlw &&
+ ; RUN: llvm-as < %s | llc -march=ppc32 -mcpu=g5 | not grep spr
+ 
+ <4 x int> %test_rol() {
+ ret <4 x int> < int -11534337, int -11534337, int -11534337, int 
-11534337 >
+ }
+ 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

PPCRegisterInfo.cpp updated: 1.60 -> 1.61
---
Log message:

Don't diddle VRSAVE if no registers need to be added/removed from it.  This
allows us to codegen functions as:

_test_rol:
vspltisw v2, -12
vrlw v2, v2, v2
blr

instead of:

_test_rol:
mfvrsave r2, 256
mr r3, r2
mtvrsave r3
vspltisw v2, -12
vrlw v2, v2, v2
mtvrsave r2
blr

Testcase here: CodeGen/PowerPC/vec_vrsave.ll



---
Diffs of the changes:  (+53 -4)

 PPCRegisterInfo.cpp |   57 
 1 files changed, 53 insertions(+), 4 deletions(-)


Index: llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
diff -u llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp:1.60 
llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp:1.61
--- llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp:1.60Mon Apr 17 16:22:06 2006
+++ llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp Mon Apr 17 16:48:13 2006
@@ -25,6 +25,7 @@
 #include "llvm/CodeGen/MachineLocation.h"
 #include "llvm/CodeGen/SelectionDAGNodes.h"
 #include "llvm/Target/TargetFrameInfo.h"
+#include "llvm/Target/TargetInstrInfo.h"
 #include "llvm/Target/TargetMachine.h"
 #include "llvm/Target/TargetOptions.h"
 #include "llvm/Support/CommandLine.h"
@@ -346,6 +347,54 @@
  PPC::V24, PPC::V25, PPC::V26, PPC::V27, PPC::V28, PPC::V29, PPC::V30, PPC::V31
 };
 
+/// RemoveVRSaveCode - We have found that this function does not need any code
+/// to manipulate the VRSAVE register, even though it uses vector registers.
+/// This can happen when the only registers used are known to be live in or out
+/// of the function.  Remove all of the VRSAVE related code from the function.
+static void RemoveVRSaveCode(MachineInstr *MI) {
+  MachineBasicBlock *Entry = MI->getParent();
+  MachineFunction *MF = Entry->getParent();
+
+  // We know that the MTVRSAVE instruction immediately follows MI.  Remove it.
+  MachineBasicBlock::iterator MBBI = MI;
+  ++MBBI;
+  assert(MBBI != Entry->end() && MBBI->getOpcode() == PPC::MTVRSAVE);
+  MBBI->eraseFromParent();
+  
+  bool RemovedAllMTVRSAVEs = true;
+  // See if we can find and remove the MTVRSAVE instruction from all of the
+  // epilog blocks.
+  const TargetInstrInfo &TII = *MF->getTarget().getInstrInfo();
+  for (MachineFunction::iterator I = MF->begin(), E = MF->end(); I != E; ++I) {
+// If last instruction is a return instruction, add an epilogue
+if (!I->empty() && TII.isReturn(I->back().getOpcode())) {
+  bool FoundIt = false;
+  for (MBBI = I->end(); MBBI != I->begin(); ) {
+--MBBI;
+if (MBBI->getOpcode() == PPC::MTVRSAVE) {
+  MBBI->eraseFromParent();  // remove it.
+  FoundIt = true;
+  break;
+}
+  }
+  RemovedAllMTVRSAVEs &= FoundIt;
+}
+  }
+
+  // If we found and removed all MTVRSAVE instructions, remove the read of
+  // VRSAVE as well.
+  if (RemovedAllMTVRSAVEs) {
+MBBI = MI;
+assert(MBBI != Entry->begin() && "UPDATE_VRSAVE is first instr in block?");
+--MBBI;
+assert(MBBI->getOpcode() == PPC::MFVRSAVE && "VRSAVE instrs wandered?");
+MBBI->eraseFromParent();
+  }
+  
+  // Finally, nuke the UPDATE_VRSAVE.
+  MI->eraseFromParent();
+}
+
 // HandleVRSaveUpdate - MI is the UPDATE_VRSAVE instruction introduced by the
 // instruction selector.  Based on the vector registers that have been used,
 // transform this into the appropriate ORI instruction.
@@ -375,9 +424,9 @@
   unsigned DstReg = MI->getOperand(0).getReg();
   // If no registers are used, turn this into a copy.
   if (UsedRegMask == 0) {
-if (SrcReg != DstReg)
-  BuildMI(*MI->getParent(), MI, PPC::OR4, 2, DstReg)
-.addReg(SrcReg).addReg(SrcReg);
+// Remove all VRSAVE code.
+RemoveVRSaveCode(MI);
+return;
   } else if ((UsedRegMask & 0x) == UsedRegMask) {
 BuildMI(*MI->getParent(), MI, PPC::ORI, 2, DstReg)
 .addReg(SrcReg).addImm(UsedRegMask);
@@ -392,7 +441,7 @@
   }
   
   // Remove the old UPDATE_VRSAVE instruction.
-  MI->getParent()->erase(MI);
+  MI->eraseFromParent();
 }
 
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/include/llvm/CodeGen/MachineInstr.h

2006-04-17 Thread Chris Lattner


Changes in directory llvm/include/llvm/CodeGen:

MachineInstr.h updated: 1.165 -> 1.166
---
Log message:

Add some convenience methods.


---
Diffs of the changes:  (+10 -0)

 MachineInstr.h |   10 ++
 1 files changed, 10 insertions(+)


Index: llvm/include/llvm/CodeGen/MachineInstr.h
diff -u llvm/include/llvm/CodeGen/MachineInstr.h:1.165 
llvm/include/llvm/CodeGen/MachineInstr.h:1.166
--- llvm/include/llvm/CodeGen/MachineInstr.h:1.165  Sat Feb 25 03:52:55 2006
+++ llvm/include/llvm/CodeGen/MachineInstr.hMon Apr 17 16:35:08 2006
@@ -499,6 +499,16 @@
   /// clone - Create a copy of 'this' instruction that is identical in
   /// all ways except the the instruction has no parent, prev, or next.
   MachineInstr* clone() const;
+  
+  /// removeFromParent - This method unlinks 'this' from the containing basic
+  /// block, and returns it, but does not delete it.
+  MachineInstr *removeFromParent();
+  
+  /// eraseFromParent - This method unlinks 'this' from the containing basic
+  /// block and deletes it.
+  void eraseFromParent() {
+delete removeFromParent();
+  }
 
   //
   // Debugging support



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/CodeGen/MachineInstr.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/CodeGen:

MachineInstr.cpp updated: 1.107 -> 1.108
---
Log message:

Add a MachineInstr::eraseFromParent convenience method.


---
Diffs of the changes:  (+9 -0)

 MachineInstr.cpp |9 +
 1 files changed, 9 insertions(+)


Index: llvm/lib/CodeGen/MachineInstr.cpp
diff -u llvm/lib/CodeGen/MachineInstr.cpp:1.107 
llvm/lib/CodeGen/MachineInstr.cpp:1.108
--- llvm/lib/CodeGen/MachineInstr.cpp:1.107 Thu Apr 21 17:33:33 2005
+++ llvm/lib/CodeGen/MachineInstr.cpp   Mon Apr 17 16:35:41 2006
@@ -100,6 +100,15 @@
   return new MachineInstr(*this);
 }
 
+/// removeFromParent - This method unlinks 'this' from the containing basic
+/// block, and returns it, but does not delete it.
+MachineInstr *MachineInstr::removeFromParent() {
+  assert(getParent() && "Not embedded in a basic block!");
+  getParent()->remove(this);
+  return this;
+}
+
+
 /// OperandComplete - Return true if it's illegal to add a new operand
 ///
 bool MachineInstr::OperandsComplete() const {



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/X86/X86InstrSSE.td

2006-04-17 Thread Evan Cheng


Changes in directory llvm/lib/Target/X86:

X86InstrSSE.td updated: 1.98 -> 1.99
---
Log message:

Encoding bug

---
Diffs of the changes:  (+1 -1)

 X86InstrSSE.td |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)


Index: llvm/lib/Target/X86/X86InstrSSE.td
diff -u llvm/lib/Target/X86/X86InstrSSE.td:1.98 
llvm/lib/Target/X86/X86InstrSSE.td:1.99
--- llvm/lib/Target/X86/X86InstrSSE.td:1.98 Mon Apr 17 13:05:01 2006
+++ llvm/lib/Target/X86/X86InstrSSE.td  Mon Apr 17 16:33:57 2006
@@ -2179,7 +2179,7 @@
  "movsd {$src, $dst|$dst, $src}",
  [(set FR64:$dst, (vector_extract (v2f64 VR128:$src),
(i32 0)))]>;
-def MOVPDI2DIrr  : PDI<0x6E, MRMSrcReg, (ops R32:$dst, VR128:$src),
+def MOVPDI2DIrr  : PDI<0x7E, MRMSrcReg, (ops R32:$dst, VR128:$src),
"movd {$src, $dst|$dst, $src}",
[(set R32:$dst, (vector_extract (v4i32 VR128:$src),
 (i32 0)))]>;



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

PPCRegisterInfo.cpp updated: 1.59 -> 1.60
---
Log message:

Vectors that are known live-in and live-out are clearly already marked in 
the vrsave register for the caller.  This allows us to codegen a function as:

_test_rol:
mfspr r2, 256
mr r3, r2
mtspr 256: http://llvm.cs.uiuc.edu/PR256 , r3
vspltisw v2, -12
vrlw v2, v2, v2
mtspr 256: http://llvm.cs.uiuc.edu/PR256 , r2
blr

instead of:

_test_rol:
mfspr r2, 256
oris r3, r2, 40960
mtspr 256: http://llvm.cs.uiuc.edu/PR256 , r3
vspltisw v0, -12
vrlw v2, v0, v0
mtspr 256: http://llvm.cs.uiuc.edu/PR256 , r2
blr



---
Diffs of the changes:  (+16 -0)

 PPCRegisterInfo.cpp |   16 
 1 files changed, 16 insertions(+)


Index: llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
diff -u llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp:1.59 
llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp:1.60
--- llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp:1.59Mon Apr 17 16:07:20 2006
+++ llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp Mon Apr 17 16:22:06 2006
@@ -355,6 +355,22 @@
 if (UsedRegs[VRRegNo[i]])
   UsedRegMask |= 1 << (31-i);
   
+  // Live in and live out values already must be in the mask, so don't bother
+  // marking them.
+  MachineFunction *MF = MI->getParent()->getParent();
+  for (MachineFunction::livein_iterator I = 
+   MF->livein_begin(), E = MF->livein_end(); I != E; ++I) {
+unsigned RegNo = PPCRegisterInfo::getRegisterNumbering(I->first);
+if (VRRegNo[RegNo] == I->first)// If this really is a vector reg.
+  UsedRegMask &= ~(1 << (31-RegNo));   // Doesn't need to be marked.
+  }
+  for (MachineFunction::liveout_iterator I = 
+   MF->liveout_begin(), E = MF->liveout_end(); I != E; ++I) {
+unsigned RegNo = PPCRegisterInfo::getRegisterNumbering(*I);
+if (VRRegNo[RegNo] == *I)  // If this really is a vector reg.
+  UsedRegMask &= ~(1 << (31-RegNo));   // Doesn't need to be marked.
+  }
+  
   unsigned SrcReg = MI->getOperand(1).getReg();
   unsigned DstReg = MI->getOperand(0).getReg();
   // If no registers are used, turn this into a copy.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/PowerPC/PPCRegisterInfo.td

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

PPCRegisterInfo.td updated: 1.33 -> 1.34
---
Log message:

Prefer to allocate V2-V5 before V0,V1.  This lets us generate code like this:

vspltisw v2, -12
vrlw v2, v2, v2

instead of:

vspltisw v0, -12
vrlw v2, v0, v0

when a function is returning a value.



---
Diffs of the changes:  (+1 -1)

 PPCRegisterInfo.td |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)


Index: llvm/lib/Target/PowerPC/PPCRegisterInfo.td
diff -u llvm/lib/Target/PowerPC/PPCRegisterInfo.td:1.33 
llvm/lib/Target/PowerPC/PPCRegisterInfo.td:1.34
--- llvm/lib/Target/PowerPC/PPCRegisterInfo.td:1.33 Sat Mar 25 01:36:56 2006
+++ llvm/lib/Target/PowerPC/PPCRegisterInfo.td  Mon Apr 17 16:19:12 2006
@@ -260,7 +260,7 @@
   F22, F23, F24, F25, F26, F27, F28, F29, F30, F31]>;
 
 def VRRC : RegisterClass<"PPC", [v16i8,v8i16,v4i32,v4f32], 128,
- [V0, V1, V2, V3, V4, V5,
+ [V2, V3, V4, V5, V0, V1, 
   V6, V7, V8, V9, V10, V11, V12, V13, V14, V15, V16, V17, V18, V19, V20, V21,
   V22, V23, V24, V25, V26, V27, V28, V29, V30, V31]>;
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/PowerPC/PPCCodeEmitter.cpp PPCRegisterInfo.cpp PPCRegisterInfo.h

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

PPCCodeEmitter.cpp updated: 1.49 -> 1.50
PPCRegisterInfo.cpp updated: 1.58 -> 1.59
PPCRegisterInfo.h updated: 1.13 -> 1.14
---
Log message:

Move some knowledge about registers out of the code emitter into the register 
info.


---
Diffs of the changes:  (+47 -41)

 PPCCodeEmitter.cpp  |   42 +-
 PPCRegisterInfo.cpp |   42 ++
 PPCRegisterInfo.h   |4 
 3 files changed, 47 insertions(+), 41 deletions(-)


Index: llvm/lib/Target/PowerPC/PPCCodeEmitter.cpp
diff -u llvm/lib/Target/PowerPC/PPCCodeEmitter.cpp:1.49 
llvm/lib/Target/PowerPC/PPCCodeEmitter.cpp:1.50
--- llvm/lib/Target/PowerPC/PPCCodeEmitter.cpp:1.49 Tue Mar 21 14:19:37 2006
+++ llvm/lib/Target/PowerPC/PPCCodeEmitter.cpp  Mon Apr 17 16:07:20 2006
@@ -141,52 +141,12 @@
   }
 }
 
-static unsigned enumRegToMachineReg(unsigned enumReg) {
-  switch (enumReg) {
-  case PPC::R0 :  case PPC::F0 :  case PPC::V0 : case PPC::CR0:  return  0;
-  case PPC::R1 :  case PPC::F1 :  case PPC::V1 : case PPC::CR1:  return  1;
-  case PPC::R2 :  case PPC::F2 :  case PPC::V2 : case PPC::CR2:  return  2;
-  case PPC::R3 :  case PPC::F3 :  case PPC::V3 : case PPC::CR3:  return  3;
-  case PPC::R4 :  case PPC::F4 :  case PPC::V4 : case PPC::CR4:  return  4;
-  case PPC::R5 :  case PPC::F5 :  case PPC::V5 : case PPC::CR5:  return  5;
-  case PPC::R6 :  case PPC::F6 :  case PPC::V6 : case PPC::CR6:  return  6;
-  case PPC::R7 :  case PPC::F7 :  case PPC::V7 : case PPC::CR7:  return  7;
-  case PPC::R8 :  case PPC::F8 :  case PPC::V8 : return  8;
-  case PPC::R9 :  case PPC::F9 :  case PPC::V9 : return  9;
-  case PPC::R10:  case PPC::F10:  case PPC::V10: return 10;
-  case PPC::R11:  case PPC::F11:  case PPC::V11: return 11;
-  case PPC::R12:  case PPC::F12:  case PPC::V12: return 12;
-  case PPC::R13:  case PPC::F13:  case PPC::V13: return 13;
-  case PPC::R14:  case PPC::F14:  case PPC::V14: return 14;
-  case PPC::R15:  case PPC::F15:  case PPC::V15: return 15;
-  case PPC::R16:  case PPC::F16:  case PPC::V16: return 16;
-  case PPC::R17:  case PPC::F17:  case PPC::V17: return 17;
-  case PPC::R18:  case PPC::F18:  case PPC::V18: return 18;
-  case PPC::R19:  case PPC::F19:  case PPC::V19: return 19;
-  case PPC::R20:  case PPC::F20:  case PPC::V20: return 20;
-  case PPC::R21:  case PPC::F21:  case PPC::V21: return 21;
-  case PPC::R22:  case PPC::F22:  case PPC::V22: return 22;
-  case PPC::R23:  case PPC::F23:  case PPC::V23: return 23;
-  case PPC::R24:  case PPC::F24:  case PPC::V24: return 24;
-  case PPC::R25:  case PPC::F25:  case PPC::V25: return 25;
-  case PPC::R26:  case PPC::F26:  case PPC::V26: return 26;
-  case PPC::R27:  case PPC::F27:  case PPC::V27: return 27;
-  case PPC::R28:  case PPC::F28:  case PPC::V28: return 28;
-  case PPC::R29:  case PPC::F29:  case PPC::V29: return 29;
-  case PPC::R30:  case PPC::F30:  case PPC::V30: return 30;
-  case PPC::R31:  case PPC::F31:  case PPC::V31: return 31;
-  default:
-std::cerr << "Unhandled reg in enumRegToRealReg!\n";
-abort();
-  }
-}
-
 int PPCCodeEmitter::getMachineOpValue(MachineInstr &MI, MachineOperand &MO) {
 
   int rv = 0; // Return value; defaults to 0 for unhandled cases
   // or things that get fixed up later by the JIT.
   if (MO.isRegister()) {
-rv = enumRegToMachineReg(MO.getReg());
+rv = PPCRegisterInfo::getRegisterNumbering(MO.getReg());
 
 // Special encoding for MTCRF and MFOCRF, which uses a bit mask for the
 // register, not the register number directly.


Index: llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
diff -u llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp:1.58 
llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp:1.59
--- llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp:1.58Mon Apr 17 15:59:25 2006
+++ llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp Mon Apr 17 16:07:20 2006
@@ -35,6 +35,48 @@
 #include 
 using namespace llvm;
 
+/// getRegisterNumbering - Given the enum value for some register, e.g.
+/// PPC::F14, return the number that it corresponds to (e.g. 14).
+unsigned PPCRegisterInfo::getRegisterNumbering(unsigned RegEnum) {
+  switch (RegEnum) {
+case PPC::R0 :  case PPC::F0 :  case PPC::V0 : case PPC::CR0:  return  0;
+case PPC::R1 :  case PPC::F1 :  case PPC::V1 : case PPC::CR1:  return  1;
+case PPC::R2 :  case PPC::F2 :  case PPC::V2 : case PPC::CR2:  return  2;
+case PPC::R3 :  case PPC::F3 :  case PPC::V3 : case PPC::CR3:  return  3;
+case PPC::R4 :  case PPC::F4 :  case PPC::V4 : case PPC::CR4:  return  4;
+case PPC::R5 :  case PPC::F5 :  case PPC::V5 : case PPC::CR5:  return  5;
+case PPC::R6 :  case PPC::F6 :  case PPC::V6 : case PPC::CR6:  return  6;
+case PPC::R7 :  case PPC::F7 :  case PPC::V7 : case PPC::CR7:  return  7;
+case PPC::R8 :  case PPC::F8 :  case PPC::V8 : return  8;
+case PPC::R9 :  case PPC::F9 :  case PPC::V9 : return  9;
+case PPC::R10:  case 

[llvm-commits] CVS: llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

PPCRegisterInfo.cpp updated: 1.57 -> 1.58
---
Log message:

Use a small table instead of macros to do this conversion.



---
Diffs of the changes:  (+13 -10)

 PPCRegisterInfo.cpp |   23 +--
 1 files changed, 13 insertions(+), 10 deletions(-)


Index: llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
diff -u llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp:1.57 
llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp:1.58
--- llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp:1.57Tue Apr 11 14:44:43 2006
+++ llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp Mon Apr 17 15:59:25 2006
@@ -295,21 +295,24 @@
   }
 }
 
+/// VRRegNo - Map from a numbered VR register to its enum value.
+///
+static const unsigned short VRRegNo[] = {
+ PPC::V0 , PPC::V1 , PPC::V2 , PPC::V3 , PPC::V4 , PPC::V5 , PPC::V6 , PPC::V7 
, 
+ PPC::V8 , PPC::V9 , PPC::V10, PPC::V11, PPC::V12, PPC::V13, PPC::V14, 
PPC::V15, 
+ PPC::V16, PPC::V17, PPC::V18, PPC::V19, PPC::V20, PPC::V21, PPC::V22, 
PPC::V23,
+ PPC::V24, PPC::V25, PPC::V26, PPC::V27, PPC::V28, PPC::V29, PPC::V30, PPC::V31
+};
+
 // HandleVRSaveUpdate - MI is the UPDATE_VRSAVE instruction introduced by the
 // instruction selector.  Based on the vector registers that have been used,
 // transform this into the appropriate ORI instruction.
 static void HandleVRSaveUpdate(MachineInstr *MI, const bool *UsedRegs) {
   unsigned UsedRegMask = 0;
-#define HANDLEREG(N) if (UsedRegs[PPC::V##N]) UsedRegMask |= 1 << (31-N)
-  HANDLEREG( 0); HANDLEREG( 1); HANDLEREG( 2); HANDLEREG( 3);
-  HANDLEREG( 4); HANDLEREG( 5); HANDLEREG( 6); HANDLEREG( 7);
-  HANDLEREG( 8); HANDLEREG( 9); HANDLEREG(10); HANDLEREG(11);
-  HANDLEREG(12); HANDLEREG(13); HANDLEREG(14); HANDLEREG(15);
-  HANDLEREG(16); HANDLEREG(17); HANDLEREG(18); HANDLEREG(19);
-  HANDLEREG(20); HANDLEREG(21); HANDLEREG(22); HANDLEREG(23);
-  HANDLEREG(24); HANDLEREG(25); HANDLEREG(26); HANDLEREG(27);
-  HANDLEREG(28); HANDLEREG(29); HANDLEREG(30); HANDLEREG(31);
-#undef HANDLEREG
+  for (unsigned i = 0; i != 32; ++i)
+if (UsedRegs[VRRegNo[i]])
+  UsedRegMask |= 1 << (31-i);
+  
   unsigned SrcReg = MI->getOperand(1).getReg();
   unsigned DstReg = MI->getOperand(0).getReg();
   // If no registers are used, turn this into a copy.



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/X86/X86ISelLowering.cpp

2006-04-17 Thread Evan Cheng


Changes in directory llvm/lib/Target/X86:

X86ISelLowering.cpp updated: 1.170 -> 1.171
---
Log message:

Implement v8i16, v16i8 splat using unpckl + pshufd.

---
Diffs of the changes:  (+56 -16)

 X86ISelLowering.cpp |   72 
 1 files changed, 56 insertions(+), 16 deletions(-)


Index: llvm/lib/Target/X86/X86ISelLowering.cpp
diff -u llvm/lib/Target/X86/X86ISelLowering.cpp:1.170 
llvm/lib/Target/X86/X86ISelLowering.cpp:1.171
--- llvm/lib/Target/X86/X86ISelLowering.cpp:1.170   Mon Apr 17 15:32:50 2006
+++ llvm/lib/Target/X86/X86ISelLowering.cpp Mon Apr 17 15:43:08 2006
@@ -1759,13 +1759,9 @@
 
 /// isSplatMask - Return true if the specified VECTOR_SHUFFLE operand specifies
 /// a splat of a single element.
-bool X86::isSplatMask(SDNode *N) {
+static bool isSplatMask(SDNode *N) {
   assert(N->getOpcode() == ISD::BUILD_VECTOR);
 
-  // We can only splat 64-bit, and 32-bit quantities.
-  if (N->getNumOperands() != 4 && N->getNumOperands() != 2)
-return false;
-
   // This is a splat operation if each element of the permute is the same, and
   // if the value doesn't reference the second vector.
   SDOperand Elt = N->getOperand(0);
@@ -1781,6 +1777,17 @@
   return cast(Elt)->getValue() < N->getNumOperands();
 }
 
+/// isSplatMask - Return true if the specified VECTOR_SHUFFLE operand specifies
+/// a splat of a single element and it's a 2 or 4 element mask.
+bool X86::isSplatMask(SDNode *N) {
+  assert(N->getOpcode() == ISD::BUILD_VECTOR);
+
+  // We can only splat 64-bit, and 32-bit quantities.
+  if (N->getNumOperands() != 4 && N->getNumOperands() != 2)
+return false;
+  return ::isSplatMask(N);
+}
+
 /// getShuffleSHUFImmediate - Return the appropriate immediate to shuffle
 /// the specified isShuffleMask VECTOR_SHUFFLE mask with PSHUF* and SHUFP*
 /// instructions.
@@ -1947,6 +1954,43 @@
   return true;
 }
 
+/// getUnpacklMask - Returns a vector_shuffle mask for an unpackl operation
+/// of specified width.
+static SDOperand getUnpacklMask(unsigned NumElems, SelectionDAG &DAG) {
+  MVT::ValueType MaskVT = MVT::getIntVectorWithNumElements(NumElems);
+  MVT::ValueType BaseVT = MVT::getVectorBaseType(MaskVT);
+  std::vector MaskVec;
+  for (unsigned i = 0, e = NumElems/2; i != e; ++i) {
+MaskVec.push_back(DAG.getConstant(i,BaseVT));
+MaskVec.push_back(DAG.getConstant(i + NumElems, BaseVT));
+  }
+  return DAG.getNode(ISD::BUILD_VECTOR, MaskVT, MaskVec);
+}
+
+/// PromoteSplat - Promote a splat of v8i16 or v16i8 to v4i32.
+///
+static SDOperand PromoteSplat(SDOperand Op, SelectionDAG &DAG) {
+  SDOperand V1 = Op.getOperand(0);
+  SDOperand PermMask = Op.getOperand(2);
+  MVT::ValueType VT = Op.getValueType();
+  unsigned NumElems = PermMask.getNumOperands();
+  PermMask = getUnpacklMask(NumElems, DAG);
+  while (NumElems != 4) {
+V1 = DAG.getNode(ISD::VECTOR_SHUFFLE, VT, V1, V1, PermMask);
+NumElems >>= 1;
+  }
+  V1 = DAG.getNode(ISD::BIT_CONVERT, MVT::v4i32, V1);
+
+  MVT::ValueType MaskVT = MVT::getIntVectorWithNumElements(4);
+  SDOperand Zero = DAG.getConstant(0, MVT::getVectorBaseType(MaskVT));
+  std::vector ZeroVec(4, Zero);
+  SDOperand SplatMask = DAG.getNode(ISD::BUILD_VECTOR, MaskVT, ZeroVec);
+  SDOperand Shuffle = DAG.getNode(ISD::VECTOR_SHUFFLE, MVT::v4i32, V1,
+  DAG.getNode(ISD::UNDEF, MVT::v4i32),
+  SplatMask);
+  return DAG.getNode(ISD::BIT_CONVERT, VT, Shuffle);
+}
+
 /// LowerOperation - Provide custom lowering hooks for some operations.
 ///
 SDOperand X86TargetLowering::LowerOperation(SDOperand Op, SelectionDAG &DAG) {
@@ -2753,8 +2797,11 @@
 MVT::ValueType VT = Op.getValueType();
 unsigned NumElems = PermMask.getNumOperands();
 
-if (X86::isSplatMask(PermMask.Val))
-  return Op;
+if (isSplatMask(PermMask.Val)) {
+  if (NumElems <= 4) return Op;
+  // Promote it to a v4i32 splat.
+  return PromoteSplat(Op, DAG);
+}
 
 // Normalize the node to match x86 shuffle ops if needed
 if (V2.getOpcode() != ISD::UNDEF) {
@@ -2877,14 +2924,7 @@
   // : unpcklps 1, 3 ==> Y: 
   //   Step 2: unpcklps X, Y ==><3, 2, 1, 0>
   MVT::ValueType VT = Op.getValueType();
-  MVT::ValueType MaskVT = MVT::getIntVectorWithNumElements(NumElems);
-  MVT::ValueType BaseVT = MVT::getVectorBaseType(MaskVT);
-  std::vector MaskVec;
-  for (unsigned i = 0, e = NumElems/2; i != e; ++i) {
-MaskVec.push_back(DAG.getConstant(i,BaseVT));
-MaskVec.push_back(DAG.getConstant(i + NumElems, BaseVT));
-  }
-  SDOperand PermMask = DAG.getNode(ISD::BUILD_VECTOR, MaskVT, MaskVec);
+  SDOperand PermMask = getUnpacklMask(NumElems, DAG);
   std::vector V(NumElems);
   for (unsigned i = 0; i < NumElems; ++i)
 V[i] = DAG.getNode(ISD::SCALAR_TO_VECTOR, VT, Op.getOperand(i));
@@ -3208,7 +3248,7 @@
   // Only do shuffles on 128-bit vector types for

[llvm-commits] CVS: llvm/lib/Target/X86/X86ISelLowering.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/X86:

X86ISelLowering.cpp updated: 1.169 -> 1.170
---
Log message:

implement returns of a vector, testcase here: CodeGen/X86/vec_return.ll


---
Diffs of the changes:  (+27 -23)

 X86ISelLowering.cpp |   50 +++---
 1 files changed, 27 insertions(+), 23 deletions(-)


Index: llvm/lib/Target/X86/X86ISelLowering.cpp
diff -u llvm/lib/Target/X86/X86ISelLowering.cpp:1.169 
llvm/lib/Target/X86/X86ISelLowering.cpp:1.170
--- llvm/lib/Target/X86/X86ISelLowering.cpp:1.169   Mon Apr 17 02:24:10 2006
+++ llvm/lib/Target/X86/X86ISelLowering.cpp Mon Apr 17 15:32:50 2006
@@ -449,26 +449,6 @@
   ReturnAddrIndex = 0; // No return address slot generated yet.
   BytesToPopOnReturn = 0;  // Callee pops nothing.
   BytesCallerReserves = ArgOffset;
-
-  // Finally, inform the code generator which regs we return values in.
-  switch (getValueType(F.getReturnType())) {
-  default: assert(0 && "Unknown type!");
-  case MVT::isVoid: break;
-  case MVT::i1:
-  case MVT::i8:
-  case MVT::i16:
-  case MVT::i32:
-MF.addLiveOut(X86::EAX);
-break;
-  case MVT::i64:
-MF.addLiveOut(X86::EAX);
-MF.addLiveOut(X86::EDX);
-break;
-  case MVT::f32:
-  case MVT::f64:
-MF.addLiveOut(X86::ST0);
-break;
-  }
   return ArgValues;
 }
 
@@ -2676,15 +2656,30 @@
 default:
   assert(0 && "Do not know how to return this many arguments!");
   abort();
-case 1: 
+case 1:// ret void.
   return DAG.getNode(X86ISD::RET_FLAG, MVT::Other, Op.getOperand(0),
  DAG.getConstant(getBytesToPopOnReturn(), MVT::i16));
 case 2: {
   MVT::ValueType ArgVT = Op.getOperand(1).getValueType();
-  if (MVT::isInteger(ArgVT))
+  
+  if (MVT::isVector(ArgVT)) {
+// Integer or FP vector result -> XMM0.
+if (DAG.getMachineFunction().liveout_empty())
+  DAG.getMachineFunction().addLiveOut(X86::XMM0);
+Copy = DAG.getCopyToReg(Op.getOperand(0), X86::XMM0, Op.getOperand(1),
+SDOperand());
+  } else if (MVT::isInteger(ArgVT)) {
+// Integer result -> EAX
+if (DAG.getMachineFunction().liveout_empty())
+  DAG.getMachineFunction().addLiveOut(X86::EAX);
+
 Copy = DAG.getCopyToReg(Op.getOperand(0), X86::EAX, Op.getOperand(1),
 SDOperand());
-  else if (!X86ScalarSSE) {
+  } else if (!X86ScalarSSE) {
+// FP return with fp-stack value.
+if (DAG.getMachineFunction().liveout_empty())
+  DAG.getMachineFunction().addLiveOut(X86::ST0);
+
 std::vector Tys;
 Tys.push_back(MVT::Other);
 Tys.push_back(MVT::Flag);
@@ -2693,6 +2688,10 @@
 Ops.push_back(Op.getOperand(1));
 Copy = DAG.getNode(X86ISD::FP_SET_RESULT, Tys, Ops);
   } else {
+// FP return with ScalarSSE (return on fp-stack).
+if (DAG.getMachineFunction().liveout_empty())
+  DAG.getMachineFunction().addLiveOut(X86::ST0);
+
 SDOperand MemLoc;
 SDOperand Chain = Op.getOperand(0);
 SDOperand Value = Op.getOperand(1);
@@ -2729,6 +2728,11 @@
   break;
 }
 case 3:
+  if (DAG.getMachineFunction().liveout_empty()) {
+DAG.getMachineFunction().addLiveOut(X86::EAX);
+DAG.getMachineFunction().addLiveOut(X86::EDX);
+  }
+
   Copy = DAG.getCopyToReg(Op.getOperand(0), X86::EDX, Op.getOperand(2), 
   SDOperand());
   Copy = DAG.getCopyToReg(Copy, 
X86::EAX,Op.getOperand(1),Copy.getValue(1));



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/test/Regression/CodeGen/X86/vec_return.ll

2006-04-17 Thread Chris Lattner


Changes in directory llvm/test/Regression/CodeGen/X86:

vec_return.ll added (r1.1)
---
Log message:

New testcase


---
Diffs of the changes:  (+5 -0)

 vec_return.ll |5 +
 1 files changed, 5 insertions(+)


Index: llvm/test/Regression/CodeGen/X86/vec_return.ll
diff -c /dev/null llvm/test/Regression/CodeGen/X86/vec_return.ll:1.1
*** /dev/null   Mon Apr 17 15:32:37 2006
--- llvm/test/Regression/CodeGen/X86/vec_return.ll  Mon Apr 17 15:32:27 2006
***
*** 0 
--- 1,5 
+ ; RUN: llvm-as < %s | llc -march=x86 -mcpu=yonah
+ 
+ <2 x double> %test() {
+   ret <2 x double> 
+ }



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/CodeGen/SelectionDAG:

LegalizeDAG.cpp updated: 1.366 -> 1.367
---
Log message:

Codegen insertelement with constant insertion points as scalar_to_vector 
and a shuffle.  For this:

void %test2(<4 x float>* %F, float %f) {
%tmp = load <4 x float>* %F ; <<4 x float>> [#uses=2]
%tmp3 = add <4 x float> %tmp, %tmp  ; <<4 x float>> 
[#uses=1]
%tmp2 = insertelement <4 x float> %tmp3, float %f, uint 2   
; <<4 x float>> [#uses=2]
%tmp6 = add <4 x float> %tmp2, %tmp2; <<4 x float>> 
[#uses=1]
store <4 x float> %tmp6, <4 x float>* %F
ret void
}

we now get this on X86 (which will get better):

_test2:
movl 4(%esp), %eax
movaps (%eax), %xmm0
addps %xmm0, %xmm0
movaps %xmm0, %xmm1
shufps $3, %xmm1, %xmm1
movaps %xmm0, %xmm2
shufps $1, %xmm2, %xmm2
unpcklps %xmm1, %xmm2
movss 8(%esp), %xmm1
unpcklps %xmm1, %xmm0
unpcklps %xmm2, %xmm0
addps %xmm0, %xmm0
movaps %xmm0, (%eax)
ret

instead of:

_test2:
subl $28, %esp
movl 32(%esp), %eax
movaps (%eax), %xmm0
addps %xmm0, %xmm0
movaps %xmm0, (%esp)
movss 36(%esp), %xmm0
movss %xmm0, 8(%esp)
movaps (%esp), %xmm0
addps %xmm0, %xmm0
movaps %xmm0, (%eax)
addl $28, %esp
ret






---
Diffs of the changes:  (+28 -0)

 LegalizeDAG.cpp |   28 
 1 files changed, 28 insertions(+)


Index: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
diff -u llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.366 
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.367
--- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1.366 Sat Apr 15 20:36:45 2006
+++ llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp   Mon Apr 17 14:21:01 2006
@@ -867,6 +867,34 @@
   }
   // FALLTHROUGH
 case TargetLowering::Expand: {
+  // If the insert index is a constant, codegen this as a scalar_to_vector,
+  // then a shuffle that inserts it into the right position in the vector.
+  if (ConstantSDNode *InsertPos = dyn_cast(Tmp3)) {
+SDOperand ScVec = DAG.getNode(ISD::SCALAR_TO_VECTOR, 
+  Tmp1.getValueType(), Tmp2);
+
+unsigned NumElts = MVT::getVectorNumElements(Tmp1.getValueType());
+MVT::ValueType ShufMaskVT = MVT::getIntVectorWithNumElements(NumElts);
+MVT::ValueType ShufMaskEltVT = MVT::getVectorBaseType(ShufMaskVT);
+
+// We generate a shuffle of InVec and ScVec, so the shuffle mask should
+// be 0,1,2,3,4,5... with the appropriate element replaced with elt 0 
of
+// the RHS.
+std::vector ShufOps;
+for (unsigned i = 0; i != NumElts; ++i) {
+  if (i != InsertPos->getValue())
+ShufOps.push_back(DAG.getConstant(i, ShufMaskEltVT));
+  else
+ShufOps.push_back(DAG.getConstant(NumElts, ShufMaskEltVT));
+}
+SDOperand ShufMask = DAG.getNode(ISD::BUILD_VECTOR, 
ShufMaskVT,ShufOps);
+
+Result = DAG.getNode(ISD::VECTOR_SHUFFLE, Tmp1.getValueType(),
+ Tmp1, ScVec, ShufMask);
+Result = LegalizeOp(Result);
+break;
+  }
+  
   // If the target doesn't support this, we have to spill the input vector
   // to a temporary stack slot, update the element, then reload it.  This 
is
   // badness.  We could also load the value into a vector register (either



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/PowerPC/PPCISelLowering.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

PPCISelLowering.cpp updated: 1.159 -> 1.160
---
Log message:

Make sure to check splats of every constant we can, handle splat(31) by
being a bit more clever, add support for odd splats from -31 to -17.


---
Diffs of the changes:  (+14 -5)

 PPCISelLowering.cpp |   19 ++-
 1 files changed, 14 insertions(+), 5 deletions(-)


Index: llvm/lib/Target/PowerPC/PPCISelLowering.cpp
diff -u llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.159 
llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.160
--- llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.159   Mon Apr 17 12:55:10 2006
+++ llvm/lib/Target/PowerPC/PPCISelLowering.cpp Mon Apr 17 13:09:22 2006
@@ -1127,6 +1127,9 @@
 if (SextVal >= -16 && SextVal <= 15)
   return BuildSplatI(SextVal, SplatSize, Op.getValueType(), DAG);
 
+
+// Two instruction sequences.
+
 // If this value is in the range [-32,30] and is even, use:
 //tmp = VSPLTI[bhw], result = add tmp, tmp
 if (SextVal >= -32 && SextVal <= 30 && (SextVal & 1) == 0) {
@@ -1154,7 +1157,7 @@
 unsigned SplatBitSize = SplatSize*8;
 static const char SplatCsts[] = {
   -1, 1, -2, 2, -3, 3, -4, 4, -5, 5, -6, 6, -7, 7,
-  -8, 8, -9, 9, -10, 10, -11, 11, -12, 12, -13, 14, -15
+  -8, 8, -9, 9, -10, 10, -11, 11, -12, 12, -13, 13, 14, -14, 15, -15, -16
 };
 for (unsigned idx = 0; idx < sizeof(SplatCsts)/sizeof(SplatCsts[0]); 
++idx){
   // Indirect through the SplatCsts array so that we favor 'vsplti -1' for
@@ -1225,10 +1228,16 @@
 
 // Three instruction sequences.
 
-// Otherwise, in range [17,29]:  (vsplti 15) + (vsplti C).
-if (SextVal >= 0 && SextVal <= 29) {
-  SDOperand LHS = BuildSplatI(15, SplatSize, Op.getValueType(), DAG);
-  SDOperand RHS = BuildSplatI(SextVal-15, SplatSize, 
Op.getValueType(),DAG);
+// Odd, in range [17,31]:  (vsplti C)-(vsplti -16).
+if (SextVal >= 0 && SextVal <= 31) {
+  SDOperand LHS = BuildSplatI(SextVal-16, SplatSize, 
Op.getValueType(),DAG);
+  SDOperand RHS = BuildSplatI(-16, SplatSize, Op.getValueType(), DAG);
+  return DAG.getNode(ISD::SUB, Op.getValueType(), LHS, RHS);
+}
+// Odd, in range [-31,-17]:  (vsplti C)+(vsplti -16).
+if (SextVal >= -31 && SextVal <= 0) {
+  SDOperand LHS = BuildSplatI(SextVal+16, SplatSize, 
Op.getValueType(),DAG);
+  SDOperand RHS = BuildSplatI(-16, SplatSize, Op.getValueType(), DAG);
   return DAG.getNode(ISD::ADD, Op.getValueType(), LHS, RHS);
 }
   }



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/X86/X86RegisterInfo.cpp

2006-04-17 Thread Evan Cheng


Changes in directory llvm/lib/Target/X86:

X86RegisterInfo.cpp updated: 1.139 -> 1.140
---
Log message:

Incorrect foldMemoryOperand entries

---
Diffs of the changes:  (+6 -12)

 X86RegisterInfo.cpp |   18 ++
 1 files changed, 6 insertions(+), 12 deletions(-)


Index: llvm/lib/Target/X86/X86RegisterInfo.cpp
diff -u llvm/lib/Target/X86/X86RegisterInfo.cpp:1.139 
llvm/lib/Target/X86/X86RegisterInfo.cpp:1.140
--- llvm/lib/Target/X86/X86RegisterInfo.cpp:1.139   Sun Apr 16 01:58:19 2006
+++ llvm/lib/Target/X86/X86RegisterInfo.cpp Mon Apr 17 13:06:12 2006
@@ -316,12 +316,6 @@
 case X86::SETGEr:return MakeMInst( X86::SETGEm,  FrameIndex, MI);
 case X86::SETLEr:return MakeMInst( X86::SETLEm,  FrameIndex, MI);
 case X86::SETGr: return MakeMInst( X86::SETGm,   FrameIndex, MI);
-case X86::TEST8rr:   return MakeMRInst(X86::TEST8mr ,FrameIndex, MI);
-case X86::TEST16rr:  return MakeMRInst(X86::TEST16mr,FrameIndex, MI);
-case X86::TEST32rr:  return MakeMRInst(X86::TEST32mr,FrameIndex, MI);
-case X86::CMP8rr:return MakeMRInst(X86::CMP8mr , FrameIndex, MI);
-case X86::CMP16rr:   return MakeMRInst(X86::CMP16mr, FrameIndex, MI);
-case X86::CMP32rr:   return MakeMRInst(X86::CMP32mr, FrameIndex, MI);
 // Alias instructions
 case X86::MOV8r0:return MakeM0Inst(X86::MOV8mi, FrameIndex, MI);
 case X86::MOV16r0:   return MakeM0Inst(X86::MOV16mi, FrameIndex, MI);
@@ -394,18 +388,18 @@
 case X86::XOR8rr:return MakeRMInst(X86::XOR8rm , FrameIndex, MI);
 case X86::XOR16rr:   return MakeRMInst(X86::XOR16rm, FrameIndex, MI);
 case X86::XOR32rr:   return MakeRMInst(X86::XOR32rm, FrameIndex, MI);
-case X86::TEST8rr:   return MakeRMInst(X86::TEST8rm ,FrameIndex, MI);
-case X86::TEST16rr:  return MakeRMInst(X86::TEST16rm,FrameIndex, MI);
-case X86::TEST32rr:  return MakeRMInst(X86::TEST32rm,FrameIndex, MI);
-case X86::TEST8ri:   return MakeMIInst(X86::TEST8mi ,FrameIndex, MI);
-case X86::TEST16ri:  return MakeMIInst(X86::TEST16mi,FrameIndex, MI);
-case X86::TEST32ri:  return MakeMIInst(X86::TEST32mi,FrameIndex, MI);
 case X86::IMUL16rr:  return MakeRMInst(X86::IMUL16rm,FrameIndex, MI);
 case X86::IMUL32rr:  return MakeRMInst(X86::IMUL32rm,FrameIndex, MI);
 case X86::IMUL16rri: return MakeRMIInst(X86::IMUL16rmi, FrameIndex, MI);
 case X86::IMUL32rri: return MakeRMIInst(X86::IMUL32rmi, FrameIndex, MI);
 case X86::IMUL16rri8:return MakeRMIInst(X86::IMUL16rmi8, FrameIndex, MI);
 case X86::IMUL32rri8:return MakeRMIInst(X86::IMUL32rmi8, FrameIndex, MI);
+case X86::TEST8rr:   return MakeRMInst(X86::TEST8rm ,FrameIndex, MI);
+case X86::TEST16rr:  return MakeRMInst(X86::TEST16rm,FrameIndex, MI);
+case X86::TEST32rr:  return MakeRMInst(X86::TEST32rm,FrameIndex, MI);
+case X86::TEST8ri:   return MakeMIInst(X86::TEST8mi ,FrameIndex, MI);
+case X86::TEST16ri:  return MakeMIInst(X86::TEST16mi,FrameIndex, MI);
+case X86::TEST32ri:  return MakeMIInst(X86::TEST32mi,FrameIndex, MI);
 case X86::CMP8rr:return MakeRMInst(X86::CMP8rm , FrameIndex, MI);
 case X86::CMP16rr:   return MakeRMInst(X86::CMP16rm, FrameIndex, MI);
 case X86::CMP32rr:   return MakeRMInst(X86::CMP32rm, FrameIndex, MI);



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/X86/X86InstrSSE.td

2006-04-17 Thread Evan Cheng


Changes in directory llvm/lib/Target/X86:

X86InstrSSE.td updated: 1.97 -> 1.98
---
Log message:

Errors in patterns preventing load folding

---
Diffs of the changes:  (+16 -16)

 X86InstrSSE.td |   32 
 1 files changed, 16 insertions(+), 16 deletions(-)


Index: llvm/lib/Target/X86/X86InstrSSE.td
diff -u llvm/lib/Target/X86/X86InstrSSE.td:1.97 
llvm/lib/Target/X86/X86InstrSSE.td:1.98
--- llvm/lib/Target/X86/X86InstrSSE.td:1.97 Sun Apr 16 13:11:28 2006
+++ llvm/lib/Target/X86/X86InstrSSE.td  Mon Apr 17 13:05:01 2006
@@ -1360,20 +1360,20 @@
 }
 def PADDBrm : PDI<0xFC, MRMSrcMem, (ops VR128:$dst, VR128:$src1, 
i128mem:$src2),
   "paddb {$src2, $dst|$dst, $src2}",
-  [(set VR128:$dst, (v16i8 (add VR128:$src1,
-(load addr:$src2]>;
+  [(set VR128:$dst, (add VR128:$src1,
+ (bc_v16i8 (loadv2i64 addr:$src2]>;
 def PADDWrm : PDI<0xFD, MRMSrcMem, (ops VR128:$dst, VR128:$src1, 
i128mem:$src2),
   "paddw {$src2, $dst|$dst, $src2}",
-  [(set VR128:$dst, (v8i16 (add VR128:$src1,
-(load addr:$src2]>;
+  [(set VR128:$dst, (add VR128:$src1,
+ (bc_v8i16 (loadv2i64 addr:$src2]>;
 def PADDDrm : PDI<0xFE, MRMSrcMem, (ops VR128:$dst, VR128:$src1, 
i128mem:$src2),
   "paddd {$src2, $dst|$dst, $src2}",
-  [(set VR128:$dst, (v4i32 (add VR128:$src1,
-(load addr:$src2]>;
+  [(set VR128:$dst, (add VR128:$src1,
+ (bc_v4i32 (loadv2i64 addr:$src2]>;
 def PADDQrm : PDI<0xD4, MRMSrcMem, (ops VR128:$dst, VR128:$src1, 
i128mem:$src2),
   "paddd {$src2, $dst|$dst, $src2}",
-  [(set VR128:$dst, (v2i64 (add VR128:$src1,
-(load addr:$src2]>;
+  [(set VR128:$dst, (add VR128:$src1,
+ (loadv2i64 addr:$src2)))]>;
 
 let isCommutable = 1 in {
 def PADDSBrr : PDI<0xEC, MRMSrcReg, (ops VR128:$dst, VR128:$src1, VR128:$src2),
@@ -1426,20 +1426,20 @@
 
 def PSUBBrm : PDI<0xF8, MRMSrcMem, (ops VR128:$dst, VR128:$src1, 
i128mem:$src2),
   "psubb {$src2, $dst|$dst, $src2}",
-  [(set VR128:$dst, (v16i8 (sub VR128:$src1,
-(load addr:$src2]>;
+  [(set VR128:$dst, (sub VR128:$src1,
+ (bc_v16i8 (loadv2i64 addr:$src2]>;
 def PSUBWrm : PDI<0xF9, MRMSrcMem, (ops VR128:$dst, VR128:$src1, 
i128mem:$src2),
   "psubw {$src2, $dst|$dst, $src2}",
-  [(set VR128:$dst, (v8i16 (sub VR128:$src1,
-(load addr:$src2]>;
+  [(set VR128:$dst, (sub VR128:$src1,
+ (bc_v8i16 (loadv2i64 addr:$src2]>;
 def PSUBDrm : PDI<0xFA, MRMSrcMem, (ops VR128:$dst, VR128:$src1, 
i128mem:$src2),
   "psubd {$src2, $dst|$dst, $src2}",
-  [(set VR128:$dst, (v4i32 (sub VR128:$src1,
-(load addr:$src2]>;
+  [(set VR128:$dst, (sub VR128:$src1,
+ (bc_v4i32 (loadv2i64 addr:$src2]>;
 def PSUBQrm : PDI<0xFB, MRMSrcMem, (ops VR128:$dst, VR128:$src1, 
i128mem:$src2),
   "psubd {$src2, $dst|$dst, $src2}",
-  [(set VR128:$dst, (v2i64 (sub VR128:$src1,
-(load addr:$src2]>;
+  [(set VR128:$dst, (sub VR128:$src1,
+ (loadv2i64 addr:$src2)))]>;
 
 def PSUBSBrr : PDI<0xE8, MRMSrcReg, (ops VR128:$dst, VR128:$src1, VR128:$src2),
"psubsb {$src2, $dst|$dst, $src2}",



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/CBackend/Writer.cpp

2006-04-17 Thread Jeff Cohen


Changes in directory llvm/lib/Target/CBackend:

Writer.cpp updated: 1.259 -> 1.260
---
Log message:

Add checks for __OpenBSD__.

---
Diffs of the changes:  (+1 -1)

 Writer.cpp |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)


Index: llvm/lib/Target/CBackend/Writer.cpp
diff -u llvm/lib/Target/CBackend/Writer.cpp:1.259 
llvm/lib/Target/CBackend/Writer.cpp:1.260
--- llvm/lib/Target/CBackend/Writer.cpp:1.259   Thu Mar 23 12:08:29 2006
+++ llvm/lib/Target/CBackend/Writer.cpp Mon Apr 17 12:55:40 2006
@@ -813,7 +813,7 @@
   << "extern void *__builtin_alloca(unsigned int);\n"
   << "#endif\n"
   << "#define alloca(x) __builtin_alloca(x)\n"
-  << "#elif defined(__FreeBSD__)\n"
+  << "#elif defined(__FreeBSD__) || defined(__OpenBSD__)\n"
   << "#define alloca(x) __builtin_alloca(x)\n"
   << "#elif !defined(_MSC_VER)\n"
   << "#include \n"



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm-test/SingleSource/Regression/C/2004-08-12-InlinerAndAllocas.c

2006-04-17 Thread Jeff Cohen


Changes in directory llvm-test/SingleSource/Regression/C:

2004-08-12-InlinerAndAllocas.c updated: 1.4 -> 1.5
---
Log message:

Add checks for __OpenBSD__.

---
Diffs of the changes:  (+1 -1)

 2004-08-12-InlinerAndAllocas.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)


Index: llvm-test/SingleSource/Regression/C/2004-08-12-InlinerAndAllocas.c
diff -u llvm-test/SingleSource/Regression/C/2004-08-12-InlinerAndAllocas.c:1.4 
llvm-test/SingleSource/Regression/C/2004-08-12-InlinerAndAllocas.c:1.5
--- llvm-test/SingleSource/Regression/C/2004-08-12-InlinerAndAllocas.c:1.4  
Sun Jan 22 23:28:20 2006
+++ llvm-test/SingleSource/Regression/C/2004-08-12-InlinerAndAllocas.c  Mon Apr 
17 12:55:41 2006
@@ -1,7 +1,7 @@
 // A compiler cannot inline Callee into main unless it is prepared to reclaim
 // the stack memory allocated in it.
 
-#ifdef __FreeBSD__
+#if defined(__FreeBSD__) || defined(__OpenBSD__)
 #include 
 #else
 #include 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/MallocBench/make/arscan.c job.c make.h misc.c read.c

2006-04-17 Thread Jeff Cohen


Changes in directory llvm-test/MultiSource/Benchmarks/MallocBench/make:

arscan.c updated: 1.4 -> 1.5
job.c updated: 1.4 -> 1.5
make.h updated: 1.4 -> 1.5
misc.c updated: 1.4 -> 1.5
read.c updated: 1.5 -> 1.6
---
Log message:

Add checks for __OpenBSD__.

---
Diffs of the changes:  (+11 -8)

 arscan.c |5 +++--
 job.c|4 ++--
 make.h   |5 +++--
 misc.c   |2 +-
 read.c   |3 ++-
 5 files changed, 11 insertions(+), 8 deletions(-)


Index: llvm-test/MultiSource/Benchmarks/MallocBench/make/arscan.c
diff -u llvm-test/MultiSource/Benchmarks/MallocBench/make/arscan.c:1.4 
llvm-test/MultiSource/Benchmarks/MallocBench/make/arscan.c:1.5
--- llvm-test/MultiSource/Benchmarks/MallocBench/make/arscan.c:1.4  Tue Jul 
20 13:24:33 2004
+++ llvm-test/MultiSource/Benchmarks/MallocBench/make/arscan.c  Mon Apr 17 
12:55:40 2006
@@ -38,7 +38,8 @@
 #endif
 
 #if(defined(STDC_HEADERS) || defined(__GNU_LIBRARY__) || \
-  defined(POSIX)) || defined(__FreeBSD__) || defined(__APPLE__)
+  defined(POSIX)) || defined(__FreeBSD__) || defined(__OpenBSD__) || \
+ defined(__APPLE__)
 #include 
 #include 
 #defineANSI_STRING
@@ -94,7 +95,7 @@
 #endif
 
 #ifdefined(__GNU_LIBRARY__) || defined(POSIX) || defined(_IBMR2) || \
-defined(__FreeBSD__) || defined(__APPLE__)
+defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__APPLE__)
 #include 
 #else
 extern int read (), open (), close (), write (), fstat ();


Index: llvm-test/MultiSource/Benchmarks/MallocBench/make/job.c
diff -u llvm-test/MultiSource/Benchmarks/MallocBench/make/job.c:1.4 
llvm-test/MultiSource/Benchmarks/MallocBench/make/job.c:1.5
--- llvm-test/MultiSource/Benchmarks/MallocBench/make/job.c:1.4 Tue Jul 20 
13:24:33 2004
+++ llvm-test/MultiSource/Benchmarks/MallocBench/make/job.c Mon Apr 17 
12:55:40 2006
@@ -31,7 +31,7 @@
 char default_shell[] = "/bin/sh";
 
 #ifdefined(POSIX) || defined(__GNU_LIBRARY__) || defined(__FreeBSD__) || \
-defined(__APPLE__)
+defined(__OpenBSD__) || defined(__APPLE__)
 #include 
 #include 
 #defineGET_NGROUPS_MAX sysconf (_SC_NGROUPS_MAX)
@@ -102,7 +102,7 @@
 
 
 #ifdefined(__GNU_LIBRARY__) || defined(POSIX) || defined(__CYGWIN__) || \
-defined(__FreeBSD__) || defined(__APPLE__)
+defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__APPLE__)
 
 #include 
 #defineGID_T   gid_t


Index: llvm-test/MultiSource/Benchmarks/MallocBench/make/make.h
diff -u llvm-test/MultiSource/Benchmarks/MallocBench/make/make.h:1.4 
llvm-test/MultiSource/Benchmarks/MallocBench/make/make.h:1.5
--- llvm-test/MultiSource/Benchmarks/MallocBench/make/make.h:1.4Tue Jul 
20 13:24:33 2004
+++ llvm-test/MultiSource/Benchmarks/MallocBench/make/make.hMon Apr 17 
12:55:40 2006
@@ -83,7 +83,8 @@
 
 
 #if(defined(STDC_HEADERS) || defined(__GNU_LIBRARY__) || \
-  defined(POSIX) || defined(__FreeBSD__) || defined(__APPLE__))
+  defined(POSIX) || defined(__FreeBSD__) || defined(__APPLE__) \
+ defined(__OpenBSD__))
 #include 
 #include 
 #defineANSI_STRING
@@ -216,7 +217,7 @@
 #endif /* USG and don't have vfork.  */
 
 #ifdefined(__GNU_LIBRARY__) || defined(POSIX) || defined(__CYGWIN__) || \
-defined(__FreeBSD__) || defined(__APPLE__)
+defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__APPLE__)
 
 #include 
 #include 


Index: llvm-test/MultiSource/Benchmarks/MallocBench/make/misc.c
diff -u llvm-test/MultiSource/Benchmarks/MallocBench/make/misc.c:1.4 
llvm-test/MultiSource/Benchmarks/MallocBench/make/misc.c:1.5
--- llvm-test/MultiSource/Benchmarks/MallocBench/make/misc.c:1.4Tue Jul 
20 13:24:33 2004
+++ llvm-test/MultiSource/Benchmarks/MallocBench/make/misc.cMon Apr 17 
12:55:40 2006
@@ -477,7 +477,7 @@
 #endif
 
 #if!defined(POSIX) && !defined(__GNU_LIBRARY__) && !defined(__CYGWIN__) && 
\
-!defined(__FreeBSD__) && !defined(__APPLE__)
+!defined(__FreeBSD__) && !defined(__OpenBSD__) && !defined(__APPLE__)
 extern int getuid (), getgid (), geteuid (), getegid ();
 #ifdef USG
 extern int setuid (), setgid ();


Index: llvm-test/MultiSource/Benchmarks/MallocBench/make/read.c
diff -u llvm-test/MultiSource/Benchmarks/MallocBench/make/read.c:1.5 
llvm-test/MultiSource/Benchmarks/MallocBench/make/read.c:1.6
--- llvm-test/MultiSource/Benchmarks/MallocBench/make/read.c:1.5Tue Sep 
28 12:24:36 2004
+++ llvm-test/MultiSource/Benchmarks/MallocBench/make/read.cMon Apr 17 
12:55:40 2006
@@ -23,7 +23,8 @@
 
 /* This is POSIX.2, but most systems using -DPOSIX probably don't have it.  */
 #if defined(__GNU_LIBRARY__) || defined(__CYGWIN__) || defined(__FreeBSD__) || 
\
-defined(__APPLE__) || (defined(__sun__) && defined(__svr4__))
+defined(__OpenBSD__) || defined(__APPLE__) || \
+(defined(__sun__) && defined(__svr4__))
 #include 
 #else
 #include "glob/glob.h"



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.ui

[llvm-commits] CVS: llvm/lib/System/Mutex.cpp

2006-04-17 Thread Jeff Cohen


Changes in directory llvm/lib/System:

Mutex.cpp updated: 1.6 -> 1.7
---
Log message:

Add checks for __OpenBSD__.

---
Diffs of the changes:  (+1 -1)

 Mutex.cpp |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)


Index: llvm/lib/System/Mutex.cpp
diff -u llvm/lib/System/Mutex.cpp:1.6 llvm/lib/System/Mutex.cpp:1.7
--- llvm/lib/System/Mutex.cpp:1.6   Wed Aug 24 05:07:21 2005
+++ llvm/lib/System/Mutex.cpp   Mon Apr 17 12:55:40 2006
@@ -75,7 +75,7 @@
 errorcode = pthread_mutexattr_settype(&attr, kind);
 assert(errorcode == 0);
 
-#ifndef __FreeBSD__
+#if !defined(__FreeBSD__) && !defined(__OpenBSD__)
 // Make it a process local mutex
 errorcode = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_PRIVATE);
 #endif



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm-test/SingleSource/Benchmarks/Misc/mandel.c

2006-04-17 Thread Jeff Cohen


Changes in directory llvm-test/SingleSource/Benchmarks/Misc:

mandel.c updated: 1.10 -> 1.11
---
Log message:

Add checks for __OpenBSD__.

---
Diffs of the changes:  (+1 -1)

 mandel.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)


Index: llvm-test/SingleSource/Benchmarks/Misc/mandel.c
diff -u llvm-test/SingleSource/Benchmarks/Misc/mandel.c:1.10 
llvm-test/SingleSource/Benchmarks/Misc/mandel.c:1.11
--- llvm-test/SingleSource/Benchmarks/Misc/mandel.c:1.10Tue Jul 20 
11:11:20 2004
+++ llvm-test/SingleSource/Benchmarks/Misc/mandel.c Mon Apr 17 12:55:40 2006
@@ -14,7 +14,7 @@
 
 #define I 1.0iF
 
-#if defined(__FreeBSD__)
+#if defined(__FreeBSD__) || defined(__OpenBSD__)
 #include 
 #elif defined(__APPLE__)
 #include 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm-test/MultiSource/Benchmarks/Olden/voronoi/newvor.c

2006-04-17 Thread Jeff Cohen


Changes in directory llvm-test/MultiSource/Benchmarks/Olden/voronoi:

newvor.c updated: 1.10 -> 1.11
---
Log message:

Add checks for __OpenBSD__.

---
Diffs of the changes:  (+1 -1)

 newvor.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)


Index: llvm-test/MultiSource/Benchmarks/Olden/voronoi/newvor.c
diff -u llvm-test/MultiSource/Benchmarks/Olden/voronoi/newvor.c:1.10 
llvm-test/MultiSource/Benchmarks/Olden/voronoi/newvor.c:1.11
--- llvm-test/MultiSource/Benchmarks/Olden/voronoi/newvor.c:1.10Fri Jul 
15 19:26:48 2005
+++ llvm-test/MultiSource/Benchmarks/Olden/voronoi/newvor.c Mon Apr 17 
12:55:40 2006
@@ -165,7 +165,7 @@
 
 void delete_all_edges() { next_edge= 0; avail_edge = NYL;}
 
-#if defined(__APPLE__) || defined(__FreeBSD__)
+#if defined(__APPLE__) || defined(__FreeBSD__) || defined(__OpenBSD__)
 #define MEMALIGN_IS_NOT_AVAILABLE
 #endif
 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm-test/MultiSource/Applications/hexxagon/hexxagonmove.cpp

2006-04-17 Thread Jeff Cohen


Changes in directory llvm-test/MultiSource/Applications/hexxagon:

hexxagonmove.cpp updated: 1.3 -> 1.4
---
Log message:

Add checks for __OpenBSD__.

---
Diffs of the changes:  (+1 -1)

 hexxagonmove.cpp |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)


Index: llvm-test/MultiSource/Applications/hexxagon/hexxagonmove.cpp
diff -u llvm-test/MultiSource/Applications/hexxagon/hexxagonmove.cpp:1.3 
llvm-test/MultiSource/Applications/hexxagon/hexxagonmove.cpp:1.4
--- llvm-test/MultiSource/Applications/hexxagon/hexxagonmove.cpp:1.3Wed Oct 
26 10:34:35 2005
+++ llvm-test/MultiSource/Applications/hexxagon/hexxagonmove.cppMon Apr 
17 12:55:40 2006
@@ -25,7 +25,7 @@
 
 #include 
 #include 
-#if defined(__FreeBSD__) || defined(__APPLE__)
+#if defined(__FreeBSD__) || defined(__OpenBSD__) || defined(__APPLE__)
 #include 
 #else
 #include 



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/PowerPC/PPCISelLowering.cpp

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

PPCISelLowering.cpp updated: 1.158 -> 1.159
---
Log message:

Teach the ppc backend to use rol and vsldoi to generate splatted constants.
This implements vec_constants.ll:test_vsldoi and test_rol


---
Diffs of the changes:  (+49 -15)

 PPCISelLowering.cpp |   64 +++-
 1 files changed, 49 insertions(+), 15 deletions(-)


Index: llvm/lib/Target/PowerPC/PPCISelLowering.cpp
diff -u llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.158 
llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.159
--- llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1.158   Mon Apr 17 01:58:41 2006
+++ llvm/lib/Target/PowerPC/PPCISelLowering.cpp Mon Apr 17 12:55:10 2006
@@ -1070,6 +1070,22 @@
  DAG.getConstant(IID, MVT::i32), LHS, RHS);
 }
 
+/// BuildVSLDOI - Return a VECTOR_SHUFFLE that is a vsldoi of the specified
+/// amount.  The result has the specified value type.
+static SDOperand BuildVSLDOI(SDOperand LHS, SDOperand RHS, unsigned Amt,
+ MVT::ValueType VT, SelectionDAG &DAG) {
+  // Force LHS/RHS to be the right type.
+  LHS = DAG.getNode(ISD::BIT_CONVERT, MVT::v16i8, LHS);
+  RHS = DAG.getNode(ISD::BIT_CONVERT, MVT::v16i8, RHS);
+  
+  std::vector Ops;
+  for (unsigned i = 0; i != 16; ++i)
+Ops.push_back(DAG.getConstant(i+Amt, MVT::i32));
+  SDOperand T = DAG.getNode(ISD::VECTOR_SHUFFLE, MVT::v16i8, LHS, RHS,
+DAG.getNode(ISD::BUILD_VECTOR, MVT::v16i8, Ops));
+  return DAG.getNode(ISD::BIT_CONVERT, VT, T);
+}
+
 // If this is a case we can't handle, return null and let the default
 // expansion code take care of it.  If we CAN select this case, and if it
 // selects to a single instruction, return Op.  Otherwise, if we can codegen
@@ -1179,11 +1195,34 @@
 return BuildIntrinsicBinOp(IIDs[SplatSize-1], Op, Op, DAG);
   }
   
-  // TODO: ROL.
+  // vsplti + rol self.
+  if (SextVal == (int)(((unsigned)i << TypeShiftAmt) |
+   ((unsigned)i >> (SplatBitSize-TypeShiftAmt {
+Op = BuildSplatI(i, SplatSize, Op.getValueType(), DAG);
+static const unsigned IIDs[] = { // Intrinsic to use for each size.
+  Intrinsic::ppc_altivec_vrlb, Intrinsic::ppc_altivec_vrlh, 0,
+  Intrinsic::ppc_altivec_vrlw
+};
+return BuildIntrinsicBinOp(IIDs[SplatSize-1], Op, Op, DAG);
+  }
+
+  // t = vsplti c, result = vsldoi t, t, 1
+  if (SextVal == ((i << 8) | (i >> (TypeShiftAmt-8 {
+SDOperand T = BuildSplatI(i, SplatSize, MVT::v16i8, DAG);
+return BuildVSLDOI(T, T, 1, Op.getValueType(), DAG);
+  }
+  // t = vsplti c, result = vsldoi t, t, 2
+  if (SextVal == ((i << 16) | (i >> (TypeShiftAmt-16 {
+SDOperand T = BuildSplatI(i, SplatSize, MVT::v16i8, DAG);
+return BuildVSLDOI(T, T, 2, Op.getValueType(), DAG);
+  }
+  // t = vsplti c, result = vsldoi t, t, 3
+  if (SextVal == ((i << 24) | (i >> (TypeShiftAmt-24 {
+SDOperand T = BuildSplatI(i, SplatSize, MVT::v16i8, DAG);
+return BuildVSLDOI(T, T, 3, Op.getValueType(), DAG);
+  }
 }
 
-
-
 // Three instruction sequences.
 
 // Otherwise, in range [17,29]:  (vsplti 15) + (vsplti C).
@@ -1224,6 +1263,10 @@
 return RHS;
   }
   
+  SDOperand OpLHS, OpRHS;
+  OpLHS = GeneratePerfectShuffle(PerfectShuffleTable[LHSID], LHS, RHS, DAG);
+  OpRHS = GeneratePerfectShuffle(PerfectShuffleTable[RHSID], LHS, RHS, DAG);
+  
   unsigned ShufIdxs[16];
   switch (OpNum) {
   default: assert(0 && "Unknown i32 permute!");
@@ -1256,24 +1299,15 @@
   ShufIdxs[i] = (i&3)+12;
 break;
   case OP_VSLDOI4:
-for (unsigned i = 0; i != 16; ++i)
-  ShufIdxs[i] = i+4;
-break;
+return BuildVSLDOI(OpLHS, OpRHS, 4, OpLHS.getValueType(), DAG);
   case OP_VSLDOI8:
-for (unsigned i = 0; i != 16; ++i)
-  ShufIdxs[i] = i+8;
-break;
+return BuildVSLDOI(OpLHS, OpRHS, 8, OpLHS.getValueType(), DAG);
   case OP_VSLDOI12:
-for (unsigned i = 0; i != 16; ++i)
-  ShufIdxs[i] = i+12;
-break;
+return BuildVSLDOI(OpLHS, OpRHS, 12, OpLHS.getValueType(), DAG);
   }
   std::vector Ops;
   for (unsigned i = 0; i != 16; ++i)
 Ops.push_back(DAG.getConstant(ShufIdxs[i], MVT::i32));
-  SDOperand OpLHS, OpRHS;
-  OpLHS = GeneratePerfectShuffle(PerfectShuffleTable[LHSID], LHS, RHS, DAG);
-  OpRHS = GeneratePerfectShuffle(PerfectShuffleTable[RHSID], LHS, RHS, DAG);
   
   return DAG.getNode(ISD::VECTOR_SHUFFLE, OpLHS.getValueType(), OpLHS, OpRHS,
  DAG.getNode(ISD::BUILD_VECTOR, MVT::v16i8, Ops));



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/test/Regression/CodeGen/PowerPC/vec_constants.ll

2006-04-17 Thread Chris Lattner


Changes in directory llvm/test/Regression/CodeGen/PowerPC:

vec_constants.ll updated: 1.4 -> 1.5
---
Log message:

Some more cases that can be generated with two instructions


---
Diffs of the changes:  (+9 -0)

 vec_constants.ll |9 +
 1 files changed, 9 insertions(+)


Index: llvm/test/Regression/CodeGen/PowerPC/vec_constants.ll
diff -u llvm/test/Regression/CodeGen/PowerPC/vec_constants.ll:1.4 
llvm/test/Regression/CodeGen/PowerPC/vec_constants.ll:1.5
--- llvm/test/Regression/CodeGen/PowerPC/vec_constants.ll:1.4   Mon Apr 17 
01:58:16 2006
+++ llvm/test/Regression/CodeGen/PowerPC/vec_constants.ll   Mon Apr 17 
12:54:18 2006
@@ -37,3 +37,12 @@
   sbyte -104, sbyte -104, sbyte -104, sbyte -104,
   sbyte -104, sbyte -104, sbyte -104, sbyte -104>
 }
+
+<4 x int> %test_vsldoi() {
+ret <4 x int> 
+}
+
+<4 x int> %test_rol() {
+ret <4 x int> 
+}
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/PowerPC/README_ALTIVEC.txt

2006-04-17 Thread Chris Lattner


Changes in directory llvm/lib/Target/PowerPC:

README_ALTIVEC.txt updated: 1.25 -> 1.26
---
Log message:

add a note


---
Diffs of the changes:  (+10 -0)

 README_ALTIVEC.txt |   10 ++
 1 files changed, 10 insertions(+)


Index: llvm/lib/Target/PowerPC/README_ALTIVEC.txt
diff -u llvm/lib/Target/PowerPC/README_ALTIVEC.txt:1.25 
llvm/lib/Target/PowerPC/README_ALTIVEC.txt:1.26
--- llvm/lib/Target/PowerPC/README_ALTIVEC.txt:1.25 Mon Apr 17 00:28:54 2006
+++ llvm/lib/Target/PowerPC/README_ALTIVEC.txt  Mon Apr 17 12:29:41 2006
@@ -110,3 +110,13 @@
 We can do an arbitrary non-constant value by using lvsr/perm/ste.
 
 
//===--===//
+
+If we want to tie instruction selection into the scheduler, we can do some
+constant formation with different instructions.  For example, we can generate
+"vsplti -1" with "vcmpequw R,R" and 1,1,1,1 with "vsubcuw R,R", both of which
+use different execution units, thus could help scheduling.
+
+This is probably only reasonable for a post-pass scheduler.
+
+//===--===//
+



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm-test/Makefile.config.in configure

2006-04-17 Thread Evan Cheng


Changes in directory llvm-test:

Makefile.config.in updated: 1.19 -> 1.20
configure updated: 1.32 -> 1.33
---
Log message:

Added Nurbs external test.

---
Diffs of the changes:  (+55 -1)

 Makefile.config.in |4 
 configure  |   52 +++-
 2 files changed, 55 insertions(+), 1 deletion(-)


Index: llvm-test/Makefile.config.in
diff -u llvm-test/Makefile.config.in:1.19 llvm-test/Makefile.config.in:1.20
--- llvm-test/Makefile.config.in:1.19   Mon Aug  8 16:26:08 2005
+++ llvm-test/Makefile.config.inMon Apr 17 03:02:47 2006
@@ -76,6 +76,10 @@
 @USE_ALP@
 ALP_ROOT := @ALP_ROOT@
 
+# Path to the NURBS source code
[EMAIL PROTECTED]@
+NURBS_ROOT := @NURBS_ROOT@
+
 # Disable LLC diffs for testing.
 @DISABLE_LLC_DIFFS@
 


Index: llvm-test/configure
diff -u llvm-test/configure:1.32 llvm-test/configure:1.33
--- llvm-test/configure:1.32Fri Apr  7 13:53:21 2006
+++ llvm-test/configure Mon Apr 17 03:02:47 2006
@@ -465,7 +465,7 @@
 # include 
 #endif"
 
-ac_subst_vars='SHELL PATH_SEPARATOR PACKAGE_NAME PACKAGE_TARNAME 
PACKAGE_VERSION PACKAGE_STRING PACKAGE_BUGREPORT exec_prefix prefix 
program_transform_name bindir sbindir libexecdir datadir sysconfdir 
sharedstatedir localstatedir libdir includedir oldincludedir infodir mandir 
build_alias host_alias target_alias DEFS ECHO_C ECHO_N ECHO_T LIBS LLVM_SRC 
LLVM_OBJ LLVM_EXTERNALS SPEC95_ROOT USE_SPEC95 SPEC2000_ROOT USE_SPEC2000 
POVRAY_ROOT USE_POVRAY NAMD_ROOT USE_NAMD SWEEP3D_ROOT USE_SWEEP3D 
FPGROWTH_ROOT USE_FPGROWTH ALP_ROOT USE_ALP DISABLE_LLC_DIFFS CXX CXXFLAGS 
LDFLAGS CPPFLAGS ac_ct_CXX EXEEXT OBJEXT CC CFLAGS ac_ct_CC CPP ifGNUmake LEX 
LEXLIB LEX_OUTPUT_ROOT FLEX YACC BISON build build_cpu build_vendor build_os 
host host_cpu host_vendor host_os EGREP LN_S ECHO AR ac_ct_AR RANLIB 
ac_ct_RANLIB STRIP ac_ct_STRIP CXXCPP F77 FFLAGS ac_ct_F77 LIBTOOL USE_F2C F2C 
F2C_BIN F2C_DIR F2C_INC F2C_LIB USE_F95 F95 F95_BIN F95_DIR F95_INC F95_LIB 
HAVE_RE_COMP LIBOBJS LTLIBOBJS'
+ac_subst_vars='SHELL PATH_SEPARATOR PACKAGE_NAME PACKAGE_TARNAME 
PACKAGE_VERSION PACKAGE_STRING PACKAGE_BUGREPORT exec_prefix prefix 
program_transform_name bindir sbindir libexecdir datadir sysconfdir 
sharedstatedir localstatedir libdir includedir oldincludedir infodir mandir 
build_alias host_alias target_alias DEFS ECHO_C ECHO_N ECHO_T LIBS LLVM_SRC 
LLVM_OBJ LLVM_EXTERNALS SPEC95_ROOT USE_SPEC95 SPEC2000_ROOT USE_SPEC2000 
POVRAY_ROOT USE_POVRAY NAMD_ROOT USE_NAMD SWEEP3D_ROOT USE_SWEEP3D 
FPGROWTH_ROOT USE_FPGROWTH ALP_ROOT USE_ALP NURBS_ROOT USE_NURBS 
DISABLE_LLC_DIFFS CXX CXXFLAGS LDFLAGS CPPFLAGS ac_ct_CXX EXEEXT OBJEXT CC 
CFLAGS ac_ct_CC CPP ifGNUmake LEX LEXLIB LEX_OUTPUT_ROOT FLEX YACC BISON build 
build_cpu build_vendor build_os host host_cpu host_vendor host_os EGREP LN_S 
ECHO AR ac_ct_AR RANLIB ac_ct_RANLIB STRIP ac_ct_STRIP CXXCPP F77 FFLAGS 
ac_ct_F77 LIBTOOL USE_F2C F2C F2C_BIN F2C_DIR F2C_INC F2C_LIB USE_F95 F95 
F95_BIN F95_DIR F95_INC F95_LIB HAVE_RE_COMP LIBOBJ!
 S LTLIBOBJS'
 ac_subst_files=''
 
 # Initialize some variables set by options.
@@ -1044,6 +1044,7 @@
   --with-sweep3d=DIR  Use sweep3d as a benchmark (srcs in DIR)
   --with-fpgrowth=DIR Use fpgrowth as a benchmark (srcs in DIR)
   --with-alp=DIR  Use alp as a benchmark (srcs in DIR)
+  --with-nurbs=DIRUse nurbs as a benchmark (srcs in DIR)
   --with-gnu-ld   assume the C compiler uses GNU ld [default=no]
   --with-pic  try to use only PIC/non-PIC objects [default=use
   both]
@@ -2028,6 +2029,53 @@
 
 
 
+# Check whether --with-nurbs or --without-nurbs was given.
+if test "${with_nurbs+set}" = set; then
+  withval="$with_nurbs"
+  checkresult=$withval
+else
+  checkresult=auto
+fi;
+echo "$as_me:$LINENO: checking for nurbs benchmark sources" >&5
+echo $ECHO_N "checking for nurbs benchmark sources... $ECHO_C" >&6
+case "$checkresult" in
+auto|yes)
+defaultdir=${LLVM_EXTERNALS}/nurbs
+   if test -d "$defaultdir"
+   then
+   NURBS_ROOT=$defaultdir
+
+   USE_NURBS=USE_NURBS=1
+
+checkresult="yes, found in $defaultdir"
+else
+checkresult=no
+fi
+;;
+no)
+
+
+checkresult=no
+;;
+*)  if test -d "$checkresult"
+then
+NURBS_ROOT="$checkresult"
+
+USE_NURBS=USE_NURBS=1
+
+checkresult="yes, in $checkresult"
+else
+
+
+checkresult="no, not found in $checkresult"
+fi
+;;
+esac
+echo "$as_me:$LINENO: result: $checkresult" >&5
+echo "${ECHO_T}$checkresult" >&6
+
+
+
 # Check whether --enable-llc_diffs or --disable-llc_diffs was given.
 if test "${enable_llc_diffs+set}" = set; then
   enableval="$enable_llc_diffs"
@@ -20766,6 +20814,8 @@
 s,@USE_FPGROWTH@,$USE_FPGROWTH,;t t
 s,@ALP_ROOT@,$ALP_ROOT,;t t
 s,@USE_ALP@,$USE_ALP,;t t
+s,@NURBS_ROOT@,$NURBS_ROOT,;t t
+s,@USE_NURBS@,$USE_NURBS,;t t
 s,@DISABLE_LLC_DIFFS@,$DISABLE_LLC_DIFFS,;t t
 s,@CXX@,$CXX,;t t
 s,@CXXFLA

[llvm-commits] CVS: llvm-test/External/Nurbs/Makefile README

2006-04-17 Thread Evan Cheng


Changes in directory llvm-test/External/Nurbs:

Makefile added (r1.1)
README added (r1.1)
---
Log message:

Initial commit of Nurbs as an External test.

---
Diffs of the changes:  (+16 -0)

 Makefile |   14 ++
 README   |2 ++
 2 files changed, 16 insertions(+)


Index: llvm-test/External/Nurbs/Makefile
diff -c /dev/null llvm-test/External/Nurbs/Makefile:1.1
*** /dev/null   Mon Apr 17 02:59:57 2006
--- llvm-test/External/Nurbs/Makefile   Mon Apr 17 02:59:47 2006
***
*** 0 
--- 1,14 
+ LEVEL = ../..
+ 
+ include $(LEVEL)/Makefile.config
+ 
+ PROG = nurbs
+ SourceDir := $(NURBS_ROOT)
+ 
+ CPPFLAGS =
+ LDFLAGS = -lstdc++
+ LIBS += -lstdc++
+ 
+ RUN_OPTIONS = /k all timed /t 500 /vsteps 64 /usteps 64 /vcp 20 /ucp 20
+ 
+ include $(LEVEL)/MultiSource/Makefile.multisrc


Index: llvm-test/External/Nurbs/README
diff -c /dev/null llvm-test/External/Nurbs/README:1.1
*** /dev/null   Mon Apr 17 02:59:59 2006
--- llvm-test/External/Nurbs/README Mon Apr 17 02:59:47 2006
***
*** 0 
--- 1,2 
+ Comparison of Uniform NURBS Surface Tessellation
+ http://www.intel.com/cd/ids/developer/asmo-na/eng/dc/code/19068.htm



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm-test/External/Nurbs/

2006-04-17 Thread LLVM


Changes in directory llvm-test/External/Nurbs:

---
Log message:

Directory /var/cvs/llvm/llvm-test/External/Nurbs added to the repository


---
Diffs of the changes:  (+0 -0)

 0 files changed



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits


[llvm-commits] CVS: llvm/lib/Target/X86/X86ISelLowering.cpp

2006-04-17 Thread Evan Cheng


Changes in directory llvm/lib/Target/X86:

X86ISelLowering.cpp updated: 1.168 -> 1.169
---
Log message:

FP SETOLT, SETOLT, SETUGE, SETUGT conditions were implemented incorrectly

---
Diffs of the changes:  (+4 -4)

 X86ISelLowering.cpp |8 
 1 files changed, 4 insertions(+), 4 deletions(-)


Index: llvm/lib/Target/X86/X86ISelLowering.cpp
diff -u llvm/lib/Target/X86/X86ISelLowering.cpp:1.168 
llvm/lib/Target/X86/X86ISelLowering.cpp:1.169
--- llvm/lib/Target/X86/X86ISelLowering.cpp:1.168   Sat Apr 15 00:37:34 2006
+++ llvm/lib/Target/X86/X86ISelLowering.cpp Mon Apr 17 02:24:10 2006
@@ -1238,16 +1238,16 @@
 default: break;
 case ISD::SETUEQ:
 case ISD::SETEQ: X86CC = X86ISD::COND_E;  break;
-case ISD::SETOLE: Flip = true; // Fallthrough
+case ISD::SETOLT: Flip = true; // Fallthrough
 case ISD::SETOGT:
 case ISD::SETGT: X86CC = X86ISD::COND_A;  break;
-case ISD::SETOLT: Flip = true; // Fallthrough
+case ISD::SETOLE: Flip = true; // Fallthrough
 case ISD::SETOGE:
 case ISD::SETGE: X86CC = X86ISD::COND_AE; break;
-case ISD::SETUGE: Flip = true; // Fallthrough
+case ISD::SETUGT: Flip = true; // Fallthrough
 case ISD::SETULT:
 case ISD::SETLT: X86CC = X86ISD::COND_B;  break;
-case ISD::SETUGT: Flip = true; // Fallthrough
+case ISD::SETUGE: Flip = true; // Fallthrough
 case ISD::SETULE:
 case ISD::SETLE: X86CC = X86ISD::COND_BE; break;
 case ISD::SETONE:



___
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits