Re: [PATCH] PR target/52137 - bdver2 scheduler needs to be added to bdver1 insn reservations

2012-02-21 Thread Quentin Neill
On Tue, Feb 21, 2012 at 2:46 AM, Jakub Jelinek  wrote:
> On Tue, Feb 21, 2012 at 08:49:13AM +0100, Uros Bizjak wrote:
>> > 2012-02-20  Quentin Neill  
>> >
>> >     PR target/52137
>> >     * gcc/config/i386/bdver1.md (bdver1_call, bdver1_push,
>
> Please leave out gcc/ prefix from the ChangeLog entry though.
>
>        Jakub

Hi,

Committed with update to ChangeLog entry.  Thanks for your
consideration for this patch.
-- 
Quentin


Re: [PATCH] PR target/52137 - bdver2 scheduler needs to be added to bdver1 insn reservations

2012-02-20 Thread Quentin Neill
On Mon, Feb 20, 2012 at 1:51 PM, Richard Guenther
 wrote:
> On Mon, Feb 20, 2012 at 6:38 PM, Quentin Neill
>  wrote:
>> Hi,
>>
>> The patch (http://gcc.gnu.org/bugzilla/attachment.cgi?id=26585)
>> attached to the bug
>> (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52137) recovers
>> performance regressions for AMD's bdver2 processor.
>>
>> It passes bootstrap and make check on x86_64.
>>
>> The patch is simple and seems to have little risk, therefore AMD would
>> like to see if this patch can be included in 4.7.0.
>>
>> Okay to commit to trunk?
>
> Please follow development process and post patches together with a description
> and changelog entry to gcc-patches.  Attaching patches to bugs does't do it.
>
> Richard.


Of course, sorry about that.  ChangeLog below and patch attached.
-- 
Quentin


2012-02-20  Quentin Neill  

PR target/52137
* gcc/config/i386/bdver1.md (bdver1_call, bdver1_push,
bdver1_pop, bdver1_leave, bdver1_lea, bdver1_imul_DI, bdver1_imul,
bdver1_imul_mem_DI, bdver1_imul_mem, bdver1_idiv, bdver1_idiv_mem,
bdver1_str, bdver1_idirect, bdver1_ivector, bdver1_idirect_loadmov,
bdver1_idirect_load, bdver1_ivector_load, bdver1_idirect_movstore,
bdver1_idirect_both, bdver1_ivector_both, bdver1_idirect_store,
bdver1_ivector_store, bdver1_fldxf, bdver1_fld, bdver1_fstxf,
bdver1_fst, bdver1_fist, bdver1_fmov_bdver1, bdver1_fadd_load,
bdver1_fadd, bdver1_fmul_load, bdver1_fmul, bdver1_fsgn,
bdver1_fdiv_load, bdver1_fdiv, bdver1_fpspc_load, bdver1_fpspc,
bdver1_fcmov_load, bdver1_fcmov, bdver1_fcomi_load,
bdver1_fcomi, bdver1_fcom_load, bdver1_fcom,
bdver1_fxch, bdver1_ssevector_avx128_unaligned_load,
bdver1_ssevector_avx256_unaligned_load,
bdver1_ssevector_sse128_unaligned_load,
bdver1_ssevector_avx128_load, bdver1_ssevector_avx256_load,
bdver1_ssevector_sse128_load, bdver1_ssescalar_movq_load,
bdver1_ssescalar_vmovss_load, bdver1_ssescalar_sse128_load,
bdver1_mmxsse_load, bdver1_sse_store_avx256, bdver1_sse_store,
bdver1_mmxsse_store_short, bdver1_ssevector_avx256,
bdver1_movss_movsd, bdver1_mmxssemov, bdver1_sselog_load_256,
bdver1_sselog_256, bdver1_sselog_load, bdver1_sselog,
bdver1_ssecmp_load, bdver1_ssecmp, bdver1_ssecomi_load,
bdver1_ssecomi, bdver1_vcvtX2Y_avx256_load, bdver1_vcvtX2Y_avx256,
bdver1_ssecvt_cvtss2sd_load, bdver1_ssecvt_cvtss2sd,
bdver1_sseicvt_cvtsi2sd_load, bdver1_sseicvt_cvtsi2sd,
bdver1_ssecvt_cvtpd2ps_load, bdver1_ssecvt_cvtpd2ps,
bdver1_ssecvt_cvtdq2ps_load, bdver1_ssecvt_cvtdq2ps,
bdver1_ssecvt_cvtdq2pd_load, bdver1_ssecvt_cvtdq2pd,
bdver1_ssecvt_cvtps2pd_load, bdver1_ssecvt_cvtps2pd,
bdver1_ssecvt_cvtsX2si_load, bdver1_ssecvt_cvtsX2si,
bdver1_ssecvt_cvtpd2pi_load, bdver1_ssecvt_cvtpd2pi,
bdver1_ssecvt_cvtpd2dq_load, bdver1_ssecvt_cvtpd2dq,
bdver1_ssecvt_cvtps2pi_load, bdver1_ssecvt_cvtps2pi,
bdver1_ssemuladd_load_256, bdver1_ssemuladd_256,
bdver1_ssemuladd_load, bdver1_ssemuladd, bdver1_sseimul_load,
bdver1_sseimul, bdver1_sseiadd_load, bdver1_sseiadd,
bdver1_ssediv_double_load_256, bdver1_ssediv_double_256,
bdver1_ssediv_single_load_256, bdver1_ssediv_single_256,
bdver1_ssediv_double_load, bdver1_ssediv_double,
bdver1_ssediv_single_load, bdver1_ssediv_single, bdver1_sseins):
Add "bdver2" attribute.
diff --git a/gcc/config/i386/bdver1.md b/gcc/config/i386/bdver1.md
index 3cde476..10f95ff 100644
--- a/gcc/config/i386/bdver1.md
+++ b/gcc/config/i386/bdver1.md
@@ -123,50 +123,50 @@
 
 ;; Jump instructions are executed in the branch unit completely transparent to 
us.
 (define_insn_reservation "bdver1_call" 0
-(and (eq_attr "cpu" "bdver1")
+(and (eq_attr "cpu" "bdver1,bdver2")
  (eq_attr "type" "call,callv"))
 "bdver1-double,bdver1-agu,bdver1-ieu")
 ;; PUSH mem is double path.
 (define_insn_reservation "bdver1_push" 1
-(and (eq_attr "cpu" "bdver1")
+(and (eq_attr "cpu" "bdver1,bdver2")
  (eq_attr "type" "push"))
 "bdver1-direct,bdver1-agu,bdver1-store")
 ;; POP r16/mem are double path.
 (define_insn_reservation "bdver1_pop" 1
-(and (eq_attr "cpu" "bdver1")
+(and (eq_attr "cpu" "bdver1,bdver2")
  (eq_attr "type" "pop"))
 "bdver1-direct,(bdver1-ieu+bdver1-load)")
 ;; LEAVE no latency info so far, assume same with amdfam10.
 (define_insn_reservation

[PATCH] PR target/52137 - bdver2 scheduler needs to be added to bdver1 insn reservations

2012-02-20 Thread Quentin Neill
Hi,

The patch (http://gcc.gnu.org/bugzilla/attachment.cgi?id=26585)
attached to the bug
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52137) recovers
performance regressions for AMD's bdver2 processor.

It passes bootstrap and make check on x86_64.

The patch is simple and seems to have little risk, therefore AMD would
like to see if this patch can be included in 4.7.0.

Okay to commit to trunk?
-- 
Quentin


Re: PATCH: Add capability to contrib/compare_tests to handle directories

2012-02-16 Thread Quentin Neill
On Thu, Feb 16, 2012 at 8:18 AM, Quentin Neill
 wrote:
> On Tue, Feb 14, 2012 at 9:01 PM, Mike Stump  wrote:
>> On Feb 14, 2012, at 8:39 AM, Quentin Neill wrote:
>>> Thanks for the fix.  This seemed familiar, and upon review it looks
>>> like I never committed this fix:
>>> http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01194.html
>>
>> Ah, ok, let's go with your version, it is much better.  Thanks.
>
> Committed.
> --
> Quentin

While working on this, I realized errors detected in directories with
more than one sum file don't show the sum file with the problem
(instead they show the sum1/sum2 files).  Patch below iterates over
sum files instead of concatenating.  Okay to commit?

(output before the patch)

$ compare_tests -strict /d/qneill/tst.w[io].zeroshift.*
# Comparing directories
## Dir1=/d/qneill/tst.wi.zeroshift.192188f71c30: 2 sum files
## Dir2=/d/qneill/tst.wo.zeroshift.bdcd46972d7d: 2 sum files

# Comparing 2 common sum files
## /bin/sh /d/paqa/gcc/compare_tests -strict /tmp/gxx-sum1.4434
/tmp/gxx-sum2.4434
Old tests that passed, that have disappeared: (Eeek!)

gcc.target/x86_64/abi/callabi/vaarg-5b.c (test for excess errors)
Strict test fails


(output after the patch)
$ ./compare_tests -strict /d/qneill/tst.w[io].zeroshift.*
# Comparing directories
## Dir1=/d/qneill/tst.wi.zeroshift.192188f71c30: 2 sum files
## Dir2=/d/qneill/tst.wo.zeroshift.bdcd46972d7d: 2 sum files

# Comparing 2 common sum files
## /bin/sh ./compare_tests -strict
/d/qneill/tst.wi.zeroshift.192188f71c30/bld/gcc/testsuite/gcc/gcc.sum
/d/qneill/tst.wo.zeroshift.bdcd46972d7d/bld/gcc/testsuite/gcc/gcc.sum
Old tests that passed, that have disappeared: (Eeek!)

gcc.target/x86_64/abi/callabi/vaarg-5b.c (test for excess errors)
Strict test fails

## Differences found: bld/gcc/testsuite/gcc/gcc.sum
## /bin/sh ./compare_tests -strict
/d/qneill/tst.wi.zeroshift.192188f71c30/bld/gcc/testsuite/gcc/g++.sum
/d/qneill/tst.wo.zeroshift.bdcd46972d7d/bld/gcc/testsuite/gcc/g++.sum
# 1 differences in 2 common sum files found



diff --git a/contrib/compare_tests b/contrib/compare_tests
index 2fc6e05..611faab 100755
--- a/contrib/compare_tests
+++ b/contrib/compare_tests
@@ -43,9 +43,7 @@ lst2=/tmp/$tool-lst2.$$
 lst3=/tmp/$tool-lst3.$$
 lst4=/tmp/$tool-lst4.$$
 lst5=/tmp/$tool-lst5.$$
-sum1=/tmp/$tool-sum1.$$
-sum2=/tmp/$tool-sum2.$$
-tmps="$tmp1 $tmp2 $now_s $before_s $lst1 $lst2 $lst3 $lst4 $lst5 $sum1 $sum2"
+tmps="$tmp1 $tmp2 $now_s $before_s $lst1 $lst2 $lst3 $lst4 $lst5"

 [ "$1" = "-strict" ] && strict=$1 && shift
 [ "$1" = "-?" ] && usage
@@ -86,15 +84,16 @@ if [ -d "$1" -a -d "$2" ] ; then
fi
cmnsums=`cat $lst5 | wc -l`
echo "# Comparing $cmnsums common sum files"
-   ( for fname in `cat $lst5`; do cat $1/$fname; done ) >$sum1
-   ( for fname in `cat $lst5`; do cat $2/$fname; done ) >$sum2
-   echo "## ${CONFIG_SHELL-/bin/sh} $0 $strict $sum1 $sum2"
-   ${CONFIG_SHELL-/bin/sh} $0 $strict $sum1 $sum2
-   ret=$?
-   if [ $ret -ne 0 ]; then
-   exit_status=`expr $exit_status + 1`
-   echo "## Differences found: $fname"
-   fi
+   for fname in `cat $lst5`
+   do
+   echo "## ${CONFIG_SHELL-/bin/sh} $0 $strict $1/$fname $2/$fname"
+   ${CONFIG_SHELL-/bin/sh} $0 $strict $1/$fname $2/$fname
+   ret=$?
+   if [ $ret -ne 0 ]; then
+   exit_status=`expr $exit_status + 1`
+   echo "## Differences found: $fname"
+   fi
+   done
if [ $exit_status -ne 0 ]; then
echo "# $exit_status differences in $cmnsums common
sum files found"
else




-- 
Quentin


Re: PATCH: Add capability to contrib/compare_tests to handle directories

2012-02-14 Thread Quentin Neill
On Tue, Feb 14, 2012 at 10:39 AM, Quentin Neill
 wrote:
> On Sat, Feb 11, 2012 at 8:13 AM, Mike Stump  wrote:
>> On Nov 4, 2011, at 8:23 PM, Quentin Neill wrote:
>>> My scenario about "ANY test results changed" is what I added with -strict.
>>> This patch concatenates the common .sum files before comparing.
>>
>> So, how exactly does this work for you:
>>
>> +       ( for fname in `cat $lst5`; do cat $1/$fname; done ) >$sum1
>> +       ( for fname in `cat $lst5`; do cat $2/$fname; done ) >$sum2
>> +       echo "## ${CONFIG_SHELL-/bin/sh} $0 $strict $sum1 $sum2"
>> +       ${CONFIG_SHELL-/bin/sh} $0 $strict $sum1 $sum2
>>
>> sum1 and sum2 appear to be variables that aren't set.
>
> Hi Mike,
>
> Thanks for the fix.  This seemed familiar, and upon review it looks
> like I never committed this fix:
> http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01194.html
>
>
> Do you prefer this patch with my original intent (declaring sum1/sum2
> with other tmps and removing the trap on line 52):

Horrible wording for the first patch description.  How about:

  "Do you prefer this patch with my original intent (declaring sum1/sum2
   with other tmps, and removing those files in the trap on line 52):"

-- 
Quentin


Re: PATCH: Add capability to contrib/compare_tests to handle directories

2012-02-14 Thread Quentin Neill
On Sat, Feb 11, 2012 at 8:13 AM, Mike Stump  wrote:
> On Nov 4, 2011, at 8:23 PM, Quentin Neill wrote:
>> My scenario about "ANY test results changed" is what I added with -strict.
>> This patch concatenates the common .sum files before comparing.
>
> So, how exactly does this work for you:
>
> +       ( for fname in `cat $lst5`; do cat $1/$fname; done ) >$sum1
> +       ( for fname in `cat $lst5`; do cat $2/$fname; done ) >$sum2
> +       echo "## ${CONFIG_SHELL-/bin/sh} $0 $strict $sum1 $sum2"
> +       ${CONFIG_SHELL-/bin/sh} $0 $strict $sum1 $sum2
>
> sum1 and sum2 appear to be variables that aren't set.

Hi Mike,

Thanks for the fix.  This seemed familiar, and upon review it looks
like I never committed this fix:
http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01194.html


Do you prefer this patch with my original intent (declaring sum1/sum2
with other tmps and removing the trap on line 52):

--- a/contrib/compare_tests
+++ b/contrib/compare_tests
@@ -43,7 +43,9 @@ lst2=/tmp/$tool-lst2.$$
 lst3=/tmp/$tool-lst3.$$
 lst4=/tmp/$tool-lst4.$$
 lst5=/tmp/$tool-lst5.$$
-tmps="$tmp1 $tmp2 $now_s $before_s $lst1 $lst2 $lst3 $lst4 $lst5"
+sum1=/tmp/$tool-sum1.$$
+sum2=/tmp/$tool-sum2.$$
+tmps="$tmp1 $tmp2 $now_s $before_s $lst1 $lst2 $lst3 $lst4 $lst5 $sum1 $sum2"


 [ "$1" = "-strict" ] && strict=$1 && shift
 [ "$1" = "-?" ] && usage
@@ -60,8 +62,8 @@ if [ -d "$1" -a -d "$2" ] ; then
echo "## Dir2=$2: `cat $lst2 | wc -l` sum files"
echo
# remove leading directory components to compare
-   sed -e "s|^$1/||" $lst1 | sort >$lst3
-   sed -e "s|^$2/||" $lst2 | sort >$lst4
+   sed -e "s|^$1[/]*||" $lst1 | sort >$lst3
+   sed -e "s|^$2[/]*||" $lst2 | sort >$lst4
comm -23 $lst3 $lst4 >$lst5
if [ -s $lst5 ] ; then
echo "# Extra sum files in Dir1=$1"
@@ -83,14 +85,11 @@ if [ -d "$1" -a -d "$2" ] ; then
exit $exit_status
fi
cmnsums=`cat $lst5 | wc -l`
-   sum1="/tmp/$tool-sum-1"
-   sum2="/tmp/$tool-sum-2"
echo "# Comparing $cmnsums common sum files"
( for fname in `cat $lst5`; do cat $1/$fname; done ) >$sum1
( for fname in `cat $lst5`; do cat $2/$fname; done ) >$sum2
echo "## ${CONFIG_SHELL-/bin/sh} $0 $strict $sum1 $sum2"
${CONFIG_SHELL-/bin/sh} $0 $strict $sum1 $sum2
-   rm -f $sum1 $sum2
ret=$?
if [ $ret -ne 0 ]; then
exit_status=`expr $exit_status + 1`



Or would you prefer this minimal fix (to remove trailing directory slashes)?

@@ -60,8 +60,8 @@ if [ -d "$1" -a -d "$2" ] ; then
echo "## Dir2=$2: `cat $lst2 | wc -l` sum files"
echo
# remove leading directory components to compare
-   sed -e "s|^$1/||" $lst1 | sort >$lst3
-   sed -e "s|^$2/||" $lst2 | sort >$lst4
+   sed -e "s|^$1[/]*||" $lst1 | sort >$lst3
+   sed -e "s|^$2[/]*||" $lst2 | sort >$lst4
comm -23 $lst3 $lst4 >$lst5
if [ -s $lst5 ] ; then
echo "# Extra sum files in Dir1=$1"


And if so, okay to commit?
-- 
Quentin


Re: [PATCH] enable fma4 for bdver2

2012-02-06 Thread Quentin Neill
On Mon, Feb 6, 2012 at 9:32 AM, Quentin Neill
 wrote:
> On Tue, Nov 8, 2011 at 3:53 PM, Richard Henderson  wrote:
>>
>> On 11/07/2011 07:28 PM, Quentin Neill wrote:
>> > +       Add FMA4 to bdver2.
>> > +       * config/i386/i386.c (ix86_option_override_internal): Add FMA4
>> > to bdver2.
>>
>> Ok.
>>
>>
>> r~
>
> This patch was okay'd in stage 3 but was never committed.
>
> Okay to commit to trunk today?

Actually, it will need to wait for 4.8, nevermind.
-- 
Quentin


Re: [PATCH] enable fma4 for bdver2

2012-02-06 Thread Quentin Neill
On Tue, Nov 8, 2011 at 3:53 PM, Richard Henderson  wrote:
>
> On 11/07/2011 07:28 PM, Quentin Neill wrote:
> > +       Add FMA4 to bdver2.
> > +       * config/i386/i386.c (ix86_option_override_internal): Add FMA4
> > to bdver2.
>
> Ok.
>
>
> r~

This patch was okay'd in stage 3 but was never committed.

Okay to commit to trunk today?

[reposting to gcc-patches due to invalid mime-type, sorry for the duplicate]
-- 
Quentin


Re: adjust installation docs to discourage installing GMP, MPFR and MPC separately

2012-01-27 Thread Quentin Neill
On Fri, Jan 20, 2012 at 6:10 PM, Jonathan Wakely  wrote:
> On 20 January 2012 23:08, Quentin Neill  wrote:
>>
>> My 2c - I heartily recommend this patch.
>
> Thanks. I'm a bit surprised noone else has commented - I hoped this
> would be a no-brainer, or at least get some constructive feedback for
> further improvement.

Another +1 for this patch.  Who can approve?

Seems like this would be okay for stage4 too.

>> May I suggest updating
>> /cvs/gcc/wwwdocs/htdocs/install/prerequisites.html
> That's exactly the page I'm trying to change, it's generated from the
> texi file my patch changes.

Didn't know that, makes sense.

>> And perhaps add a page to the twiki
>> http://gcc.gnu.org/wiki/BuildingGCC to catch google searches for
>> "building gcc"?
>
> I recently added http://gcc.gnu.org/wiki/InstallingGCC
> I chose "installing" not "building" because I believe that of the
> people who fail to build GCC and look for support, most of them
> probably should have installed a pre-built package supplied by their
> OS or another third-party packager. Maybe it could be changed to
> attract more search hits.

Yes I saw your InstallingGCC and I like it.  I was thinking a
BuildingGCC would catch searches and could direct to InstallingGCC.
Does MoinMoin have a way to attach keyword data to a page?  Consider
where InstallingGCC appears in these different results from google:

? https://www.google.com/search?q=installing+gcc <-surprised me
? http://www.google.com/search?q=building+gcc
? http://www.google.com/search?q=building+gcc+wiki
#2 http://www.google.com/search?q=installing+gcc+wiki
#4 http://www.google.com/search?q=building+installing+gcc

Another suggestion would be to link from one or both of these:
http://gcc.gnu.org/install/index.html
http://gcc.gnu.org/install/build.html
-- 
Quentin


Re: adjust installation docs to discourage installing GMP, MPFR and MPC separately

2012-01-20 Thread Quentin Neill
On Thu, Jan 5, 2012 at 6:17 PM, Jonathan Wakely  wrote:
> Ping.
>
> I think this is a useful improvement to the docs and could prevent the
> most commonly-encountered bootstrap failure for inexpert users
> building GCC.
>
> OK for trunk?
>
>
> On 30 December 2011 13:29, Jonathan Wakely wrote:
> On 22 December 2011 00:23, Jonathan Wakely wrote:
>> The most frequently asked question on gcc-help, and a frequently
>> reported "bug" in bugzilla, is
>> http://gcc.gnu.org/wiki/FAQ#configure_suffix
>>
>> It is almost always caused by installing libgmp.so etc. in a
>> non-standard location and not using ldconfig, DT_RUNPATH,
>> $LD_LIBRARY_PATH or some other method to tell the dynamic linker how
>> to find them.  The current installation docs mention --with-gmp right
>> away, which probably gives the impression that it's the recommended
>> method, whereas it actually causes problems for the majority of users
>> who don't understand dynamic linker search paths.  I think rewording
>> the installation docs to suggest building the support libs as part of
>> GCC first, and only using --with-gmp second, might steer people in the
>> right direction.
>>
>> Separately, I also plan to write an "Installing GCC" page on the GCC
>> wiki which (among other things) discourages installing the support
>> libraries by hand, recommending installing prebuilt packages in
>> standard system dirs instead, or if that's not an option then using
>> contrib/download_prerequisites to get the sources and build them
>> in-tree.
>>
>>
>>        * doc/install.text (Prerequisites): Suggest building GMP, MPFR and
>>        MPC as part of GCC before describing configuring with --with-gmp etc.
>>
>> Tested by running "make doc" and viewing the resulting .info pages, OK
>> for trunk?
>
> Here's an updated patch which also adjusts the --with-gmp docs in the
> Configuration chapter to mention the option of putting the support lib
> sources under the GCC sources
>
>       * doc/install.texi (Prerequisites): Suggest building GMP, MPFR and
>       MPC as part of GCC before describing configuring with --with-gmp etc.
>       (Installing GCC: Configuration): State that --with-gmp etc. aren't
>       needed if sources are present.
>
> OK for trunk?

My 2c - I heartily recommend this patch.

May I suggest updating
/cvs/gcc/wwwdocs/htdocs/install/prerequisites.html (or wherever it
lives) with the same wording, since it is a frequent landing page for
people hunting for GCC building/installing information?

And perhaps add a page to the twiki
http://gcc.gnu.org/wiki/BuildingGCC to catch google searches for
"building gcc"?

-- 
Quentin


Re: [PR48743] cpuid family6 fix for Athlon: okay for trunk?

2011-12-19 Thread Quentin Neill
On Mon, Dec 12, 2011 at 3:00 PM, Uros Bizjak  wrote:
> Hello!
>
>> I'm testing the patch in
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48743#c4 against trunk
>
>
> --cut here--
> --- gcc-4.7-20110521/gcc/config/i386/driver-i386.c.~1~  2011-01-06
> 23:59:46.0 +0100
> +++ gcc-4.7-20110521/gcc/config/i386/driver-i386.c.~1~  2011-05-22
> 16:05:41.0 +0200
> @@ -506,7 +506,7 @@ const char *host_detect_local_cpu (int a
>        processor = PROCESSOR_AMDFAM10;
>       else if (has_sse2 || has_longmode)
>        processor = PROCESSOR_K8;
> -      else if (has_3dnowp)
> +      else if (has_3dnowp && family == 6)
>        processor = PROCESSOR_ATHLON;
>       else if (has_mmx)
>        processor = PROCESSOR_K6;
> --cut here--
>
> The patch looks correct to me.
>
>> Is this considered okay for stage3?
>
> Yes, since it fixes a bug.
>
>> If so, okay to commit after bootstrap testing on x86_64?
>
> OK, with a correct ChangeLog.
>
> Uros.

Fixed in r182489, updated http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48743#c7

I got the correct gcc/ChangeLog ... but looks like I committed with
two date/author lines in the svn log.  G.

I guess if someone has svnadmin permissions on gcc.gnu.org it could be fixed.
-- 
Quentin


Re: [PR48743] cpuid family6 fix for Athlon: okay for trunk?

2011-12-12 Thread Quentin Neill
On Mon, Dec 12, 2011 at 3:00 PM, Uros Bizjak  wrote:
> Hello!
>
>> I'm testing the patch in
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48743#c4 against trunk
>
>
> --cut here--
> --- gcc-4.7-20110521/gcc/config/i386/driver-i386.c.~1~  2011-01-06
> 23:59:46.0 +0100
> +++ gcc-4.7-20110521/gcc/config/i386/driver-i386.c.~1~  2011-05-22
> 16:05:41.0 +0200
> @@ -506,7 +506,7 @@ const char *host_detect_local_cpu (int a
>        processor = PROCESSOR_AMDFAM10;
>       else if (has_sse2 || has_longmode)
>        processor = PROCESSOR_K8;
> -      else if (has_3dnowp)
> +      else if (has_3dnowp && family == 6)
>        processor = PROCESSOR_ATHLON;
>       else if (has_mmx)
>        processor = PROCESSOR_K6;
> --cut here--
>
> The patch looks correct to me.
>
>> Is this considered okay for stage3?
>
> Yes, since it fixes a bug.
>
>> If so, okay to commit after bootstrap testing on x86_64?
>
> OK, with a correct ChangeLog.
>
> Uros.

Thanks.

Okay for backport to 4.6 and 4.5 too, after bootstrap tests?
-- 
Quentin


[PR48743] cpuid family6 fix for Athlon: okay for trunk?

2011-12-12 Thread Quentin Neill
Hi,

I'm testing the patch in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48743#c4 against trunk

Is this considered okay for stage3?

If so, okay to commit after bootstrap testing on x86_64?
-- 
Quentin


Re: PATCH: Add capability to contrib/compare_tests to handle directories

2011-11-08 Thread Quentin Neill
On Sat, Nov 5, 2011 at 4:26 PM, Mike Stump  wrote:
> On Nov 4, 2011, at 8:23 PM, Quentin Neill wrote:
>> This patch concatenates the common .sum files before comparing.
>>
>> Okay to commit?
>
> Ok, thanks for the contribution.
>

FYI I see my patch was missing these two fixes:
1. fix missing sum1 and sum2 temp variables
2. handle trailing slashes in dir args

Okay to commit?  Should I commit such a patch as trivial?
-- 
Quentin


Index: compare_tests
===
--- compare_tests   (revision 181166)
+++ compare_tests   (working copy)
@@ -43,7 +43,9 @@ lst2=/tmp/$tool-lst2.$$
 lst3=/tmp/$tool-lst3.$$
 lst4=/tmp/$tool-lst4.$$
 lst5=/tmp/$tool-lst5.$$
-tmps="$tmp1 $tmp2 $now_s $before_s $lst1 $lst2 $lst3 $lst4 $lst5"
+sum1=/tmp/$tool-sum1.$$
+sum2=/tmp/$tool-sum2.$$
+tmps="$tmp1 $tmp2 $now_s $before_s $lst1 $lst2 $lst3 $lst4 $lst5 $sum1 $sum2"

 [ "$1" = "-strict" ] && strict=$1 && shift
 [ "$1" = "-?" ] && usage
@@ -60,8 +62,8 @@ if [ -d "$1" -a -d "$2" ] ; then
echo "## Dir2=$2: `cat $lst2 | wc -l` sum files"
echo
# remove leading directory components to compare
-   sed -e "s|^$1/||" $lst1 | sort >$lst3
-   sed -e "s|^$2/||" $lst2 | sort >$lst4
+   sed -e "s|^$1[/]*||" $lst1 | sort >$lst3
+   sed -e "s|^$2[/]*||" $lst2 | sort >$lst4
comm -23 $lst3 $lst4 >$lst5
if [ -s $lst5 ] ; then
echo "# Extra sum files in Dir1=$1"


[PATCH] enable fma4 for bdver2

2011-11-07 Thread Quentin Neill
Hi,

This patch turns on FMA4 for the AMD BDVER2 processor.

Okay for trunk if no regressions?
-- 
Quentin


Index: ChangeLog
===
--- ChangeLog   (revision 181147)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2011-11-07  Quentin Neill  
+
+   Add FMA4 to bdver2.
+   * config/i386/i386.c (ix86_option_override_internal): Add FMA4
to bdver2.
+
 2011-11-07  Richard Henderson  

* optabs.h (OTI_sync_compare_and_swap, OTI_sync_lock_test_and_set,
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 181147)
+++ config/i386/i386.c  (working copy)
@@ -3044,7 +3044,7 @@ ix86_option_override_internal (bool main
   {"bdver2", PROCESSOR_BDVER2, CPU_BDVER2,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
| PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1
-   | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX
+   | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_FMA4
| PTA_XOP | PTA_LWP | PTA_BMI | PTA_TBM | PTA_F16C
| PTA_FMA},
   {"btver1", PROCESSOR_BTVER1, CPU_GENERIC64,


Re: Many testsuite failures on x86_64 due recent "fix" about f16cintrin.h header

2011-11-07 Thread Quentin Neill
On Mon, Nov 7, 2011 at 4:25 AM, Uros Bizjak  wrote:
> Hello!
>
> Attached patch fixes all remaining i386 testsuite failures. The header
> did say that we can include it also through immintrin.h ...
>
> 2011-11-07  Uros Bizjak  
>
>        * config/i386/f16cintrin: Remove extra _X86INTRIN_H_INCLUDED check.
>
> Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.
>
> Uros.

Thank you Dominique, Uros, Kai, and Eric.
-- 
Quentin


Re: Many testsuite failures on x86_64 due recent "fix" about f16cintrin.h header

2011-11-06 Thread Quentin Neill
On Sun, Nov 6, 2011 at 2:13 PM, Dominique Dhumieres  wrote:
> Following http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02901.html, I have 
> applied
> the following patch on x86_64-apple-darwin10
>
> --- ../_clean/gcc/config.gcc    2011-11-05 22:25:37.0 +0100
> +++ gcc/config.gcc      2011-11-06 12:35:57.0 +0100
> @@ -350,7 +350,7 @@ i[34567]86-*-*)
>                       immintrin.h x86intrin.h avxintrin.h xopintrin.h
>                       ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
>                       lzcntintrin.h bmiintrin.h bmi2intrin.h tbmintrin.h
> -                      avx2intrin.h fmaintrin.h"
> +                      avx2intrin.h fmaintrin.h f16cintrin.h"
>        ;;
>  x86_64-*-*)
>        cpu_type=i386
> @@ -363,7 +363,7 @@ x86_64-*-*)
>                       immintrin.h x86intrin.h avxintrin.h xopintrin.h
>                       ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
>                       lzcntintrin.h bmiintrin.h tbmintrin.h bmi2intrin.h
> -                      avx2intrin.h fmaintrin.h"
> +                      avx2intrin.h fmaintrin.h f16cintrin.h"
>        need_64bit_hwint=yes
>        ;;
>  ia64-*-*)
> --- ../_clean/gcc/config/i386/f16cintrin.h      2011-11-05 10:03:10.0 
> +0100
> +++ gcc/config/i386/f16cintrin.h        2011-11-06 16:55:05.0 +0100
> @@ -88,7 +88,8 @@ _mm256_cvtps_ph (__m256 __A, const int _
>
>  #define _mm256_cvtps_ph(A, I) \
>   ((__m128i) __builtin_ia32_vcvtps2ph256 ((__v8sf)(__m256) A, (int) (I)))
> -#endif
> +#endif /* __OPTIMIZE__ */
> +#endif /* _F16CINTRIN_H_INCLUDED */
>
>  #endif /* __F16C__ */
>  #endif
>
> (the second part fixes a missing endif). However I still have most of the 
> failures:
>
> FAIL: gcc.target/i386/sse-14.c
> Excess errors:
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:95:1: error: implicit 
> declaration of function '_cvtss_sh' [-Werror=implicit-function-declaration]
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:96:1: error: implicit 
> declaration of function '_mm_cvtps_ph' [-Werror=implicit-function-declaration]
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:96:1: error: 
> incompatible types when returning type 'int' but '__m128i' was expected
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:97:1: error: implicit 
> declaration of function '_mm256_cvtps_ph' 
> [-Werror=implicit-function-declaration]
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/sse-14.c:97:1: error: 
> incompatible types when returning type 'int' but '__m128i' was expected
> ...
> FAIL: gcc.target/i386/testimm-1.c (test for excess errors)
> Excess errors:
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/testimm-1.c:36:6: error: 
> incompatible types when assigning to type '__m128i' from type 'int'
> /opt/gcc/work/gcc/testsuite/gcc.target/i386/testimm-1.c:45:6: error: 
> incompatible types when assigning to type '__m128i' from type 'int'
>
> At this point I have no idea about how to fix those.
>
> Cheers,
>
> Dominique

Thanks for the testing.
I committed the changes to add f16cintrin.h to config.gcc and added
the missing endif.
I'm not sure why my testing did not see the missing endif or the failures above.
My first guess would be that the missing endif was causing ifndef
guards to not work across other intrinsics headers, causing some
builtins not to be defined.
My test machine is down right now, but I will post a patch if I can
figure things out tonight.
-- 
Quentin


Re: PATCH: Add capability to contrib/compare_tests to handle directories

2011-11-04 Thread Quentin Neill
On Wed, Nov 2, 2011 at 12:32 PM, Quentin Neill
 wrote:
> On Tue, Oct 4, 2011 at 4:57 PM, Mike Stump  wrote:
>> On Oct 4, 2011, at 2:37 PM, Quentin Neill wrote:
>>> Ping?
>>
>> The problem with this patch is it reorders the listing so that lower 
>> priority things are after higher priority things.  The entire point of the 
>> routine is to list the high priority things first, so that a casual user can 
>> read the first line or two and not immediately if there is something they 
>> care about.
>>
>> Now, this property can be preserved simply by concatenating all the .sum 
>> files found in a directory into an all.sum file, and then running the script 
>> on those two files.  If one does that, then the priority order is preserved.
>
> Hi Mike,
> Thanks for the feedback.  From the comments, I assumed the usage was
> to compare two .log files not .sum files.
> Maybe it is a new scenario I'm imagining - comparing two builds to see
> if ANY test results changed, regardless of order.
> In any case, I will work on your changes when I get time (after stage1 
> probably)
> --
> Quentin

I got to this anyway.
My scenario about "ANY test results changed" is what I added with -strict.
This patch concatenates the common .sum files before comparing.

Okay to commit?
-- 
Quentin
From aaac4ffeedf4b3c9641d593de6526b4e872760d5 Mon Sep 17 00:00:00 2001
From: Quentin Neill 
Date: Fri, 4 Nov 2011 22:13:07 -0500
Subject: [PATCH] 2011-11-04   Quentin Neill  

* compare_tests: Add ability to compare all .sum
files from two build directories.
---
 contrib/ChangeLog |5 ++
 contrib/compare_tests |  103 -
 2 files changed, 98 insertions(+), 10 deletions(-)

diff --git a/contrib/ChangeLog b/contrib/ChangeLog
index eddf6ec..b0afb27 100644
--- a/contrib/ChangeLog
+++ b/contrib/ChangeLog
@@ -1,3 +1,8 @@
+2011-11-04   Quentin Neill  
+
+   * compare_tests: Add ability to compare all .sum
+   files from two build directories.
+
 2011-09-13   Diego Novillo  
 
* testsuite-management: New.
diff --git a/contrib/compare_tests b/contrib/compare_tests
index bed9742..a23fcf6 100755
--- a/contrib/compare_tests
+++ b/contrib/compare_tests
@@ -2,13 +2,35 @@
 # This script automatically test the given tool with the tool's test cases,
 # reporting anything of interest.
 
-# exits with 0 if there is nothing of interest
-# exits with 1 if there is something interesting
-# exits with 2 if an error occurred
-
-# Give two .sum files to compare them
+usage()
+{
+   if [ -n "$1" ] ; then
+   echo "$0: Error: $1" >&2
+   echo >&2
+   fi
+   cat >&2 <
+# Subdir comparison added by Quentin Neill 
 
 tool=gxx
 
@@ -16,10 +38,69 @@ tmp1=/tmp/$tool-testing.$$a
 tmp2=/tmp/$tool-testing.$$b
 now_s=/tmp/$tool-testing.$$d
 before_s=/tmp/$tool-testing.$$e
+lst1=/tmp/$tool-lst1.$$
+lst2=/tmp/$tool-lst2.$$
+lst3=/tmp/$tool-lst3.$$
+lst4=/tmp/$tool-lst4.$$
+lst5=/tmp/$tool-lst5.$$
+tmps="$tmp1 $tmp2 $now_s $before_s $lst1 $lst2 $lst3 $lst4 $lst5"
+
+[ "$1" = "-strict" ] && strict=$1 && shift
+[ "$1" = "-?" ] && usage
+[ "$2" = "" ] && usage "Must specify both PREVIOUS and CURRENT"
+
+trap "rm -f $tmps" 0 1 2 3 5 9 13 15
+exit_status=0
 
-if [ "$2" = "" ]; then
-   echo "Usage: $0 previous current" >&2
-   exit 2
+if [ -d "$1" -a -d "$2" ] ; then
+   find "$1" -name '*.sum' >$lst1
+   find "$2" -name '*.sum' >$lst2
+   echo "# Comparing directories"
+   echo "## Dir1=$1: `cat $lst1 | wc -l` sum files"
+   echo "## Dir2=$2: `cat $lst2 | wc -l` sum files"
+   echo
+   # remove leading directory components to compare
+   sed -e "s|^$1/||" $lst1 | sort >$lst3
+   sed -e "s|^$2/||" $lst2 | sort >$lst4
+   comm -23 $lst3 $lst4 >$lst5
+   if [ -s $lst5 ] ; then
+   echo "# Extra sum files in Dir1=$1"
+   sed -e "s|^|< $1/|" $lst5
+   echo
+   [ -n "$strict" ] && exit_status=`expr $exit_status + 1`
+   fi
+   comm -13 $lst3 $lst4 >$lst5
+   if [ -s $lst5 ] ; then
+   echo "# Extra sum files in Dir2=$2"
+   sed -e "s|^|> $2/|" $lst5
+   echo
+   [ -n "$strict" ] && exit_status=`expr $exit_status + 1`
+   fi
+   comm -12 $lst3 $lst4 | sort -u >$lst5
+   if [ ! -s $lst5 ] ; then
+   echo "# No common sum files"
+   exit_st

Re: PATCH: Add capability to contrib/compare_tests to handle directories

2011-11-02 Thread Quentin Neill
On Tue, Oct 4, 2011 at 4:57 PM, Mike Stump  wrote:
> On Oct 4, 2011, at 2:37 PM, Quentin Neill wrote:
>> Ping?
>
> The problem with this patch is it reorders the listing so that lower priority 
> things are after higher priority things.  The entire point of the routine is 
> to list the high priority things first, so that a casual user can read the 
> first line or two and not immediately if there is something they care about.
>
> Now, this property can be preserved simply by concatenating all the .sum 
> files found in a directory into an all.sum file, and then running the script 
> on those two files.  If one does that, then the priority order is preserved.

Hi Mike,
Thanks for the feedback.  From the comments, I assumed the usage was
to compare two .log files not .sum files.
Maybe it is a new scenario I'm imagining - comparing two builds to see
if ANY test results changed, regardless of order.
In any case, I will work on your changes when I get time (after stage1 probably)
-- 
Quentin


Re: PATCH: Move f16c intrinsics into f16cintrin.h

2011-10-31 Thread Quentin Neill
On Mon, Oct 31, 2011 at 5:31 PM, Jakub Jelinek  wrote:
> On Mon, Oct 31, 2011 at 05:23:58PM -0500, Quentin Neill wrote:
>> Interested parties should view these threads from three years ago:
>> http://gcc.gnu.org/ml/gcc-patches/2008-11/threads.html#00145
>> http://gcc.gnu.org/ml/gcc-patches/2008-12/threads.html#00174
>>
>> Testing on x86_64, okay to commit if no regressions?
>
> You aren't installing the header, so it will cause regressions.
> config.gcc needs to be adjusted for it.

Arggh.  Thanks, my tests found that too.

Reposting, okay to commit after testing on x86_64 if no regressions?
--
Quentin Neill
From c0379bf7dacbe457813893cdaf381ae7206566c7 Mon Sep 17 00:00:00 2001
From: Quentin Neill 
Date: Mon, 31 Oct 2011 16:54:18 -0500
Subject: [PATCH] 2011-10-31  Quentin Neill  

Piledriver f16cintrin.h fix.
* config/i386/f16cintrin.h: Contents moved from immintrin.h.
* config/config.gcc: Add f16cintrin.h.
---
 gcc/ChangeLog|6 +++
 gcc/config.gcc   |4 +-
 gcc/config/i386/f16cintrin.h |   94 ++
 gcc/config/i386/immintrin.h  |   63 ++--
 4 files changed, 106 insertions(+), 61 deletions(-)
 create mode 100644 gcc/config/i386/f16cintrin.h

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index caed12e..14a4392 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2011-10-31  Quentin Neill  
+
+   Piledriver f16cintrin.h fix.
+   * config/i386/f16cintrin.h: Contents moved from immintrin.h.
+   * config/config.gcc: Add f16cintrin.h.
+
 2011-10-31  Richard Henderson  
 
* config/i386/sse.md (floatv8siv8sf2): Rename from avx_cvtdq2ps256.
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 2c18655..2b60e77 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -361,7 +361,7 @@ i[34567]86-*-*)
   immintrin.h x86intrin.h avxintrin.h xopintrin.h
   ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
   lzcntintrin.h bmiintrin.h bmi2intrin.h tbmintrin.h
-  avx2intrin.h fmaintrin.h"
+  avx2intrin.h fmaintrin.h f16cintrin.h"
;;
 x86_64-*-*)
cpu_type=i386
@@ -374,7 +374,7 @@ x86_64-*-*)
   immintrin.h x86intrin.h avxintrin.h xopintrin.h
   ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
   lzcntintrin.h bmiintrin.h tbmintrin.h bmi2intrin.h
-  avx2intrin.h fmaintrin.h"
+  avx2intrin.h fmaintrin.h f16cintrin.h"
need_64bit_hwint=yes
;;
 ia64-*-*)
diff --git a/gcc/config/i386/f16cintrin.h b/gcc/config/i386/f16cintrin.h
new file mode 100644
index 000..5ff836b
--- /dev/null
+++ b/gcc/config/i386/f16cintrin.h
@@ -0,0 +1,94 @@
+/* Copyright (C) 2011
+   Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _X86INTRIN_H_INCLUDED
+#if (!defined(_X86INTRIN_H_INCLUDED) && !defined(_IMMINTRIN_H_INCLUDED))
+# error "Never use  directly; include  or 
 instead."
+#endif
+
+#ifndef __F16C__
+# error "F16C instruction set not enabled"
+#else
+
+#ifndef _F16CINTRIN_H_INCLUDED
+#define _F16CINTRIN_H_INCLUDED
+
+extern __inline float __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_cvtsh_ss (unsigned short __S)
+{
+  __v8hi __H = __extension__ (__v8hi){ __S, 0, 0, 0, 0, 0, 0, 0 };
+  __v4sf __A = __builtin_ia32_vcvtph2ps (__H);
+  return __builtin_ia32_vec_ext_v4sf (__A, 0);
+}
+
+extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm_cvtph_ps (__m128i __A)
+{
+  return (__m128) __builtin_ia32_vcvtph2ps ((__v8hi) __A);
+}
+
+extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm256_cvtph_ps (__m128i __A)
+{
+  return (__m256) __builtin_ia32_vcvtph2ps256 ((__v8hi) __A);
+}
+
+#ifdef __OPTIMIZE__
+extern __inline unsigned s

PATCH: Move f16c intrinsics into f16cintrin.h

2011-10-31 Thread Quentin Neill
Hi,

This patch moves f16c intrinsics out of immintrin.h into their own
header f16cintrin.h

Interested parties should view these threads from three years ago:
http://gcc.gnu.org/ml/gcc-patches/2008-11/threads.html#00145
http://gcc.gnu.org/ml/gcc-patches/2008-12/threads.html#00174

Testing on x86_64, okay to commit if no regressions?
-- 
Quentin Neill
-- 

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index caed12e..5af1c78 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2011-10-31  Quentin Neill  
+
+   Piledriver f16cintrin.h fix.
+   * config/i386/f16cintrin.h: Contents moved from immintrin.h.
+
 2011-10-31  Richard Henderson  

* config/i386/sse.md (floatv8siv8sf2): Rename from avx_cvtdq2ps256.
diff --git a/gcc/config/i386/f16cintrin.h b/gcc/config/i386/f16cintrin.h
new file mode 100644
index 000..5ff836b
--- /dev/null
+++ b/gcc/config/i386/f16cintrin.h
@@ -0,0 +1,94 @@
+/* Copyright (C) 2011
+   Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _X86INTRIN_H_INCLUDED
+#if (!defined(_X86INTRIN_H_INCLUDED) && !defined(_IMMINTRIN_H_INCLUDED))
+# error "Never use  directly; include  or
 instead."
+#endif
+
+#ifndef __F16C__
+# error "F16C instruction set not enabled"
+#else
+
+#ifndef _F16CINTRIN_H_INCLUDED
+#define _F16CINTRIN_H_INCLUDED
+
+extern __inline float __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
+_cvtsh_ss (unsigned short __S)
+{
+  __v8hi __H = __extension__ (__v8hi){ __S, 0, 0, 0, 0, 0, 0, 0 };
+  __v4sf __A = __builtin_ia32_vcvtph2ps (__H);
+  return __builtin_ia32_vec_ext_v4sf (__A, 0);
+}
+
+extern __inline __m128 __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
+_mm_cvtph_ps (__m128i __A)
+{
+  return (__m128) __builtin_ia32_vcvtph2ps ((__v8hi) __A);
+}
+
+extern __inline __m256 __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
+_mm256_cvtph_ps (__m128i __A)
+{
+  return (__m256) __builtin_ia32_vcvtph2ps256 ((__v8hi) __A);
+}
+
+#ifdef __OPTIMIZE__
+extern __inline unsigned short __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
+_cvtss_sh (float __F, const int __I)
+{
+  __v4sf __A =  __extension__ (__v4sf){ __F, 0, 0, 0 };
+  __v8hi __H = __builtin_ia32_vcvtps2ph (__A, __I);
+  return (unsigned short) __builtin_ia32_vec_ext_v8hi (__H, 0);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
+_mm_cvtps_ph (__m128 __A, const int __I)
+{
+  return (__m128i) __builtin_ia32_vcvtps2ph ((__v4sf) __A, __I);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
+_mm256_cvtps_ph (__m256 __A, const int __I)
+{
+  return (__m128i) __builtin_ia32_vcvtps2ph256 ((__v8sf) __A, __I);
+}
+#else
+#define _cvtss_sh(__F, __I)\
+  (__extension__   \
+   ({  \
+  __v4sf __A =  __extension__ (__v4sf){ __F, 0, 0, 0 };\
+  __v8hi __H = __builtin_ia32_vcvtps2ph (__A, __I);
\
+  (unsigned short) __builtin_ia32_vec_ext_v8hi (__H, 0);   \
+}))
+
+#define _mm_cvtps_ph(A, I) \
+  ((__m128i) __builtin_ia32_vcvtps2ph ((__v4sf)(__m128) A, (int) (I)))
+
+#define _mm256_cvtps_ph(A, I) \
+  ((__m128i) __builtin_ia32_vcvtps2ph256 ((__v8sf)(__m256) A, (int) (I)))
+#endif
+
+#endif /* __F16C__ */
+#endif
diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h
index 102814e..986a573 100644
--- a/gcc/config/i386/immintrin.h
+++ b/gcc/config/i386/immintrin.h
@@ -76,6 +76,10 @@
 #include 
 #endif

+#ifdef __F16C__
+#include 
+#endif
+
 #ifdef __RDRND__
 extern __inline int
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
@@ -161,63 +165,4 @@ _rdrand64_step (unsigned long long *__P)
 #endif /* __RDRND__ */
 #endif /* __x86_64__  */

-#ifdef __F16C__
-extern __inline float __attribute__((__gnu_inline__,
__always_inl

Re: PATCH: Add capability to contrib/compare_tests to handle directories

2011-10-04 Thread Quentin Neill
Ping?

On Wed, Sep 7, 2011 at 12:21 PM, Quentin Neill
 wrote:
> Hi,
>
> Should not change behavior for comparing two files (expect for usage
> output), and also should be POSIX compliant.
>
> Tested on x86_64 tests logs and test directories, would be interested
> in help testing on other platforms.
>
> Ok to commit?
> --
> Quentin Neill
>
>
> From 4d4fa9d094745ace0b6e51faadb2f3ea40cb7c7f Mon Sep 17 00:00:00 2001
> From: Quentin Neill 
> Date: Wed, 7 Sep 2011 12:04:35 -0500
> Subject: [PATCH] Add capability to compare test log directories.
>
> ---
>  contrib/ChangeLog     |    4 ++
>  contrib/compare_tests |  107 
> -
>  2 files changed, 101 insertions(+), 10 deletions(-)
>
> diff --git a/contrib/ChangeLog b/contrib/ChangeLog
> index 07adb58..e2007e7 100644
> --- a/contrib/ChangeLog
> +++ b/contrib/ChangeLog
> @@ -1,3 +1,7 @@
> +2011-09-07  Quentin Neill  
> +
> +       * compare_tests: Add capability to compare test log directories.
> +
>  2011-08-25  Rainer Orth  
>
>        * gcc_update: Determine svn branch from hg convert_revision.
> diff --git a/contrib/compare_tests b/contrib/compare_tests
> index bed9742..b0e3321 100755
> --- a/contrib/compare_tests
> +++ b/contrib/compare_tests
> @@ -2,13 +2,36 @@
>  # This script automatically test the given tool with the tool's test cases,
>  # reporting anything of interest.
>
> -# exits with 0 if there is nothing of interest
> -# exits with 1 if there is something interesting
> -# exits with 2 if an error occurred
> -
> -# Give two .sum files to compare them
> +usage()
> +{
> +       if [ -n "$1" ] ; then
> +               echo "$0: Error: $1" >&2
> +               echo >&2
> +       fi
> +       cat >&2 < +Usage: $0 [-strict] PREVIOUS CURRENT
> +
> +       Compare the PREVIOUS and CURRENT test case logs, reporting anything
> of interest.
> +
> +       If PREVIOUS and CURRENT are directories, find and compare any *.log 
> files
> +       therein (except config.log files).
> +
> +       Unless -strict is given, these discrepancies are not counted as 
> errors:
> +               missing/extra log files when comparing directories
> +               tests that failed in PREVIOUS but pass in CURRENT
> +               tests that were not in PREVIOUS but appear in CURRENT
> +               tests in PREVIOUS that are missing in CURRENT
> +
> +       Exit with the following values:
> +               0 if there is nothing of interest
> +               1 if there are errors when comparing single test case files
> +               N for the number of errors found when comparing directories
> +EOUSAGE
> +       exit 2
> +}
>
>  # Written by Mike Stump 
> +# Subdir comparison added by Quentin Neill 
>
>  tool=gxx
>
> @@ -16,10 +39,72 @@ tmp1=/tmp/$tool-testing.$$a
>  tmp2=/tmp/$tool-testing.$$b
>  now_s=/tmp/$tool-testing.$$d
>  before_s=/tmp/$tool-testing.$$e
> +lst1=/tmp/$tool-lst1.$$
> +lst2=/tmp/$tool-lst2.$$
> +lst3=/tmp/$tool-lst3.$$
> +lst4=/tmp/$tool-lst4.$$
> +lst5=/tmp/$tool-lst5.$$
> +tmps="$tmp1 $tmp2 $now_s $before_s $lst1 $lst2 $lst3 $lst4 $lst5"
> +
> +[ "$1" = "-strict" ] && strict=$1 && shift
> +[ "$1" = "-?" ] && usage
> +[ "$2" = "" ] && usage "Must specify both PREVIOUS and CURRENT"
> +
> +trap "rm -f $tmps" 0 1 2 3 5 9 13 15
> +exit_status=0
>
> -if [ "$2" = "" ]; then
> -       echo "Usage: $0 previous current" >&2
> -       exit 2
> +if [ -d "$1" -a -d "$2" ] ; then
> +       find "$1" \( ! -name config.log \) -name '*.log' >$lst1
> +       find "$2" \( ! -name config.log \) -name '*.log' >$lst2
> +       echo "# Comparing directories"
> +       echo "## Dir1=$1: `cat $lst1 | wc -l` log files"
> +       echo "## Dir2=$2: `cat $lst2 | wc -l` log files"
> +       echo
> +       # remove leading directory components to compare
> +       sed -e "s|^$1/||" $lst1 | sort >$lst3
> +       sed -e "s|^$2/||" $lst2 | sort >$lst4
> +       comm -23 $lst3 $lst4 >$lst5
> +       if [ -s $lst5 ] ; then
> +               echo "# Extra log files in Dir1=$1"
> +               sed -e "s|^|< $1/|" $lst5
> +               echo
> +               [ -n "$strict" ] && exit_status=`expr $exit_status + 1`
> +       fi
> +       comm -13 $lst3 $lst4 >$lst5
> +       if [ -s $lst5 ] ; then
> +    

PATCH: Add capability to contrib/compare_tests to handle directories

2011-09-07 Thread Quentin Neill
Hi,

Should not change behavior for comparing two files (expect for usage
output), and also should be POSIX compliant.

Tested on x86_64 tests logs and test directories, would be interested
in help testing on other platforms.

Ok to commit?
-- 
Quentin Neill


>From 4d4fa9d094745ace0b6e51faadb2f3ea40cb7c7f Mon Sep 17 00:00:00 2001
From: Quentin Neill 
Date: Wed, 7 Sep 2011 12:04:35 -0500
Subject: [PATCH] Add capability to compare test log directories.

---
 contrib/ChangeLog |4 ++
 contrib/compare_tests |  107 -
 2 files changed, 101 insertions(+), 10 deletions(-)

diff --git a/contrib/ChangeLog b/contrib/ChangeLog
index 07adb58..e2007e7 100644
--- a/contrib/ChangeLog
+++ b/contrib/ChangeLog
@@ -1,3 +1,7 @@
+2011-09-07  Quentin Neill  
+
+   * compare_tests: Add capability to compare test log directories.
+
 2011-08-25  Rainer Orth  

* gcc_update: Determine svn branch from hg convert_revision.
diff --git a/contrib/compare_tests b/contrib/compare_tests
index bed9742..b0e3321 100755
--- a/contrib/compare_tests
+++ b/contrib/compare_tests
@@ -2,13 +2,36 @@
 # This script automatically test the given tool with the tool's test cases,
 # reporting anything of interest.

-# exits with 0 if there is nothing of interest
-# exits with 1 if there is something interesting
-# exits with 2 if an error occurred
-
-# Give two .sum files to compare them
+usage()
+{
+   if [ -n "$1" ] ; then
+   echo "$0: Error: $1" >&2
+   echo >&2
+   fi
+   cat >&2 <
+# Subdir comparison added by Quentin Neill 

 tool=gxx

@@ -16,10 +39,72 @@ tmp1=/tmp/$tool-testing.$$a
 tmp2=/tmp/$tool-testing.$$b
 now_s=/tmp/$tool-testing.$$d
 before_s=/tmp/$tool-testing.$$e
+lst1=/tmp/$tool-lst1.$$
+lst2=/tmp/$tool-lst2.$$
+lst3=/tmp/$tool-lst3.$$
+lst4=/tmp/$tool-lst4.$$
+lst5=/tmp/$tool-lst5.$$
+tmps="$tmp1 $tmp2 $now_s $before_s $lst1 $lst2 $lst3 $lst4 $lst5"
+
+[ "$1" = "-strict" ] && strict=$1 && shift
+[ "$1" = "-?" ] && usage
+[ "$2" = "" ] && usage "Must specify both PREVIOUS and CURRENT"
+
+trap "rm -f $tmps" 0 1 2 3 5 9 13 15
+exit_status=0

-if [ "$2" = "" ]; then
-   echo "Usage: $0 previous current" >&2
-   exit 2
+if [ -d "$1" -a -d "$2" ] ; then
+   find "$1" \( ! -name config.log \) -name '*.log' >$lst1
+   find "$2" \( ! -name config.log \) -name '*.log' >$lst2
+   echo "# Comparing directories"
+   echo "## Dir1=$1: `cat $lst1 | wc -l` log files"
+   echo "## Dir2=$2: `cat $lst2 | wc -l` log files"
+   echo
+   # remove leading directory components to compare
+   sed -e "s|^$1/||" $lst1 | sort >$lst3
+   sed -e "s|^$2/||" $lst2 | sort >$lst4
+   comm -23 $lst3 $lst4 >$lst5
+   if [ -s $lst5 ] ; then
+   echo "# Extra log files in Dir1=$1"
+   sed -e "s|^|< $1/|" $lst5
+   echo
+   [ -n "$strict" ] && exit_status=`expr $exit_status + 1`
+   fi
+   comm -13 $lst3 $lst4 >$lst5
+   if [ -s $lst5 ] ; then
+   echo "# Extra log files in Dir2=$2"
+   sed -e "s|^|> $2/|" $lst5
+   echo
+   [ -n "$strict" ] && exit_status=`expr $exit_status + 1`
+   fi
+   comm -12 $lst3 $lst4 | sort -u >$lst5
+   if [ ! -s $lst5 ] ; then
+   echo "# No common log files"
+   exit_status=`expr $exit_status + 1`
+   exit $exit_status
+   fi
+   cmnlogs=`cat $lst5 | wc -l`
+   echo "# Comparing $cmnlogs common log files"
+   for fname in `cat $lst5`
+   do
+   f1="$1/$fname"
+   f2="$2/$fname"
+   echo "## ${CONFIG_SHELL-/bin/sh} $0 $strict $f1 $f2"
+   ${CONFIG_SHELL-/bin/sh} $0 $strict $f1 $f2
+   ret=$?
+   if [ $ret -ne 0 ]; then
+   exit_status=`expr $exit_status + 1`
+   echo "## Differences found: $fname"
+   fi
+   done
+   if [ $exit_status -ne 0 ]; then
+   echo "# $exit_status differences in $cmnlogs common log files 
found"
+   else
+   echo "# No differences found in $cmnlogs common log files"
+   fi
+   exit $exit_status
+elif [ -d "$1" -o -d "$2" ] ; then
+   usage "Must specify either two directories or two files"
 fi

 sed 's/^XFAIL/FAIL/; s/^XPASS/PASS/' < "$1" | awk '/^Running target /
{target =

Re: [PATCH] Fix ICEs with -mxop __builtin_ia32_vpermil2p[sd]{,256} and __builtin_ia32_vprot[bwdq]i intrinsics (PR target/49411)

2011-06-16 Thread Quentin Neill
On Thu, Jun 16, 2011 at 3:34 PM, Quentin Neill
 wrote:
> On Wed, Jun 15, 2011 at 11:40 AM, Quentin Neill
>  wrote:
>> On Wed, Jun 15, 2011 at 4:54 AM, Jakub Jelinek  wrote:
>>> Hi!
>>>
>>> All of these _mm{,256}_permute2_p[sd] and _mm_roti_epi{8,16,32,64}
>>> intrinsics ICE if the last argument is constant integer, but not in the
>>> expected range.
>>>
>>> I could only find MSFT documentation for these intrinsics, where for
>>> *permute2* it says that the last argument must be 0, 1, 2 or 3,
>>> for *roti* it says that the last argument is integer rotation count,
>>> preferrably constant and that if count is negative, it performs right
>>> rotation instead of left rotation.
>>> This patch adjusts the builtins to match that, if we want to instead
>>> e.g. always mandate _mm_roti_epi* last argument is constant integer,
>>> or constant integer in the range -N+1 .. N-1 where N is the number
>>> after _mm_roti_epi, or in the range 0 .. N-1, it can be easily adjusted.
>>>
>>> Regtested on x86_64-linux {-m32,-m64}, unfortunately on a SandyBridge
>>> box, so I couldn't verify if xop-rotate[12]-int.c actually succeeds
>>> on xop capable HW.
>>>
>>> [snip]
>>>
>>>        Jakub
>>>
>>
>> I will test on AMD HW.
>> --
>> Quentin
>
> Regtested on x86_64-linux on AMD Family 16h, and verified the
> xop-rotate[12]-int tests ran and passed.
> --
> Quentin

Does it need to also handle the VCVTP[SH]2P[HS] insns like this?

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 675888f..584f722 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -25571,6 +25571,10 @@ ix86_expand_multi_arg_builtin (enum insn_code
icode, tree exp, rtx target,
case CODE_FOR_xop_vpermil2v4sf3:
case CODE_FOR_xop_vpermil2v4df3:
case CODE_FOR_xop_vpermil2v8sf3:
+   case CODE_FOR_vcvtph2ps:
+   case CODE_FOR_vcvtph2ps256:
+   case CODE_FOR_vcvtps2ph:
+   case CODE_FOR_vcvtps2ph256:
  if (!CONST_INT_P (op))
{
  error ("last argument must be an immediate");

-- 
Quentin


Re: [PATCH] Fix ICEs with -mxop __builtin_ia32_vpermil2p[sd]{,256} and __builtin_ia32_vprot[bwdq]i intrinsics (PR target/49411)

2011-06-16 Thread Quentin Neill
On Wed, Jun 15, 2011 at 11:40 AM, Quentin Neill
 wrote:
> On Wed, Jun 15, 2011 at 4:54 AM, Jakub Jelinek  wrote:
>> Hi!
>>
>> All of these _mm{,256}_permute2_p[sd] and _mm_roti_epi{8,16,32,64}
>> intrinsics ICE if the last argument is constant integer, but not in the
>> expected range.
>>
>> I could only find MSFT documentation for these intrinsics, where for
>> *permute2* it says that the last argument must be 0, 1, 2 or 3,
>> for *roti* it says that the last argument is integer rotation count,
>> preferrably constant and that if count is negative, it performs right
>> rotation instead of left rotation.
>> This patch adjusts the builtins to match that, if we want to instead
>> e.g. always mandate _mm_roti_epi* last argument is constant integer,
>> or constant integer in the range -N+1 .. N-1 where N is the number
>> after _mm_roti_epi, or in the range 0 .. N-1, it can be easily adjusted.
>>
>> Regtested on x86_64-linux {-m32,-m64}, unfortunately on a SandyBridge
>> box, so I couldn't verify if xop-rotate[12]-int.c actually succeeds
>> on xop capable HW.
>>
>> [snip]
>>
>>        Jakub
>>
>
> I will test on AMD HW.
> --
> Quentin

Regtested on x86_64-linux on AMD Family 16h, and verified the
xop-rotate[12]-int tests ran and passed.
-- 
Quentin


Re: [PATCH] Fix ICEs with -mxop __builtin_ia32_vpermil2p[sd]{,256} and __builtin_ia32_vprot[bwdq]i intrinsics (PR target/49411)

2011-06-15 Thread Quentin Neill
On Wed, Jun 15, 2011 at 4:54 AM, Jakub Jelinek  wrote:
> Hi!
>
> All of these _mm{,256}_permute2_p[sd] and _mm_roti_epi{8,16,32,64}
> intrinsics ICE if the last argument is constant integer, but not in the
> expected range.
>
> I could only find MSFT documentation for these intrinsics, where for
> *permute2* it says that the last argument must be 0, 1, 2 or 3,
> for *roti* it says that the last argument is integer rotation count,
> preferrably constant and that if count is negative, it performs right
> rotation instead of left rotation.
> This patch adjusts the builtins to match that, if we want to instead
> e.g. always mandate _mm_roti_epi* last argument is constant integer,
> or constant integer in the range -N+1 .. N-1 where N is the number
> after _mm_roti_epi, or in the range 0 .. N-1, it can be easily adjusted.
>
> Regtested on x86_64-linux {-m32,-m64}, unfortunately on a SandyBridge
> box, so I couldn't verify if xop-rotate[12]-int.c actually succeeds
> on xop capable HW.
>
> 2011-06-15  Jakub Jelinek  
>
>        PR target/49411
>        * config/i386/i386.c (ix86_expand_multi_arg_builtins): If
>        last_arg_constant and last argument doesn't match its predicate,
>        for xop_vpermil23 error out and for xop_rotl3
>        if it is CONST_INT, mask it, otherwise expand using rotl3.
>
>        * gcc.target/i386/xop-vpermil2px-1.c: New test.
>        * gcc.target/i386/xop-vpermil2px-2.c: New test.
>        * gcc.target/i386/xop-rotate1-int.c: New test.
>        * gcc.target/i386/xop-rotate2-int.c: New test.
>
> --- gcc/config/i386/i386.c.jj   2011-06-09 16:56:56.0 +0200
> +++ gcc/config/i386/i386.c      2011-06-15 11:17:12.0 +0200
> @@ -26149,16 +26149,66 @@ ix86_expand_multi_arg_builtin (enum insn
>       int adjust = (comparison_p) ? 1 : 0;
>       enum machine_mode mode = insn_data[icode].operand[i+adjust+1].mode;
>
> -      if (last_arg_constant && i == nargs-1)
> +      if (last_arg_constant && i == nargs - 1)
>        {
> -         if (!CONST_INT_P (op))
> +         if (!insn_data[icode].operand[i + 1].predicate (op, mode))
>            {
> -             error ("last argument must be an immediate");
> -             return gen_reg_rtx (tmode);
> +             enum insn_code new_icode = icode;
> +             switch (icode)
> +               {
> +               case CODE_FOR_xop_vpermil2v2df3:
> +               case CODE_FOR_xop_vpermil2v4sf3:
> +               case CODE_FOR_xop_vpermil2v4df3:
> +               case CODE_FOR_xop_vpermil2v8sf3:
> +                 if (!CONST_INT_P (op))
> +                   {
> +                     error ("last argument must be an immediate");
> +                     return gen_reg_rtx (tmode);
> +                   }
> +                 error ("last argument must be in the range 0 .. 3");
> +                 return gen_reg_rtx (tmode);
> +               case CODE_FOR_xop_rotlv2di3:
> +                 new_icode = CODE_FOR_rotlv2di3;
> +                 goto xop_rotl;
> +               case CODE_FOR_xop_rotlv4si3:
> +                 new_icode = CODE_FOR_rotlv4si3;
> +                 goto xop_rotl;
> +               case CODE_FOR_xop_rotlv8hi3:
> +                 new_icode = CODE_FOR_rotlv8hi3;
> +                 goto xop_rotl;
> +               case CODE_FOR_xop_rotlv16qi3:
> +                 new_icode = CODE_FOR_rotlv16qi3;
> +               xop_rotl:
> +                 if (CONST_INT_P (op))
> +                   {
> +                     int mask = GET_MODE_BITSIZE (GET_MODE_INNER (tmode)) - 
> 1;
> +                     op = GEN_INT (INTVAL (op) & mask);
> +                     gcc_checking_assert
> +                       (insn_data[icode].operand[i + 1].predicate (op, 
> mode));
> +                   }
> +                 else
> +                   {
> +                     gcc_checking_assert
> +                       (nargs == 2
> +                        && insn_data[new_icode].operand[0].mode == tmode
> +                        && insn_data[new_icode].operand[1].mode == tmode
> +                        && insn_data[new_icode].operand[2].mode == mode
> +                        && insn_data[new_icode].operand[0].predicate
> +                           == insn_data[icode].operand[0].predicate
> +                        && insn_data[new_icode].operand[1].predicate
> +                           == insn_data[icode].operand[1].predicate);
> +                     icode = new_icode;
> +                     goto non_constant;
> +                   }
> +                 break;
> +               default:
> +                 gcc_unreachable ();
> +               }
>            }
>        }
>       else
>        {
> +       non_constant:
>          if (VECTOR_MODE_P (mode))
>            op = safe_vector_operand (op, mode);
>
> --- gcc/testsuite/gcc.target/i386/xop-vpermil2px-1.c.jj 2011-06-15 
> 10:18:29.0 +0200
> +++ gcc/testsuite/gcc.target/i386/xop-vpermil2px-1.c    2011-06-15 
> 10:41:13.0 +0200
> @@ -0,0 +1,25 @@
> +/*

Re: [PATCH] fix vfmsubaddpd/vfmaddsubpd generation

2011-05-19 Thread Quentin Neill
On Wed, May 18, 2011 at 10:32 AM, Uros Bizjak  wrote:
> Hello!
>
>> This patch fixes an obvious problem: the fma4_fmsubadd/fma4_fmaddsub
>> instruction templates don't generate vfmsubaddpd/vfmaddsubpd because
>> they don't use 
>>
>> This passes bootstrap on x86_64 on trunk.  Okay to commit?
>
> See comments in the code.
>
>> BTW, I'm testing on gcc-4_6-branch.  Should I post a different patch
>> thread, or just use this one?
>
> No, the patch is clear and simple enough, you don't need to post it twice.
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 3625d9b..e86ea4e 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,8 @@
> +2011-05-17  Harsha Jagasia  
> +
> +       * config/i386/sse.md (fma4_fmsubadd): Use .
> +       (fma4_fmaddsub): Likewise
> +
>  2011-05-17  Richard Guenther  
>
> ChangeLog should be included in the message body, not in the patch.
> Please see [1] for details.
>
>        * gimple.c (iterative_hash_gimple_type): Simplify singleton
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 291bffb..7c4e6dd 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -1663,7 +1663,7 @@
>           (match_operand:VF 3 "nonimmediate_operand" "xm,x")]
>          UNSPEC_FMADDSUB))]
>   "TARGET_FMA4"
> -  "vfmaddsubps\t{%3, %2, %1, %0|%0, %1, %2, %3}"
> +  "vfmaddsubp\t{%3, %2, %1, %0|%0, %1, %2, %3}"
>   [(set_attr "type" "ssemuladd")
>    (set_attr "mode" "")])
>
> No,  mode attribute resolves to ps and pd for VF mode
> iterator, so "vfmaddsub".
>
> @@ -1676,7 +1676,7 @@
>             (match_operand:VF 3 "nonimmediate_operand" "xm,x"))]
>          UNSPEC_FMADDSUB))]
>   "TARGET_FMA4"
> -  "vfmsubaddps\t{%3, %2, %1, %0|%0, %1, %2, %3}"
> +  "vfmsubaddp\t{%3, %2, %1, %0|%0, %1, %2, %3}"
>   [(set_attr "type" "ssemuladd")
>    (set_attr "mode" "")])
>
> Same here.
>
> OK everywhere with these two changes.
>
> [1] http://gcc.gnu.org/contribute.html.
>
> Thanks,
> Uros.

Uros,

Thanks for the feedback and gentle reminder.

I will apply your changes, retesting on both branches, and then commit.
-- 
Quentin


[PATCH] fix vfmsubaddpd/vfmaddsubpd generation

2011-05-17 Thread Quentin Neill
This patch fixes an obvious problem: the fma4_fmsubadd/fma4_fmaddsub
instruction templates don't generate vfmsubaddpd/vfmaddsubpd because
they don't use 

This passes bootstrap on x86_64 on trunk.  Okay to commit?

BTW, I'm testing on gcc-4_6-branch.  Should I post a different patch
thread, or just use this one?
-- 
Quentin
From aa70d4f6180f1c6712888b7328723232b5da8bdc Mon Sep 17 00:00:00 2001
From: Quentin Neill 
Date: Tue, 17 May 2011 10:24:17 -0500
Subject: [PATCH] 2011-05-17  Harsha Jagasia  

* config/i386/sse.md (fma4_fmsubadd): Use .
(fma4_fmaddsub): Likewise
---
 gcc/ChangeLog  |5 +
 gcc/config/i386/sse.md |4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3625d9b..e86ea4e 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2011-05-17  Harsha Jagasia  
+
+   * config/i386/sse.md (fma4_fmsubadd): Use .
+   (fma4_fmaddsub): Likewise
+
 2011-05-17  Richard Guenther  
 
* gimple.c (iterative_hash_gimple_type): Simplify singleton
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 291bffb..7c4e6dd 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1663,7 +1663,7 @@
   (match_operand:VF 3 "nonimmediate_operand" "xm,x")]
  UNSPEC_FMADDSUB))]
   "TARGET_FMA4"
-  "vfmaddsubps\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  "vfmaddsubp\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "type" "ssemuladd")
(set_attr "mode" "")])
 
@@ -1676,7 +1676,7 @@
 (match_operand:VF 3 "nonimmediate_operand" "xm,x"))]
  UNSPEC_FMADDSUB))]
   "TARGET_FMA4"
-  "vfmsubaddps\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  "vfmsubaddp\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "type" "ssemuladd")
(set_attr "mode" "")])
 
-- 
1.7.1