On Sun, Oct 25, 2015 at 5:10 PM, Siarhei Siamashka <siarhei.siamas...@gmail.com> wrote: > On Sun, 25 Oct 2015 13:13:09 -0700 > Matt Turner <matts...@gmail.com> wrote: > >> On Sun, Oct 11, 2015 at 8:59 PM, Matt Turner <matts...@gmail.com> wrote: >> > We had lots of hacks to handle the inability to include xmmintrin.h >> > without compiling with -msse (lest SSE instructions be used in >> > pixman-mmx.c). Some recent version of gcc relaxed this restriction. >> > >> > Change configure.ac to test that xmmintrin.h can be included and that we >> > can use some intrinsics from it, and remove the work-around code from >> > pixman-mmx.c. >> > >> > Evidently allows gcc 4.9.3 to optimize better as well: >> > >> > text data bss dec hex filename >> > 657078 30848 680 688606 a81de libpixman-1.so.0.33.3 before >> > 656710 30848 680 688238 a806e libpixman-1.so.0.33.3 after >> > >> > Signed-off-by: Matt Turner <matts...@gmail.com> >> > --- >> >> Ugh. This is apparently not sufficient... >> >> https://bugs.gentoo.org/show_bug.cgi?id=564024 >> >> GCC allows you to *include* xmmintrin.h without enabling SSE, but it >> still doesn't allow you to use any of the functions: >> >> conftest.c: In function ‘main’: >> /usr/lib/gcc/x86_64-pc-linux-gnu/5.1.0/include/xmmintrin.h:1124:1: >> error: inlining failed in call to always_inline ‘_mm_mulhi_pu16’: >> target specific option mismatch >> _mm_mulhi_pu16 (__m64 __A, __m64 __B) >> ^ >> conftest.c:12:7: error: called from here >> w = _mm_mulhi_pu16(w, w); > > Oh, looks like the restriction used to be relaxed for a while, but then > GCC 4.9 started to be strict again: > https://bugzilla.redhat.com/show_bug.cgi?id=1092991#c1 > >> I'm not sure what to do except to revert. > > The real problem is that GCC does not provide a separate option for > MMX2 (a common subset of 3DNOW and SSE). We usually solve compiler > problems by reporting bugs to compiler developers. This particular > case had not been handled according to the usual rule, and now > we have a nice practical demonstration of the consequences ;-) > > BTW, we can still report a bug to GCC. Better late than never.
Yeah, I suppose. The disappointing thing is that Google says an -m3dnowext flag existed at one point... >> The MMX but no SSE case is important, at least it was in the past >> because of OLPC's XO-1. > > I'm not sure how many OLPC XO-1 laptops might be still remaining in > real use in the hands of real people: > http://www.olpcnews.com/about_olpc_news/goodbye_one_laptop_per_child.html > >> Suggestions besides reverting this? > > Because OLPC XO-1 is using the AMD Geode processor, we could probably > treat the code in pixman-mmx.c as 3dnow optimizations on x86 hardware? The problem is that -m3dnow isn't sufficient. The instructions we want to use are a subset of SSE that AMD implemented in the Athlon. We need an -m3dnowext flag. We can't pass -march=athlon in MMX_CFLAGS either, since the user is likely to have specified a -march= value of their own. > Another option is to start using assembly instead of intrinsics. > Unless a miracle happens and somebody decides to pay for this job, > we definitely don't have resources to do a high quality assembly > implementation for MMX/MMX2. But we still can take the assembly > output of GCC and tweak it a bit. This is ugly and not very > maintainable though. Been there, done that with ARMv6. Not interested. > Or we could simply do nothing and finally retire MMX support on x86. > If OLPC XO-1 users still do exist, they can always contact us. I don't care so much about XO-1, but I do want to retain the ability to test the MMX code on x86. iwMMXt/loongson systems are slow, and most development can be done on a fast desktop this way. _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman