------- Comment #35 from michaelni at gmx dot at 2008-03-20 17:18 ------- Subject: Re: compiled trivial vector intrinsic code is inefficient
On Thu, Mar 20, 2008 at 09:49:22AM -0000, ubizjak at gmail dot com wrote: > > > ------- Comment #34 from ubizjak at gmail dot com 2008-03-20 09:49 ------- > (In reply to comment #33) > > > Anyway iam glad ffmpeg compiles fine under icc. > > Me to. Now you will troll in their support lists. No, truth be, i dont plan to switch to icc yet. Somehow i do prefer to use free tools. Of course if the gap becomes too big i as well as most others will switch to icc ... Also ffmpeg uses almost entirely asm() instead of intrinsics so this alone is not so much a problem for ffmpeg than it is for others who followed the recommandition of "intrinsics are better than asm". About trolling, well i made no attempt to reply politely and diplomatic, no. But "solving" a "problem" in some use case by droping support for that use case is kinda extreem. The way i see it is that * Its non trivial to place emms optimally and automatically * there needs to be a emms between mmx code and fpu code The solutions to this would be any one of A. let the programmer place emms like it has been in the past B. dont support mmx at all C. dont support x87 fpu at all D. place emms after every bunch of mmx instructions E. solve a quite non trivial problem and place emms optimally The solution which has been selected apparently is B., why was that choosen? Instead of lets say A.? If i do write SIMD code then i do know that i need an emms on x86. Its trivial for the programmer to place it optimally. [...] -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14552