Re: [flac-dev] Two questions about RG in flac
On 6/3/14, Robert Kausch wrote: > Am 03.06.2014 16:45, schrieb lvqcl: >> 2) to ALL: >> I attached a small program. Compile and run it. >> * Does it work correctly when compiled with -O3 -msse2 options? >> * If yes, does it work correctly when compiled with -O3 -funroll-loops >> -msse2 options? >> ( and what is the version of your GCC? ) > I further reduced the testcase (attached). > > The bug only occurs if N >= 64; presumably the second loop is only SSE2 > optimized if that's the case. > > The problem seems to be that sum is interpreted as a 64 bit value if > SSE2 was used in the loop (the lower 32 bits of the result give the > expected value). If sum is evaluated another time before or after (!) > the printf, the problem goes away. For example, changing the last line > to "return sum + 1;" lets the problem disappear. > > I confirmed the bug with GCC 4.6.3 on Ubuntu. As on Windows, only 32 bit > code generation is affected. > > You should file a bug report with the GCC team. > With gcc-3,3,6, 3,4,6, 4.3.0 and gcc-4.9.1 (svn r210839) the output is normal: Sum = 64.00 (should be equal to 64) With gcc-4.8.3 (release version) it's broken: Sum = 206158430272.00 (should be equal to 64) With clang-3.4.1 (compiled with gcc-4.8.3) the output is normal again. This is on i686-linux (fedora9, glibc-2.8, kernel-2.6.27.35) ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions about RG in flac
Am 03.06.2014 16:45, schrieb lvqcl: 2) to ALL: I attached a small program. Compile and run it. * Does it work correctly when compiled with -O3 -msse2 options? * If yes, does it work correctly when compiled with -O3 -funroll-loops -msse2 options? ( and what is the version of your GCC? ) I further reduced the testcase (attached). The bug only occurs if N >= 64; presumably the second loop is only SSE2 optimized if that's the case. The problem seems to be that sum is interpreted as a 64 bit value if SSE2 was used in the loop (the lower 32 bits of the result give the expected value). If sum is evaluated another time before or after (!) the printf, the problem goes away. For example, changing the last line to "return sum + 1;" lets the problem disappear. I confirmed the bug with GCC 4.6.3 on Ubuntu. As on Windows, only 32 bit code generation is affected. You should file a bug report with the GCC team. #include #define N 64 /* problem is triggered only if N >= 64 */ unsigned A[N]; int main() { unsigned i, sum = 0; /* both sum and A[] need to be unsigned for the bug to happen */ for (i = 0; i < N; i++) A[i] = 1; for (i = 0; i < N; i++) sum += A[i]; printf("Sum = %f (should be equal to %i)\n", (float) sum, N); return 0; } ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions about RG in flac
Am 03.06.2014 16:45, schrieb lvqcl: > 2) to ALL: > I attached a small program. Compile and run it. > * Does it work correctly when compiled with -O3 -msse2 options? > * If yes, does it work correctly when compiled with -O3 -funroll-loops > -msse2 options? > ( and what is the version of your GCC? ) Tested various versions of TDM-GCC on Windows. 32 bit executable produced with TDM-GCC 4.8.1 fails as soon as -O3 and SSE2 come together. SSE2 is enabled by -O3 here, so compiling with -O3 is sufficient to trigger the bug. Compiling with -O3 -mno-sse2 produces a correctly working executable just as -O2 -msse2 does. -funroll-loops does not make any difference. Same with TDM-GCC 4.4.1, 4.5.0, 4.6.1 and 4.7.1; only difference is that -O3 does not include SSE2 there, so it has to be enabled manually with -msse2 to trigger the problem. TDM-GCC 4.3.2 produces a correctly working executable even with -O3 -msse2. 64 bit executables produced with any of the tested GCC versions work correctly in all cases. ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev