I can reproduce this bug outside of banshee and mono *wheee*. Also,
I think I understand the problem too now, and what really causes it.

First of all, the repro steps:

| $ cat a.c
| #include <stdio.h>
| #include <liboil/liboil.h>
| 
| int main() {
|     printf("oil_init...\n");
|     oil_init();
|     printf("Done...\n");
|     return 0;
| }
| 
| $ gcc -I/usr/include/liboil-0.3 -Wall -ggdb a.c -c -o a.o 
-mpreferred-stack-boundary=2
| $ gcc a.o /usr/lib/liboil-0.3.so
| $ ./a.out
| oil_init...
| zsh: segmentation fault  ./a.out

The segfault is caused by the optimize_all call inside of oil_init,
when it tries all possible implementations.

The crash happens here:
| 0xb453c013 <composite_in_argb_sse_2pix+35>:     movdqa 0xffff704c(%ebx),%xmm0
| 0xb453c01b <composite_in_argb_sse_2pix+43>:     movdqa %xmm0,0xffffffc8(%ebp) 
 <=== SEGV

This corresponds to the following code in composite_sse_2pix.c:
| static inline __m128i
| muldiv_255_sse2(__m128i a, __m128i b)
| {
|   __m128i ret;
|   __m128i roundconst = MC(8x0080);
| 
|   ret = _mm_mullo_epi16(a, b);
|   ret = _mm_adds_epu16(ret, roundconst);

The problem is that gcc somehow has believes that it has to copy
that MC(8x0080) thing to the stack. Gcc tries to copy the constant
using movqda, which requires that memory operands are 16-byte
aligned. If it's not, the CPU raises a #GP exception, which the
kernel translates to a SEGV [1].

Normally this is not problematic, since gcc aligns the stack
boundary to 16 bytes by default. However this doesn't seem to hold
for mono/banshee, or if one manually changes that alignment.

Gcc can be convinced to optimize roundconst away and directly use
the MC(8x0080) constant, so that particular segfault goes away
(patches attached).  There are however several other segfaults in
other places.

A fix can be found for some of them, but the problem is that you'd
have to prevent the use of any __m128 constants on the stack. This
means no local variables, no implicit copies by gcc, ...

That's quite a major PITA, if it's even possible at all.

The other possibility is to tell gcc that it's got to 16-byte align
those variables, no matter what. There's an alignment attribute for
that, which can be either applied to variables[2] or to types[3].
However when I tried it out, it didn't work, gcc(-4.0) always
generated the same faulty code that relied on the frame starting at
a multiple of 16.

To conclude, manually fixing up all this stuff seems impossible, and
getting gcc to solve it didn't work for me either.

So unless you have any better ideas we could ask the gcc folks if
they know a solution for this.

HTH,
Christian Aichinger

[1] http://enrico.phys.cmu.edu/QCDcluster/intel/vtune/reference/vc183.htm
[2] http://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Variable-Attributes.html
[3] http://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Type-Attributes.html
--- liboil-0.3.9.orig/liboil/sse/composite_sse_2pix.c   2006-06-08 
04:41:50.000000000 +0200
+++ liboil-0.3.9/liboil/sse/composite_sse_2pix.c        2006-06-08 
04:43:28.000000000 +0200
@@ -41,20 +41,10 @@
  * the channel value in the low byte.  This means 2 pixels per pass.
  */
 
-union m128_int {
-  __m128i m128;
-  uint64_t ull[2];
-};
-
-static const struct _SSEData {
-  union m128_int sse_8x00ff;
-  union m128_int sse_8x0080;
-} c = {
-    .sse_8x00ff.ull =  {0x00ff00ff00ff00ffULL, 0x00ff00ff00ff00ffULL},
-    .sse_8x0080.ull =  {0x0080008000800080ULL, 0x0080008000800080ULL},
-};
+static const __m128i c_sse_8x00ff = {0x00ff00ff00ff00ffULL, 
0x00ff00ff00ff00ffULL};
+static const __m128i c_sse_8x0080 = {0x0080008000800080ULL, 
0x0080008000800080ULL};
 
-#define MC(x) (c.sse_##x.m128)
+#define MC(x) (c_sse_##x)
 
 /* Shuffles the given value such that the alpha for each pixel appears in each
  * channel of the pixel.
--- liboil-0.3.9.orig/liboil/sse/composite_sse_4pix.c   2006-06-08 
04:41:50.000000000 +0200
+++ liboil-0.3.9/liboil/sse/composite_sse_4pix.c        2006-06-08 
05:15:38.000000000 +0200
@@ -32,20 +32,10 @@
 #include <emmintrin.h>
 #include <liboil/liboilcolorspace.h>
 
-union m128_int {
-  __m128i m128;
-  uint64_t ull[2];
-};
-
-static const struct _SSEData {
-  union m128_int sse_16xff;
-  union m128_int sse_8x0080;
-} c = {
-    .sse_16xff.ull =   {0xffffffffffffffffULL, 0xffffffffffffffffULL},
-    .sse_8x0080.ull =  {0x0080008000800080ULL, 0x0080008000800080ULL},
-};
+static const __m128i c_sse_16xff = {0xffffffffffffffffULL, 
0xffffffffffffffffULL};
+static const __m128i c_sse_8x0080 = {0x0080008000800080ULL, 
0x0080008000800080ULL};
 
-#define MC(x) (c.sse_##x.m128)
+#define MC(x) (c_sse_##x)
 
 /* non-SSE2 compositing support */
 #define COMPOSITE_OVER(d,s,m) ((d) + (s) - oil_muldiv_255((d),(m)))

Attachment: signature.asc
Description: Digital signature

Reply via email to