Hi all, I'm making some slow progress in my understanding of this bug.
So, I started with the assumption that everything was caused by these "Address out of bounds" problems: _X11TransWrite (ciptr=0x30c, buf=0x30c <Address 0x30c out of bounds>, size=780) And _X11TransRead (ciptr=0x20, buf=0x20 <Address 0x20 out of bounds>, size=32) As these two are pretty similar I will only focus on the "Write" (assuming that this is the exact same problem for both). If we extract only the interesting functions calls, we have: (1) _XFlushInt (dpy=0x83b0750, cv=0x0) at XlibInt.c:589 (2) _X11TransWrite (ciptr=0x83b0cf0, buf=0x83b0cf0 "@[EMAIL PROTECTED]", size=138087664) at Xtrans.c:843 (3) _X11TransSocketWrite (ciptr=0x83b0db0, buf=0x83b0db0 "<\001\002", size=138087856) at Xtranssock.c:1750 (4) _X11TransWrite (ciptr=0x30c, buf=0x30c <Address 0x30c out of bounds>, size=780) My guess now is that the parameter "size" is causing this out of bounds address as it is a quite unusual size for a message to put in a socket (size=138087856=131Mb). It is only a supposition, but the pointer to the buffer has no other choice than being out of bounds as the size of the message is that big. Therefore, we have to see where is calculated the size of the message. Actually, everything seems to take place in the function _XFlushInt: ================================================= _XFlushInt (dpy=0x83b0750, cv=0x0) at XlibInt.c:589 589 if (dpy->flags & XlibDisplayIOError) (gdb) 597 while (dpy->flags & XlibDisplayWriting) { (gdb) 605 size = todo = dpy->bufptr - dpy->buffer; (gdb) 606 if (!size) return; (gdb) 605 size = todo = dpy->bufptr - dpy->buffer; (gdb) 606 if (!size) return; (gdb) 612 for (ext = dpy->flushes; ext; ext = ext->next_flush) (gdb) 608 dpy->flags |= XlibDisplayWriting; (gdb) 610 dpy->bufptr = dpy->bufmax; (gdb) 608 dpy->flags |= XlibDisplayWriting; (gdb) 612 for (ext = dpy->flushes; ext; ext = ext->next_flush) (gdb) 610 dpy->bufptr = dpy->bufmax; (gdb) 612 for (ext = dpy->flushes; ext; ext = ext->next_flush) (gdb) 620 while (size) { (gdb) 614 bufindex = dpy->buffer; (gdb) 620 while (size) { (gdb) 621 ESET(0); (gdb) 622 write_stat = _X11TransWrite(dpy->trans_conn, (gdb) _X11TransWrite (ciptr=0x83b0cf0, buf=0x83b0cf0 "@[EMAIL PROTECTED]", size=138087664) at Xtrans.c:843 843 return ciptr->transptr->Write (ciptr, buf, size); (gdb) _X11TransSocketWrite (ciptr=0x83b0db0, buf=0x83b0db0 "<\001\002", size=138087856) at Xtranssock.c:1750 1750 return write (ciptr->fd, buf, size); (gdb) 1744 { (gdb) 1750 return write (ciptr->fd, buf, size); (gdb) 1752 } (gdb) _X11TransWrite (ciptr=0x30c, buf=0x30c <Address 0x30c out of bounds>, size=780) at Xtrans.c:844 844 } ================================================= The exact piece of code is the following (see in xc/lib/X11/XlibInt.c): ================================================= /* _XFlushInt - Internal version of _XFlush used to do multi-threaded * locking correctly. */ static void _XFlushInt (dpy, cv) register Display *dpy; register xcondition_t cv; { #endif /* XTHREADS*/ register long size, todo; register int write_stat; register char *bufindex; _XExtension *ext; /* This fix resets the bufptr to the front of the buffer so * additional appends to the bufptr will not corrupt memory. Since * the server is down, these appends are no-op's anyway but * callers of _XFlush() are not verifying this before they * call it. */ if (dpy->flags & XlibDisplayIOError) { dpy->bufptr = dpy->buffer; dpy->last_req = (char *)&_dummy_request; return; } #ifdef XTHREADS while (dpy->flags & XlibDisplayWriting) { if (dpy->lock) { ConditionWait(dpy, dpy->lock->writers); } else { _XWaitForWritable (dpy, cv); } } #endif size = todo = dpy->bufptr - dpy->buffer; if (!size) return; #ifdef XTHREADS dpy->flags |= XlibDisplayWriting; /* make sure no one else can put in data */ dpy->bufptr = dpy->bufmax; #endif for (ext = dpy->flushes; ext; ext = ext->next_flush) (*ext->before_flush)(dpy, &ext->codes, dpy->buffer, size); bufindex = dpy->buffer; /* * While write has not written the entire buffer, keep looping * until the entire buffer is written. bufindex will be * incremented and size decremented as buffer is written out. */ while (size) { ESET(0); write_stat = _X11TransWrite(dpy->trans_conn, bufindex, (int) todo); if (write_stat >= 0) { size -= write_stat; todo = size; bufindex += write_stat; } else if (ETEST()) { _XWaitForWritable(dpy #ifdef XTHREADS , cv #endif ); #ifdef SUNSYSV } else if (ECHECK(0)) { _XWaitForWritable(dpy #ifdef XTHREADS , cv #endif ); #endif #ifdef ESZTEST } else if (ESZTEST()) { if (todo > 1) todo >>= 1; else { _XWaitForWritable(dpy #ifdef XTHREADS , cv #endif ); } #endif } else if (!ECHECK(EINTR)) { /* Write failed! */ /* errno set by write system call. */ _XIOError(dpy); } } dpy->last_req = (char *)&_dummy_request; if ((dpy->request - dpy->last_request_read) >= SEQLIMIT && !(dpy->flags & XlibDisplayPrivSync)) { dpy->savedsynchandler = dpy->synchandler; dpy->synchandler = _XSeqSyncFunction; dpy->flags |= XlibDisplayPrivSync; } dpy->bufptr = dpy->buffer; #ifdef XTHREADS dpy->flags &= ~XlibDisplayWriting; #endif } ================================================= And we are executing only the following (I cut out uninteresting parts): ---> size = todo = dpy->bufptr - dpy->buffer; if (!size) return; #ifdef XTHREADS dpy->flags |= XlibDisplayWriting; /* make sure no one else can put in data */ dpy->bufptr = dpy->bufmax; #endif for (ext = dpy->flushes; ext; ext = ext->next_flush) (*ext->before_flush)(dpy, &ext->codes, dpy->buffer, size); bufindex = dpy->buffer; /* * While write has not written the entire buffer, keep looping * until the entire buffer is written. bufindex will be * incremented and size decremented as buffer is written out. */ while (size) { ESET(0); ---> write_stat = _X11TransWrite(dpy->trans_conn, bufindex, (int) todo); So, if I understand well here, "size" and "todo" are declared as "register long" (note the _register_ here because it might explain the persistance of the bug). And they have the exact same value which is a _difference_ of pointers ("dpy->bufptr - dpy->buffer"). I didn't manage to find where was the definition of the "Display" type was, I found only the one of "_XDisplay" in xc/lib/X11/Xlibint.h. And in this one, "bufptr" and "buffer" are both "char*" (sound ok). Then at the very end, when calling "_X11TransWrite", the "todo" (aka "size") is casted to an "int" (why not having them as "int" from the beginning ??? And why having "size" and "todo" as separate variables ???). I don't know exactly what's wrong here, but my guess is that on the Crusoe architecture, these manipulations do not behave as we usually intend to (my second guess is that the "register" key word makes the bug persistant). If someone has some ideas/theories about this, I would be pleased to have some help here. :) Regards -- Emmanuel Fleury Computer Science Department, | Office: B1-201 Aalborg University, | Phone: +45 96 35 72 23 Fredriks Bajersvej 7E, | Fax: +45 98 15 98 89 9220 Aalborg East, Denmark | Email: [EMAIL PROTECTED]