Bug#280384: X11 crashing on 2.4.28
I have been neglectful in mentioning that this fix for 280384 fixes my original problem also. Thanks to everybody for fixing it and for sending me the files so I could post them to my web server. --- "Jurzitza, Dieter" <[EMAIL PROTECTED]> wrote: > Dear listmembers, > sorry for the delay. I'd really like to spend more time with the > sparc issues if I could ;-). > Replacing the XFree86 binary that comes with the regular distribution > with the freshly build one that has been supplied by you solves the > problem with X on my U60 / SMP / Creator 3d. > > So, it seems to me that my problem has been fixed by Richard (many > thanks again) and I am looking forward for an official build to > appear on the net prior to really using 2.4.28, because I do not want > to get "off sync" with the regular distribution.
Bug#280384: X11 crashing on 2.4.28
Dear listmembers, sorry for the delay. I'd really like to spend more time with the sparc issues if I could ;-). Replacing the XFree86 binary that comes with the regular distribution with the freshly build one that has been supplied by you solves the problem with X on my U60 / SMP / Creator 3d. So, it seems to me that my problem has been fixed by Richard (many thanks again) and I am looking forward for an official build to appear on the net prior to really using 2.4.28, because I do not want to get "off sync" with the regular distribution. Thanks again to everybody for helping out, take care Dieter Jurzitza -Original Message- From: Ron Murray [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 08, 2004 7:46 PM To: debian-sparc@lists.debian.org; [EMAIL PROTECTED] Subject: Re: X11 crashing on 2.4.28 At Wed, 8 Dec 2004 13:00:46 -0500, Branden Robinson <[EMAIL PROTECTED]> wrote: *** Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und loeschen Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the contents in this e-mail is strictly forbidden. ***
Bug#280384: X11 crashing on 2.4.28
At Wed, 8 Dec 2004 13:00:46 -0500, Branden Robinson <[EMAIL PROTECTED]> wrote: > >Only difference was that I didn't turn it on for all Linux, just > > for ia86 and sparc. Wasn't sure whether it was a good idea or not. > > > >I'll let everyone know how it went. Thanks for finding it. > > I haven't seen followup from you on how it went, but I went ahead and > applied your patch anyway. I replied to the list with the results I had, which were that I _thought_ it was fixed, but X still wouldn't start for me. I haven't seen any replies from anyone else. I now have X working using the framebuffer driver, so it seems that these patches do indeed fix the problem (I couldn't even do that before). > Sorry about the red herring I threw out regarding PCI domain issues -- I > didn't mean to lead anyone astray I was just stabbing in the dark. Understood. It actually led me to realize the probable cause of the remaining X problem I have with this machine, in that fbdev works but the glint driver doesn't load. I've submitted a separate bug report on that one (#284111), since it's clearly not related to the mmap problem. Thanks, .Ron -- Ron Murray ([EMAIL PROTECTED]) http://www.rjmx.net/~ron GPG Public Key Fingerprint: F2C1 FC47 5EF7 0317 133C D66B 8ADA A3C4 D86C 74DE
Bug#280384: X11 crashing on 2.4.28
On Thu, Dec 02, 2004 at 07:45:21PM -0500, Ron Murray wrote: >Yep, I agree that you've probably found the problem. After I wrote > my previous post, I did some poking around with gdb on the XFree86 > executable. I found a sequence of bytes that looked a lot like the > ones you posted earlier, a little further on than you had (but my > current copy of XFree86 has lots of debugging code inbuilt). They even > had a call to malloc() in the middle of them. gdb claimed that the > code was in the middle of ELFLoadModule(), so I looked, and there it > was, complete with the same #ifdef you found earlier. I set up the > patch, started the build, and went home. With any luck, I'll have a > new (and hopefully functional) set of X packages when I get to work in > the morning. > >Only difference was that I didn't turn it on for all Linux, just > for ia86 and sparc. Wasn't sure whether it was a good idea or not. > >I'll let everyone know how it went. Thanks for finding it. I haven't seen followup from you on how it went, but I went ahead and applied your patch anyway. Thanks very much to you and the other people who contributed to this bug report. Sorry about the red herring I threw out regarding PCI domain issues -- I didn't mean to lead anyone astray I was just stabbing in the dark. Expect this to be fixed in 4.3.0.dfsg.1-9, assuming your tests succeeded. r2043 | branden | 2004-12-02 23:24:44 -0500 (Thu, 02 Dec 2004) | 6 lines Changed paths: M /trunk/debian/CHANGESETS M /trunk/debian/TODO M /trunk/debian/changelog M /trunk/debian/patches/071_nonexecutable_malloced_mem.diff M /trunk/debian/patches/600_amd64_support.diff Apply patch from Richard Mortimer to fix the XFree86 X server's ELF object loader to set the PROT_EXEC flag on mmap()ed modules regardless of machine architecture. (It was already trying to do this, but there are three preprocessor statements involved, and we were only patching one.) (Closes: #280384) -- G. Branden Robinson|I've made up my mind. Don't try to Debian GNU/Linux |confuse me with the facts. [EMAIL PROTECTED] |-- Indiana Senator Earl Landgrebe http://people.debian.org/~branden/ | signature.asc Description: Digital signature
Bug#280384: X11 crashing on 2.4.28
OK, the machine made all the X packages and I installed them with no problems. It does look like Richard's patch works, in that, according to the XFree86 log, the loader now correctly loads pcidata, which goes on to scan the PCI bus as it's supposed to. I think it should work with later 2.4 kernels now, although I haven't tested it. I'm willing to provide the packages I built if somebody wants to try them, but I can't put them up for ftp (we don't allow ftp servers here). Working with 2.6 kernels is another problem, at least for my E250. Now startx grinds to a halt with the dreaded "no screens found", and indeed the log does't have it finding my display adaptor in the PCI scan. I suspect this is because 2.6 adds domains to the PCI system, and for totally unexplained reasons, my display adaptor is on domain 0001 instead of , and it doesn't look like that gets scanned. But that's for another bug report. Thank you, Richard. I think it's fixed; we can be more certain once somebody tests it. .Ron -- Ron Murray ([EMAIL PROTECTED]) http://www.rjmx.net/~ron GPG Public Key Fingerprint: F2C1 FC47 5EF7 0317 133C D66B 8ADA A3C4 D86C 74DE
Bug#280384: X11 crashing on 2.4.28
At Thu, 02 Dec 2004 22:41:46 +, Richard Mortimer <[EMAIL PROTECTED]> wrote: > > > On Thu, 2004-12-02 at 19:27, Ron Murray wrote: > > At Thu, 02 Dec 2004 13:45:59 -0500, > > Ron Murray wrote: > >We have a minor problem. Richard's patch seems to refer to a > > pristine xfree86-4.3.0 source. > > Damn! There are two similar #if defined lines. I made the patch against > the wrong one! > > I also accept that I did make the patch against pristine sources - > although in this case it means that you spotted my mistake. > > I still stand by my analysis. Hopefully the new patch (below) will work. > Note I've taken the same approach as the one that my original patch > clashed with. Basically I've removed the check for ia64 because I'm > assuming that the non-executable issue could in future apply to all > linux versions. > > Richard Yep, I agree that you've probably found the problem. After I wrote my previous post, I did some poking around with gdb on the XFree86 executable. I found a sequence of bytes that looked a lot like the ones you posted earlier, a little further on than you had (but my current copy of XFree86 has lots of debugging code inbuilt). They even had a call to malloc() in the middle of them. gdb claimed that the code was in the middle of ELFLoadModule(), so I looked, and there it was, complete with the same #ifdef you found earlier. I set up the patch, started the build, and went home. With any luck, I'll have a new (and hopefully functional) set of X packages when I get to work in the morning. Only difference was that I didn't turn it on for all Linux, just for ia86 and sparc. Wasn't sure whether it was a good idea or not. I'll let everyone know how it went. Thanks for finding it. .Ron -- Ron Murray ([EMAIL PROTECTED]) http://www.rjmx.net/~ron GPG Public Key Fingerprint: F2C1 FC47 5EF7 0317 133C D66B 8ADA A3C4 D86C 74DE
Bug#280384: X11 crashing on 2.4.28
On Thu, 2004-12-02 at 19:27, Ron Murray wrote: > At Thu, 02 Dec 2004 13:45:59 -0500, > Ron Murray wrote: >We have a minor problem. Richard's patch seems to refer to a > pristine xfree86-4.3.0 source. Damn! There are two similar #if defined lines. I made the patch against the wrong one! I also accept that I did make the patch against pristine sources - although in this case it means that you spotted my mistake. I still stand by my analysis. Hopefully the new patch (below) will work. Note I've taken the same approach as the one that my original patch clashed with. Basically I've removed the check for ia64 because I'm assuming that the non-executable issue could in future apply to all linux versions. Richard --- xc/programs/Xserver/hw/xfree86/loader/elfloader.c.orig 2004-12-02 22:29:26.0 + +++ xc/programs/Xserver/hw/xfree86/loader/elfloader.c 2004-12-02 22:38:37.0 + @@ -2937,7 +2937,7 @@ ErrorF( "Unable to allocate ELF sections\n" ); return NULL; } -# if defined(linux) && defined(__ia64__) || defined(__OpenBSD__) +# if defined(linux) || defined(__OpenBSD__) { unsigned long page_size = getpagesize(); unsigned long round; -- [EMAIL PROTECTED]
Bug#280384: X11 crashing on 2.4.28
At Thu, 02 Dec 2004 13:45:59 -0500, Ron Murray wrote: > > Anyone fancy compiling a new xserver binary? > > > >I'll set one going before I leave work this afternoon. Should have > completed by tomorrow morning. > We have a minor problem. Richard's patch seems to refer to a pristine xfree86-4.3.0 source. When I came to check the patch location on a build tree that had had the Debian patches applied, I found it to be quite different. Specifically, the line Richard wanted to change was now # if defined(linux) || defined(__OpenBSD__) instead of # if defined(linux) && defined(__ia64__) || defined(__OpenBSD__) Clearly, there's a Debian patch involved here. I found it at debian/patches/071_nonexecutable_malloced_mem.diff and it goes: > $Id: 071_nonexecutable_malloced_mem.diff 1044 2004-02-16 17:40:33Z branden $ > > This patch fixes the assumption that data returned by malloc() is > executable. In upstream revision 1.43, the assumption was fixed for > ia64 only. We understand it is Linus' position that programs that > assume data to be executable are broken, so we enable this code for > all Linux platforms. > > Original patch (before upstream applied its own version) was by David > Mosberger. > diff -urN xc/programs/Xserver/hw/xfree86/loader/elfloader.c > xc.new/programs/Xserver/hw/xfree86/loader/elfloader.c > --- xc/programs/Xserver/hw/xfree86/loader/elfloader.c 2004-02-07 > 17:33:29.0 -0500 > +++ xc.new/programs/Xserver/hw/xfree86/loader/elfloader.c > 2004-02-07 17:29:03.0 -0500 > @@ -957,7 +957,7 @@ > ErrorF( "ELFCreateGOT() Unable to reallocate memory\n" > ); > return FALSE; > } > -# if defined(linux) && defined(__ia64__) || defined(__OpenBSD__) > +# if defined(linux) || defined(__OpenBSD__) > { > unsigned long page_size = getpagesize(); > unsigned long round; ... which would indicate that Richard's suggestion is already in the current Debian package. I'd made a build log when I built the package here, and I have > Applying patch debian/patches/071_nonexecutable_malloced_mem.diff ... > successful. in it, so I'm sure it's in the build. Richard, does this look likely? Are there any other places that could stuff up the exec bit? .Ron -- Ron Murray ([EMAIL PROTECTED]) http://www.rjmx.net/~ron GPG Public Key Fingerprint: F2C1 FC47 5EF7 0317 133C D66B 8ADA A3C4 D86C 74DE
Bug#280384: X11 crashing on 2.4.28
At Thu, 02 Dec 2004 17:02:58 +, Richard Mortimer <[EMAIL PROTECTED]> wrote: > > Ok, I think that I've found the problem. The XFree86 binary does its own > object loading and on sparc it is failing to set the PROT_EXEC bit when > mapping executable code. This is falling over a change in the kernel > which checks the executable bit and gives a Segmentation Fault. > > Full rationale, explanation and proposed patch below. ... Wow. Well done! That's certainly consistent with what I see. > > Anyone fancy compiling a new xserver binary? > I'll set one going before I leave work this afternoon. Should have completed by tomorrow morning. .Ron -- Ron Murray ([EMAIL PROTECTED]) http://www.rjmx.net/~ron GPG Public Key Fingerprint: F2C1 FC47 5EF7 0317 133C D66B 8ADA A3C4 D86C 74DE
Bug#280384: X11 crashing on 2.4.28
Ok, I think that I've found the problem. The XFree86 binary does its own object loading and on sparc it is failing to set the PROT_EXEC bit when mapping executable code. This is falling over a change in the kernel which checks the executable bit and gives a Segmentation Fault. Full rationale, explanation and proposed patch below. Richard I was looking through the changes between 2.4.27 and 2.4.28 and there is a patch that adds a check that executed code is actually mapped as executable (one bit of it is) diff -urN linux-2.4.27/arch/sparc64/mm/fault.c linux-2.4.28/arch/sparc64/mm/fault.c --- linux-2.4.27/arch/sparc64/mm/fault.c2004-08-07 16:26:04.0 -0700 +++ linux-2.4.28/arch/sparc64/mm/fault.c2004-11-17 03:54:21.156379721 -0800 @@ -404,6 +404,16 @@ */ good_area: si_code = SEGV_ACCERR; + + /* If we took a ITLB miss on a non-executable page, catch +* that here. +*/ + if ((fault_code & FAULT_CODE_ITLB) && !(vma->vm_flags & VM_EXEC)) { + BUG_ON(address != regs->tpc); + BUG_ON(regs->tstate & TSTATE_PRIV); + goto bad_area; + } + if (fault_code & FAULT_CODE_WRITE) { if (!(vma->vm_flags & VM_WRITE)) goto bad_area; Now given that this reports a SIGSEGV if you hit this issue (see SEGV_ACCERR at the top of the patch) I figured that this would be something that could be triggered. Now looking at the broken strace from 2.4.28 we see two mmaps during the loading of module pcidata. These correspond to the text(code) and data sections of the binary. mmap(NULL, 163840, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x70272000 lseek(5, 229836, SEEK_SET) = 229836 read(5, "\0pci_vendor_003d\0pci_vendor_0e11"..., 157024) = 157024 brk(0) = 0x274000 brk(0x296000) = 0x296000 mmap(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7029a000 lseek(5, 380, SEEK_SET) = 380 read(5, "\201\303\340\10\220\20 \1\201\303\340\10\1\0\0\0\235\343"..., 1612) = 1612 Note that neither has PROT_EXEC set in the mmap. The second one is the text section that really needs it. Now looking at the XFree86 code in xc/programs/Xserver/hw/xfree86/loader/elfloader.c This gets memory for the data in one of two ways (chosen at compile time): xf86loadermalloc - actually a call to the glibc2 malloc or mmap. The mmap specifies PROT_EXEC but I've disassembled the XFree86 binary and it seems to use the xf86loadermalloc option. 77514: 90 00 40 08 add %g1, %o0, %o0 77518: 40 05 69 49 call 0x1d1a3c 7751c: d0 24 60 48 st %o0, [ %l1 + 0x48 ] 77520: 84 10 00 08 mov %o0, %g2 77524: 80 a2 20 00 cmp %o0, 0 77528: 02 80 00 77 be 0x77704 Apologies to those who don't read SPARC assembler! The call at 77518 is a call to malloc (from the symbol table) 001d1a3c DF *UND* 0234 GLIBC_2.0 malloc I'm guessing that malloc doesn't set PROT_EXEC (people generally don't want it and it would create a security risk). Now in the elfloader.c file there is a bit of conditional code for ia64 and OpenBSD that does an mprotect to add PROT_EXEC to the code. So it looks quite clear to me that we need to do the same for sparc. i.e. apply the following patch (untested I'm afraid) --- xc/programs/Xserver/hw/xfree86/loader/elfloader.c.orig 2004-12-02 16:56:31.0 + +++ xc/programs/Xserver/hw/xfree86/loader/elfloader.c 2004-12-02 16:57:42.0 + @@ -893,7 +893,7 @@ ErrorF( "ELFCreateGOT() Unable to reallocate memory\n" ); return FALSE; } -# if defined(linux) && defined(__ia64__) || defined(__OpenBSD__) +# if defined(linux) && (defined(__ia64__) || defined(__sparc__)) || defined(__OpenBSD__) { unsigned long page_size = getpagesize(); unsigned long round; Anyone fancy compiling a new xserver binary? On Thu, 2004-12-02 at 06:23, Jurzitza, Dieter wrote: > Dear listmembers, > I can confirm for my U60 that the XFree86-debug server comes up on 2.4.28. So > I seem to be consistent with what Admar said and what Ron has been saying. > What makes me wonder, though, is why does the binary loader work with 2.4.27 > and does not work with 2.4.28. > And, moreover, if it is a loader issue it seems more plausible to me that I > can observe additional side effects on 2.4.28 not being related to X11 (like > very long reaction times on ping / ssh requests, not settling a network > connection for quite a while) > A propably dumb question: > is that binary loader a simple file? would it be possible to get that loader > from another version (like Debian Woody), or is it buried deep down in the > kernel? > Thank you for your inputs, > take care > > > > Dieter Jurzitza -- [EMAIL PROTECTED]
Bug#280384: X11 crashing on 2.4.28
Dear listmembers, I can confirm for my U60 that the XFree86-debug server comes up on 2.4.28. So I seem to be consistent with what Admar said and what Ron has been saying. What makes me wonder, though, is why does the binary loader work with 2.4.27 and does not work with 2.4.28. And, moreover, if it is a loader issue it seems more plausible to me that I can observe additional side effects on 2.4.28 not being related to X11 (like very long reaction times on ping / ssh requests, not settling a network connection for quite a while) A propably dumb question: is that binary loader a simple file? would it be possible to get that loader from another version (like Debian Woody), or is it buried deep down in the kernel? Thank you for your inputs, take care Dieter Jurzitza -- HARMAN BECKER AUTOMOTIVE SYSTEMS Dr.-Ing. Dieter Jurzitza Manager Hardware Systems System Development Industriegebiet Ittersbach Becker-Göring Str. 16 D-76307 Karlsbad / Germany Phone: +49 (0)7248 71-1577 Fax: +49 (0)7248 71-1216 eMail: [EMAIL PROTECTED] Internet: http://www.becker.de *** Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und loeschen Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the contents in this e-mail is strictly forbidden. ***