Re: Stack size on 64-bit Cygwin
On Aug 19 14:36, Corinna Vinschen wrote: > On Aug 19 07:43, Ryan Johnson wrote: > > On 19/08/2013 7:39 AM, Corinna Vinschen wrote: > > >On Aug 19 07:04, Ryan Johnson wrote: > > >>So maybe emacs just had the incredibly bad luck to alloca() a large > > >>buffer right at end-of-stack and then somehow managed to skip over > > >>the 4 guard pages when accessing it? > > >That's unlikely since alloca is designed to probe the stack in 4K > > >steps. And STATUS_STACK_OVERFLOW is translated to a SEGV by Cygwin's > > >exception handler. > > ... and yet somehow emacs managed to get around that protection > > (unintentially), leading to all that fun over the last week. What > > went wrong? > > Good question. I don't know. And then again, Emacs is not exactly an STC for *any* problem... Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat pgpmnKgMXirXX.pgp Description: PGP signature
Re: Stack size on 64-bit Cygwin
On Aug 19 07:43, Ryan Johnson wrote: > On 19/08/2013 7:39 AM, Corinna Vinschen wrote: > >On Aug 19 07:04, Ryan Johnson wrote: > >>On 19/08/2013 6:49 AM, Ryan Johnson wrote: > >>>One thing I don't understand, though: shouldn't a stack overflow > >>>normally manifest as a seg fault when trying to access the invalid > >>>addresses, rather than silent memory corruption? > >That would be helpful. > > > >>>However, /proc/pid/maps for emacs shows: > 0001-0002 rw-s : 0 > [win heap 1 default shared] > 0002-0003 rw-s : 0 [win heap 2 default shared] > 0003-001E4000 ===p : 0 [stack (tid 4896)] > 001E4000-001E6000 rw-g 001B4000 : 0 [stack (tid 4896)] > 001E6000-0023 rw-p 001B6000 : 0 [stack (tid 4896)] > >>>GDB reports that thread 4896 is the main thread... so I guess > >>>Windows doesn't reserve a red zone around its stack, but instead > >>>chooses to place the main thread stack right next to the > >>>fully-mapped global shared heap to maximize the potential for Fun? > >Right. I have no idea what the two shared mem regions preceeding the > >stack are good for, though. > > > > > >>Some googling turns up > >>http://comments.gmane.org/gmane.comp.java.openjdk.hotspot.runtime.devel/7706 > >>>Windows only uses reserved but only partially committed memory for its > >>>stacks. In order to detect when to > >>>commit more stack, it installs a one-shot guard page (btw the same type > >>>of guard page that is used for the > >>>hotspot yellow and red zone) right at the edge of the currently commited > >>>stack zone. When a thread accesses > >>>this guard page an exception is thrown which Windows catches internally, > >>>commits more stack and > >>>re-establishes the one-shot guard page at the new edge of the commited > >>>zone. When Windows detects such an > >>>exception inside the _last 4 pages_ of a stack (I couldn't find any > >>>documentation for that on MSDN, I found > >>>this value from manually testing on several Windows machines with 4k stack > >>>pages) it throws a STACK_OVERFLOW_EXCEPTION. > >>So maybe emacs just had the incredibly bad luck to alloca() a large > >>buffer right at end-of-stack and then somehow managed to skip over > >>the 4 guard pages when accessing it? > >That's unlikely since alloca is designed to probe the stack in 4K > >steps. And STATUS_STACK_OVERFLOW is translated to a SEGV by Cygwin's > >exception handler. > ... and yet somehow emacs managed to get around that protection > (unintentially), leading to all that fun over the last week. What > went wrong? Good question. I don't know. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat pgpjOlTUlqTn9.pgp Description: PGP signature
Re: Stack size on 64-bit Cygwin
On 19/08/2013 7:39 AM, Corinna Vinschen wrote: On Aug 19 07:04, Ryan Johnson wrote: On 19/08/2013 6:49 AM, Ryan Johnson wrote: One thing I don't understand, though: shouldn't a stack overflow normally manifest as a seg fault when trying to access the invalid addresses, rather than silent memory corruption? That would be helpful. However, /proc/pid/maps for emacs shows: 0001-0002 rw-s : 0 [win heap 1 default shared] 0002-0003 rw-s : 0 [win heap 2 default shared] 0003-001E4000 ===p : 0 [stack (tid 4896)] 001E4000-001E6000 rw-g 001B4000 : 0 [stack (tid 4896)] 001E6000-0023 rw-p 001B6000 : 0 [stack (tid 4896)] GDB reports that thread 4896 is the main thread... so I guess Windows doesn't reserve a red zone around its stack, but instead chooses to place the main thread stack right next to the fully-mapped global shared heap to maximize the potential for Fun? Right. I have no idea what the two shared mem regions preceeding the stack are good for, though. Some googling turns up http://comments.gmane.org/gmane.comp.java.openjdk.hotspot.runtime.devel/7706 Windows only uses reserved but only partially committed memory for its stacks. In order to detect when to commit more stack, it installs a one-shot guard page (btw the same type of guard page that is used for the hotspot yellow and red zone) right at the edge of the currently commited stack zone. When a thread accesses this guard page an exception is thrown which Windows catches internally, commits more stack and re-establishes the one-shot guard page at the new edge of the commited zone. When Windows detects such an exception inside the _last 4 pages_ of a stack (I couldn't find any documentation for that on MSDN, I found this value from manually testing on several Windows machines with 4k stack pages) it throws a STACK_OVERFLOW_EXCEPTION. So maybe emacs just had the incredibly bad luck to alloca() a large buffer right at end-of-stack and then somehow managed to skip over the 4 guard pages when accessing it? That's unlikely since alloca is designed to probe the stack in 4K steps. And STATUS_STACK_OVERFLOW is translated to a SEGV by Cygwin's exception handler. ... and yet somehow emacs managed to get around that protection (unintentially), leading to all that fun over the last week. What went wrong? Ryan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Stack size on 64-bit Cygwin
On Aug 19 07:04, Ryan Johnson wrote: > On 19/08/2013 6:49 AM, Ryan Johnson wrote: > >One thing I don't understand, though: shouldn't a stack overflow > >normally manifest as a seg fault when trying to access the invalid > >addresses, rather than silent memory corruption? That would be helpful. > >However, /proc/pid/maps for emacs shows: > >>0001-0002 rw-s : 0 > >>[win heap 1 default shared] > >>0002-0003 rw-s : 0 [win heap 2 default shared] > >>0003-001E4000 ===p : 0 [stack (tid 4896)] > >>001E4000-001E6000 rw-g 001B4000 : 0 [stack (tid 4896)] > >>001E6000-0023 rw-p 001B6000 : 0 [stack (tid 4896)] > >GDB reports that thread 4896 is the main thread... so I guess > >Windows doesn't reserve a red zone around its stack, but instead > >chooses to place the main thread stack right next to the > >fully-mapped global shared heap to maximize the potential for Fun? Right. I have no idea what the two shared mem regions preceeding the stack are good for, though. > Some googling turns up > http://comments.gmane.org/gmane.comp.java.openjdk.hotspot.runtime.devel/7706 > >Windows only uses reserved but only partially committed memory for its > >stacks. In order to detect when to > >commit more stack, it installs a one-shot guard page (btw the same type of > >guard page that is used for the > >hotspot yellow and red zone) right at the edge of the currently commited > >stack zone. When a thread accesses > >this guard page an exception is thrown which Windows catches internally, > >commits more stack and > >re-establishes the one-shot guard page at the new edge of the commited zone. > >When Windows detects such an > >exception inside the _last 4 pages_ of a stack (I couldn't find any > >documentation for that on MSDN, I found > >this value from manually testing on several Windows machines with 4k stack > >pages) it throws a STACK_OVERFLOW_EXCEPTION. > So maybe emacs just had the incredibly bad luck to alloca() a large > buffer right at end-of-stack and then somehow managed to skip over > the 4 guard pages when accessing it? That's unlikely since alloca is designed to probe the stack in 4K steps. And STATUS_STACK_OVERFLOW is translated to a SEGV by Cygwin's exception handler. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat pgp8I2w0EL_I6.pgp Description: PGP signature
Re: Stack size on 64-bit Cygwin
On 19/08/2013 6:49 AM, Ryan Johnson wrote: On 19/08/2013 5:32 AM, Corinna Vinschen wrote: On Aug 16 16:49, Ken Brown wrote: The problem that has been discussed at length in the thread "64-bit emacs crashes a lot" appears to have been solved on the emacs-devel list. (I say "appears to" because I'm waiting for Ryan to confirm this.) The problem went away for me when I built emacs with 'LDFLAGS=-Wl,--stack,4194304'. I'm wondering if it's just that emacs needs an unusually big stack or if the default stack size on 64-bit Cygwin should be increased for all applications. I noticed that ulimit -s gives 2025 on both 32-bit Cygwin and 64-bit Cygwin. Shouldn't 64-bit applications need a larger stack than 32-bit applications in general? From my POV, if you have a stack-active application, just add the aforementioned --stack linker option, or call peflags -x after the build. The latter can be done any time FYI, I just tried upping the stack size on /usr/bin/emacs-nox, but it still crashes. Most likely because the damage was already done during bootstrap, when it has much larger memory requirements than normal. Still no crashes so far in the version I linked with --stack, though. One thing I don't understand, though: shouldn't a stack overflow normally manifest as a seg fault when trying to access the invalid addresses, rather than silent memory corruption? However, /proc/pid/maps for emacs shows: 0001-0002 rw-s : 0 [win heap 1 default shared] 0002-0003 rw-s : 0 [win heap 2 default shared] 0003-001E4000 ===p : 0 [stack (tid 4896)] 001E4000-001E6000 rw-g 001B4000 : 0 [stack (tid 4896)] 001E6000-0023 rw-p 001B6000 : 0 [stack (tid 4896)] GDB reports that thread 4896 is the main thread... so I guess Windows doesn't reserve a red zone around its stack, but instead chooses to place the main thread stack right next to the fully-mapped global shared heap to maximize the potential for Fun? Some googling turns up http://comments.gmane.org/gmane.comp.java.openjdk.hotspot.runtime.devel/7706 Windows only uses reserved but only partially committed memory for its stacks. In order to detect when to commit more stack, it installs a one-shot guard page (btw the same type of guard page that is used for the hotspot yellow and red zone) right at the edge of the currently commited stack zone. When a thread accesses this guard page an exception is thrown which Windows catches internally, commits more stack and re-establishes the one-shot guard page at the new edge of the commited zone. When Windows detects such an exception inside the _last 4 pages_ of a stack (I couldn't find any documentation for that on MSDN, I found this value from manually testing on several Windows machines with 4k stack pages) it throws a STACK_OVERFLOW_EXCEPTION. So maybe emacs just had the incredibly bad luck to alloca() a large buffer right at end-of-stack and then somehow managed to skip over the 4 guard pages when accessing it? Very strange... Ryan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Stack size on 64-bit Cygwin
On 19/08/2013 5:32 AM, Corinna Vinschen wrote: On Aug 16 16:49, Ken Brown wrote: The problem that has been discussed at length in the thread "64-bit emacs crashes a lot" appears to have been solved on the emacs-devel list. (I say "appears to" because I'm waiting for Ryan to confirm this.) The problem went away for me when I built emacs with 'LDFLAGS=-Wl,--stack,4194304'. I'm wondering if it's just that emacs needs an unusually big stack or if the default stack size on 64-bit Cygwin should be increased for all applications. I noticed that ulimit -s gives 2025 on both 32-bit Cygwin and 64-bit Cygwin. Shouldn't 64-bit applications need a larger stack than 32-bit applications in general? From my POV, if you have a stack-active application, just add the aforementioned --stack linker option, or call peflags -x after the build. The latter can be done any time FYI, I just tried upping the stack size on /usr/bin/emacs-nox, but it still crashes. Most likely because the damage was already done during bootstrap, when it has much larger memory requirements than normal. Still no crashes so far in the version I linked with --stack, though. One thing I don't understand, though: shouldn't a stack overflow normally manifest as a seg fault when trying to access the invalid addresses, rather than silent memory corruption? However, /proc/pid/maps for emacs shows: 0001-0002 rw-s : 0 [win heap 1 default shared] 0002-0003 rw-s : 0 [win heap 2 default shared] 0003-001E4000 ===p : 0 [stack (tid 4896)] 001E4000-001E6000 rw-g 001B4000 : 0 [stack (tid 4896)] 001E6000-0023 rw-p 001B6000 : 0 [stack (tid 4896)] GDB reports that thread 4896 is the main thread... so I guess Windows doesn't reserve a red zone around its stack, but instead chooses to place the main thread stack right next to the fully-mapped global shared heap to maximize the potential for Fun? Ryan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Stack size on 64-bit Cygwin
On 8/19/2013 5:32 AM, Corinna Vinschen wrote: On Aug 16 16:49, Ken Brown wrote: The problem that has been discussed at length in the thread "64-bit emacs crashes a lot" appears to have been solved on the emacs-devel list. (I say "appears to" because I'm waiting for Ryan to confirm this.) The problem went away for me when I built emacs with 'LDFLAGS=-Wl,--stack,4194304'. I'm wondering if it's just that emacs needs an unusually big stack or if the default stack size on 64-bit Cygwin should be increased for all applications. I noticed that ulimit -s gives 2025 on both 32-bit Cygwin and 64-bit Cygwin. Shouldn't 64-bit applications need a larger stack than 32-bit applications in general? Well, in fact 2 Megs is a pretty big stack to begin with. If you check the Windows executables in C:\Windows\system32, you'll notice that a predominant number of them have their stacksize set to <= 1 Meg. Also, if you don't set the default stack size explicitely when building applications with VC++, the default stacksize will be set to 1 Meg on both platforms, x86 and x64. So, by setting the default stacksize to 2 Megs, gcc is already leaning towards the safe side and it's *much* more than most applications really need. From my POV, if you have a stack-active application, just add the aforementioned --stack linker option, or call peflags -x after the build. The latter can be done any time, for instance: tcsh$ peflags -x /bin/bash /bin/bash: stack reserve size : 2097152 (0x20) bytes tcsh$ bash -c 'ulimit -s' 2025 tcsh$ peflags -x0x40 /bin/bash /bin/bash: stack reserve size : 4194304 (0x40) bytes tcsh$ bash -c 'ulimit -s' 4073 OK, thanks. I'll just use the --stack option the next time I rebuild emacs. But it's good to know that users can change this themselves with peflags. Ken -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Stack size on 64-bit Cygwin
On Aug 16 16:49, Ken Brown wrote: > The problem that has been discussed at length in the thread "64-bit > emacs crashes a lot" appears to have been solved on the emacs-devel > list. (I say "appears to" because I'm waiting for Ryan to confirm > this.) The problem went away for me when I built emacs with > 'LDFLAGS=-Wl,--stack,4194304'. I'm wondering if it's just that > emacs needs an unusually big stack or if the default stack size on > 64-bit Cygwin should be increased for all applications. > > I noticed that ulimit -s gives 2025 on both 32-bit Cygwin and 64-bit > Cygwin. Shouldn't 64-bit applications need a larger stack than > 32-bit applications in general? Well, in fact 2 Megs is a pretty big stack to begin with. If you check the Windows executables in C:\Windows\system32, you'll notice that a predominant number of them have their stacksize set to <= 1 Meg. Also, if you don't set the default stack size explicitely when building applications with VC++, the default stacksize will be set to 1 Meg on both platforms, x86 and x64. So, by setting the default stacksize to 2 Megs, gcc is already leaning towards the safe side and it's *much* more than most applications really need. From my POV, if you have a stack-active application, just add the aforementioned --stack linker option, or call peflags -x after the build. The latter can be done any time, for instance: tcsh$ peflags -x /bin/bash /bin/bash: stack reserve size : 2097152 (0x20) bytes tcsh$ bash -c 'ulimit -s' 2025 tcsh$ peflags -x0x40 /bin/bash /bin/bash: stack reserve size : 4194304 (0x40) bytes tcsh$ bash -c 'ulimit -s' 4073 Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat pgp6h0zMDZfwU.pgp Description: PGP signature
Re: Stack size on 64-bit Cygwin
On 16/08/2013 4:49 PM, Ken Brown wrote: The problem that has been discussed at length in the thread "64-bit emacs crashes a lot" appears to have been solved on the emacs-devel list. (I say "appears to" because I'm waiting for Ryan to confirm this.) WJFFM so far (fingers crossed!) The problem went away for me when I built emacs with 'LDFLAGS=-Wl,--stack,4194304'. I'm wondering if it's just that emacs needs an unusually big stack or if the default stack size on 64-bit Cygwin should be increased for all applications. I could easily imagine running into trouble by doubling pointer sizes, if GC calls routinely reach 10k+ stack frames deep like somebody mentioned a couple days ago... Ryan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple