Re: Stack size on 64-bit Cygwin

2013-08-19 Thread Corinna Vinschen
On Aug 19 14:36, Corinna Vinschen wrote:
> On Aug 19 07:43, Ryan Johnson wrote:
> > On 19/08/2013 7:39 AM, Corinna Vinschen wrote:
> > >On Aug 19 07:04, Ryan Johnson wrote:
> > >>So maybe emacs just had the incredibly bad luck to alloca() a large
> > >>buffer right at end-of-stack and then somehow managed to skip over
> > >>the 4 guard pages when accessing it?
> > >That's unlikely since alloca is designed to probe the stack in 4K
> > >steps.  And STATUS_STACK_OVERFLOW is translated to a SEGV by Cygwin's
> > >exception handler.
> > ... and yet somehow emacs managed to get around that protection
> > (unintentially), leading to all that fun over the last week. What
> > went wrong?
> 
> Good question.  I don't know.

And then again, Emacs is not exactly an STC for *any* problem...


Corinna

-- 
Corinna Vinschen  Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat


pgpmnKgMXirXX.pgp
Description: PGP signature


Re: Stack size on 64-bit Cygwin

2013-08-19 Thread Corinna Vinschen
On Aug 19 07:43, Ryan Johnson wrote:
> On 19/08/2013 7:39 AM, Corinna Vinschen wrote:
> >On Aug 19 07:04, Ryan Johnson wrote:
> >>On 19/08/2013 6:49 AM, Ryan Johnson wrote:
> >>>One thing I don't understand, though: shouldn't a stack overflow
> >>>normally manifest as a seg fault when trying to access the invalid
> >>>addresses, rather than silent memory corruption?
> >That would be helpful.
> >
> >>>However, /proc/pid/maps for emacs shows:
> 0001-0002 rw-s  : 0
> [win heap 1 default shared]
> 0002-0003 rw-s  : 0 [win heap 2 default shared]
> 0003-001E4000 ===p  : 0 [stack (tid 4896)]
> 001E4000-001E6000 rw-g 001B4000 : 0 [stack (tid 4896)]
> 001E6000-0023 rw-p 001B6000 : 0 [stack (tid 4896)]
> >>>GDB reports that thread 4896 is the main thread... so I guess
> >>>Windows doesn't reserve a red zone around its stack, but instead
> >>>chooses to place the main thread stack right next to the
> >>>fully-mapped global shared heap to maximize the potential for Fun?
> >Right.  I have no idea what the two shared mem regions preceeding the
> >stack are good for, though.
> >
> >
> >>Some googling turns up
> >>http://comments.gmane.org/gmane.comp.java.openjdk.hotspot.runtime.devel/7706
> >>>Windows only uses reserved but only partially committed memory for its 
> >>>stacks. In order to detect when to
> >>>commit more stack, it installs  a one-shot guard page (btw the same type 
> >>>of guard page that is used for the
> >>>hotspot yellow and red zone) right at the edge of the currently commited 
> >>>stack zone. When a thread accesses
> >>>this guard page an exception is thrown which Windows catches internally, 
> >>>commits more stack and
> >>>re-establishes the one-shot guard page at the new edge of the commited 
> >>>zone. When Windows detects such an
> >>>exception inside the _last 4 pages_ of a stack (I couldn't find any 
> >>>documentation for that on MSDN, I found
> >>>this value from manually testing on several Windows machines with 4k stack 
> >>>pages) it throws a STACK_OVERFLOW_EXCEPTION.
> >>So maybe emacs just had the incredibly bad luck to alloca() a large
> >>buffer right at end-of-stack and then somehow managed to skip over
> >>the 4 guard pages when accessing it?
> >That's unlikely since alloca is designed to probe the stack in 4K
> >steps.  And STATUS_STACK_OVERFLOW is translated to a SEGV by Cygwin's
> >exception handler.
> ... and yet somehow emacs managed to get around that protection
> (unintentially), leading to all that fun over the last week. What
> went wrong?

Good question.  I don't know.


Corinna

-- 
Corinna Vinschen  Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat


pgpjOlTUlqTn9.pgp
Description: PGP signature


Re: Stack size on 64-bit Cygwin

2013-08-19 Thread Ryan Johnson

On 19/08/2013 7:39 AM, Corinna Vinschen wrote:

On Aug 19 07:04, Ryan Johnson wrote:

On 19/08/2013 6:49 AM, Ryan Johnson wrote:

One thing I don't understand, though: shouldn't a stack overflow
normally manifest as a seg fault when trying to access the invalid
addresses, rather than silent memory corruption?

That would be helpful.


However, /proc/pid/maps for emacs shows:

0001-0002 rw-s  : 0
[win heap 1 default shared]
0002-0003 rw-s  : 0 [win heap 2 default shared]
0003-001E4000 ===p  : 0 [stack (tid 4896)]
001E4000-001E6000 rw-g 001B4000 : 0 [stack (tid 4896)]
001E6000-0023 rw-p 001B6000 : 0 [stack (tid 4896)]

GDB reports that thread 4896 is the main thread... so I guess
Windows doesn't reserve a red zone around its stack, but instead
chooses to place the main thread stack right next to the
fully-mapped global shared heap to maximize the potential for Fun?

Right.  I have no idea what the two shared mem regions preceeding the
stack are good for, though.



Some googling turns up
http://comments.gmane.org/gmane.comp.java.openjdk.hotspot.runtime.devel/7706

Windows only uses reserved but only partially committed memory for its stacks. 
In order to detect when to
commit more stack, it installs  a one-shot guard page (btw the same type of 
guard page that is used for the
hotspot yellow and red zone) right at the edge of the currently commited stack 
zone. When a thread accesses
this guard page an exception is thrown which Windows catches internally, 
commits more stack and
re-establishes the one-shot guard page at the new edge of the commited zone. 
When Windows detects such an
exception inside the _last 4 pages_ of a stack (I couldn't find any 
documentation for that on MSDN, I found
this value from manually testing on several Windows machines with 4k stack 
pages) it throws a STACK_OVERFLOW_EXCEPTION.

So maybe emacs just had the incredibly bad luck to alloca() a large
buffer right at end-of-stack and then somehow managed to skip over
the 4 guard pages when accessing it?

That's unlikely since alloca is designed to probe the stack in 4K
steps.  And STATUS_STACK_OVERFLOW is translated to a SEGV by Cygwin's
exception handler.
... and yet somehow emacs managed to get around that protection 
(unintentially), leading to all that fun over the last week. What went 
wrong?


Ryan


--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Stack size on 64-bit Cygwin

2013-08-19 Thread Corinna Vinschen
On Aug 19 07:04, Ryan Johnson wrote:
> On 19/08/2013 6:49 AM, Ryan Johnson wrote:
> >One thing I don't understand, though: shouldn't a stack overflow
> >normally manifest as a seg fault when trying to access the invalid
> >addresses, rather than silent memory corruption?

That would be helpful.

> >However, /proc/pid/maps for emacs shows:
> >>0001-0002 rw-s  : 0
> >>[win heap 1 default shared]
> >>0002-0003 rw-s  : 0 [win heap 2 default shared]
> >>0003-001E4000 ===p  : 0 [stack (tid 4896)]
> >>001E4000-001E6000 rw-g 001B4000 : 0 [stack (tid 4896)]
> >>001E6000-0023 rw-p 001B6000 : 0 [stack (tid 4896)]
> >GDB reports that thread 4896 is the main thread... so I guess
> >Windows doesn't reserve a red zone around its stack, but instead
> >chooses to place the main thread stack right next to the
> >fully-mapped global shared heap to maximize the potential for Fun?

Right.  I have no idea what the two shared mem regions preceeding the
stack are good for, though.


> Some googling turns up
> http://comments.gmane.org/gmane.comp.java.openjdk.hotspot.runtime.devel/7706
> >Windows only uses reserved but only partially committed memory for its 
> >stacks. In order to detect when to
> >commit more stack, it installs  a one-shot guard page (btw the same type of 
> >guard page that is used for the
> >hotspot yellow and red zone) right at the edge of the currently commited 
> >stack zone. When a thread accesses
> >this guard page an exception is thrown which Windows catches internally, 
> >commits more stack and
> >re-establishes the one-shot guard page at the new edge of the commited zone. 
> >When Windows detects such an
> >exception inside the _last 4 pages_ of a stack (I couldn't find any 
> >documentation for that on MSDN, I found
> >this value from manually testing on several Windows machines with 4k stack 
> >pages) it throws a STACK_OVERFLOW_EXCEPTION.
> So maybe emacs just had the incredibly bad luck to alloca() a large
> buffer right at end-of-stack and then somehow managed to skip over
> the 4 guard pages when accessing it?

That's unlikely since alloca is designed to probe the stack in 4K
steps.  And STATUS_STACK_OVERFLOW is translated to a SEGV by Cygwin's
exception handler.


Corinna

-- 
Corinna Vinschen  Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat


pgp8I2w0EL_I6.pgp
Description: PGP signature


Re: Stack size on 64-bit Cygwin

2013-08-19 Thread Ryan Johnson

On 19/08/2013 6:49 AM, Ryan Johnson wrote:

On 19/08/2013 5:32 AM, Corinna Vinschen wrote:

On Aug 16 16:49, Ken Brown wrote:

The problem that has been discussed at length in the thread "64-bit
emacs crashes a lot" appears to have been solved on the emacs-devel
list.  (I say "appears to" because I'm waiting for Ryan to confirm
this.)  The problem went away for me when I built emacs with
'LDFLAGS=-Wl,--stack,4194304'.  I'm wondering if it's just that
emacs needs an unusually big stack or if the default stack size on
64-bit Cygwin should be increased for all applications.

I noticed that ulimit -s gives 2025 on both 32-bit Cygwin and 64-bit
Cygwin.  Shouldn't 64-bit applications need a larger stack than
32-bit applications in general?

 From my POV, if you have a stack-active application, just add the
aforementioned --stack linker option, or call peflags -x after the
build.  The latter can be done any time
FYI, I just tried upping the stack size on /usr/bin/emacs-nox, but it 
still crashes. Most likely because the damage was already done during 
bootstrap, when it has much larger memory requirements than normal.


Still no crashes so far in the version I linked with --stack, though.

One thing I don't understand, though: shouldn't a stack overflow 
normally manifest as a seg fault when trying to access the invalid 
addresses, rather than silent memory corruption?


However, /proc/pid/maps for emacs shows:
0001-0002 rw-s  : 0   [win 
heap 1 default shared]

0002-0003 rw-s  : 0 [win heap 2 default shared]
0003-001E4000 ===p  : 0 [stack (tid 4896)]
001E4000-001E6000 rw-g 001B4000 : 0 [stack (tid 4896)]
001E6000-0023 rw-p 001B6000 : 0 [stack (tid 4896)]
GDB reports that thread 4896 is the main thread... so I guess Windows 
doesn't reserve a red zone around its stack, but instead chooses to 
place the main thread stack right next to the fully-mapped global 
shared heap to maximize the potential for Fun?


Some googling turns up
http://comments.gmane.org/gmane.comp.java.openjdk.hotspot.runtime.devel/7706

Windows only uses reserved but only partially committed memory for its stacks. 
In order to detect when to
commit more stack, it installs  a one-shot guard page (btw the same type of 
guard page that is used for the
hotspot yellow and red zone) right at the edge of the currently commited stack 
zone. When a thread accesses
this guard page an exception is thrown which Windows catches internally, 
commits more stack and
re-establishes the one-shot guard page at the new edge of the commited zone. 
When Windows detects such an
exception inside the _last 4 pages_ of a stack (I couldn't find any 
documentation for that on MSDN, I found
this value from manually testing on several Windows machines with 4k stack 
pages) it throws a STACK_OVERFLOW_EXCEPTION.
So maybe emacs just had the incredibly bad luck to alloca() a large 
buffer right at end-of-stack and then somehow managed to skip over the 4 
guard pages when accessing it?


Very strange...


Ryan



--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Stack size on 64-bit Cygwin

2013-08-19 Thread Ryan Johnson

On 19/08/2013 5:32 AM, Corinna Vinschen wrote:

On Aug 16 16:49, Ken Brown wrote:

The problem that has been discussed at length in the thread "64-bit
emacs crashes a lot" appears to have been solved on the emacs-devel
list.  (I say "appears to" because I'm waiting for Ryan to confirm
this.)  The problem went away for me when I built emacs with
'LDFLAGS=-Wl,--stack,4194304'.  I'm wondering if it's just that
emacs needs an unusually big stack or if the default stack size on
64-bit Cygwin should be increased for all applications.

I noticed that ulimit -s gives 2025 on both 32-bit Cygwin and 64-bit
Cygwin.  Shouldn't 64-bit applications need a larger stack than
32-bit applications in general?

 From my POV, if you have a stack-active application, just add the
aforementioned --stack linker option, or call peflags -x after the
build.  The latter can be done any time
FYI, I just tried upping the stack size on /usr/bin/emacs-nox, but it 
still crashes. Most likely because the damage was already done during 
bootstrap, when it has much larger memory requirements than normal.


Still no crashes so far in the version I linked with --stack, though.

One thing I don't understand, though: shouldn't a stack overflow 
normally manifest as a seg fault when trying to access the invalid 
addresses, rather than silent memory corruption?


However, /proc/pid/maps for emacs shows:
0001-0002 rw-s  : 0   [win 
heap 1 default shared]
0002-0003 rw-s  : 0   [win 
heap 2 default shared]

0003-001E4000 ===p  : 0 [stack (tid 4896)]
001E4000-001E6000 rw-g 001B4000 : 0 [stack (tid 4896)]
001E6000-0023 rw-p 001B6000 : 0 [stack (tid 4896)]
GDB reports that thread 4896 is the main thread... so I guess Windows 
doesn't reserve a red zone around its stack, but instead chooses to 
place the main thread stack right next to the fully-mapped global shared 
heap to maximize the potential for Fun?


Ryan


--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Stack size on 64-bit Cygwin

2013-08-19 Thread Ken Brown

On 8/19/2013 5:32 AM, Corinna Vinschen wrote:

On Aug 16 16:49, Ken Brown wrote:

The problem that has been discussed at length in the thread "64-bit
emacs crashes a lot" appears to have been solved on the emacs-devel
list.  (I say "appears to" because I'm waiting for Ryan to confirm
this.)  The problem went away for me when I built emacs with
'LDFLAGS=-Wl,--stack,4194304'.  I'm wondering if it's just that
emacs needs an unusually big stack or if the default stack size on
64-bit Cygwin should be increased for all applications.

I noticed that ulimit -s gives 2025 on both 32-bit Cygwin and 64-bit
Cygwin.  Shouldn't 64-bit applications need a larger stack than
32-bit applications in general?


Well, in fact 2 Megs is a pretty big stack to begin with.  If you check
the Windows executables in C:\Windows\system32, you'll notice that a
predominant number of them have their stacksize set to <= 1 Meg.  Also,
if you don't set the default stack size explicitely when building
applications with VC++, the default stacksize will be set to 1 Meg on
both platforms, x86 and x64.

So, by setting the default stacksize to 2 Megs, gcc is already leaning
towards the safe side and it's *much* more than most applications really
need.  From my POV, if you have a stack-active application, just add the
aforementioned --stack linker option, or call peflags -x after the
build.  The latter can be done any time, for instance:

   tcsh$ peflags -x /bin/bash
   /bin/bash: stack reserve size  : 2097152 (0x20) bytes
   tcsh$ bash -c 'ulimit -s'
   2025
   tcsh$ peflags -x0x40 /bin/bash
   /bin/bash: stack reserve size  : 4194304 (0x40) bytes
   tcsh$ bash -c 'ulimit -s'
   4073


OK, thanks.  I'll just use the --stack option the next time I rebuild 
emacs.  But it's good to know that users can change this themselves with 
peflags.


Ken

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Stack size on 64-bit Cygwin

2013-08-19 Thread Corinna Vinschen
On Aug 16 16:49, Ken Brown wrote:
> The problem that has been discussed at length in the thread "64-bit
> emacs crashes a lot" appears to have been solved on the emacs-devel
> list.  (I say "appears to" because I'm waiting for Ryan to confirm
> this.)  The problem went away for me when I built emacs with
> 'LDFLAGS=-Wl,--stack,4194304'.  I'm wondering if it's just that
> emacs needs an unusually big stack or if the default stack size on
> 64-bit Cygwin should be increased for all applications.
> 
> I noticed that ulimit -s gives 2025 on both 32-bit Cygwin and 64-bit
> Cygwin.  Shouldn't 64-bit applications need a larger stack than
> 32-bit applications in general?

Well, in fact 2 Megs is a pretty big stack to begin with.  If you check
the Windows executables in C:\Windows\system32, you'll notice that a
predominant number of them have their stacksize set to <= 1 Meg.  Also,
if you don't set the default stack size explicitely when building
applications with VC++, the default stacksize will be set to 1 Meg on
both platforms, x86 and x64.

So, by setting the default stacksize to 2 Megs, gcc is already leaning
towards the safe side and it's *much* more than most applications really
need.  From my POV, if you have a stack-active application, just add the
aforementioned --stack linker option, or call peflags -x after the
build.  The latter can be done any time, for instance:

  tcsh$ peflags -x /bin/bash
  /bin/bash: stack reserve size  : 2097152 (0x20) bytes
  tcsh$ bash -c 'ulimit -s'
  2025
  tcsh$ peflags -x0x40 /bin/bash
  /bin/bash: stack reserve size  : 4194304 (0x40) bytes
  tcsh$ bash -c 'ulimit -s'
  4073


Corinna

-- 
Corinna Vinschen  Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat


pgp6h0zMDZfwU.pgp
Description: PGP signature


Re: Stack size on 64-bit Cygwin

2013-08-16 Thread Ryan Johnson

On 16/08/2013 4:49 PM, Ken Brown wrote:
The problem that has been discussed at length in the thread "64-bit 
emacs crashes a lot" appears to have been solved on the emacs-devel 
list.  (I say "appears to" because I'm waiting for Ryan to confirm this.) 

WJFFM so far (fingers crossed!)

The problem went away for me when I built emacs with 
'LDFLAGS=-Wl,--stack,4194304'.  I'm wondering if it's just that emacs 
needs an unusually big stack or if the default stack size on 64-bit 
Cygwin should be increased for all applications.
I could easily imagine running into trouble by doubling pointer sizes, 
if GC calls routinely reach 10k+ stack frames deep like somebody 
mentioned a couple days ago...


Ryan


--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple