Re: Sparc64 rthreads Instablilty

2024-02-16 Thread Martin Pieuchot
On 15/02/24(Thu) 20:06, Kurt Miller wrote: > On Feb 15, 2024, at 3:01 PM, Miod Vallat wrote: > > > >> Has been running for the last few hours without any issue. > >> OK claudio@ on that diff. > > > > But it's your diff! I only polished it a bit. > > > > I have also been testing various

Re: Sparc64 rthreads Instablilty

2024-02-15 Thread Kurt Miller
On Feb 15, 2024, at 3:01 PM, Miod Vallat wrote: > >> Has been running for the last few hours without any issue. >> OK claudio@ on that diff. > > But it's your diff! I only polished it a bit. > I have also been testing various versions of my test program for a few hours as well. It has not

Re: Sparc64 rthreads Instablilty

2024-02-15 Thread Miod Vallat
> Has been running for the last few hours without any issue. > OK claudio@ on that diff. But it's your diff! I only polished it a bit.

Re: Sparc64 rthreads Instablilty

2024-02-15 Thread Claudio Jeker
On Thu, Feb 15, 2024 at 04:38:07PM +0100, Claudio Jeker wrote: > On Thu, Feb 15, 2024 at 03:30:39PM +, Miod Vallat wrote: > > > > A lot of this points towards a register window error in__tfork() which > > > > affects only the parent process. > > > > > > I think the issue is that cpu_fork()

Re: Sparc64 rthreads Instablilty

2024-02-15 Thread Claudio Jeker
On Thu, Feb 15, 2024 at 03:30:39PM +, Miod Vallat wrote: > > > A lot of this points towards a register window error in__tfork() which > > > affects only the parent process. > > > > I think the issue is that cpu_fork() copies the u_pcb from parent to child > > and with that the special user

Re: Sparc64 rthreads Instablilty

2024-02-15 Thread Miod Vallat
> > A lot of this points towards a register window error in__tfork() which > > affects only the parent process. > > I think the issue is that cpu_fork() copies the u_pcb from parent to child > and with that the special user register windows that got spilled because > of some TL>0 fault. Since

Re: Sparc64 rthreads Instablilty

2024-02-15 Thread Claudio Jeker
On Tue, Nov 07, 2023 at 12:09:36PM +0100, Claudio Jeker wrote: > On Tue, Oct 31, 2023 at 01:18:42PM +0100, Claudio Jeker wrote: > > On Mon, Oct 23, 2023 at 11:06:53PM +, Kurt Miller wrote: > > > I experimented with adding a nanosleep after pthread_create() to > > > see if that would resolve

Re: Sparc64 rthreads Instablilty

2023-11-07 Thread Claudio Jeker
On Tue, Oct 31, 2023 at 01:18:42PM +0100, Claudio Jeker wrote: > On Mon, Oct 23, 2023 at 11:06:53PM +, Kurt Miller wrote: > > I experimented with adding a nanosleep after pthread_create() to > > see if that would resolve the segfault issue - it does, but it > > also exposed a new failure mode

Re: Sparc64 rthreads Instablilty

2023-10-25 Thread Kurt Miller
On Oct 25, 2023, at 4:26 AM, Claudio Jeker wrote: > > On Mon, Oct 23, 2023 at 11:06:53PM +, Kurt Miller wrote: >> I experimented with adding a nanosleep after pthread_create() to >> see if that would resolve the segfault issue - it does, but it >> also exposed a new failure mode on -current.

Re: Sparc64 rthreads Instablilty

2023-10-25 Thread Claudio Jeker
On Mon, Oct 23, 2023 at 11:06:53PM +, Kurt Miller wrote: > I experimented with adding a nanosleep after pthread_create() to > see if that would resolve the segfault issue - it does, but it > also exposed a new failure mode on -current. Every so often > the test program would not exit now.

Re: Sparc64 rthreads Instablilty

2023-10-23 Thread Kurt Miller
I experimented with adding a nanosleep after pthread_create() to see if that would resolve the segfault issue - it does, but it also exposed a new failure mode on -current. Every so often the test program would not exit now. Thinking it may be related to the detached threads I reworked the test

Re: Sparc64 rthreads Instablilty

2023-10-19 Thread Kurt Miller
On Aug 16, 2023, at 4:14 PM, Kurt Miller wrote: > >> On Aug 14, 2023, at 5:42 PM, Theo Buehler wrote: >> >> On Mon, Aug 14, 2023 at 08:47:22PM +, Miod Vallat wrote: >>> For what it's worth, I couldn't get your test to fail on a dual-cpu >>> sun4u. Either it's a sun4v-specific issue or it

Re: Sparc64 rthreads Instablilty

2023-09-02 Thread Theo Buehler
On Sat, Sep 02, 2023 at 11:52:28AM +0100, Martin Pieuchot wrote: > On 13/08/23(Sun) 22:59, Kurt Miller wrote: > > I’ve been hunting an intermittent jdk crash on sparc64 for some time now. > > Since egdb has not been up to the task, I created a small c program which > > reproduces the problem. This

Re: Sparc64 rthreads Instablilty

2023-09-02 Thread Martin Pieuchot
On 13/08/23(Sun) 22:59, Kurt Miller wrote: > I’ve been hunting an intermittent jdk crash on sparc64 for some time now. > Since egdb has not been up to the task, I created a small c program which > reproduces the problem. This partially mimics the jdk startup where a number > of detached threads

Re: Sparc64 rthreads Instablilty

2023-08-16 Thread Kurt Miller
> On Aug 14, 2023, at 5:42 PM, Theo Buehler wrote: > > On Mon, Aug 14, 2023 at 08:47:22PM +, Miod Vallat wrote: >> For what it's worth, I couldn't get your test to fail on a dual-cpu >> sun4u. Either it's a sun4v-specific issue or it needs many more cpus to >> trigger. > > I can reproduce

Re: Sparc64 rthreads Instablilty

2023-08-14 Thread Theo Buehler
On Mon, Aug 14, 2023 at 08:47:22PM +, Miod Vallat wrote: > For what it's worth, I couldn't get your test to fail on a dual-cpu > sun4u. Either it's a sun4v-specific issue or it needs many more cpus to > trigger. I can reproduce the segfault, but seemingly not the killed process on 16-cpu LDOM

Re: Sparc64 rthreads Instablilty

2023-08-14 Thread Miod Vallat
For what it's worth, I couldn't get your test to fail on a dual-cpu sun4u. Either it's a sun4v-specific issue or it needs many more cpus to trigger.

Re: Sparc64 rthreads Instablilty

2023-08-14 Thread Kurt Miller
> On Aug 13, 2023, at 11:38 PM, Kurt Miller wrote: > > The test program as an attachment as my mua mangles inline > code - sorry. Also on cvs:~kurt/startup.c > > The attachment had NTHREADS set to 400. That was for a one off test. All my testing has been at 40 threads. Also, I noticed when

Re: Sparc64 rthreads Instablilty

2023-08-13 Thread Kurt Miller
The test program as an attachment as my mua mangles inline code - sorry. Also on cvs:~kurt/startup.c startup.c Description: Binary data

Sparc64 rthreads Instablilty

2023-08-13 Thread Kurt Miller
I’ve been hunting an intermittent jdk crash on sparc64 for some time now. Since egdb has not been up to the task, I created a small c program which reproduces the problem. This partially mimics the jdk startup where a number of detached threads are created. When each thread is created the main