Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-17 Thread Gregory Shimansky

Evgueni Brevnov wrote:

In other words we will observe the crash as we do now if sem_wait
completes unsuccessfully for whatever reason...


Well it shouldn't return an error except for signal, shouldn't it? Two 
possible other errors are EINVAL and EDEADLK which should never happen.


Maybe we should add an assertion after it that sem_wait was successful 
to catch this situation quickly, and it will be a good starting point 
for investigation.



On 11/17/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote:

Gregory,

The code which goes after sem_wait doesn't work properly if sem_wait
returns with an error code. So we need to either loop until sem_wait
returns successfully or adjust the code after sem_wait to handle
irregular cases.

Thanks
Evgueni

On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:
> Yes - that's why I was poking him to see the patch.  I was going to
> suggest something very similar.
>
> geir
>
>
> Gregory Shimansky wrote:
> > Evgueni Brevnov wrote:
> >> You can look at the change here
> >> http://issues.apache.org/jira/browse/HARMONY-2203
> >
> > Could someone who knowns classlib native code internals better 
than me

> > comment on this JIRA? I've added my comment from the general POV.
> >
> > I would change the loop to detect only signal interruption like
> >
> > while (sem_wait(&wakeUpASynchReporter) == -1 && errno == EINTR);
> >
> > Other than that I agree with the patch. I someone does not know, 
every
> > step in gdb also interrupts sem_wait calls, so such loops are a 
common

> > practice when using semaphores.
> >
> > If someone knows classlib internal logic with this asynchronous 
handlers

> > stuff please write your opinion.
> >
>






--
Gregory



Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-16 Thread Evgueni Brevnov

In other words we will observe the crash as we do now if sem_wait
completes unsuccessfully for whatever reason...

On 11/17/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote:

Gregory,

The code which goes after sem_wait doesn't work properly if sem_wait
returns with an error code. So we need to either loop until sem_wait
returns successfully or adjust the code after sem_wait to handle
irregular cases.

Thanks
Evgueni

On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:
> Yes - that's why I was poking him to see the patch.  I was going to
> suggest something very similar.
>
> geir
>
>
> Gregory Shimansky wrote:
> > Evgueni Brevnov wrote:
> >> You can look at the change here
> >> http://issues.apache.org/jira/browse/HARMONY-2203
> >
> > Could someone who knowns classlib native code internals better than me
> > comment on this JIRA? I've added my comment from the general POV.
> >
> > I would change the loop to detect only signal interruption like
> >
> > while (sem_wait(&wakeUpASynchReporter) == -1 && errno == EINTR);
> >
> > Other than that I agree with the patch. I someone does not know, every
> > step in gdb also interrupts sem_wait calls, so such loops are a common
> > practice when using semaphores.
> >
> > If someone knows classlib internal logic with this asynchronous handlers
> > stuff please write your opinion.
> >
>



Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-16 Thread Evgueni Brevnov

Gregory,

The code which goes after sem_wait doesn't work properly if sem_wait
returns with an error code. So we need to either loop until sem_wait
returns successfully or adjust the code after sem_wait to handle
irregular cases.

Thanks
Evgueni

On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:

Yes - that's why I was poking him to see the patch.  I was going to
suggest something very similar.

geir


Gregory Shimansky wrote:
> Evgueni Brevnov wrote:
>> You can look at the change here
>> http://issues.apache.org/jira/browse/HARMONY-2203
>
> Could someone who knowns classlib native code internals better than me
> comment on this JIRA? I've added my comment from the general POV.
>
> I would change the loop to detect only signal interruption like
>
> while (sem_wait(&wakeUpASynchReporter) == -1 && errno == EINTR);
>
> Other than that I agree with the patch. I someone does not know, every
> step in gdb also interrupts sem_wait calls, so such loops are a common
> practice when using semaphores.
>
> If someone knows classlib internal logic with this asynchronous handlers
> stuff please write your opinion.
>



Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-16 Thread Geir Magnusson Jr.
Yes - that's why I was poking him to see the patch.  I was going to 
suggest something very similar.


geir


Gregory Shimansky wrote:

Evgueni Brevnov wrote:

You can look at the change here
http://issues.apache.org/jira/browse/HARMONY-2203


Could someone who knowns classlib native code internals better than me 
comment on this JIRA? I've added my comment from the general POV.


I would change the loop to detect only signal interruption like

while (sem_wait(&wakeUpASynchReporter) == -1 && errno == EINTR);

Other than that I agree with the patch. I someone does not know, every 
step in gdb also interrupts sem_wait calls, so such loops are a common 
practice when using semaphores.


If someone knows classlib internal logic with this asynchronous handlers 
stuff please write your opinion.




Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-16 Thread Gregory Shimansky

Evgueni Brevnov wrote:

You can look at the change here
http://issues.apache.org/jira/browse/HARMONY-2203


Could someone who knowns classlib native code internals better than me 
comment on this JIRA? I've added my comment from the general POV.


I would change the loop to detect only signal interruption like

while (sem_wait(&wakeUpASynchReporter) == -1 && errno == EINTR);

Other than that I agree with the patch. I someone does not know, every 
step in gdb also interrupts sem_wait calls, so such loops are a common 
practice when using semaphores.


If someone knows classlib internal logic with this asynchronous handlers 
stuff please write your opinion.


--
Gregory



Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-16 Thread Evgueni Brevnov

You can look at the change here
http://issues.apache.org/jira/browse/HARMONY-2203

On 11/16/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote:

I haven't published it yet...will file a JIRA soon...

On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:
> ah. whew.
>
> can you point me to that change you made?
>
> geir
>
> Evgueni Brevnov wrote:
> > I'm not aware if classlib uses SIGUSR2. In this particular case
> > classlib (to be more precise it is the portlib module) does sem_wait
> > which is interrupted by TM's SIGUSR2 signal. I replaced "hysem_wait"
> > with "while (hysem_wait() != 0) {}". It helped to pass all tests.
> >
> > Evgueni
> >
> > On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:
> >> um... classlib uses SIGUSR2 as well?  Doesn't our thread manager use it?
> >>
> >> Evgueni Brevnov wrote:
> >> > Hey,
> >> >
> >> > Seems like the pretty old problem shows itself again. I'm talking
> >> > about SIGUSR2 signal :-(...Classlib's asynchronous signal reporter
> >> > uses system semaphores for synchronization purposes...and hysem_wait
> >> > is interrupted by the signal:
> >> >
> >> > (gdb) p perror("sym_wait error:")
> >> > sym_wait error:: Interrupted system call
> >> >
> >> > Do we have good (universal) solution for such cases?
> >> >
> >> > Thanks
> >> > Evgueni
> >> >
> >> > On 11/15/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:
> >> >>
> >> >>
> >> >> Gregory Shimansky wrote:
> >> >> > Evgueni Brevnov wrote:
> >> >> >> hmmm strange. The patch was tested on multi-processor system
> >> >> >> running SUSE9. I will check if the patch misses something.
> >> Anyway, we
> >> >> >> need to wait with the patch submission until we 100% sure how
> >> >> >> hythread_monitor_init should behave.
> >> >> >>
> >> >> >> Thanks
> >> >> >> Evgueni
> >> >> >>
> >> >> >> On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote:
> >> >> >>> On Friday 10 November 2006 17:45 Evgueni Brevnov wrote:
> >> >> >>> > Hi,
> >> >> >>> >
> >> >> >>> > While investigating deadlock scenario which is described in
> >> >> >>> > HARMONY-2006 I found out one interesting thing. It turned out
> >> >> that DRL
> >> >> >>> > implementation of hythread_monitor_init /
> >> >> >>> > hythread_monitor_init_with_name initializes and acquires a
> >> monitor.
> >> >> >>> > Original spec reads: "Acquire and initialize a new monitor
> >> from the
> >> >> >>> > threading library" AFAIU that doesn't mean to lock the
> >> >> monitor but
> >> >> >>> > get it from the threading library. So the hythread_monitor_init
> >> >> should
> >> >> >>> > not lock the monitor.
> >> >> >>> >
> >> >> >>> > Could somebody comment on that?
> >> >> >>>
> >> >> >>> It might be that semantic is different on different platforms
> >> >> which is
> >> >> >>> probably even worse. Your patch in HARMONY-2149 breaks nearly
> >> all of
> >> >> >>> acceptance tests on Linux while everything on Windows works (ok I
> >> >> >>> tested on
> >> >> >>> laptop with 1 processor while Linux was a HT server, sometimes
> >> it is
> >> >> >>> important for threading).
> >> >> >
> >> >> > I've tried to investigate the problem but didn't find the end of it
> >> >> yet.
> >> >> > The bug seems to be ubuntu specific (shall we maybe call this
> >> >> > distribution buggy and move on?).
> >> >>
> >> >> There is something odd about it, I'll admit...  Remember the EOMEM
> >> bugs
> >> >> I found in forking?
> >> >>
> >> >>
> >> >> I didn't reproduce it on
> >> >> > gentoo, all tests work just fine.
> >> >> >
> >> >> > The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE,
> >> >> > gc.PhantomReferenceTest, gc.WeakReferenceTest,
> >> >> stress.WeakHashMapTest VM
> >> >> > segfaults. The stack looks like an infinite recursion of 4 stack
> >> >> frames:
> >> >> >
> >> >> > #0  0xb6dcb814 in null_java_reference_handler (signum=11,
> >> >> > info=0xb71a503c, context=0xb71a50bc) at
> >> >> >
> >> >>
> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco
> >> >> > re/src/util/linux/signals_ia32.cpp:443
> >> >> > #1  
> >> >> > #2  0xb6dcc20a in get_stack_addr () at
> >> >> >
> >> >>
> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco
> >> >> > re/src/util/linux/signals_ia32.cpp:293
> >> >> > #3  0xb6dcb6cd in check_stack_overflow (info=0xb71a546c,
> >> uc=0xb71a54ec)
> >> >> > at
> >> >> >
> >> >>
> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco
> >> >> > re/src/util/linux/signals_ia32.cpp:399
> >> >> > #4  0xb6dcb900 in null_java_reference_handler (signum=11,
> >> >> > info=0xb71a546c, context=0xb71a54ec) at
> >> >> >
> >> >>
> >> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco
> >> >> > re/src/util/linux/signals_ia32.cpp:451
> >> >> >
> >> >> > and so on. The stack is very long. When I run VM with
> >> -Xtrace:signals I
> >> >> > get a very long log of messages that "NPE or SOE detected at
> >> ...". The
> >> >> > first time address always varies, but it ap

Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-15 Thread Evgueni Brevnov

I haven't published it yet...will file a JIRA soon...

On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:

ah. whew.

can you point me to that change you made?

geir

Evgueni Brevnov wrote:
> I'm not aware if classlib uses SIGUSR2. In this particular case
> classlib (to be more precise it is the portlib module) does sem_wait
> which is interrupted by TM's SIGUSR2 signal. I replaced "hysem_wait"
> with "while (hysem_wait() != 0) {}". It helped to pass all tests.
>
> Evgueni
>
> On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:
>> um... classlib uses SIGUSR2 as well?  Doesn't our thread manager use it?
>>
>> Evgueni Brevnov wrote:
>> > Hey,
>> >
>> > Seems like the pretty old problem shows itself again. I'm talking
>> > about SIGUSR2 signal :-(...Classlib's asynchronous signal reporter
>> > uses system semaphores for synchronization purposes...and hysem_wait
>> > is interrupted by the signal:
>> >
>> > (gdb) p perror("sym_wait error:")
>> > sym_wait error:: Interrupted system call
>> >
>> > Do we have good (universal) solution for such cases?
>> >
>> > Thanks
>> > Evgueni
>> >
>> > On 11/15/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:
>> >>
>> >>
>> >> Gregory Shimansky wrote:
>> >> > Evgueni Brevnov wrote:
>> >> >> hmmm strange. The patch was tested on multi-processor system
>> >> >> running SUSE9. I will check if the patch misses something.
>> Anyway, we
>> >> >> need to wait with the patch submission until we 100% sure how
>> >> >> hythread_monitor_init should behave.
>> >> >>
>> >> >> Thanks
>> >> >> Evgueni
>> >> >>
>> >> >> On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote:
>> >> >>> On Friday 10 November 2006 17:45 Evgueni Brevnov wrote:
>> >> >>> > Hi,
>> >> >>> >
>> >> >>> > While investigating deadlock scenario which is described in
>> >> >>> > HARMONY-2006 I found out one interesting thing. It turned out
>> >> that DRL
>> >> >>> > implementation of hythread_monitor_init /
>> >> >>> > hythread_monitor_init_with_name initializes and acquires a
>> monitor.
>> >> >>> > Original spec reads: "Acquire and initialize a new monitor
>> from the
>> >> >>> > threading library" AFAIU that doesn't mean to lock the
>> >> monitor but
>> >> >>> > get it from the threading library. So the hythread_monitor_init
>> >> should
>> >> >>> > not lock the monitor.
>> >> >>> >
>> >> >>> > Could somebody comment on that?
>> >> >>>
>> >> >>> It might be that semantic is different on different platforms
>> >> which is
>> >> >>> probably even worse. Your patch in HARMONY-2149 breaks nearly
>> all of
>> >> >>> acceptance tests on Linux while everything on Windows works (ok I
>> >> >>> tested on
>> >> >>> laptop with 1 processor while Linux was a HT server, sometimes
>> it is
>> >> >>> important for threading).
>> >> >
>> >> > I've tried to investigate the problem but didn't find the end of it
>> >> yet.
>> >> > The bug seems to be ubuntu specific (shall we maybe call this
>> >> > distribution buggy and move on?).
>> >>
>> >> There is something odd about it, I'll admit...  Remember the EOMEM
>> bugs
>> >> I found in forking?
>> >>
>> >>
>> >> I didn't reproduce it on
>> >> > gentoo, all tests work just fine.
>> >> >
>> >> > The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE,
>> >> > gc.PhantomReferenceTest, gc.WeakReferenceTest,
>> >> stress.WeakHashMapTest VM
>> >> > segfaults. The stack looks like an infinite recursion of 4 stack
>> >> frames:
>> >> >
>> >> > #0  0xb6dcb814 in null_java_reference_handler (signum=11,
>> >> > info=0xb71a503c, context=0xb71a50bc) at
>> >> >
>> >>
>> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco
>> >> > re/src/util/linux/signals_ia32.cpp:443
>> >> > #1  
>> >> > #2  0xb6dcc20a in get_stack_addr () at
>> >> >
>> >>
>> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco
>> >> > re/src/util/linux/signals_ia32.cpp:293
>> >> > #3  0xb6dcb6cd in check_stack_overflow (info=0xb71a546c,
>> uc=0xb71a54ec)
>> >> > at
>> >> >
>> >>
>> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco
>> >> > re/src/util/linux/signals_ia32.cpp:399
>> >> > #4  0xb6dcb900 in null_java_reference_handler (signum=11,
>> >> > info=0xb71a546c, context=0xb71a54ec) at
>> >> >
>> >>
>> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco
>> >> > re/src/util/linux/signals_ia32.cpp:451
>> >> >
>> >> > and so on. The stack is very long. When I run VM with
>> -Xtrace:signals I
>> >> > get a very long log of messages that "NPE or SOE detected at
>> ...". The
>> >> > first time address always varies, but it appears to be memcpy.
>> The next
>> >> > addresses are always the same, they point to get_stack_addr
>> function.
>> >> >
>> >> > So I tried to find out why memcpy crashes in the first place. It
>> >> appears
>> >> > to be a struct copy called from jsig_handler hysig. The stack looks
>> >> like
>> >> > this (if I can trust gdb on ubuntu):
>> >> >
>> >> > #0  0xb7a9b9dc in memcpy () from /

Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-15 Thread Geir Magnusson Jr.

ah. whew.

can you point me to that change you made?

geir

Evgueni Brevnov wrote:

I'm not aware if classlib uses SIGUSR2. In this particular case
classlib (to be more precise it is the portlib module) does sem_wait
which is interrupted by TM's SIGUSR2 signal. I replaced "hysem_wait"
with "while (hysem_wait() != 0) {}". It helped to pass all tests.

Evgueni

On 11/16/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:

um... classlib uses SIGUSR2 as well?  Doesn't our thread manager use it?

Evgueni Brevnov wrote:
> Hey,
>
> Seems like the pretty old problem shows itself again. I'm talking
> about SIGUSR2 signal :-(...Classlib's asynchronous signal reporter
> uses system semaphores for synchronization purposes...and hysem_wait
> is interrupted by the signal:
>
> (gdb) p perror("sym_wait error:")
> sym_wait error:: Interrupted system call
>
> Do we have good (universal) solution for such cases?
>
> Thanks
> Evgueni
>
> On 11/15/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:
>>
>>
>> Gregory Shimansky wrote:
>> > Evgueni Brevnov wrote:
>> >> hmmm strange. The patch was tested on multi-processor system
>> >> running SUSE9. I will check if the patch misses something. 
Anyway, we

>> >> need to wait with the patch submission until we 100% sure how
>> >> hythread_monitor_init should behave.
>> >>
>> >> Thanks
>> >> Evgueni
>> >>
>> >> On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote:
>> >>> On Friday 10 November 2006 17:45 Evgueni Brevnov wrote:
>> >>> > Hi,
>> >>> >
>> >>> > While investigating deadlock scenario which is described in
>> >>> > HARMONY-2006 I found out one interesting thing. It turned out
>> that DRL
>> >>> > implementation of hythread_monitor_init /
>> >>> > hythread_monitor_init_with_name initializes and acquires a 
monitor.
>> >>> > Original spec reads: "Acquire and initialize a new monitor 
from the

>> >>> > threading library" AFAIU that doesn't mean to lock the
>> monitor but
>> >>> > get it from the threading library. So the hythread_monitor_init
>> should
>> >>> > not lock the monitor.
>> >>> >
>> >>> > Could somebody comment on that?
>> >>>
>> >>> It might be that semantic is different on different platforms
>> which is
>> >>> probably even worse. Your patch in HARMONY-2149 breaks nearly 
all of

>> >>> acceptance tests on Linux while everything on Windows works (ok I
>> >>> tested on
>> >>> laptop with 1 processor while Linux was a HT server, sometimes 
it is

>> >>> important for threading).
>> >
>> > I've tried to investigate the problem but didn't find the end of it
>> yet.
>> > The bug seems to be ubuntu specific (shall we maybe call this
>> > distribution buggy and move on?).
>>
>> There is something odd about it, I'll admit...  Remember the EOMEM 
bugs

>> I found in forking?
>>
>>
>> I didn't reproduce it on
>> > gentoo, all tests work just fine.
>> >
>> > The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE,
>> > gc.PhantomReferenceTest, gc.WeakReferenceTest,
>> stress.WeakHashMapTest VM
>> > segfaults. The stack looks like an infinite recursion of 4 stack
>> frames:
>> >
>> > #0  0xb6dcb814 in null_java_reference_handler (signum=11,
>> > info=0xb71a503c, context=0xb71a50bc) at
>> >
>> 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

>> > re/src/util/linux/signals_ia32.cpp:443
>> > #1  
>> > #2  0xb6dcc20a in get_stack_addr () at
>> >
>> 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

>> > re/src/util/linux/signals_ia32.cpp:293
>> > #3  0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, 
uc=0xb71a54ec)

>> > at
>> >
>> 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

>> > re/src/util/linux/signals_ia32.cpp:399
>> > #4  0xb6dcb900 in null_java_reference_handler (signum=11,
>> > info=0xb71a546c, context=0xb71a54ec) at
>> >
>> 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

>> > re/src/util/linux/signals_ia32.cpp:451
>> >
>> > and so on. The stack is very long. When I run VM with 
-Xtrace:signals I
>> > get a very long log of messages that "NPE or SOE detected at 
...". The
>> > first time address always varies, but it appears to be memcpy. 
The next
>> > addresses are always the same, they point to get_stack_addr 
function.

>> >
>> > So I tried to find out why memcpy crashes in the first place. It
>> appears
>> > to be a struct copy called from jsig_handler hysig. The stack looks
>> like
>> > this (if I can trust gdb on ubuntu):
>> >
>> > #0  0xb7a9b9dc in memcpy () from /lib/tls/i686/cmov/libc.so.6
>> > #1  0xb7ba0fa0 in jsig_handler (sig=-1215196204, siginfo=0x0, 
uc=0x0)

>> >  at hysigunix.c:169
>> > #2  0xb7f9ec8b in asynchSignalReporter (userData=0x0) at 
hysignal.c:971
>> > #3  0xb7baa8ef in thread_start_proc (thd=0x807a8e8, 
p_args=0x807a8d8)

>> > at
>> >
>> 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/thread/src/thread_native_basic.c:712 


>>
>> >
>> > #4  0xb7bb0ed4 in dummy_worker (opaque=0x

Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-15 Thread Geir Magnusson Jr.

um... classlib uses SIGUSR2 as well?  Doesn't our thread manager use it?

Evgueni Brevnov wrote:

Hey,

Seems like the pretty old problem shows itself again. I'm talking
about SIGUSR2 signal :-(...Classlib's asynchronous signal reporter
uses system semaphores for synchronization purposes...and hysem_wait
is interrupted by the signal:

(gdb) p perror("sym_wait error:")
sym_wait error:: Interrupted system call

Do we have good (universal) solution for such cases?

Thanks
Evgueni

On 11/15/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:



Gregory Shimansky wrote:
> Evgueni Brevnov wrote:
>> hmmm strange. The patch was tested on multi-processor system
>> running SUSE9. I will check if the patch misses something. Anyway, we
>> need to wait with the patch submission until we 100% sure how
>> hythread_monitor_init should behave.
>>
>> Thanks
>> Evgueni
>>
>> On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote:
>>> On Friday 10 November 2006 17:45 Evgueni Brevnov wrote:
>>> > Hi,
>>> >
>>> > While investigating deadlock scenario which is described in
>>> > HARMONY-2006 I found out one interesting thing. It turned out 
that DRL

>>> > implementation of hythread_monitor_init /
>>> > hythread_monitor_init_with_name initializes and acquires a monitor.
>>> > Original spec reads: "Acquire and initialize a new monitor from the
>>> > threading library" AFAIU that doesn't mean to lock the 
monitor but
>>> > get it from the threading library. So the hythread_monitor_init 
should

>>> > not lock the monitor.
>>> >
>>> > Could somebody comment on that?
>>>
>>> It might be that semantic is different on different platforms 
which is

>>> probably even worse. Your patch in HARMONY-2149 breaks nearly all of
>>> acceptance tests on Linux while everything on Windows works (ok I
>>> tested on
>>> laptop with 1 processor while Linux was a HT server, sometimes it is
>>> important for threading).
>
> I've tried to investigate the problem but didn't find the end of it 
yet.

> The bug seems to be ubuntu specific (shall we maybe call this
> distribution buggy and move on?).

There is something odd about it, I'll admit...  Remember the EOMEM bugs
I found in forking?


I didn't reproduce it on
> gentoo, all tests work just fine.
>
> The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE,
> gc.PhantomReferenceTest, gc.WeakReferenceTest, 
stress.WeakHashMapTest VM
> segfaults. The stack looks like an infinite recursion of 4 stack 
frames:

>
> #0  0xb6dcb814 in null_java_reference_handler (signum=11,
> info=0xb71a503c, context=0xb71a50bc) at
> 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

> re/src/util/linux/signals_ia32.cpp:443
> #1  
> #2  0xb6dcc20a in get_stack_addr () at
> 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

> re/src/util/linux/signals_ia32.cpp:293
> #3  0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, uc=0xb71a54ec)
> at
> 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

> re/src/util/linux/signals_ia32.cpp:399
> #4  0xb6dcb900 in null_java_reference_handler (signum=11,
> info=0xb71a546c, context=0xb71a54ec) at
> 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

> re/src/util/linux/signals_ia32.cpp:451
>
> and so on. The stack is very long. When I run VM with -Xtrace:signals I
> get a very long log of messages that "NPE or SOE detected at ...". The
> first time address always varies, but it appears to be memcpy. The next
> addresses are always the same, they point to get_stack_addr function.
>
> So I tried to find out why memcpy crashes in the first place. It 
appears
> to be a struct copy called from jsig_handler hysig. The stack looks 
like

> this (if I can trust gdb on ubuntu):
>
> #0  0xb7a9b9dc in memcpy () from /lib/tls/i686/cmov/libc.so.6
> #1  0xb7ba0fa0 in jsig_handler (sig=-1215196204, siginfo=0x0, uc=0x0)
>  at hysigunix.c:169
> #2  0xb7f9ec8b in asynchSignalReporter (userData=0x0) at hysignal.c:971
> #3  0xb7baa8ef in thread_start_proc (thd=0x807a8e8, p_args=0x807a8d8)
> at
> 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/thread/src/thread_native_basic.c:712 


>
> #4  0xb7bb0ed4 in dummy_worker (opaque=0x0) at 
threadproc/unix/thread.c:138
> #5  0xb7b65341 in start_thread () from 
lib/tls/i686/cmov/libpthread.so.0

> #6  0xb7af94ee in clone () from /lib/tls/i686/cmov/libc.so.6
>
> In jsig_handler a struct of type sigaction is copied
>
> act = saved_sigaction[sig];
>
> and gcc replaces this statement with a call to memcpy it seems. But the
> parameter sig is quite weird if you look at it. It is 
sig=-1215196204...

> Now if I could only find where and this sig happened there... I cannot
> find it in the depth of classlib native code this late at night.
>






Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-15 Thread Evgueni Brevnov

Hey,

Seems like the pretty old problem shows itself again. I'm talking
about SIGUSR2 signal :-(...Classlib's asynchronous signal reporter
uses system semaphores for synchronization purposes...and hysem_wait
is interrupted by the signal:

(gdb) p perror("sym_wait error:")
sym_wait error:: Interrupted system call

Do we have good (universal) solution for such cases?

Thanks
Evgueni

On 11/15/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:



Gregory Shimansky wrote:
> Evgueni Brevnov wrote:
>> hmmm strange. The patch was tested on multi-processor system
>> running SUSE9. I will check if the patch misses something. Anyway, we
>> need to wait with the patch submission until we 100% sure how
>> hythread_monitor_init should behave.
>>
>> Thanks
>> Evgueni
>>
>> On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote:
>>> On Friday 10 November 2006 17:45 Evgueni Brevnov wrote:
>>> > Hi,
>>> >
>>> > While investigating deadlock scenario which is described in
>>> > HARMONY-2006 I found out one interesting thing. It turned out that DRL
>>> > implementation of hythread_monitor_init /
>>> > hythread_monitor_init_with_name initializes and acquires a monitor.
>>> > Original spec reads: "Acquire and initialize a new monitor from the
>>> > threading library" AFAIU that doesn't mean to lock the monitor but
>>> > get it from the threading library. So the hythread_monitor_init should
>>> > not lock the monitor.
>>> >
>>> > Could somebody comment on that?
>>>
>>> It might be that semantic is different on different platforms which is
>>> probably even worse. Your patch in HARMONY-2149 breaks nearly all of
>>> acceptance tests on Linux while everything on Windows works (ok I
>>> tested on
>>> laptop with 1 processor while Linux was a HT server, sometimes it is
>>> important for threading).
>
> I've tried to investigate the problem but didn't find the end of it yet.
> The bug seems to be ubuntu specific (shall we maybe call this
> distribution buggy and move on?).

There is something odd about it, I'll admit...  Remember the EOMEM bugs
I found in forking?


I didn't reproduce it on
> gentoo, all tests work just fine.
>
> The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE,
> gc.PhantomReferenceTest, gc.WeakReferenceTest, stress.WeakHashMapTest VM
> segfaults. The stack looks like an infinite recursion of 4 stack frames:
>
> #0  0xb6dcb814 in null_java_reference_handler (signum=11,
> info=0xb71a503c, context=0xb71a50bc) at
> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco
> re/src/util/linux/signals_ia32.cpp:443
> #1  
> #2  0xb6dcc20a in get_stack_addr () at
> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco
> re/src/util/linux/signals_ia32.cpp:293
> #3  0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, uc=0xb71a54ec)
> at
> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco
> re/src/util/linux/signals_ia32.cpp:399
> #4  0xb6dcb900 in null_java_reference_handler (signum=11,
> info=0xb71a546c, context=0xb71a54ec) at
> /nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco
> re/src/util/linux/signals_ia32.cpp:451
>
> and so on. The stack is very long. When I run VM with -Xtrace:signals I
> get a very long log of messages that "NPE or SOE detected at ...". The
> first time address always varies, but it appears to be memcpy. The next
> addresses are always the same, they point to get_stack_addr function.
>
> So I tried to find out why memcpy crashes in the first place. It appears
> to be a struct copy called from jsig_handler hysig. The stack looks like
> this (if I can trust gdb on ubuntu):
>
> #0  0xb7a9b9dc in memcpy () from /lib/tls/i686/cmov/libc.so.6
> #1  0xb7ba0fa0 in jsig_handler (sig=-1215196204, siginfo=0x0, uc=0x0)
>  at hysigunix.c:169
> #2  0xb7f9ec8b in asynchSignalReporter (userData=0x0) at hysignal.c:971
> #3  0xb7baa8ef in thread_start_proc (thd=0x807a8e8, p_args=0x807a8d8)
> at
> 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/thread/src/thread_native_basic.c:712
>
> #4  0xb7bb0ed4 in dummy_worker (opaque=0x0) at threadproc/unix/thread.c:138
> #5  0xb7b65341 in start_thread () from lib/tls/i686/cmov/libpthread.so.0
> #6  0xb7af94ee in clone () from /lib/tls/i686/cmov/libc.so.6
>
> In jsig_handler a struct of type sigaction is copied
>
> act = saved_sigaction[sig];
>
> and gcc replaces this statement with a call to memcpy it seems. But the
> parameter sig is quite weird if you look at it. It is sig=-1215196204...
> Now if I could only find where and this sig happened there... I cannot
> find it in the depth of classlib native code this late at night.
>




Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-14 Thread Geir Magnusson Jr.



Gregory Shimansky wrote:

Evgueni Brevnov wrote:

hmmm strange. The patch was tested on multi-processor system
running SUSE9. I will check if the patch misses something. Anyway, we
need to wait with the patch submission until we 100% sure how
hythread_monitor_init should behave.

Thanks
Evgueni

On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote:

On Friday 10 November 2006 17:45 Evgueni Brevnov wrote:
> Hi,
>
> While investigating deadlock scenario which is described in
> HARMONY-2006 I found out one interesting thing. It turned out that DRL
> implementation of hythread_monitor_init /
> hythread_monitor_init_with_name initializes and acquires a monitor.
> Original spec reads: "Acquire and initialize a new monitor from the
> threading library" AFAIU that doesn't mean to lock the monitor but
> get it from the threading library. So the hythread_monitor_init should
> not lock the monitor.
>
> Could somebody comment on that?

It might be that semantic is different on different platforms which is
probably even worse. Your patch in HARMONY-2149 breaks nearly all of
acceptance tests on Linux while everything on Windows works (ok I 
tested on

laptop with 1 processor while Linux was a HT server, sometimes it is
important for threading).


I've tried to investigate the problem but didn't find the end of it yet. 
The bug seems to be ubuntu specific (shall we maybe call this 
distribution buggy and move on?). 


There is something odd about it, I'll admit...  Remember the EOMEM bugs 
I found in forking?



I didn't reproduce it on

gentoo, all tests work just fine.

The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE, 
gc.PhantomReferenceTest, gc.WeakReferenceTest, stress.WeakHashMapTest VM 
segfaults. The stack looks like an infinite recursion of 4 stack frames:


#0  0xb6dcb814 in null_java_reference_handler (signum=11, 
info=0xb71a503c, context=0xb71a50bc) at 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

re/src/util/linux/signals_ia32.cpp:443
#1  
#2  0xb6dcc20a in get_stack_addr () at 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

re/src/util/linux/signals_ia32.cpp:293
#3  0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, uc=0xb71a54ec)
at 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

re/src/util/linux/signals_ia32.cpp:399
#4  0xb6dcb900 in null_java_reference_handler (signum=11, 
info=0xb71a546c, context=0xb71a54ec) at 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

re/src/util/linux/signals_ia32.cpp:451

and so on. The stack is very long. When I run VM with -Xtrace:signals I 
get a very long log of messages that "NPE or SOE detected at ...". The 
first time address always varies, but it appears to be memcpy. The next 
addresses are always the same, they point to get_stack_addr function.


So I tried to find out why memcpy crashes in the first place. It appears 
to be a struct copy called from jsig_handler hysig. The stack looks like 
this (if I can trust gdb on ubuntu):


#0  0xb7a9b9dc in memcpy () from /lib/tls/i686/cmov/libc.so.6
#1  0xb7ba0fa0 in jsig_handler (sig=-1215196204, siginfo=0x0, uc=0x0) 
 at hysigunix.c:169

#2  0xb7f9ec8b in asynchSignalReporter (userData=0x0) at hysignal.c:971
#3  0xb7baa8ef in thread_start_proc (thd=0x807a8e8, p_args=0x807a8d8)
at 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/thread/src/thread_native_basic.c:712 


#4  0xb7bb0ed4 in dummy_worker (opaque=0x0) at threadproc/unix/thread.c:138
#5  0xb7b65341 in start_thread () from lib/tls/i686/cmov/libpthread.so.0
#6  0xb7af94ee in clone () from /lib/tls/i686/cmov/libc.so.6

In jsig_handler a struct of type sigaction is copied

act = saved_sigaction[sig];

and gcc replaces this statement with a call to memcpy it seems. But the 
parameter sig is quite weird if you look at it. It is sig=-1215196204... 
Now if I could only find where and this sig happened there... I cannot 
find it in the depth of classlib native code this late at night.






Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-13 Thread Evgueni Brevnov

Gregory,

I can't reproduce the problem described by you on my local Ubuntu
machine. So I can only guess. And my guess is that
mapPortLibSignalToUnix can't find corresponding signal in the map.
That's why you have undefined sig (-1215196204) in jsig_handler. I can
think of two reasons why everything works fine on my machine:
1) Another signal is generated on my build.
2) It is just a matter of luck that eax contains some proper value
upon returning from mapPortLibSignalToUnix.

That's it for now

Thanks
Evgueni

On 11/14/06, Alexei Fedotov <[EMAIL PROTECTED]> wrote:

Evgueni,
That was great.

Artem,
It's nice to see you online. Could you please check the last comments
to http://issues.apache.org/jira/browse/HARMONY-1904 and decide what
should we do about this issue?



Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-13 Thread Alexei Fedotov

Evgueni,
That was great.

Artem,
It's nice to see you online. Could you please check the last comments
to http://issues.apache.org/jira/browse/HARMONY-1904 and decide what
should we do about this issue?


Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-13 Thread Gregory Shimansky

Evgueni Brevnov wrote:

hmmm strange. The patch was tested on multi-processor system
running SUSE9. I will check if the patch misses something. Anyway, we
need to wait with the patch submission until we 100% sure how
hythread_monitor_init should behave.

Thanks
Evgueni

On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote:

On Friday 10 November 2006 17:45 Evgueni Brevnov wrote:
> Hi,
>
> While investigating deadlock scenario which is described in
> HARMONY-2006 I found out one interesting thing. It turned out that DRL
> implementation of hythread_monitor_init /
> hythread_monitor_init_with_name initializes and acquires a monitor.
> Original spec reads: "Acquire and initialize a new monitor from the
> threading library" AFAIU that doesn't mean to lock the monitor but
> get it from the threading library. So the hythread_monitor_init should
> not lock the monitor.
>
> Could somebody comment on that?

It might be that semantic is different on different platforms which is
probably even worse. Your patch in HARMONY-2149 breaks nearly all of
acceptance tests on Linux while everything on Windows works (ok I 
tested on

laptop with 1 processor while Linux was a HT server, sometimes it is
important for threading).


I've tried to investigate the problem but didn't find the end of it yet. 
The bug seems to be ubuntu specific (shall we maybe call this 
distribution buggy and move on?). I didn't reproduce it on 
gentoo, all tests work just fine.


The bug look likes this, on tests gc.Force, gc.LOS, gc.List, gc.NPE, 
gc.PhantomReferenceTest, gc.WeakReferenceTest, stress.WeakHashMapTest VM 
segfaults. The stack looks like an infinite recursion of 4 stack frames:


#0  0xb6dcb814 in null_java_reference_handler (signum=11, 
info=0xb71a503c, context=0xb71a50bc) at 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

re/src/util/linux/signals_ia32.cpp:443
#1  
#2  0xb6dcc20a in get_stack_addr () at 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

re/src/util/linux/signals_ia32.cpp:293
#3  0xb6dcb6cd in check_stack_overflow (info=0xb71a546c, uc=0xb71a54ec)
at 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

re/src/util/linux/signals_ia32.cpp:399
#4  0xb6dcb900 in null_java_reference_handler (signum=11, 
info=0xb71a546c, context=0xb71a54ec) at 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/vmco

re/src/util/linux/signals_ia32.cpp:451

and so on. The stack is very long. When I run VM with -Xtrace:signals I 
get a very long log of messages that "NPE or SOE detected at ...". The 
first time address always varies, but it appears to be memcpy. The next 
addresses are always the same, they point to get_stack_addr function.


So I tried to find out why memcpy crashes in the first place. It appears 
to be a struct copy called from jsig_handler hysig. The stack looks like 
this (if I can trust gdb on ubuntu):


#0  0xb7a9b9dc in memcpy () from /lib/tls/i686/cmov/libc.so.6
#1  0xb7ba0fa0 in jsig_handler (sig=-1215196204, siginfo=0x0, uc=0x0) 
 at hysigunix.c:169

#2  0xb7f9ec8b in asynchSignalReporter (userData=0x0) at hysignal.c:971
#3  0xb7baa8ef in thread_start_proc (thd=0x807a8e8, p_args=0x807a8d8)
at 
/nfs/ims/proj/drl/mrt1/users/gregory/Harmony/enhanced/drlvm/trunk/vm/thread/src/thread_native_basic.c:712

#4  0xb7bb0ed4 in dummy_worker (opaque=0x0) at threadproc/unix/thread.c:138
#5  0xb7b65341 in start_thread () from lib/tls/i686/cmov/libpthread.so.0
#6  0xb7af94ee in clone () from /lib/tls/i686/cmov/libc.so.6

In jsig_handler a struct of type sigaction is copied

act = saved_sigaction[sig];

and gcc replaces this statement with a call to memcpy it seems. But the 
parameter sig is quite weird if you look at it. It is sig=-1215196204... 
Now if I could only find where and this sig happened there... I cannot 
find it in the depth of classlib native code this late at night.


--
Gregory



Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-13 Thread Artem Aliev

Oops,


You were right. I take a llook into  classlib hythread code.
It looks like I incorrectly understand the documentation.
This is a bug.

Thanks
Artem


On 11/13/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote:

Could someone familiar with classlib's implementation comment on that ?

Thanks in advance.
Evgueni

On 11/13/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote:
> Hello Artem,
>
> Are you 100% sure? I've looked at the classlib's implementation and
> can't find where the monitor is acquired. Moreover if you look at the
> initializeSignalTools() located in
> modules\portlib\src\main\native\port\linux\hysignal.c you will find
> that it initializes new monitors with hyhtread_monitor_init_with_name
> and never frees these monitors. That turned out to be the reason of a
> deadlock in HARMONY-2006.
>
> Thanks
> Evgueni
>
> On 11/13/06, Artem Aliev <[EMAIL PROTECTED]> wrote:
> > > It turned out that DRL
> > > implementation of hythread_monitor_init /
> > > hythread_monitor_init_with_name initializes and acquires a monitor.
> >
> > Eugeni,
> >
> > Both drlvm and classlib hythread work this way.
> > This original hythread design that for compatibility reason  was
> > implemented in drlvm.
> >
> > Thanks
> > Artem
> >
> >
> >
> > On 11/10/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote:
> > > Hi,
> > >
> > > While investigating deadlock scenario which is described in
> > > HARMONY-2006 I found out one interesting thing. It turned out that DRL
> > > implementation of hythread_monitor_init /
> > > hythread_monitor_init_with_name initializes and acquires a monitor.
> > > Original spec reads: "Acquire and initialize a new monitor from the
> > > threading library" AFAIU that doesn't mean to lock the monitor but
> > > get it from the threading library. So the hythread_monitor_init should
> > > not lock the monitor.
> > >
> > > Could somebody comment on that?
> > >
> > > Thanks
> > > Evgueni
> > >
> >
>



Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-13 Thread Evgueni Brevnov

Could someone familiar with classlib's implementation comment on that ?

Thanks in advance.
Evgueni

On 11/13/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote:

Hello Artem,

Are you 100% sure? I've looked at the classlib's implementation and
can't find where the monitor is acquired. Moreover if you look at the
initializeSignalTools() located in
modules\portlib\src\main\native\port\linux\hysignal.c you will find
that it initializes new monitors with hyhtread_monitor_init_with_name
and never frees these monitors. That turned out to be the reason of a
deadlock in HARMONY-2006.

Thanks
Evgueni

On 11/13/06, Artem Aliev <[EMAIL PROTECTED]> wrote:
> > It turned out that DRL
> > implementation of hythread_monitor_init /
> > hythread_monitor_init_with_name initializes and acquires a monitor.
>
> Eugeni,
>
> Both drlvm and classlib hythread work this way.
> This original hythread design that for compatibility reason  was
> implemented in drlvm.
>
> Thanks
> Artem
>
>
>
> On 11/10/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > While investigating deadlock scenario which is described in
> > HARMONY-2006 I found out one interesting thing. It turned out that DRL
> > implementation of hythread_monitor_init /
> > hythread_monitor_init_with_name initializes and acquires a monitor.
> > Original spec reads: "Acquire and initialize a new monitor from the
> > threading library" AFAIU that doesn't mean to lock the monitor but
> > get it from the threading library. So the hythread_monitor_init should
> > not lock the monitor.
> >
> > Could somebody comment on that?
> >
> > Thanks
> > Evgueni
> >
>



Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-13 Thread Evgueni Brevnov

Hello Artem,

Are you 100% sure? I've looked at the classlib's implementation and
can't find where the monitor is acquired. Moreover if you look at the
initializeSignalTools() located in
modules\portlib\src\main\native\port\linux\hysignal.c you will find
that it initializes new monitors with hyhtread_monitor_init_with_name
and never frees these monitors. That turned out to be the reason of a
deadlock in HARMONY-2006.

Thanks
Evgueni

On 11/13/06, Artem Aliev <[EMAIL PROTECTED]> wrote:

> It turned out that DRL
> implementation of hythread_monitor_init /
> hythread_monitor_init_with_name initializes and acquires a monitor.

Eugeni,

Both drlvm and classlib hythread work this way.
This original hythread design that for compatibility reason  was
implemented in drlvm.

Thanks
Artem



On 11/10/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote:
> Hi,
>
> While investigating deadlock scenario which is described in
> HARMONY-2006 I found out one interesting thing. It turned out that DRL
> implementation of hythread_monitor_init /
> hythread_monitor_init_with_name initializes and acquires a monitor.
> Original spec reads: "Acquire and initialize a new monitor from the
> threading library" AFAIU that doesn't mean to lock the monitor but
> get it from the threading library. So the hythread_monitor_init should
> not lock the monitor.
>
> Could somebody comment on that?
>
> Thanks
> Evgueni
>



Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-13 Thread Artem Aliev

It turned out that DRL
implementation of hythread_monitor_init /
hythread_monitor_init_with_name initializes and acquires a monitor.


Eugeni,

Both drlvm and classlib hythread work this way.
This original hythread design that for compatibility reason  was
implemented in drlvm.

Thanks
Artem



On 11/10/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote:

Hi,

While investigating deadlock scenario which is described in
HARMONY-2006 I found out one interesting thing. It turned out that DRL
implementation of hythread_monitor_init /
hythread_monitor_init_with_name initializes and acquires a monitor.
Original spec reads: "Acquire and initialize a new monitor from the
threading library" AFAIU that doesn't mean to lock the monitor but
get it from the threading library. So the hythread_monitor_init should
not lock the monitor.

Could somebody comment on that?

Thanks
Evgueni



Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-11 Thread Alexei Fedotov

All,

Evgueni's patch is a step in the right direction. Considering
pthread_mutex_init as a conventional example, monitor shouldn't be
locked at _init function. Test errors on Linux can just tell us that
there are more places that rely on the incorrect contract of the
function.

-- Alexei

On 11/11/06, Evgueni Brevnov <[EMAIL PROTECTED]> wrote:

hmmm strange. The patch was tested on multi-processor system
running SUSE9. I will check if the patch misses something. Anyway, we
need to wait with the patch submission until we 100% sure how
hythread_monitor_init should behave.

Thanks
Evgueni

On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote:
> On Friday 10 November 2006 17:45 Evgueni Brevnov wrote:
> > Hi,
> >
> > While investigating deadlock scenario which is described in
> > HARMONY-2006 I found out one interesting thing. It turned out that DRL
> > implementation of hythread_monitor_init /
> > hythread_monitor_init_with_name initializes and acquires a monitor.
> > Original spec reads: "Acquire and initialize a new monitor from the
> > threading library" AFAIU that doesn't mean to lock the monitor but
> > get it from the threading library. So the hythread_monitor_init should
> > not lock the monitor.
> >
> > Could somebody comment on that?
>
> It might be that semantic is different on different platforms which is
> probably even worse. Your patch in HARMONY-2149 breaks nearly all of
> acceptance tests on Linux while everything on Windows works (ok I tested on
> laptop with 1 processor while Linux was a HT server, sometimes it is
> important for threading).
>
> I think we need more investigation on whether or not the monitor has to be
> locked in init.
>
> --
> Gregory Shimansky, Intel Middleware Products Division
>




--
Thank you,
Alexei


Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-11 Thread Evgueni Brevnov

hmmm strange. The patch was tested on multi-processor system
running SUSE9. I will check if the patch misses something. Anyway, we
need to wait with the patch submission until we 100% sure how
hythread_monitor_init should behave.

Thanks
Evgueni

On 11/11/06, Gregory Shimansky <[EMAIL PROTECTED]> wrote:

On Friday 10 November 2006 17:45 Evgueni Brevnov wrote:
> Hi,
>
> While investigating deadlock scenario which is described in
> HARMONY-2006 I found out one interesting thing. It turned out that DRL
> implementation of hythread_monitor_init /
> hythread_monitor_init_with_name initializes and acquires a monitor.
> Original spec reads: "Acquire and initialize a new monitor from the
> threading library" AFAIU that doesn't mean to lock the monitor but
> get it from the threading library. So the hythread_monitor_init should
> not lock the monitor.
>
> Could somebody comment on that?

It might be that semantic is different on different platforms which is
probably even worse. Your patch in HARMONY-2149 breaks nearly all of
acceptance tests on Linux while everything on Windows works (ok I tested on
laptop with 1 processor while Linux was a HT server, sometimes it is
important for threading).

I think we need more investigation on whether or not the monitor has to be
locked in init.

--
Gregory Shimansky, Intel Middleware Products Division



Re: [drlvm][threading] Should hythread_monitor_init() aquire the monitor?

2006-11-10 Thread Gregory Shimansky
On Friday 10 November 2006 17:45 Evgueni Brevnov wrote:
> Hi,
>
> While investigating deadlock scenario which is described in
> HARMONY-2006 I found out one interesting thing. It turned out that DRL
> implementation of hythread_monitor_init /
> hythread_monitor_init_with_name initializes and acquires a monitor.
> Original spec reads: "Acquire and initialize a new monitor from the
> threading library" AFAIU that doesn't mean to lock the monitor but
> get it from the threading library. So the hythread_monitor_init should
> not lock the monitor.
>
> Could somebody comment on that?

It might be that semantic is different on different platforms which is 
probably even worse. Your patch in HARMONY-2149 breaks nearly all of 
acceptance tests on Linux while everything on Windows works (ok I tested on 
laptop with 1 processor while Linux was a HT server, sometimes it is 
important for threading).

I think we need more investigation on whether or not the monitor has to be 
locked in init.

-- 
Gregory Shimansky, Intel Middleware Products Division