Re: RFR: 8231666: ThreadIdTable::grow() invokes invalid thread transition

Daniil Titov Tue, 08 Oct 2019 11:02:27 -0700

Hi David and Robbin,

Thank you for reviewing this fix!


Best regards,
Daniil

On 10/8/19, 1:49 AM, "Robbin Ehn" <robbin....@oracle.com> wrote:

    Great, thanks!
    
    /Robbin
    
    On 2019-10-07 18:41, Daniil Titov wrote:
    > Hi Robbin,
    > 
    > Yes, I ran my benchmark [1]. Please see below the output showing that the 
table was grown.
    > 
    > ../jdk/build/linux-x64-release/images/jdk/bin/java -cp . 
-Xlog:thread+table=info ThreadStartupTest
    > Starting the test
    > [0.185s][info][thread,table] Grown to size:512
    > The test finished.
    > Execution time:15673 ms
    > 
    > 
    > [1] https://cr.openjdk.java.net/~dtitov/tests/ThreadStartupTest.java
    > 
    > Thanks!
    > Daniil
    > 
    > 
    > On 10/7/19, 12:34 AM, "Robbin Ehn" <robbin....@oracle.com> wrote:
    > 
    >      Hi Daniil,
    >      
    >      Yes, good, but:
    >      
    >      >>      >> Testing:  Mach5 tier1, tier2, and tier3 tests 
successfully passed.
    >      >>      And if you have not done so, you should test this with the 
benchmark you
    >      >> have as
    >      >>      a stress test and see that this does what we think.
    >      
    >      Can you please test it with your benchmark, if you have not done so?
    >      
    >      /Robbin
    >      
    >      >>      Thanks, Robbin
    >      >>      >>
    >      >>      >> Thank you!
    >      >>      >>
    >      >>      >> Best regards,
    >      >>      >> Daniil
    >      >>      >>
    >      >>      >> On 10/2/19, 3:26 PM, "David Holmes" 
<david.hol...@oracle.com> wrote:
    >      >>      >>
    >      >>      >>      Hi Daniil,
    >      >>      >>      On 3/10/2019 2:21 am, Daniil Titov wrote:
    >      >>      >>      > Hi David and Robbin,
    >      >>      >>      >
    >      >>      >>      > Could we consider  making the ServiceThread 
responsible for the
    >      >>      >> ThreadIdTable resizing in the similar way how
    >      >>      >>      > it works for  StringTable  and 
ResolvedMethodTable, rather than
    >      >> having
    >      >>      >> ThreadIdTable::add() method calling ThreadIdTable::grow()?
    >      >>      >>      > As I understand It should solve  the current  
issue and
    >      >> address the
    >      >>      >> concern that  the doing the resizing could be a 
relatively long and
    >      >>      >>      > doing it without polling  for safepoints or while 
the holding
    >      >>      >> Threads_lock is not desirable.
    >      >>      >>      I originally rejected copying that part of the code 
from the other
    >      >>      >>      tables as it seems to introduce unnecessary 
complexity. Having a
    >      >>      >>      separate thread trying to grow the table when it 
could be
    >      >> concurrently
    >      >>      >>      having threads added and removed seems like it could 
introduce
    >      >> hard to
    >      >>      >>      diagnose performance pathologies. It also adds what 
we know to be a
    >      >>      >>      potentially long running action to the workload of 
the service
    >      >> thread,
    >      >>      >>      which means it may also impact the other tasks the 
service thread is
    >      >>      >>      doing, thus potentially introducing even more hard 
to diagnose
    >      >>      >>      performance pathologies.
    >      >>      >>      So this change does concern me. But go ahead and 
trial it.
    >      >>      >>      Thanks,
    >      >>      >>      David
    >      >>      >>      > Thank you,
    >      >>      >>      > Daniil
    >      >>      >>      >
    >      >>      >>      >
    >      >>      >>      > On 10/2/19, 6:25 AM, "David Holmes" 
<david.hol...@oracle.com>
    >      >> wrote:
    >      >>      >>      >
    >      >>      >>      >      Hi Robbin,
    >      >>      >>      >
    >      >>      >>      >      On 2/10/2019 7:58 pm, Robbin Ehn wrote:
    >      >>      >>      >      > Hi David,
    >      >>      >>      >      >
    >      >>      >>      >      >> What if the table is full and must be 
grown?
    >      >>      >>      >      >
    >      >>      >>      >      > The table uses chaining, it just means load 
factor tip
    >      >> over what is
    >      >>      >>      >      > considered a good backing array size.
    >      >>      >>      >
    >      >>      >>      >      Coleen raised a good question in a separate 
discussion,
    >      >> which made me
    >      >>      >>      >      realize that once the table has been 
initially populated all
    >      >>      >> subsequent
    >      >>      >>      >      additions, and hence all subsequent calls to 
grow() always
    >      >> happen
    >      >>      >> with
    >      >>      >>      >      the Threads_lock held. So we can't just defer 
the grow().
    >      >>      >>      >
    >      >>      >>      >      >> That aside, I'd like to know how expensive 
it is to
    >      >> grow this
    >      >>      >> table.
    >      >>      >>      >      >> What are we talking about here?
    >      >>      >>      >      >
    >      >>      >>      >      > We use global counter which on 
write_synchronize must
    >      >> scan all
    >      >>      >>      >      > threads to make sure they have seen the 
update (there some
    >      >>      >>      >      > optimization to avoid it if there is no 
readers at all).
    >      >> Since this
    >      >>      >>      >      > table contains the threads, we get double 
penalized, for
    >      >> each new
    >      >>      >>      >      > thread the synchronization cost increase 
AND the number
    >      >> of items.
    >      >>      >>      >      >
    >      >>      >>      >      > With concurrent reads you still need many 
thousands of
    >      >> threads, but
    >      >>      >>      >      > I think I saw someone mentioning 100k 
threads, assuming
    >      >> concurrent
    >      >>      >>      >      > queries the resize can take hundreds of ms 
to finish.
    >      >> Note that
    >      >>      >> reads
    >      >>      >>      >      > and inserts still in operate roughly at the 
same speed
    >      >> while
    >      >>      >>      >      > resizing. So a longer resize is only 
problematic if we
    >      >> do not
    >      >>      >>      >      > respect safepoints.
    >      >>      >>      >      I think if anything were capable of running 
100K threads
    >      >> we would be
    >      >>      >>      >      hitting far worse scalability bottlenecks 
than this. But
    >      >> this does
    >      >>      >> seem
    >      >>      >>      >      problematic.
    >      >>      >>      >
    >      >>      >>      >      Thanks,
    >      >>      >>      >      David
    >      >>      >>      >      -----
    >      >>      >>      >
    >      >>      >>      >      > Thanks, Robbin
    >      >>      >>      >      >
    >      >>      >>      >      >>
    >      >>      >>      >      >> David
    >      >>      >>      >      >>
    >      >>      >>      >      >>> /Robbin
    >      >>      >>      >      >>>
    >      >>      >>      >      >>> On 2019-10-02 08:46, David Holmes wrote:
    >      >>      >>      >      >>>> Hi Daniil,
    >      >>      >>      >      >>>>
    >      >>      >>      >      >>>> On 2/10/2019 4:13 pm, Daniil Titov wrote:
    >      >>      >>      >      >>>>> Please review a change that fixes the 
issue. The
    >      >> problem
    >      >>      >> here is
    >      >>      >>      >      >>>>> that that the thread is added to the 
ThreadIdTable
    >      >>      >> (introduced in
    >      >>      >>      >      >>>>> [3]) while the Threads_lock is held by
    >      >>      >>      >      >>>>> JVM_StartThread. When new thread is 
added  to the
    >      >> thread
    >      >>      >> table the
    >      >>      >>      >      >>>>> table checks if its load factor is 
greater than
    >      >> required and
    >      >>      >> if so
    >      >>      >>      >      >>>>> it grows itself while polling for 
safepoints.
    >      >>      >>      >      >>>>> After changes [4]  an attempt to block 
the thread while
    >      >>      >> holding the
    >      >>      >>      >      >>>>> Threads_lock  results in assertion in
    >      >>      >>      >      >>>>> Thread::check_possible_safepoint().
    >      >>      >>      >      >>>>>
    >      >>      >>      >      >>>>> The fix  proposed by David Holmes ( 
thank you,
    >      >> David!)  is
    >      >>      >> to skip
    >      >>      >>      >      >>>>> the ThreadBlockInVM inside 
ThreadIdTable::grow()
    >      >> method if the
    >      >>      >>      >      >>>>> current thread owns the Threads_lock.
    >      >>      >>      >      >>>>
    >      >>      >>      >      >>>> Sorry but looking at the fix in context 
now I think
    >      >> it would be
    >      >>      >>      >      >>>> better to do this:
    >      >>      >>      >      >>>>
    >      >>      >>      >      >>>>      while (gt.do_task(jt)) {
    >      >>      >>      >      >>>>        if (Threads_lock->owner() == jt) {
    >      >>      >>      >      >>>>          gt.pause(jt);
    >      >>      >>      >      >>>>          ThreadBlockInVM tbivm(jt);
    >      >>      >>      >      >>>>          gt.cont(jt);
    >      >>      >>      >      >>>>        }
    >      >>      >>      >      >>>>      }
    >      >>      >>      >      >>>>
    >      >>      >>      >      >>>> This way we don't waste time with the 
pause/cont when
    >      >> there's no
    >      >>      >>      >      >>>> safepoint pause going to happen - and 
the owner()
    >      >> check is
    >      >>      >> quicker
    >      >>      >>      >      >>>> than owned_by_self(). That partially 
addresses a general
    >      >>      >> concern I
    >      >>      >>      >      >>>> have about how long it may take to grow 
the table, as
    >      >> we are
    >      >>      >>      >      >>>> deferring safepoints until it is 
complete in this
    >      >>      >> JVM_StartThread
    >      >>      >>      >      >>>> usecase.
    >      >>      >>      >      >>>>
    >      >>      >>      >      >>>> In the test you don't need all of:
    >      >>      >>      >      >>>>
    >      >>      >>      >      >>>>    32  * @run clean ThreadStartTest
    >      >>      >>      >      >>>>    33  * @run build ThreadStartTest
    >      >>      >>      >      >>>>    34  * @run main ThreadStartTest
    >      >>      >>      >      >>>>
    >      >>      >>      >      >>>> just the last @run suffices to build and 
run the test.
    >      >>      >>      >      >>>>
    >      >>      >>      >      >>>> Thanks,
    >      >>      >>      >      >>>> David
    >      >>      >>      >      >>>> -----
    >      >>      >>      >      >>>>
    >      >>      >>      >      >>>>> Testing : Mach 5 tier1 and tier2 
completed
    >      >> successfully,
    >      >>      >> tier3 is
    >      >>      >>      >      >>>>> in progress.
    >      >>      >>      >      >>>>>
    >      >>      >>      >      >>>>> [1] Webrev:
    >      >>      >> http://cr.openjdk.java.net/~dtitov/8231666/webrev.01/
    >      >>      >>      >      >>>>> [2] Bug:
    >      >> https://bugs.openjdk.java.net/browse/JDK-8231666
    >      >>      >>      >      >>>>> [3] 
https://bugs.openjdk.java.net/browse/JDK-8185005
    >      >>      >>      >      >>>>> [4] 
https://bugs.openjdk.java.net/browse/JDK-8184732
    >      >>      >>      >      >>>>>
    >      >>      >>      >      >>>>> Best regards,
    >      >>      >>      >      >>>>> Danill
    >      >>      >>      >      >>>>>
    >      >>      >>      >      >>>>>
    >      >>      >>      >
    >      >>      >>      >
    >      >>      >>      >
    >      >>      >>
    >      >>      >>
    >      >>
    >      >>
    >      
    > 
    >

Re: RFR: 8231666: ThreadIdTable::grow() invokes invalid thread transition

Reply via email to