[jira] [Commented] (HBASE-22867) The ForkJoinPool in CleanerChore will spawn thousands of threads in our cluster with thousands table

Duo Zhang (JIRA) Fri, 16 Aug 2019 06:42:32 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-22867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909059#comment-16909059
 ]


Duo Zhang commented on HBASE-22867:
-----------------------------------

OK, both in fork and submit, we will try to create new workers if too few 
workers are active. The code are very 'Doug Lea' so not easy to fully 
understand but at least the comments tell this...

So I do not think we should use ForkJoinPool here then. This is not an in 
memory computation, some tasks may be pending for a long time, and introduce 
lots of threads...

> The ForkJoinPool in CleanerChore will spawn thousands of threads in our 
> cluster with thousands table
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-22867
>                 URL: https://issues.apache.org/jira/browse/HBASE-22867
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zheng Hu
>            Priority: Critical
>         Attachments: 31162.stack.1
>
>
> The thousands of spawned  threads make the safepoint cost 80+s in our Master 
> JVM processs.
> {code}
> 2019-08-15,19:35:35,861 INFO [main-SendThread(zjy-hadoop-prc-zk02.bj:11000)] 
> org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard 
> from server in 82260ms for sessionid 0x1691332e2d3aae5, closing socket 
> connection and at
> tempting reconnect
> {code}
> The stdout from JVM (can see from here there're 9126 threads & sync cost 80+s)
> {code}
> vmop                    [threads: total initially_running wait_to_block]    
> [time: spin block sync cleanup vmop] page_trap_count
> 32358.859: ForceAsyncSafepoint              [    9126         67            
> 474    ]      [     1    28 86596    87   101    ]  0
> {code}
> Also we got the jstack: 
> {code}
> $ cat 31162.stack.1  | grep 'ForkJoinPool-1-worker' | wc -l
> 8648
> {code}
> It's a dangerous bug, make it as blocker.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HBASE-22867) The ForkJoinPool in CleanerChore will spawn thousands of threads in our cluster with thousands table

Reply via email to