Possible dead lock when number of jobs exceeds thread pool

Raymond Wilson Tue, 03 Oct 2017 20:02:29 -0700

I am testing an aspect of a POC I have written in C# to determine how well
it performs when processing multiple tasks at one time. I’m using Ignite
2.2.




Briefly, a client node sends a request to a server node for a list of
items. Each of the items requested is send back to a listener in the client
(using Ignite messaging) and then processed (rendered onto a bitmap). Once
all parts of the request have arrived and been rendered, a wait state is
triggered allowing the request to be completed. The wait state will
terminate after 2 minutes which short cuts the request in the client node.



This works well from a functional perspective, but I ran into problems with
multithreading.



To test multi-threading, I essentially did this:



                                int nThreads = 4;

            Parallel.For(0, nThreads, x => PerformRender(<some arguments>));



This works well for values of nThreads of up to 7. However, once nThreads
is set more than 7 then the requests start stalling, and no progress is
made the first eight requests until the wait states start timing out at two
minutes.



Looking in the log is it common to see entries like this during the time
the requests are stalled:



             WARN  2017-10-04 13:41:57,051 252758ms
IgniteKernal%Raptor                      ?                  - Possible
thread pool starvation detected (no task completed in last 30000ms, is
public thread pool size large enough?)

Here we see an internal monitor is warning regarding a possible deadlock or
race condition meaning progress is not being made on requests.



INFO  2017-10-04 15:47:21,182 263409ms
IgniteKernal%Raptor                      ?                  -

Metrics for local node (to disable set 'metricsLogFrequency' to 0)

    ^-- Node [id=c966b7cc, name=Raptor, uptime=00:04:10:105]

    ^-- H/N/C [hosts=1, nodes=7, CPUs=8]

    ^-- CPU [cur=0.07%, avg=1.76%, GC=0%]

    ^-- PageMemory [pages=0]

    ^-- Heap [used=103MB, free=88.58%, comm=497MB]

    ^-- Non heap [used=38MB, free=-1%, comm=39MB]

    ^-- Public thread pool [active=8, idle=0, qSize=274]

    ^-- System thread pool [active=0, idle=0, qSize=0]

    ^-- Outbound messages queue [size=0]

Here we see the pubic thread pool is set to 8 by default. This suggests
when the pool becomes saturated with requests something is stalling the
requests.



I tried increasing the number of threads in the public pool in the grid
configuration and this resulted in no stalls in requests which suggests
Ignite does not like fully committed thread pools.



Is this a known issue with Ignite?



Thanks,

Raymond.

Possible dead lock when number of jobs exceeds thread pool

Reply via email to