[jira] [Commented] (IGNITE-4395) Implement communication backpressure per policy - SYSTEM or PUBLIC

2017-02-06 Thread Dmitry Karachentsev (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853652#comment-15853652
 ] 

Dmitry Karachentsev commented on IGNITE-4395:
-

[review|http://reviews.ignite.apache.org/ignite/review/IGNT-CR-89] 
[PR#1495|https://github.com/apache/ignite/pull/1495]


> Implement communication backpressure per policy - SYSTEM or PUBLIC
> --
>
> Key: IGNITE-4395
> URL: https://issues.apache.org/jira/browse/IGNITE-4395
> Project: Ignite
>  Issue Type: Improvement
>  Components: cache, compute
>Affects Versions: 1.7
>Reporter: Dmitry Karachentsev
>Assignee: Dmitry Karachentsev
> Fix For: 1.9
>
>
> 1) Start two data nodes with some cache.
> 2) From one node in async mode post some big number of jobs to another. That 
> jobs do some cache operations.
> 3) Grid hangs almost immediately and all threads are sleeping except public 
> ones, they are waiting for response.
> This happens because all cache and job messages are queued on communication 
> and limited with default number (1024). It looks like jobs are waiting for 
> cache responses that could not be received due to this limit.
> Proper solution here is to have communication backpressure per policy -
> SYSTEM or PUBLIC, but not single point as it is now. It could be achieved
> with having two queues per communication session or (which looks a bit
> easier to implement) to have separate connections.
> [PR#1331|https://github.com/apache/ignite/pull/1331] with test that leads to 
> grid hang.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (IGNITE-4395) Implement communication backpressure per policy - SYSTEM or PUBLIC

2016-12-27 Thread Dmitry Karachentsev (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15780178#comment-15780178
 ] 

Dmitry Karachentsev commented on IGNITE-4395:
-

Should be implemented after this improvement 
[IGNITE-3220|https://issues.apache.org/jira/browse/IGNITE-3220]

> Implement communication backpressure per policy - SYSTEM or PUBLIC
> --
>
> Key: IGNITE-4395
> URL: https://issues.apache.org/jira/browse/IGNITE-4395
> Project: Ignite
>  Issue Type: Improvement
>  Components: cache, compute
>Affects Versions: 1.7
>Reporter: Dmitry Karachentsev
>Assignee: Dmitry Karachentsev
> Fix For: 2.0
>
>
> 1) Start two data nodes with some cache.
> 2) From one node in async mode post some big number of jobs to another. That 
> jobs do some cache operations.
> 3) Grid hangs almost immediately and all threads are sleeping except public 
> ones, they are waiting for response.
> This happens because all cache and job messages are queued on communication 
> and limited with default number (1024). It looks like jobs are waiting for 
> cache responses that could not be received due to this limit.
> Proper solution here is to have communication backpressure per policy -
> SYSTEM or PUBLIC, but not single point as it is now. It could be achieved
> with having two queues per communication session or (which looks a bit
> easier to implement) to have separate connections.
> [PR#1331|https://github.com/apache/ignite/pull/1331] with test that leads to 
> grid hang.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)