[ 
https://issues.apache.org/jira/browse/IGNITE-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616263#comment-15616263
 ] 

Andrew Mashenkov commented on IGNITE-4106:
------------------------------------------

I've implementes 2 prototypes. In both I try to speed up SQL query Map phase 
with multi-threading approach.
Compared scenarios: 1 node with splitting into 4 threads vs 4 nodes without 
splitting.

1) The first one. MapQuery message processing splits into several threads. Each 
thread runs same query over certain cache local partitions. When all threads 
fiished - results merged and return to Reducer. This approach shows significant 
speedup, but throughput is 10-15% slower than if we just add more nodes to 
grid. Code is far from ideal, i believe we can fix this 10-15% slowdown.

2)  The second. I try to split queries with sending more Map queries messages 
from query initiator node. But subset of primary partitions for target node 
were specified in these messages . So, remote nodes process these messages in 
parallel. This approach give worse results, throughput is 50% slower than if we 
just add more nodes to grid.

> SQL: parallelize sql queries over cache local partitions
> --------------------------------------------------------
>
>                 Key: IGNITE-4106
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4106
>             Project: Ignite
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6, 1.7
>            Reporter: Andrew Mashenkov
>            Assignee: Andrew Mashenkov
>              Labels: performance
>
> If we run SQL query on cache partitioned over several cluster nodes, it will 
> be split into several queries running in parallel. But really we will have 
> one thread per query on each node.
> So, for now, to improve SQL query performance we need to run more Ignite 
> instances or split caches manually.
> It seems to be better to split local SQL queries over cache partitions, so we 
> would be able to parallelize SQL query on every single node and utilize CPU 
> more efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to