[jira] [Updated] (IGNITE-6089) SQL: Improve query parallelism architecture

2018-03-21 Thread Vladimir Ozerov (JIRA)

 [ 
https://issues.apache.org/jira/browse/IGNITE-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Ozerov updated IGNITE-6089:

Labels: performance sql-engine  (was: performance)

> SQL: Improve query parallelism architecture
> ---
>
> Key: IGNITE-6089
> URL: https://issues.apache.org/jira/browse/IGNITE-6089
> Project: Ignite
>  Issue Type: Task
>  Components: sql
>Affects Versions: 2.1
>Reporter: Vladimir Ozerov
>Priority: Major
>  Labels: performance, sql-engine
>
> Currently query parallelism implement with static split of all indexes 
> (including PK) for cache. This approach has several major disadvantages:
> 1) It improves scans, but slows down index and range lookups
> 2) Tables with different DOP cannot be used in the same query
> We need to fix that. Proposed plan:
> 1) No more index splits, ever - there is one and only one index always
> 2) Use preliminary execution plan, statistics (IGNITE-6079), CPU cores count 
> and CPU load to estimate whether query will benefit from parallelism. 
> 3) if yes - split node-s single map query into several independent pieces. 
> Splitting can be achieved in one of the following ways:
> 1) Partition-based: e.g. if node owns partitions A, B, C and D, then we can 
> split it to two queries - one over (A, B), another over (C, D). This could be 
> useful for pure scans (e.g. DWH)
> 2) Histogram-based: e.g. if we have a query {{SELECT ... WHERE salary > 50}}, 
> and we know salary distribution, we can split it into {{WHERE salary > 50 AND 
> salary <= 200}} and {{WHERE salary > 200}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-6089) SQL: Improve query parallelism architecture

2017-08-16 Thread Vladimir Ozerov (JIRA)

 [ 
https://issues.apache.org/jira/browse/IGNITE-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Ozerov updated IGNITE-6089:

Labels: performance  (was: )

> SQL: Improve query parallelism architecture
> ---
>
> Key: IGNITE-6089
> URL: https://issues.apache.org/jira/browse/IGNITE-6089
> Project: Ignite
>  Issue Type: Task
>  Components: sql
>Affects Versions: 2.1
>Reporter: Vladimir Ozerov
>  Labels: performance
>
> Currently query parallelism implement with static split of all indexes 
> (including PK) for cache. This approach has several major disadvantages:
> 1) It improves scans, but slows down index and range lookups
> 2) Tables with different DOP cannot be used in the same query
> We need to fix that. Proposed plan:
> 1) No more index splits, ever - there is one and only one index always
> 2) Use preliminary execution plan, statistics (IGNITE-6079), CPU cores count 
> and CPU load to estimate whether query will benefit from parallelism. 
> 3) if yes - split node-s single map query into several independent pieces. 
> Splitting can be achieved in one of the following ways:
> 1) Partition-based: e.g. if node owns partitions A, B, C and D, then we can 
> split it to two queries - one over (A, B), another over (C, D). This could be 
> useful for pure scans (e.g. DWH)
> 2) Histogram-based: e.g. if we have a query {{SELECT ... WHERE salary > 50}}, 
> and we know salary distribution, we can split it into {{WHERE salary > 50 AND 
> salary <= 200}} and {{WHERE salary > 200}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)