[ https://issues.apache.org/jira/browse/IGNITE-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vladimir Ozerov updated IGNITE-6089: ------------------------------------ Labels: performance (was: ) > SQL: Improve query parallelism architecture > ------------------------------------------- > > Key: IGNITE-6089 > URL: https://issues.apache.org/jira/browse/IGNITE-6089 > Project: Ignite > Issue Type: Task > Components: sql > Affects Versions: 2.1 > Reporter: Vladimir Ozerov > Labels: performance > > Currently query parallelism implement with static split of all indexes > (including PK) for cache. This approach has several major disadvantages: > 1) It improves scans, but slows down index and range lookups > 2) Tables with different DOP cannot be used in the same query > We need to fix that. Proposed plan: > 1) No more index splits, ever - there is one and only one index always > 2) Use preliminary execution plan, statistics (IGNITE-6079), CPU cores count > and CPU load to estimate whether query will benefit from parallelism. > 3) if yes - split node-s single map query into several independent pieces. > Splitting can be achieved in one of the following ways: > 1) Partition-based: e.g. if node owns partitions A, B, C and D, then we can > split it to two queries - one over (A, B), another over (C, D). This could be > useful for pure scans (e.g. DWH) > 2) Histogram-based: e.g. if we have a query {{SELECT ... WHERE salary > 50}}, > and we know salary distribution, we can split it into {{WHERE salary > 50 AND > salary <= 200}} and {{WHERE salary > 200}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)