[ https://issues.apache.org/jira/browse/IGNITE-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vladimir Ozerov updated IGNITE-6057: ------------------------------------ Labels: performance (was: iep-1 performance) > SQL: Full scan should be performed through data pages bypassing primary index > ----------------------------------------------------------------------------- > > Key: IGNITE-6057 > URL: https://issues.apache.org/jira/browse/IGNITE-6057 > Project: Ignite > Issue Type: Task > Components: persistence, sql > Affects Versions: 2.1 > Reporter: Vladimir Ozerov > Labels: performance > > Currently both SQL full scan and {{CREATE INDEX}} commands iterate through > primary index to get all existing values. Consider that we have 10 entries > per data page on average. In this case we will have to read the same data > page 10 times when reaching relevant keys in different parts of index tree. > This could be very inefficient on certain workloads. > We should iterate over data pages directly instead. This way a page with 10 > entries will be accessed only once. However, we should take cache groups in > count - if there are too many entries from other logical caches, this > approach could make situation even worse, unless we have a mechanism to skip > unnecessary entries (or the whole pages!) efficiently. > Probably we should develop a cost-based model, which will take in count the > following statistics: > 1) Average entry size. The longer the entry, the lesser the benefit. > Especially if overflow pages are used frequently. > 2) Cache groups. Ideally, we should estimate number of entries from all > logical caches. The more entries from other caches, the lesser the benefit. -- This message was sent by Atlassian JIRA (v6.4.14#64029)