Igniters,

I'd like to start a discussion about new storage format for Ignite. Our
current approach is so-called *heap-organized* storage with secondary index
per partition. It has a number of drawbacks:
1) Slow scans (joins, OLAP workload) - data is writen in arbitrary manner,
so iteration over base index leads to multiple page reads and page locks
2) Slow writes in case of OLTP workload- every update touches miltiple
index and free-list pages (a kind of write amplification)
3) Duplicated PK index when SQL is enabled - our base index cannot be used
for lookups or range scans. This makes write amplification effects even
worse.

All mature RDBMS systems emply alternative format as default -
*index-organized* storage. In this case primary index leaf pages is data
pages. Rowse are sorted inside data pages. This gives:
- Blazingly fast scans (no dereference, less page reads, less evictions,
less locks)
- Fast writes in OLTP workloads when PK index column (e.g. ID) grows
monotonically (you need to *update only one page* if there are no splits)
- Slower random writes due to index fragmentation compared to heap

I propose to adopt this approach in two phases:
1) Optionally add data to leaf pages [1]. This should improve our ScanQuery
dramatically
2) Optionally has single primary index instead of per-partition index [2].
This should improve our updates and SQL scans at the cost of harder
rebalance and recovery.

Thoughts?

[1] https://issues.apache.org/jira/browse/IGNITE-7026
[2] https://issues.apache.org/jira/browse/IGNITE-7027

Reply via email to