Vladimir, my comments are inline... On Sat, May 5, 2018 at 6:12 AM, Vladimir Ozerov <voze...@gridgain.com> wrote:
> In general I do not support this initiative. There are two serious reasons > for that: > 1) Our indexes are slow on updates due to architectural flaws. First, every > index entry must be of fixed size. For this reason we cannot inline full > values in general case and suffer from data page lookups [1]. Second, final > comparisons always compare primary keys, so another lookup is needed [2]. > Third, our indexes are fat because we are lacking prefix compression [3]. > These all seem like great optimization and we should definitely do them. However, I am of the strong opinion that even after these optimizations, the data ingestion speed will be much slower with the persistence turned on. Am I wrong? > 2) Some vendors do have memory-only indexes - SQL Server, Couchbase, > MemSQL, to name a few. But they are memory optimized - no pages, no BTrees. > Lock-free skiplist is used instead. This is correct design which really > fast. But we are very far from it at the moment. > I have not heard complaints about our BTree indexes being slow in memory. I only hear complaints about the slow-downs whenever the persistence is turned on and users are ingesting large amounts of data. > Taking this in count I would not consider memory-only BTree indexes in the > nearest future. Instead, we should focus on performance. When mentioned > things are fixed/implemented, our indexes will be both memory-efficient and > very fast to update. > I would agree with you only if there is no performance boost in the short term. So far, disabling persistence for indexes seems like a very simple change, but could render a significant performance boost. > > [1] > https://issues.apache.org/jira/browse/IGNITE-8385 > [2] > https://issues.apache.org/jira/browse/IGNITE-8384 > [3] > https://cwiki.apache.org/confluence/display/IGNITE/IEP- > 20%3A+Data+Compression+in+Ignite#IEP-20:DataCompressioninIgnite- > IndexPrefixCompression > > сб, 5 мая 2018 г. в 3:46, Dmitriy Setrakyan <dsetrak...@apache.org>: > > > Igniters, > > > > One of the main complaints I hear from users is that whenever the > > persistence is turned on, index creation can really slow down the > > performance, because of massive amounts of writes to disk. The reason > > Ignite is writing indexes to disk is to support fast restarts - nothing > > needs to be rebuilt on startup, and Ignite can become operational right > > away. > > > > However, as far as I can tell, most users care about faster operations > > after the system is started and much less about the startup speed. What > if > > we added a mode where we do not persist indexes at all? This way data > > ingestion and overall throughput will significantly increase (of course, > at > > the cost of startup type getting longer because we have to rebuild the > > indexes). > > > > There are 2 ways to achieve this in Ignite. The simplest way is not mark > > index pages dirty in memory, so they will never participate in > > check-pointing process. We also have to make sure that index pages never > > get evicted form memory. This can be done fairly quickly. The > disadvantage > > of this approach is that if indexes fill up most of the memory, then it > > will be very difficult to find a page to evict, which may hurt the > > performance. > > > > The other way is to have a separate in-memory off-heap region for > indexes. > > This region should never be persisted. It maybe somewhat bigger > > refactoring, as we currently do not separate between index and data > pages. > > However, the advantage of this approach is that this region can be > flushed > > to disk practically as is during a graceful shutdown of the node, and > hence > > shorten the restart time. > > > > I think we should start from the 1st approach and then think about the > 2nd > > one. What do you think? > > > > D. > > >