Vladimir, my comments are inline...

On Sat, May 5, 2018 at 6:12 AM, Vladimir Ozerov <voze...@gridgain.com>
wrote:

> In general I do not support this initiative. There are two serious reasons
> for that:
> 1) Our indexes are slow on updates due to architectural flaws. First, every
> index entry must be of fixed size. For this reason we cannot inline full
> values in general case and suffer from data page lookups [1]. Second, final
> comparisons always compare primary keys, so another lookup is needed [2].
> Third, our indexes are fat because we are lacking prefix compression [3].
>

These all seem like great optimization and we should definitely do them.
However, I am of the strong opinion that even after these optimizations,
the data ingestion speed will be much slower with the persistence turned
on. Am I wrong?


> 2) Some vendors do have memory-only indexes - SQL Server, Couchbase,
> MemSQL, to name a few. But they are memory optimized - no pages, no BTrees.
> Lock-free skiplist is used instead. This is correct design which really
> fast. But we are very far from it at the moment.
>

I have not heard complaints about our BTree indexes being slow in memory. I
only hear complaints about the slow-downs whenever the persistence is
turned on and users are ingesting large amounts of data.


> Taking this in count I would not consider memory-only BTree indexes in the
> nearest future. Instead, we should focus on performance. When mentioned
> things are fixed/implemented, our indexes will be both memory-efficient and
> very fast to update.
>

I would agree with you only if there is no performance boost in the short
term. So far, disabling persistence for indexes seems like a very simple
change, but could render a significant performance boost.


>
> [1]
> https://issues.apache.org/jira/browse/IGNITE-8385
> [2]
> https://issues.apache.org/jira/browse/IGNITE-8384
> [3]
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-
> 20%3A+Data+Compression+in+Ignite#IEP-20:DataCompressioninIgnite-
> IndexPrefixCompression
>
> сб, 5 мая 2018 г. в 3:46, Dmitriy Setrakyan <dsetrak...@apache.org>:
>
> > Igniters,
> >
> > One of the main complaints I hear from users is that whenever the
> > persistence is turned on, index creation can really slow down the
> > performance, because of massive amounts of writes to disk. The reason
> > Ignite is writing indexes to disk is to support fast restarts - nothing
> > needs to be rebuilt on startup, and Ignite can become operational right
> > away.
> >
> > However, as far as I can tell, most users care about faster operations
> > after the system is started and much less about the startup speed. What
> if
> > we added a mode where we do not persist indexes at all? This way data
> > ingestion and overall throughput will significantly increase (of course,
> at
> > the cost of startup type getting longer because we have to rebuild the
> > indexes).
> >
> > There are 2 ways to achieve this in Ignite. The simplest way is not mark
> > index pages dirty in memory, so they will never participate in
> > check-pointing process. We also have to make sure that index pages never
> > get evicted form memory. This can be done fairly quickly. The
> disadvantage
> > of this approach is that if indexes fill up most of the memory, then it
> > will be very difficult to find a page to evict, which may hurt the
> > performance.
> >
> > The other way is to have a separate in-memory off-heap region for
> indexes.
> > This region should never be persisted. It maybe somewhat bigger
> > refactoring, as we currently do not separate between index and data
> pages.
> > However, the advantage of this approach is that this region can be
> flushed
> > to disk practically as is during a graceful shutdown of the node, and
> hence
> > shorten the restart time.
> >
> > I think we should start from the 1st approach and then think about the
> 2nd
> > one. What do you think?
> >
> > D.
> >
>

Reply via email to