Hi Alex, I believe most of your comments have to do with disk-based functionality, especially in regard to backups, snapshots, etc. However, Ignite is currently an in-memory system, at least for the nearest future. Let me know if I misunderstood something.
D. On Tue, Jul 12, 2016 at 9:44 PM, Alexandre Boudnik < alexander.boud...@gmail.com> wrote: > Dmitriy, thank you for your time and questions, which helped me to > realize what I forget to mentioned! > See my answers inline; later I'll combine everything together to help > to the next readers :) > > I put together some implementation ideas in Apache Ignite JIRA, as > promised: https://issues.apache.org/jira/browse/IGNITE-3457. I see > this facility as another CacheStore implementation, so it wouldn't > interfere with base principals of Ignite platform. > > > On Mon, Jul 11, 2016 at 1:15 AM, Dmitriy Setrakyan > <dsetrak...@apache.org> wrote: > > My answers are inline… > > > > On Sat, Jul 9, 2016 at 3:04 AM, Dmitriy Setrakyan <dsetrak...@apache.org > > > > wrote: > > > >> Thanks Sasha! > >> > >> Resending to the dev list. > >> > >> D. > >> > >> On Fri, Jul 8, 2016 at 2:02 PM, Alexandre Boudnik < > alexan...@boudnik.org> > >> wrote: > >> > >>> Apache Ignite a great platform but it lacks of certain capabilities, > >>> which are common in RDMS world, such as: > >>> - Consistent on-line backup for data on entire cluster (or for > >>> specified set of caches) > >>> > >> > > I think you mean data center replication here. It is not an easy feature > to > > implement, and so far has been handled by commercial vendors of Ignite, > > e.g. GridGain. > > > Actually not. Right here I meant exactly what I said: full or > incremental backup of all/selected caches in consistent state so it > can be used for the purpose of being able to restore them in case of > data loss or data corruption. One of important use cases is the OLAP > systems (let's say for banking), which has been built on Apache Ignite > platform. > > And you right, data center replication can be easily implemented based > on log/snapshot shipment. > > > > >> - Hierarchal snapshots for specified set caches > >>> > >> > > What do you mean by hierarchical? > > > In this particular case the notion of hierarchical snapshots is very > similar to the same notion used in SAN appliances or by Virtual Box or > vmware. Using concept of snapshots we can do all this amazing things: > - full and incremental backup > - restore > - rollback to checkpoint > - roll forward > much easier, with minimal memory and I/O overhead. > > > > >> - Transaction log > >>> > >> > > Why does Ignite need it for in-memory transactions? > > > At least it is required to provide roll-forward functionality, when > you restores the state of the cache from checkpoint (the cache state > before snapshot has been made) and then reapply transactions one by > one. > > > > >> - Restore cluster state as of certain point in time > >>> > >> > > Given that such restorability may introduce lots of memory overhead, does > > it really make sense for an in-memory cache? > > > Actually, it will not consume any memory. It will use external memory, > such as HDD/SSD space instead. And yes, I think that this > functionality makes complete sense for our users IRL, who will love > it. > > > > >> - Rolling forward from snapshot with ability to filter/modify > transactions > >>> > >> > > Same as above > > > The same as above: my customers in trenches are begging for that feature. > > > > >> - Asynchronous replication based either on log shipment or snapshot > >>> shipment > >>> -- Between clusters > >>> > >> > > This is the same as data center replication, no? > Including but not limited to: log shipment or snapshot shipment also > could be used to implement so called "better-than-lambda-architecture" > for BI and OLAP, when data replicated to a query-able datasource let's > say Oracle as soon as they are produced by OLTP system. We can use > RDBMS API such as Oracle Streams (going to be discontinued - sad) or > Golden Gate to filter changes from logs/snapshots and then apply them. > That approach allows to save a tons of legacy reports and BI > dashboards. > > > > > > >> -- Continues data export to let’s say RDMS > >>> > >> > > Don’t we already support it with our write-through feature to a database? > > > When write-through used for non-local caches it may cause the data > corruption in RDBMS: I have opened this issue a few weeks ago: > https://issues.apache.org/jira/browse/IGNITE-3321 > > > > >> It is also a necessity to reduce cold start time for huge clusters > >>> with strict SLAs. > >>> > >> > > What part are you trying to speed up here? Are you talking about loading > > data from databases? > > > I'm talking about the initial load from Persistent Store when cluster > has been cold-started (like from GridGain's Local Recoverable Store). > > > > >> > >>> I'll put some implementation ideas in JIRA later on. I believe that > >>> this list is far from being complete, but I want the community to > >>> discuss these abovementioned use cases. > >>> > >>> --Sasha > >>> > >> > >> >