Dmitriy, It looks like Konstantin is talking about specific case, when you specified readThrough/writeThrough mode for your caches. In a such mode all your WRITE operations and some portion of READ operation are inevitably disk-based.
Thus all the suggested enhancements are about readThrough/writeThrough mode only. Igor Rudyak On Thu, Jul 14, 2016 at 3:00 PM, Dmitriy Setrakyan <dsetrak...@apache.org> wrote: > On Thu, Jul 14, 2016 at 9:07 PM, Konstantin Boudnik <c...@apache.org> > wrote: > > > On Wed, Jul 13, 2016 at 05:30AM, Dmitriy Setrakyan wrote: > > > Hi Alex, > > > > > > I believe most of your comments have to do with disk-based > functionality, > > > especially in regard to backups, snapshots, etc. However, Ignite is > > > currently an in-memory system, at least for the nearest future. Let me > > know > > > if I misunderstood something. > > > > And the nearest future is defined by....? This is a collaborative > project, > > as > > you all learned during the incubation, and the statements like "the X > only > > does bar for now" should be consensual. If there's a will to work on the > > new > > functionality which is demanded by the users, and the said functionality > is > > expected to expand the applicability of the technology - I don't really > see > > why and how it could be put to hold. > > > > Fortunately, there are a number of ways this development could be put > > through, > > and it doesn't really require much of the moving parts (in fact it is > done > > all > > the time in the same way right now): let's put the new development on a > > branch, and start moving. There's JIRA and there's the CI to help to > > validate > > and coordinate the work. Sounds like an easy decision to me. > > > > Cos, the nearest future is defined by the community, of course. Take a look > at the Ignite 2.0 discussion which is taking place on another thread [1]. > > Any disk-based functionality will require some significant memory-model > rearchitecture, which is already planned for Ignite 2.0 as part of > IGNITE-3477 [2] and IGNITE-3478 [3]. I believe Alexey G. has already > started making significant progress on it. Note that in-memory snapshots > are already defined as a part of this work. > > If the community decides to add disk based features, I am all for it. We > can start a discussion on it now, but the implementation should come after > the Ignite 2.0, to avoid any conflicts in architecture, design, or code. > Just my 0.02 cents. > > [1] - > > http://apache-ignite-developers.2346864.n4.nabble.com/Ignite-2-0-tasks-roadmap-td9585.html > [2] - https://issues.apache.org/jira/browse/IGNITE-3477 > [3] - https://issues.apache.org/jira/browse/IGNITE-3478 > > > > Cos > > > > > On Tue, Jul 12, 2016 at 9:44 PM, Alexandre Boudnik < > > > alexander.boud...@gmail.com> wrote: > > > > > > > Dmitriy, thank you for your time and questions, which helped me to > > > > realize what I forget to mentioned! > > > > See my answers inline; later I'll combine everything together to help > > > > to the next readers :) > > > > > > > > I put together some implementation ideas in Apache Ignite JIRA, as > > > > promised: https://issues.apache.org/jira/browse/IGNITE-3457. I see > > > > this facility as another CacheStore implementation, so it wouldn't > > > > interfere with base principals of Ignite platform. > > > > > > > > > > > > On Mon, Jul 11, 2016 at 1:15 AM, Dmitriy Setrakyan > > > > <dsetrak...@apache.org> wrote: > > > > > My answers are inline… > > > > > > > > > > On Sat, Jul 9, 2016 at 3:04 AM, Dmitriy Setrakyan < > > dsetrak...@apache.org > > > > > > > > > > wrote: > > > > > > > > > >> Thanks Sasha! > > > > >> > > > > >> Resending to the dev list. > > > > >> > > > > >> D. > > > > >> > > > > >> On Fri, Jul 8, 2016 at 2:02 PM, Alexandre Boudnik < > > > > alexan...@boudnik.org> > > > > >> wrote: > > > > >> > > > > >>> Apache Ignite a great platform but it lacks of certain > > capabilities, > > > > >>> which are common in RDMS world, such as: > > > > >>> - Consistent on-line backup for data on entire cluster (or for > > > > >>> specified set of caches) > > > > >>> > > > > >> > > > > > I think you mean data center replication here. It is not an easy > > feature > > > > to > > > > > implement, and so far has been handled by commercial vendors of > > Ignite, > > > > > e.g. GridGain. > > > > > > > > > Actually not. Right here I meant exactly what I said: full or > > > > incremental backup of all/selected caches in consistent state so it > > > > can be used for the purpose of being able to restore them in case of > > > > data loss or data corruption. One of important use cases is the OLAP > > > > systems (let's say for banking), which has been built on Apache > Ignite > > > > platform. > > > > > > > > And you right, data center replication can be easily implemented > based > > > > on log/snapshot shipment. > > > > > > > > > > > > > >> - Hierarchal snapshots for specified set caches > > > > >>> > > > > >> > > > > > What do you mean by hierarchical? > > > > > > > > > In this particular case the notion of hierarchical snapshots is very > > > > similar to the same notion used in SAN appliances or by Virtual Box > or > > > > vmware. Using concept of snapshots we can do all this amazing things: > > > > - full and incremental backup > > > > - restore > > > > - rollback to checkpoint > > > > - roll forward > > > > much easier, with minimal memory and I/O overhead. > > > > > > > > > > > > > >> - Transaction log > > > > >>> > > > > >> > > > > > Why does Ignite need it for in-memory transactions? > > > > > > > > > At least it is required to provide roll-forward functionality, when > > > > you restores the state of the cache from checkpoint (the cache state > > > > before snapshot has been made) and then reapply transactions one by > > > > one. > > > > > > > > > > > > > >> - Restore cluster state as of certain point in time > > > > >>> > > > > >> > > > > > Given that such restorability may introduce lots of memory > overhead, > > does > > > > > it really make sense for an in-memory cache? > > > > > > > > > Actually, it will not consume any memory. It will use external > memory, > > > > such as HDD/SSD space instead. And yes, I think that this > > > > functionality makes complete sense for our users IRL, who will love > > > > it. > > > > > > > > > > > > > >> - Rolling forward from snapshot with ability to filter/modify > > > > transactions > > > > >>> > > > > >> > > > > > Same as above > > > > > > > > > The same as above: my customers in trenches are begging for that > > feature. > > > > > > > > > > > > > >> - Asynchronous replication based either on log shipment or > snapshot > > > > >>> shipment > > > > >>> -- Between clusters > > > > >>> > > > > >> > > > > > This is the same as data center replication, no? > > > > Including but not limited to: log shipment or snapshot shipment also > > > > could be used to implement so called > "better-than-lambda-architecture" > > > > for BI and OLAP, when data replicated to a query-able datasource > let's > > > > say Oracle as soon as they are produced by OLTP system. We can use > > > > RDBMS API such as Oracle Streams (going to be discontinued - sad) or > > > > Golden Gate to filter changes from logs/snapshots and then apply > them. > > > > That approach allows to save a tons of legacy reports and BI > > > > dashboards. > > > > > > > > > > > > > > > > > > >> -- Continues data export to let’s say RDMS > > > > >>> > > > > >> > > > > > Don’t we already support it with our write-through feature to a > > database? > > > > > > > > > When write-through used for non-local caches it may cause the data > > > > corruption in RDBMS: I have opened this issue a few weeks ago: > > > > https://issues.apache.org/jira/browse/IGNITE-3321 > > > > > > > > > > > > > >> It is also a necessity to reduce cold start time for huge clusters > > > > >>> with strict SLAs. > > > > >>> > > > > >> > > > > > What part are you trying to speed up here? Are you talking about > > loading > > > > > data from databases? > > > > > > > > > I'm talking about the initial load from Persistent Store when cluster > > > > has been cold-started (like from GridGain's Local Recoverable Store). > > > > > > > > > > > > > >> > > > > >>> I'll put some implementation ideas in JIRA later on. I believe > that > > > > >>> this list is far from being complete, but I want the community to > > > > >>> discuss these abovementioned use cases. > > > > >>> > > > > >>> --Sasha > > > > >>> > > > > >> > > > > >> > > > > > > >