Raul,

Actually SQL indexes are already snapshotable. I'm not sure if it does make
sense to make
the whole cache (with full cache API support) snapshotable, but I like your
idea
about running multiple SQL statements against the same snapshot.

Also I don't think that it is a good idea to keep snapshots for a long time,
so I'd prefer to have typical AutoClosable API like:

try (Snapshot s = ...) {
    s.query(...);
    s.query(...);
    s.query(...);
}

Though I'm not sure when we will be able to get down to this.

Sergi

2015-10-21 12:06 GMT+03:00 Raul Kripalani <ra...@apache.org>:

> Hey guys,
>
> LevelDb has a functionality called Snapshots which provides a consistent
> read-only view of the DB at a given point in time, against which queries
> can be executed.
>
> To my knowledge, this functionality doesn't exist in the world of open
> source In-Memory Computing. Ignite could be an innovator here.
>
> Ignite Snapshots would allow queries, distributed closures, map-reduce
> jobs, etc. It could be useful for Spark RDDs to avoid data shift while the
> computation is taking place (not sure if there's already some form of
> snapshotting, though). Same for IGFS.
>
> Example usage:
>
>     IgniteCacheSnapshot snapshot =
> ignite.cache("mycache").snapshots().create();
>
>     // all three queries are executed against a view of the cache at the
> point in time where it was snapshotted
>     snapshot.query("select ...");
>     snapshot.query("select ...");
>     snapshot.query("select ...");
>
> In fact, it would be awesome to be able to logically save this snapshot
> with a name so that later jobs, queries, etc. can run on top of it, e.g.:
>
>     IgniteCacheSnapshot snapshot =
> ignite.cache("mycache").snapshots().create("abc");
>
>     // ...
>     // in another module of a distributed system, or in another thread in
> parallel, use the saved snapshot
>     IgniteCacheSnapshot snapshot =
> ignite.cache("mycache").snapshots().get("abc");
>     ....
>
> Named snapshotting can be dangerous due to data retention, e.g. imagine
> keeping a snapshot for 2 weeks! So we should force the user to specify a
> TTL:
>
>     IgniteCacheSnapshot snapshot =
> ignite.cache("mycache").snapshots().create("abc", 2, TimeUnit.HOURS);
>
> Such functionality would allow for "reporting checkpoints" and "time
> travel", for example, where you want users to be able to query the data as
> it stood 1 hour ago, 2 hours ago, etc.
>
> What do you think?
>
> P.S.: We do have some form of snapshotting in the Compute checkpointing
> functionality – but my proposal is to generalise the notion.
>
> Regards,
>
> *Raúl Kripalani*
> PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data and
> Messaging Engineer
> http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
> http://blog.raulkr.net | twitter: @raulvk
>

Reply via email to