Ignite in-memory + other SQL store without fully loading all data into Ignite

Courtney Robinson Sat, 26 Dec 2020 12:39:33 -0800

We've been using Ignite in production for almost 3 years and we love the
platform but there are some increasingly frustrating points we run into.
Before the holidays a few of our engineers started looking around and have
now presented a serious case for migrating from Ignite. We would end up
using at least 3 technologies they've identified to bridge the gap left by
Ignite features but have presented good cases for why managing these
individually would be a more flexible solution we could grow with.


I am not keen on the idea as it presents a major refactor that would likely
take 6 months to get to production but I understand and agree with the
points they've made. I'm trying to find a middle ground as this seems like
the nuclear option to me.

Of the top of my head some things they've raised are:

   1. Lack of tooling
      1. Inability to change a column type + no support in schema migration
      tools that we've found (even deleting the column we can't reuse the name)
      2. We had to build our own backup solution and even now backup has
      landed in 2.9 we can't use it directly because we have implemented
      relatively granular backup to be able to push to S3 compatible APIs (Ceph
      in our case) and restore partially to per hour granularity. Whilst we've
      done it, it took some serious engineering effort and time. We considered
      open sourcing it but it was done in a way that's tightly coupled to our
      internal stack and APIs.
      2. Inconsistency between various Ignite APIs.
      1. Transactions on KV, none of SQL (or now in beta but work seem
      seems to have paused?)
      2. SQL limitations - SELECT queries never read through data from the
      external database
      3. Even if we implemented a CacheStore we have to load all data into
      Ignite to use SELECT
      4. No referential integrity enforcement
   3. It is incredibly easy to corrupt the data in Ignite persistence.
   We've gotten better due to operational experience but upgrades (in k8s)
   still on occasion lead to one or two nodes being corrupt when their pod was
   stopped

I'll stop there but the point is, after 3yrs in production the team feels
like they're always running up against a wall and that Ignite has created
that wall.

My goal in writing this is to find out more about why the limitation around
CacheStore exists, how does Ignite persistence achieve partially caching
data in memory and pulling from disk if the data is not in memory and why
can't that apply to a CacheStore as well?

What would it take to make it so that Ignite's SQL operations could be
pushed down to a CacheStore implementation?
Ignite's a relatively large code base, so hints about which
classes/interfaces to investigate if we're looking to replace Ignite
persistence would be incredibly useful. My idea at the moment is to have
Ignite as the in-memory SQL layer with a SQL MPP providing persistence.

To me right now the path forward is for us to put the work into removing
these Ignite limitations if possible. We have a mixture of on-premise
clients for our product as well as a multi-tenant SaaS version - some of
these on-prem clients depend on Ignite's in-memory capabilities and so we
can't easily take this away.

FYI it doesn't have to be CacheStore, I realise this just inherits the
JCache interface just generally can something like CacheStore be
implemented to replace the integration that Ignite persistence provides?

Regards,
Courtney Robinson
Founder and CEO, Hypi
Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io>

<https://hypi.io>
https://hypi.io

Ignite in-memory + other SQL store without fully loading all data into Ignite

Reply via email to