Re: IGNITE-2294 implementation details

Sergi Vladykin Thu, 21 Jul 2016 04:07:44 -0700

No, this does not make sense.

There is no upsert mode in databases. There are operations: INSERT, UPDATE,
DELETE, MERGE.


I want to have clear understanding of how they have to behave in SQL
databases and how they will actually behave in Ignite in different
scenarios. Also I want to have clear understanding of performance
implications of each decision here.

Anything wrong with that?

Sergi

On Thu, Jul 21, 2016 at 1:04 PM, Dmitriy Setrakyan <dsetrak...@apache.org>
wrote:

> Serj, are you asking what will happen as of today? Then the answer to all
> your questions is that duplicate keys are not an issue, and Ignite always
> operates in **upsert** mode (which is essentially a *“put(…)” *method).
>
> However, the *“insert”* that is suggested by Alex would delegate to
> *“putIfAbsent(…)”*, which in database world makes more sense. However, in
> this case, the *“update”* syntax should delegate to *“replace(…)”*, as
> update should fail in case if a key is absent.
>
> Considering the above, a notion of “*upsert”* or “*merge” *operation is
> very much needed, as it will give a user an option to perform
> “insert-or-update” in 1 call.
>
> Does this make sense?
>
> D.
>
> On Wed, Jul 20, 2016 at 9:39 PM, Sergi Vladykin <sergi.vlady...@gmail.com>
> wrote:
>
> > I'd prefer to do MERGE operation last because in H2 it is not standard
> ANSI
> > SQL MERGE. Or may be not implement it at all, or may be contribute ANSI
> > correct version to H2, then implement it on Ignite. Need to investigate
> the
> > semantics deeper before making any decisions here.
> >
> > Lets start with simple scenarios for INSERT and go through all the
> possible
> > cases and answer the questions:
> > - What will happen on key conflict in TX cache?
> > - What will happen on key conflict in Atomic cache?
> > - What will happen with the previous two if we use DataLoader?
> > - How to make these operations efficient (it will be simple enough to
> > implement them with separate put/putIfAbsent operations but probably we
> > will need some batching like putAllIfAbsent for efficiency)?
> >
> > As for API, we still will need to have a single entry point for all SQL
> > queries/commands to allow any console work with it transparently. It
> would
> > be great if we will be able to come up with something consistent with
> this
> > idea on public API.
> >
> > Sergi
> >
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Jul 20, 2016 at 2:23 PM, Dmitriy Setrakyan <
> > dsetrak...@gridgain.com>
> > wrote:
> >
> > > Like the idea of merge and insert. I need more time to think about the
> > API
> > > changes.
> > >
> > > Sergi, what do you think?
> > >
> > > Dmitriy
> > >
> > >
> > >
> > > On Jul 20, 2016, at 12:36 PM, Alexander Paschenko <
> > > alexander.a.pasche...@gmail.com> wrote:
> > >
> > > >> Thus, I suggest that we implement MERGE as a separate operation
> backed
> > > by putIfAbsent operation, while INSERT will be implemented via put.
> > > >
> > > > Sorry, of course I meant that MERGE has to be put-based, while INSERT
> > > > has to be putIfAbsent-based.
> > > >
> > > > 2016-07-20 12:30 GMT+03:00 Alexander Paschenko
> > > > <alexander.a.pasche...@gmail.com>:
> > > >> Hell Igniters,
> > > >>
> > > >> In this thread I would like to share and discuss some thoughts on
> DML
> > > >> operations' implementation, so let's start and keep it here.
> Everyone
> > > >> is of course welcome to share their suggestions.
> > > >>
> > > >> For starters, I was thinking about semantics of INSERT. In
> traditional
> > > >> RDBMSs, INSERT works only for records whose primary keys don't
> > > >> conflict with those of records that are already persistent - you
> can't
> > > >> try to insert the same key more than once because you'll get an
> error.
> > > >> However, semantics of cache put is obviously different - it does not
> > > >> have anything about duplicate keys, it just quietly updates values
> in
> > > >> case of keys' duplication. Still, cache has putIfAbsent operation
> that
> > > >> is closer to traditional notion of INSERT, and H2's SQL dialect has
> > > >> MERGE operation which corresponds to semantics of cache put. Thus, I
> > > >> suggest that we implement MERGE as a separate operation backed by
> > > >> putIfAbsent operation, while INSERT will be implemented via put.
> > > >>
> > > >> And one more, probably more important thing: I suggest that we
> create
> > > >> separate class Update and corresponding operation update() in
> > > >> IgniteCache. The reasons are as follows:
> > > >>
> > > >> - Query bears some flags that are clearly redundant for Update (page
> > > >> size, locality)
> > > >> - query() method in IgniteCache (one that accepts Query) and query()
> > > >> methods in GridQueryIndexing return iterators. So, if we strive to
> > > >> leave interfaces unchanged, we still will introduce some design
> > > >> ugliness like query methods returning empty iterators for certain
> > > >> queries, and/or query flags that indicate whether it's an update
> query
> > > >> or not, etc.
> > > >> - If some Queries are update queries, then continuous queries can't
> be
> > > >> based on them - more design-wise ugly checks and stuff like that.
> > > >> - I'm pretty sure there's more I don't know about.
> > > >>
> > > >> Comments and suggestions are welcome. Sergi Vladykin, Dmitry
> > > >> Setrakyan, your opinions are of particular interest, please advise.
> > > >>
> > > >> Regards,
> > > >> Alex
> > >
> >
>

Re: IGNITE-2294 implementation details

Reply via email to