Re: IGNITE-2294 implementation details

Dmitriy Setrakyan Tue, 26 Jul 2016 23:21:24 -0700

What is the ticket number for this. Is the new API documented there?

On Tue, Jul 26, 2016 at 11:36 AM, Sergi Vladykin <sergi.vlady...@gmail.com>
wrote:


> I don't see anything ugly in empty iterator, sorry if I insulted your taste
> of beauty.
>
> If you will take a look at Jdbc, you will see that Statement.executeUpdate
> method returns number of updated rows, I'd prefer to return the same
> information, so it will not be empty (beauty is restored!).
>
> Sergi
>
>
>
> 2016-07-26 18:24 GMT+03:00 Alexander Paschenko <
> alexander.a.pasche...@gmail.com>:
>
> > I see your point. But what about my concerns from initial post?
> > Particularly about signatures of existing methods? I personally don't
> > like an option of query() method always returning an empty iterator
> > for any non-select query, it seems ugly design wise.
> >
> > - Alex
> >
> > 2016-07-26 18:15 GMT+03:00 Sergi Vladykin <sergi.vlady...@gmail.com>:
> > > BTW, the simplest way to solve this issue is to allow running SQL
> > commands
> > > inside of SqlFieldsQuery.
> > >
> > > We may add some additional convenience API for updates if we want, but
> > JDBC
> > > client will always call it like this:
> > >
> > > cache.query(new SqlFieldsQuery("INSERT INTO MY_TABLE
> > > VALUES(?,?)").setArgs(1,2));
> > >
> > > This will resolve any ambiguity.
> > >
> > > Sergi
> > >
> > > 2016-07-26 17:56 GMT+03:00 Sergi Vladykin <sergi.vlady...@gmail.com>:
> > >
> > >> I don't like any pre-parsing, especially with some libraries other
> than
> > >> H2. H2 itself has enough quirks to multiply it on quirks of another
> > library.
> > >>
> > >> This is exactly what I was talking about - we need some single entry
> > point
> > >> on API for all the SQL commands and queries. Thats why I suggested
> > >> SqlUpdate to extend Query. To me its is the cleanest approach. May be
> we
> > >> need to change in some backward compatible way this Query hierarchy to
> > get
> > >> rid of extra methods but the idea is still the same.
> > >>
> > >> Sergi
> > >>
> > >> 2016-07-26 14:34 GMT+03:00 Alexander Paschenko <
> > >> alexander.a.pasche...@gmail.com>:
> > >>
> > >>> Guys,
> > >>>
> > >>> I would like to advance the discussion further. There's one quite
> > >>> important question that arose based on current state of work on this
> > >>> issue. If we use some kind of interactive console, like Visor, then
> > >>> how should it know whether SQL query it is requested to execute
> > >>> returns a result set or not? In JDBC world, solution is quite simple
> -
> > >>> there's base interface called Statement that all commands implement,
> > >>> and it has magic isResultSet method that tells whether statement is a
> > >>> query or an update command. The API proposed now has separate Query
> > >>> and Update operations which I believe to be a right thing by the
> > >>> reasons I outlined in the beginning of this thread. However, their
> > >>> lack of common ancestor prevents possible console clients from
> running
> > >>> text SQL commands in a fully transparent manner - like
> > >>> IgniteCache.execute(String sql). Therefore I see two possible ways of
> > >>> solving this:
> > >>>
> > >>> - we change API so that it includes new class or interface parenting
> > >>> both Query and Update, and clients use it to communicate with cache
> > >>> - we let (or make :) ) the client determine command type
> independently
> > >>> and behave accordingly - for it to work it will have some kind of
> > >>> command parsing by itself just to determine its type. Visor console
> > >>> may use simple library like JSqlParser
> > >>> (https://github.com/JSQLParser/JSqlParser; dual LGPL 2.1/ASF 2.0
> > >>> licensed) to determine request type in terms of JDBC, and behave
> > >>> accordingly.
> > >>>
> > >>> Personally, I think that the second approach is better - and here's
> > why.
> > >>>
> > >>> First, it does not seem wise to change API simply to make console (or
> > >>> any other) clients simpler. Programmatic APIs should be concise and
> > >>> short for programmatic use, console clients should be easy to use
> from
> > >>> console - and that's it: after all, console client exists to free a
> > >>> user from burden of doing things programmatically, so its aim is to
> > >>> adapt API to console or whatever UI.
> > >>> Second, possible complications in client implied by such approach
> > >>> certainly won't be dramatic - I don't think that additional single
> > >>> query parsing operation in client code will make it much harder to
> > >>> develop.
> > >>> Third, as I see it now, adding a new "synthetic" entity and new
> method
> > >>> would take more effort to adapting the client to new API.
> > >>>
> > >>> Dmitry, Sergi, I would like to hear what you think about it all.
> > Thanks.
> > >>>
> > >>> - Alex
> > >>>
> > >>> 2016-07-21 21:17 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org
> >:
> > >>> > OK, then using your analogy, the current behavior in Ignite is
> MERGE
> > for
> > >>> > the most part.
> > >>> >
> > >>> > My preference is that Ignite SQL should work no different from
> > >>> traditional
> > >>> > databases, which means:
> > >>> >
> > >>> > - INSERT is translated into *putIfAbsent()* call in Ignite
> > >>> > - UPDATE is translated into *replace()* call in Ignite
> > >>> > - MERGE is translated into *put()* call in Ignite
> > >>> > - For SQL BATCH calls we should delegate to Ignite batch
> operations,
> > >>> e.g.
> > >>> > *putAll()*
> > >>> >
> > >>> > The above should hold true for atomic and transactional put/putAll
> > >>> calls,
> > >>> > as well as for the data streamer.
> > >>> >
> > >>> > Does this make sense?
> > >>> >
> > >>> > D.
> > >>> >
> > >>> > On Thu, Jul 21, 2016 at 4:06 AM, Sergi Vladykin <
> > >>> sergi.vlady...@gmail.com>
> > >>> > wrote:
> > >>> >
> > >>> >> No, this does not make sense.
> > >>> >>
> > >>> >> There is no upsert mode in databases. There are operations:
> INSERT,
> > >>> UPDATE,
> > >>> >> DELETE, MERGE.
> > >>> >>
> > >>> >> I want to have clear understanding of how they have to behave in
> SQL
> > >>> >> databases and how they will actually behave in Ignite in different
> > >>> >> scenarios. Also I want to have clear understanding of performance
> > >>> >> implications of each decision here.
> > >>> >>
> > >>> >> Anything wrong with that?
> > >>> >>
> > >>> >> Sergi
> > >>> >>
> > >>> >> On Thu, Jul 21, 2016 at 1:04 PM, Dmitriy Setrakyan <
> > >>> dsetrak...@apache.org>
> > >>> >> wrote:
> > >>> >>
> > >>> >> > Serj, are you asking what will happen as of today? Then the
> answer
> > >>> to all
> > >>> >> > your questions is that duplicate keys are not an issue, and
> Ignite
> > >>> always
> > >>> >> > operates in **upsert** mode (which is essentially a *“put(…)”
> > >>> *method).
> > >>> >> >
> > >>> >> > However, the *“insert”* that is suggested by Alex would delegate
> > to
> > >>> >> > *“putIfAbsent(…)”*, which in database world makes more sense.
> > >>> However, in
> > >>> >> > this case, the *“update”* syntax should delegate to
> > *“replace(…)”*,
> > >>> as
> > >>> >> > update should fail in case if a key is absent.
> > >>> >> >
> > >>> >> > Considering the above, a notion of “*upsert”* or “*merge”
> > *operation
> > >>> is
> > >>> >> > very much needed, as it will give a user an option to perform
> > >>> >> > “insert-or-update” in 1 call.
> > >>> >> >
> > >>> >> > Does this make sense?
> > >>> >> >
> > >>> >> > D.
> > >>> >> >
> > >>> >> > On Wed, Jul 20, 2016 at 9:39 PM, Sergi Vladykin <
> > >>> >> sergi.vlady...@gmail.com>
> > >>> >> > wrote:
> > >>> >> >
> > >>> >> > > I'd prefer to do MERGE operation last because in H2 it is not
> > >>> standard
> > >>> >> > ANSI
> > >>> >> > > SQL MERGE. Or may be not implement it at all, or may be
> > contribute
> > >>> ANSI
> > >>> >> > > correct version to H2, then implement it on Ignite. Need to
> > >>> investigate
> > >>> >> > the
> > >>> >> > > semantics deeper before making any decisions here.
> > >>> >> > >
> > >>> >> > > Lets start with simple scenarios for INSERT and go through all
> > the
> > >>> >> > possible
> > >>> >> > > cases and answer the questions:
> > >>> >> > > - What will happen on key conflict in TX cache?
> > >>> >> > > - What will happen on key conflict in Atomic cache?
> > >>> >> > > - What will happen with the previous two if we use DataLoader?
> > >>> >> > > - How to make these operations efficient (it will be simple
> > enough
> > >>> to
> > >>> >> > > implement them with separate put/putIfAbsent operations but
> > >>> probably we
> > >>> >> > > will need some batching like putAllIfAbsent for efficiency)?
> > >>> >> > >
> > >>> >> > > As for API, we still will need to have a single entry point
> for
> > >>> all SQL
> > >>> >> > > queries/commands to allow any console work with it
> > transparently.
> > >>> It
> > >>> >> > would
> > >>> >> > > be great if we will be able to come up with something
> consistent
> > >>> with
> > >>> >> > this
> > >>> >> > > idea on public API.
> > >>> >> > >
> > >>> >> > > Sergi
> > >>> >> > >
> > >>> >> > >
> > >>> >> > >
> > >>> >> > >
> > >>> >> > >
> > >>> >> > >
> > >>> >> > >
> > >>> >> > >
> > >>> >> > > On Wed, Jul 20, 2016 at 2:23 PM, Dmitriy Setrakyan <
> > >>> >> > > dsetrak...@gridgain.com>
> > >>> >> > > wrote:
> > >>> >> > >
> > >>> >> > > > Like the idea of merge and insert. I need more time to think
> > >>> about
> > >>> >> the
> > >>> >> > > API
> > >>> >> > > > changes.
> > >>> >> > > >
> > >>> >> > > > Sergi, what do you think?
> > >>> >> > > >
> > >>> >> > > > Dmitriy
> > >>> >> > > >
> > >>> >> > > >
> > >>> >> > > >
> > >>> >> > > > On Jul 20, 2016, at 12:36 PM, Alexander Paschenko <
> > >>> >> > > > alexander.a.pasche...@gmail.com> wrote:
> > >>> >> > > >
> > >>> >> > > > >> Thus, I suggest that we implement MERGE as a separate
> > >>> operation
> > >>> >> > backed
> > >>> >> > > > by putIfAbsent operation, while INSERT will be implemented
> via
> > >>> put.
> > >>> >> > > > >
> > >>> >> > > > > Sorry, of course I meant that MERGE has to be put-based,
> > while
> > >>> >> INSERT
> > >>> >> > > > > has to be putIfAbsent-based.
> > >>> >> > > > >
> > >>> >> > > > > 2016-07-20 12:30 GMT+03:00 Alexander Paschenko
> > >>> >> > > > > <alexander.a.pasche...@gmail.com>:
> > >>> >> > > > >> Hell Igniters,
> > >>> >> > > > >>
> > >>> >> > > > >> In this thread I would like to share and discuss some
> > >>> thoughts on
> > >>> >> > DML
> > >>> >> > > > >> operations' implementation, so let's start and keep it
> > here.
> > >>> >> > Everyone
> > >>> >> > > > >> is of course welcome to share their suggestions.
> > >>> >> > > > >>
> > >>> >> > > > >> For starters, I was thinking about semantics of INSERT.
> In
> > >>> >> > traditional
> > >>> >> > > > >> RDBMSs, INSERT works only for records whose primary keys
> > don't
> > >>> >> > > > >> conflict with those of records that are already
> persistent
> > -
> > >>> you
> > >>> >> > can't
> > >>> >> > > > >> try to insert the same key more than once because you'll
> > get
> > >>> an
> > >>> >> > error.
> > >>> >> > > > >> However, semantics of cache put is obviously different -
> it
> > >>> does
> > >>> >> not
> > >>> >> > > > >> have anything about duplicate keys, it just quietly
> updates
> > >>> values
> > >>> >> > in
> > >>> >> > > > >> case of keys' duplication. Still, cache has putIfAbsent
> > >>> operation
> > >>> >> > that
> > >>> >> > > > >> is closer to traditional notion of INSERT, and H2's SQL
> > >>> dialect
> > >>> >> has
> > >>> >> > > > >> MERGE operation which corresponds to semantics of cache
> > put.
> > >>> >> Thus, I
> > >>> >> > > > >> suggest that we implement MERGE as a separate operation
> > >>> backed by
> > >>> >> > > > >> putIfAbsent operation, while INSERT will be implemented
> via
> > >>> put.
> > >>> >> > > > >>
> > >>> >> > > > >> And one more, probably more important thing: I suggest
> > that we
> > >>> >> > create
> > >>> >> > > > >> separate class Update and corresponding operation
> update()
> > in
> > >>> >> > > > >> IgniteCache. The reasons are as follows:
> > >>> >> > > > >>
> > >>> >> > > > >> - Query bears some flags that are clearly redundant for
> > Update
> > >>> >> (page
> > >>> >> > > > >> size, locality)
> > >>> >> > > > >> - query() method in IgniteCache (one that accepts Query)
> > and
> > >>> >> query()
> > >>> >> > > > >> methods in GridQueryIndexing return iterators. So, if we
> > >>> strive to
> > >>> >> > > > >> leave interfaces unchanged, we still will introduce some
> > >>> design
> > >>> >> > > > >> ugliness like query methods returning empty iterators for
> > >>> certain
> > >>> >> > > > >> queries, and/or query flags that indicate whether it's an
> > >>> update
> > >>> >> > query
> > >>> >> > > > >> or not, etc.
> > >>> >> > > > >> - If some Queries are update queries, then continuous
> > queries
> > >>> >> can't
> > >>> >> > be
> > >>> >> > > > >> based on them - more design-wise ugly checks and stuff
> like
> > >>> that.
> > >>> >> > > > >> - I'm pretty sure there's more I don't know about.
> > >>> >> > > > >>
> > >>> >> > > > >> Comments and suggestions are welcome. Sergi Vladykin,
> > Dmitry
> > >>> >> > > > >> Setrakyan, your opinions are of particular interest,
> please
> > >>> >> advise.
> > >>> >> > > > >>
> > >>> >> > > > >> Regards,
> > >>> >> > > > >> Alex
> > >>> >> > > >
> > >>> >> > >
> > >>> >> >
> > >>> >>
> > >>>
> > >>
> > >>
> >
>

Re: IGNITE-2294 implementation details

Reply via email to