Sergi, But current signature of query() method returns not just some iterator, but rather iterator of R which is type param of Query - i.e., we won't be able to return an int inside a QueryCursor<R>. At least without API change (signature of query() method will have to be changed to drop genericness, or in some other weird way). Is this what we really want? Or am I missing something in your point?
- Alex 2016-07-27 12:51 GMT+03:00 Sergi Vladykin <[email protected]>: > Exactly. This will allow our Jdbc driver to work transparently. > > Sergi > > 2016-07-27 12:40 GMT+03:00 Alexander Paschenko < > [email protected]>: > >> Sergi, >> >> You wrote: >> > I'd prefer to return the same information, so it will not be empty >> >> Do you mean return iterator with single element that denotes number of >> rows? >> >> Dmitriy, >> >> You wrote: >> > What is the ticket number for this. Is the new API documented there? >> >> Overall issue number is 2294. There's no particular issue on API >> changes, but creating one seems to be a good idea, I will do it. >> >> - Alex >> >> 2016-07-27 9:20 GMT+03:00 Dmitriy Setrakyan <[email protected]>: >> > What is the ticket number for this. Is the new API documented there? >> > >> > On Tue, Jul 26, 2016 at 11:36 AM, Sergi Vladykin < >> [email protected]> >> > wrote: >> > >> >> I don't see anything ugly in empty iterator, sorry if I insulted your >> taste >> >> of beauty. >> >> >> >> If you will take a look at Jdbc, you will see that >> Statement.executeUpdate >> >> method returns number of updated rows, I'd prefer to return the same >> >> information, so it will not be empty (beauty is restored!). >> >> >> >> Sergi >> >> >> >> >> >> >> >> 2016-07-26 18:24 GMT+03:00 Alexander Paschenko < >> >> [email protected]>: >> >> >> >> > I see your point. But what about my concerns from initial post? >> >> > Particularly about signatures of existing methods? I personally don't >> >> > like an option of query() method always returning an empty iterator >> >> > for any non-select query, it seems ugly design wise. >> >> > >> >> > - Alex >> >> > >> >> > 2016-07-26 18:15 GMT+03:00 Sergi Vladykin <[email protected]>: >> >> > > BTW, the simplest way to solve this issue is to allow running SQL >> >> > commands >> >> > > inside of SqlFieldsQuery. >> >> > > >> >> > > We may add some additional convenience API for updates if we want, >> but >> >> > JDBC >> >> > > client will always call it like this: >> >> > > >> >> > > cache.query(new SqlFieldsQuery("INSERT INTO MY_TABLE >> >> > > VALUES(?,?)").setArgs(1,2)); >> >> > > >> >> > > This will resolve any ambiguity. >> >> > > >> >> > > Sergi >> >> > > >> >> > > 2016-07-26 17:56 GMT+03:00 Sergi Vladykin <[email protected] >> >: >> >> > > >> >> > >> I don't like any pre-parsing, especially with some libraries other >> >> than >> >> > >> H2. H2 itself has enough quirks to multiply it on quirks of another >> >> > library. >> >> > >> >> >> > >> This is exactly what I was talking about - we need some single >> entry >> >> > point >> >> > >> on API for all the SQL commands and queries. Thats why I suggested >> >> > >> SqlUpdate to extend Query. To me its is the cleanest approach. May >> be >> >> we >> >> > >> need to change in some backward compatible way this Query >> hierarchy to >> >> > get >> >> > >> rid of extra methods but the idea is still the same. >> >> > >> >> >> > >> Sergi >> >> > >> >> >> > >> 2016-07-26 14:34 GMT+03:00 Alexander Paschenko < >> >> > >> [email protected]>: >> >> > >> >> >> > >>> Guys, >> >> > >>> >> >> > >>> I would like to advance the discussion further. There's one quite >> >> > >>> important question that arose based on current state of work on >> this >> >> > >>> issue. If we use some kind of interactive console, like Visor, >> then >> >> > >>> how should it know whether SQL query it is requested to execute >> >> > >>> returns a result set or not? In JDBC world, solution is quite >> simple >> >> - >> >> > >>> there's base interface called Statement that all commands >> implement, >> >> > >>> and it has magic isResultSet method that tells whether statement >> is a >> >> > >>> query or an update command. The API proposed now has separate >> Query >> >> > >>> and Update operations which I believe to be a right thing by the >> >> > >>> reasons I outlined in the beginning of this thread. However, their >> >> > >>> lack of common ancestor prevents possible console clients from >> >> running >> >> > >>> text SQL commands in a fully transparent manner - like >> >> > >>> IgniteCache.execute(String sql). Therefore I see two possible >> ways of >> >> > >>> solving this: >> >> > >>> >> >> > >>> - we change API so that it includes new class or interface >> parenting >> >> > >>> both Query and Update, and clients use it to communicate with >> cache >> >> > >>> - we let (or make :) ) the client determine command type >> >> independently >> >> > >>> and behave accordingly - for it to work it will have some kind of >> >> > >>> command parsing by itself just to determine its type. Visor >> console >> >> > >>> may use simple library like JSqlParser >> >> > >>> (https://github.com/JSQLParser/JSqlParser; dual LGPL 2.1/ASF 2.0 >> >> > >>> licensed) to determine request type in terms of JDBC, and behave >> >> > >>> accordingly. >> >> > >>> >> >> > >>> Personally, I think that the second approach is better - and >> here's >> >> > why. >> >> > >>> >> >> > >>> First, it does not seem wise to change API simply to make console >> (or >> >> > >>> any other) clients simpler. Programmatic APIs should be concise >> and >> >> > >>> short for programmatic use, console clients should be easy to use >> >> from >> >> > >>> console - and that's it: after all, console client exists to free >> a >> >> > >>> user from burden of doing things programmatically, so its aim is >> to >> >> > >>> adapt API to console or whatever UI. >> >> > >>> Second, possible complications in client implied by such approach >> >> > >>> certainly won't be dramatic - I don't think that additional single >> >> > >>> query parsing operation in client code will make it much harder to >> >> > >>> develop. >> >> > >>> Third, as I see it now, adding a new "synthetic" entity and new >> >> method >> >> > >>> would take more effort to adapting the client to new API. >> >> > >>> >> >> > >>> Dmitry, Sergi, I would like to hear what you think about it all. >> >> > Thanks. >> >> > >>> >> >> > >>> - Alex >> >> > >>> >> >> > >>> 2016-07-21 21:17 GMT+03:00 Dmitriy Setrakyan < >> [email protected] >> >> >: >> >> > >>> > OK, then using your analogy, the current behavior in Ignite is >> >> MERGE >> >> > for >> >> > >>> > the most part. >> >> > >>> > >> >> > >>> > My preference is that Ignite SQL should work no different from >> >> > >>> traditional >> >> > >>> > databases, which means: >> >> > >>> > >> >> > >>> > - INSERT is translated into *putIfAbsent()* call in Ignite >> >> > >>> > - UPDATE is translated into *replace()* call in Ignite >> >> > >>> > - MERGE is translated into *put()* call in Ignite >> >> > >>> > - For SQL BATCH calls we should delegate to Ignite batch >> >> operations, >> >> > >>> e.g. >> >> > >>> > *putAll()* >> >> > >>> > >> >> > >>> > The above should hold true for atomic and transactional >> put/putAll >> >> > >>> calls, >> >> > >>> > as well as for the data streamer. >> >> > >>> > >> >> > >>> > Does this make sense? >> >> > >>> > >> >> > >>> > D. >> >> > >>> > >> >> > >>> > On Thu, Jul 21, 2016 at 4:06 AM, Sergi Vladykin < >> >> > >>> [email protected]> >> >> > >>> > wrote: >> >> > >>> > >> >> > >>> >> No, this does not make sense. >> >> > >>> >> >> >> > >>> >> There is no upsert mode in databases. There are operations: >> >> INSERT, >> >> > >>> UPDATE, >> >> > >>> >> DELETE, MERGE. >> >> > >>> >> >> >> > >>> >> I want to have clear understanding of how they have to behave >> in >> >> SQL >> >> > >>> >> databases and how they will actually behave in Ignite in >> different >> >> > >>> >> scenarios. Also I want to have clear understanding of >> performance >> >> > >>> >> implications of each decision here. >> >> > >>> >> >> >> > >>> >> Anything wrong with that? >> >> > >>> >> >> >> > >>> >> Sergi >> >> > >>> >> >> >> > >>> >> On Thu, Jul 21, 2016 at 1:04 PM, Dmitriy Setrakyan < >> >> > >>> [email protected]> >> >> > >>> >> wrote: >> >> > >>> >> >> >> > >>> >> > Serj, are you asking what will happen as of today? Then the >> >> answer >> >> > >>> to all >> >> > >>> >> > your questions is that duplicate keys are not an issue, and >> >> Ignite >> >> > >>> always >> >> > >>> >> > operates in **upsert** mode (which is essentially a *“put(…)” >> >> > >>> *method). >> >> > >>> >> > >> >> > >>> >> > However, the *“insert”* that is suggested by Alex would >> delegate >> >> > to >> >> > >>> >> > *“putIfAbsent(…)”*, which in database world makes more sense. >> >> > >>> However, in >> >> > >>> >> > this case, the *“update”* syntax should delegate to >> >> > *“replace(…)”*, >> >> > >>> as >> >> > >>> >> > update should fail in case if a key is absent. >> >> > >>> >> > >> >> > >>> >> > Considering the above, a notion of “*upsert”* or “*merge” >> >> > *operation >> >> > >>> is >> >> > >>> >> > very much needed, as it will give a user an option to perform >> >> > >>> >> > “insert-or-update” in 1 call. >> >> > >>> >> > >> >> > >>> >> > Does this make sense? >> >> > >>> >> > >> >> > >>> >> > D. >> >> > >>> >> > >> >> > >>> >> > On Wed, Jul 20, 2016 at 9:39 PM, Sergi Vladykin < >> >> > >>> >> [email protected]> >> >> > >>> >> > wrote: >> >> > >>> >> > >> >> > >>> >> > > I'd prefer to do MERGE operation last because in H2 it is >> not >> >> > >>> standard >> >> > >>> >> > ANSI >> >> > >>> >> > > SQL MERGE. Or may be not implement it at all, or may be >> >> > contribute >> >> > >>> ANSI >> >> > >>> >> > > correct version to H2, then implement it on Ignite. Need to >> >> > >>> investigate >> >> > >>> >> > the >> >> > >>> >> > > semantics deeper before making any decisions here. >> >> > >>> >> > > >> >> > >>> >> > > Lets start with simple scenarios for INSERT and go through >> all >> >> > the >> >> > >>> >> > possible >> >> > >>> >> > > cases and answer the questions: >> >> > >>> >> > > - What will happen on key conflict in TX cache? >> >> > >>> >> > > - What will happen on key conflict in Atomic cache? >> >> > >>> >> > > - What will happen with the previous two if we use >> DataLoader? >> >> > >>> >> > > - How to make these operations efficient (it will be simple >> >> > enough >> >> > >>> to >> >> > >>> >> > > implement them with separate put/putIfAbsent operations but >> >> > >>> probably we >> >> > >>> >> > > will need some batching like putAllIfAbsent for >> efficiency)? >> >> > >>> >> > > >> >> > >>> >> > > As for API, we still will need to have a single entry point >> >> for >> >> > >>> all SQL >> >> > >>> >> > > queries/commands to allow any console work with it >> >> > transparently. >> >> > >>> It >> >> > >>> >> > would >> >> > >>> >> > > be great if we will be able to come up with something >> >> consistent >> >> > >>> with >> >> > >>> >> > this >> >> > >>> >> > > idea on public API. >> >> > >>> >> > > >> >> > >>> >> > > Sergi >> >> > >>> >> > > >> >> > >>> >> > > >> >> > >>> >> > > >> >> > >>> >> > > >> >> > >>> >> > > >> >> > >>> >> > > >> >> > >>> >> > > >> >> > >>> >> > > >> >> > >>> >> > > On Wed, Jul 20, 2016 at 2:23 PM, Dmitriy Setrakyan < >> >> > >>> >> > > [email protected]> >> >> > >>> >> > > wrote: >> >> > >>> >> > > >> >> > >>> >> > > > Like the idea of merge and insert. I need more time to >> think >> >> > >>> about >> >> > >>> >> the >> >> > >>> >> > > API >> >> > >>> >> > > > changes. >> >> > >>> >> > > > >> >> > >>> >> > > > Sergi, what do you think? >> >> > >>> >> > > > >> >> > >>> >> > > > Dmitriy >> >> > >>> >> > > > >> >> > >>> >> > > > >> >> > >>> >> > > > >> >> > >>> >> > > > On Jul 20, 2016, at 12:36 PM, Alexander Paschenko < >> >> > >>> >> > > > [email protected]> wrote: >> >> > >>> >> > > > >> >> > >>> >> > > > >> Thus, I suggest that we implement MERGE as a separate >> >> > >>> operation >> >> > >>> >> > backed >> >> > >>> >> > > > by putIfAbsent operation, while INSERT will be >> implemented >> >> via >> >> > >>> put. >> >> > >>> >> > > > > >> >> > >>> >> > > > > Sorry, of course I meant that MERGE has to be >> put-based, >> >> > while >> >> > >>> >> INSERT >> >> > >>> >> > > > > has to be putIfAbsent-based. >> >> > >>> >> > > > > >> >> > >>> >> > > > > 2016-07-20 12:30 GMT+03:00 Alexander Paschenko >> >> > >>> >> > > > > <[email protected]>: >> >> > >>> >> > > > >> Hell Igniters, >> >> > >>> >> > > > >> >> >> > >>> >> > > > >> In this thread I would like to share and discuss some >> >> > >>> thoughts on >> >> > >>> >> > DML >> >> > >>> >> > > > >> operations' implementation, so let's start and keep it >> >> > here. >> >> > >>> >> > Everyone >> >> > >>> >> > > > >> is of course welcome to share their suggestions. >> >> > >>> >> > > > >> >> >> > >>> >> > > > >> For starters, I was thinking about semantics of >> INSERT. >> >> In >> >> > >>> >> > traditional >> >> > >>> >> > > > >> RDBMSs, INSERT works only for records whose primary >> keys >> >> > don't >> >> > >>> >> > > > >> conflict with those of records that are already >> >> persistent >> >> > - >> >> > >>> you >> >> > >>> >> > can't >> >> > >>> >> > > > >> try to insert the same key more than once because >> you'll >> >> > get >> >> > >>> an >> >> > >>> >> > error. >> >> > >>> >> > > > >> However, semantics of cache put is obviously >> different - >> >> it >> >> > >>> does >> >> > >>> >> not >> >> > >>> >> > > > >> have anything about duplicate keys, it just quietly >> >> updates >> >> > >>> values >> >> > >>> >> > in >> >> > >>> >> > > > >> case of keys' duplication. Still, cache has >> putIfAbsent >> >> > >>> operation >> >> > >>> >> > that >> >> > >>> >> > > > >> is closer to traditional notion of INSERT, and H2's >> SQL >> >> > >>> dialect >> >> > >>> >> has >> >> > >>> >> > > > >> MERGE operation which corresponds to semantics of >> cache >> >> > put. >> >> > >>> >> Thus, I >> >> > >>> >> > > > >> suggest that we implement MERGE as a separate >> operation >> >> > >>> backed by >> >> > >>> >> > > > >> putIfAbsent operation, while INSERT will be >> implemented >> >> via >> >> > >>> put. >> >> > >>> >> > > > >> >> >> > >>> >> > > > >> And one more, probably more important thing: I suggest >> >> > that we >> >> > >>> >> > create >> >> > >>> >> > > > >> separate class Update and corresponding operation >> >> update() >> >> > in >> >> > >>> >> > > > >> IgniteCache. The reasons are as follows: >> >> > >>> >> > > > >> >> >> > >>> >> > > > >> - Query bears some flags that are clearly redundant >> for >> >> > Update >> >> > >>> >> (page >> >> > >>> >> > > > >> size, locality) >> >> > >>> >> > > > >> - query() method in IgniteCache (one that accepts >> Query) >> >> > and >> >> > >>> >> query() >> >> > >>> >> > > > >> methods in GridQueryIndexing return iterators. So, if >> we >> >> > >>> strive to >> >> > >>> >> > > > >> leave interfaces unchanged, we still will introduce >> some >> >> > >>> design >> >> > >>> >> > > > >> ugliness like query methods returning empty iterators >> for >> >> > >>> certain >> >> > >>> >> > > > >> queries, and/or query flags that indicate whether >> it's an >> >> > >>> update >> >> > >>> >> > query >> >> > >>> >> > > > >> or not, etc. >> >> > >>> >> > > > >> - If some Queries are update queries, then continuous >> >> > queries >> >> > >>> >> can't >> >> > >>> >> > be >> >> > >>> >> > > > >> based on them - more design-wise ugly checks and stuff >> >> like >> >> > >>> that. >> >> > >>> >> > > > >> - I'm pretty sure there's more I don't know about. >> >> > >>> >> > > > >> >> >> > >>> >> > > > >> Comments and suggestions are welcome. Sergi Vladykin, >> >> > Dmitry >> >> > >>> >> > > > >> Setrakyan, your opinions are of particular interest, >> >> please >> >> > >>> >> advise. >> >> > >>> >> > > > >> >> >> > >>> >> > > > >> Regards, >> >> > >>> >> > > > >> Alex >> >> > >>> >> > > > >> >> > >>> >> > > >> >> > >>> >> > >> >> > >>> >> >> >> > >>> >> >> > >> >> >> > >> >> >> > >> >> >>
