Sounds good to me. On Tue, Dec 4, 2018 at 12:06 PM Юрий <jury.gerzhedow...@gmail.com> wrote:
> Hey Igniters! > > I continue working on IEP-29 > < > https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring > > > in > part related to expose view with currently running queries. > > I checked how we track running queries right now and see that it's > complicated and spread by few classes process and we don't track all > running queries. Also there are internal queries which tracked within users > queries and can't be distinguished of user queries , e.g. map queries for > map-reduce queries or DML operation which required first step as select to > modify data. > > My proposal is extract logic for working with running queries information > to separate class, like RunningQueryManager. The class will track running > queries and will be single point to retrieve information about running > queries. Currently to keep information about running queries uses > GridRunningQueryInfo class. As of now it can't provide useful information > to distinguish internal queries and users query, so the class need to > extend to keep information about type of query and id of initial user query > for internal query to be able identify it. > > New RunningQueryManager should be used for all place where currently we > have tracking of running queries and added for all places which not covered > yet, it's mostly DDL and DML operations. > > After implement the proposed change we can simple expose SQL view for all > running queries on a local node. > Collecting information from all node in a cluster currently out of scope > the change and will be described for discussing later. > > Is there any objections for described proposal? > > чт, 29 нояб. 2018 г. в 09:03, Юрий <jury.gerzhedow...@gmail.com>: > > > Hi Alex, > > > > I've just started implement of the view. Thanks for the your efforts! > > > > ср, 28 нояб. 2018 г. в 19:00, Alex Plehanov <plehanov.a...@gmail.com>: > > > >> Yuriy, > >> > >> If you have plans to implement running queries view in the nearest > future, > >> I already have implemented draft for local node queries some time ago > [1]. > >> Maybe it will help to implement a view for whole cluster queries. > >> > >> [1]: > >> > >> > https://github.com/alex-plekhanov/ignite/commit/6231668646a2b0f848373eb4e9dc38d127603e43 > >> > >> > >> ср, 28 нояб. 2018 г. в 17:34, Vladimir Ozerov <voze...@gridgain.com>: > >> > >> > Denis > >> > > >> > I would wait for running queries view first. > >> > > >> > ср, 28 нояб. 2018 г. в 1:57, Denis Magda <dma...@apache.org>: > >> > > >> > > Vladimir, > >> > > > >> > > Please see inline > >> > > > >> > > On Mon, Nov 19, 2018 at 8:23 AM Vladimir Ozerov < > voze...@gridgain.com > >> > > >> > > wrote: > >> > > > >> > > > Denis, > >> > > > > >> > > > I partially agree with you. But there are several problem with > >> syntax > >> > > > proposed by you: > >> > > > 1) This is harder to implement technically - more parsing logic to > >> > > > implement. Ok, this is our internal problem, users do not care > >> about it > >> > > > 2) User will have to consult to docs in any case > >> > > > > >> > > > >> > > Two of these are not a big deal. We just need to invest more time > for > >> > > development and during the design phase so that people need to > consult > >> > the > >> > > docs rarely. > >> > > > >> > > > >> > > > 3) "nodeId" is not really node ID. For Ignite users node ID is > >> UUID. In > >> > > our > >> > > > case this is node order, and we intentionally avoided any naming > >> here. > >> > > > > >> > > > >> > > Let's use a more loose name such as "node". > >> > > > >> > > > >> > > > Query is just identified by a string, no more than that > >> > > > 4) Proposed syntax is more verbose and open ways for misuse. E.g. > >> what > >> > is > >> > > > "KILL QUERY WHERE queryId=1234"? > >> > > > > >> > > > I am not 100% satisfied with both variants, but the first one > looks > >> > > simpler > >> > > > to me. Remember, that user will not guess query ID. Instead, he > will > >> > get > >> > > > the list of running queries with some other syntax. What we need > to > >> > > > understand for now is how this syntax will look like. I think that > >> we > >> > > > should implement getting list of running queries, and only then > >> start > >> > > > working on cancellation. > >> > > > > >> > > > >> > > That's a good point. Syntax of both running and killing queires > >> commands > >> > > should be tightly coupled. We're going to name a column if running > >> > queries > >> > > IDs somehow anyway and that name might be resued in the WHERE clause > >> of > >> > > KILL. > >> > > > >> > > Should we discuss the syntax in a separate thread? > >> > > > >> > > -- > >> > > Denis > >> > > > >> > > > > >> > > > Vladimir. > >> > > > > >> > > > > >> > > > On Mon, Nov 19, 2018 at 7:02 PM Denis Mekhanikov < > >> > dmekhani...@gmail.com> > >> > > > wrote: > >> > > > > >> > > > > Guys, > >> > > > > > >> > > > > Syntax like *KILL QUERY '25.1234'* look a bit cryptic to me. > >> > > > > I'm going to look up in documentation, which parameter goes > first > >> in > >> > > this > >> > > > > query every time I use it. > >> > > > > I like the syntax, that Igor suggested more. > >> > > > > Will it be better if we make *nodeId* and *queryId *named > >> properties? > >> > > > > > >> > > > > Something like this: > >> > > > > KILL QUERY WHERE nodeId=25 and queryId=1234 > >> > > > > > >> > > > > Denis > >> > > > > > >> > > > > пт, 16 нояб. 2018 г. в 14:12, Юрий <jury.gerzhedow...@gmail.com > >: > >> > > > > > >> > > > > > I fully agree with last sentences and can start to implement > >> this > >> > > part. > >> > > > > > > >> > > > > > Guys, thanks for your productive participate at discussion. > >> > > > > > > >> > > > > > пт, 16 нояб. 2018 г. в 2:53, Denis Magda <dma...@apache.org>: > >> > > > > > > >> > > > > > > Vladimir, > >> > > > > > > > >> > > > > > > Thanks, make perfect sense to me. > >> > > > > > > > >> > > > > > > > >> > > > > > > On Thu, Nov 15, 2018 at 12:18 AM Vladimir Ozerov < > >> > > > voze...@gridgain.com > >> > > > > > > >> > > > > > > wrote: > >> > > > > > > > >> > > > > > > > Denis, > >> > > > > > > > > >> > > > > > > > The idea is that QueryDetailMetrics will be exposed > through > >> > > > separate > >> > > > > > > > "historical" SQL view in addition to current API. So we > are > >> on > >> > > the > >> > > > > same > >> > > > > > > > page here. > >> > > > > > > > > >> > > > > > > > As far as query ID I do not see any easy way to operate > on a > >> > > single > >> > > > > > > integer > >> > > > > > > > value (even 64bit). This is distributed system - we do not > >> want > >> > > to > >> > > > > have > >> > > > > > > > coordination between nodes to get query ID. And > >> coordination is > >> > > the > >> > > > > > only > >> > > > > > > > possible way to get sexy "long". Instead, I would propose > to > >> > form > >> > > > ID > >> > > > > > from > >> > > > > > > > node order and query counter within node. This will be > (int, > >> > > long) > >> > > > > > pair. > >> > > > > > > > For use convenience we may convert it to a single string, > >> e.g. > >> > > > > > > > "[node_order].[query_counter]". Then the syntax would be: > >> > > > > > > > > >> > > > > > > > KILL QUERY '25.1234'; // Kill query 1234 on node 25 > >> > > > > > > > KILL QUERY '25.*; // Kill all queries on the node 25 > >> > > > > > > > > >> > > > > > > > Makes sense? > >> > > > > > > > > >> > > > > > > > Vladimir. > >> > > > > > > > > >> > > > > > > > On Wed, Nov 14, 2018 at 1:25 PM Denis Magda < > >> dma...@apache.org > >> > > > >> > > > > wrote: > >> > > > > > > > > >> > > > > > > > > Yury, > >> > > > > > > > > > >> > > > > > > > > As I understand you mean that the view should contains > >> both > >> > > > running > >> > > > > > and > >> > > > > > > > > > finished queries. If be honest for the view I was > going > >> to > >> > > use > >> > > > > just > >> > > > > > > > > queries > >> > > > > > > > > > running right now. For finished queries I thought > about > >> > > another > >> > > > > > view > >> > > > > > > > with > >> > > > > > > > > > another set of fields which should include I/O related > >> > ones. > >> > > Is > >> > > > > it > >> > > > > > > > works? > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > Got you, so if only running queries are there then your > >> > initial > >> > > > > > > proposal > >> > > > > > > > > makes total sense. Not sure we need a view of the > finished > >> > > > queries. > >> > > > > > It > >> > > > > > > > will > >> > > > > > > > > be possible to analyze them through the updated > >> > DetailedMetrics > >> > > > > > > approach, > >> > > > > > > > > won't it? > >> > > > > > > > > > >> > > > > > > > > For "KILL QUERY node_id query_id" node_id required as > >> part > >> > of > >> > > > > unique > >> > > > > > > key > >> > > > > > > > > > of query and help understand Ignite which node start > the > >> > > > > > distributed > >> > > > > > > > > query. > >> > > > > > > > > > Use both parameters will allow cheap generate unique > key > >> > > across > >> > > > > all > >> > > > > > > > > nodes. > >> > > > > > > > > > Node which started a query can cancel it on all nodes > >> > > > participate > >> > > > > > > > nodes. > >> > > > > > > > > > So, to stop any queries initially we need just send > the > >> > > cancel > >> > > > > > > request > >> > > > > > > > to > >> > > > > > > > > > node who started the query. This mechanism is already > in > >> > > > Ignite. > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > Can we locate node_id behind the scenes if the user > >> supplies > >> > > > > query_id > >> > > > > > > > only? > >> > > > > > > > > A query record in the view already contains query_id and > >> > > node_id > >> > > > > and > >> > > > > > it > >> > > > > > > > > sounds like an extra work for the user to fill in all > the > >> > > details > >> > > > > for > >> > > > > > > us. > >> > > > > > > > > Embed node_id into query_id if you'd like to avoid extra > >> > > network > >> > > > > hops > >> > > > > > > for > >> > > > > > > > > query_id to node_id mapping. > >> > > > > > > > > > >> > > > > > > > > -- > >> > > > > > > > > Denis > >> > > > > > > > > > >> > > > > > > > > On Wed, Nov 14, 2018 at 1:04 AM Юрий < > >> > > > jury.gerzhedow...@gmail.com> > >> > > > > > > > wrote: > >> > > > > > > > > > >> > > > > > > > > > Denis, > >> > > > > > > > > > > >> > > > > > > > > > Under the hood 'time' will be as startTime, but for > >> system > >> > > > view I > >> > > > > > > > planned > >> > > > > > > > > > use duration which will be simple calculated as now - > >> > > > startTime. > >> > > > > > So, > >> > > > > > > > > there > >> > > > > > > > > > is't a performance issue. > >> > > > > > > > > > As I understand you mean that the view should contains > >> both > >> > > > > running > >> > > > > > > and > >> > > > > > > > > > finished queries. If be honest for the view I was > going > >> to > >> > > use > >> > > > > just > >> > > > > > > > > queries > >> > > > > > > > > > running right now. For finished queries I thought > about > >> > > another > >> > > > > > view > >> > > > > > > > with > >> > > > > > > > > > another set of fields which should include I/O related > >> > ones. > >> > > Is > >> > > > > it > >> > > > > > > > works? > >> > > > > > > > > > > >> > > > > > > > > > For "KILL QUERY node_id query_id" node_id required as > >> part > >> > > of > >> > > > > > unique > >> > > > > > > > key > >> > > > > > > > > > of query and help understand Ignite which node start > the > >> > > > > > distributed > >> > > > > > > > > query. > >> > > > > > > > > > Use both parameters will allow cheap generate unique > key > >> > > across > >> > > > > all > >> > > > > > > > > nodes. > >> > > > > > > > > > Node which started a query can cancel it on all nodes > >> > > > participate > >> > > > > > > > nodes. > >> > > > > > > > > > So, to stop any queries initially we need just send > the > >> > > cancel > >> > > > > > > request > >> > > > > > > > to > >> > > > > > > > > > node who started the query. This mechanism is already > in > >> > > > Ignite. > >> > > > > > > > > > > >> > > > > > > > > > Native SQL APIs will automatically support the futures > >> > after > >> > > > > > > > implementing > >> > > > > > > > > > for thin clients. So we are good here. > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > вт, 13 нояб. 2018 г. в 18:52, Denis Magda < > >> > dma...@apache.org > >> > > >: > >> > > > > > > > > > > >> > > > > > > > > > > Yury, > >> > > > > > > > > > > > >> > > > > > > > > > > Please consider the following: > >> > > > > > > > > > > > >> > > > > > > > > > > - If we record the duration instead of startTime, > >> then > >> > > the > >> > > > > > > former > >> > > > > > > > > has > >> > > > > > > > > > to > >> > > > > > > > > > > be updated frequently - sounds like a performance > >> red > >> > > > flag. > >> > > > > > > Should > >> > > > > > > > > we > >> > > > > > > > > > > store > >> > > > > > > > > > > startTime and endTime instead? This way a query > >> record > >> > > > will > >> > > > > be > >> > > > > > > > > updated > >> > > > > > > > > > > twice - when the query is started and terminated. > >> > > > > > > > > > > - In the IEP you've mentioned I/O related fields > >> that > >> > > > should > >> > > > > > > help > >> > > > > > > > to > >> > > > > > > > > > > grasp why a query runs that slow. Should they be > >> > stored > >> > > in > >> > > > > > this > >> > > > > > > > > view? > >> > > > > > > > > > > - "KILL QUERY query_id" is more than enough. > Let's > >> not > >> > > add > >> > > > > > > > "node_id" > >> > > > > > > > > > > unless it's absolutely required. Our queries are > >> > > > distributed > >> > > > > > and > >> > > > > > > > > > > executed > >> > > > > > > > > > > across several nodes that's why the node_id > >> parameter > >> > is > >> > > > > > > > redundant. > >> > > > > > > > > > > - This API needs to be supported across all our > >> > > > interfaces. > >> > > > > We > >> > > > > > > can > >> > > > > > > > > > start > >> > > > > > > > > > > with JDBC/ODBC and thin clients and then support > >> for > >> > the > >> > > > > > native > >> > > > > > > > SQL > >> > > > > > > > > > APIs > >> > > > > > > > > > > (Java, Net, C++) > >> > > > > > > > > > > - Please share examples of SELECTs in the IEP > that > >> > would > >> > > > > show > >> > > > > > > how > >> > > > > > > > to > >> > > > > > > > > > > find long running queries, queries that cause a > >> lot of > >> > > I/O > >> > > > > > > > troubles. > >> > > > > > > > > > > > >> > > > > > > > > > > -- > >> > > > > > > > > > > Denis > >> > > > > > > > > > > > >> > > > > > > > > > > On Tue, Nov 13, 2018 at 1:15 AM Юрий < > >> > > > > > jury.gerzhedow...@gmail.com> > >> > > > > > > > > > wrote: > >> > > > > > > > > > > > >> > > > > > > > > > > > Igniters, > >> > > > > > > > > > > > > >> > > > > > > > > > > > Some comments for my original email's. > >> > > > > > > > > > > > > >> > > > > > > > > > > > The proposal related to part of IEP-29 > >> > > > > > > > > > > > < > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring > >> > > > > > > > > > > > > > >> > > > > > > > > > > > . > >> > > > > > > > > > > > > >> > > > > > > > > > > > What purpose are we pursuing of the proposal? > >> > > > > > > > > > > > We want to be able check which queries running > right > >> > now > >> > > > > > through > >> > > > > > > > thin > >> > > > > > > > > > > > clients. Get some information related to the > queries > >> > and > >> > > be > >> > > > > > able > >> > > > > > > to > >> > > > > > > > > > > cancel > >> > > > > > > > > > > > a query if it required for some reasons. > >> > > > > > > > > > > > So, we need interface to get a running queries. > For > >> the > >> > > > goal > >> > > > > we > >> > > > > > > > > propose > >> > > > > > > > > > > > running_queries system view. The view contains > >> unique > >> > > query > >> > > > > > > > > identifier > >> > > > > > > > > > > > which need to pass to kill query command to cancel > >> the > >> > > > query. > >> > > > > > > > > > > > > >> > > > > > > > > > > > What do you think about fields of the running > >> queries > >> > > view? > >> > > > > May > >> > > > > > > be > >> > > > > > > > > some > >> > > > > > > > > > > > useful fields we could easy add to the view. > >> > > > > > > > > > > > > >> > > > > > > > > > > > Also let's discuss syntax of cancellation of > query. > >> I > >> > > > propose > >> > > > > > to > >> > > > > > > > use > >> > > > > > > > > > > MySQL > >> > > > > > > > > > > > like syntax as easy to understand and shorter then > >> > Oracle > >> > > > and > >> > > > > > > > > Postgres > >> > > > > > > > > > > > syntax ( detailed information in IEP-29 > >> > > > > > > > > > > > < > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring > >> > > > > > > > > > > > > > >> > > > > > > > > > > > ). > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > пн, 12 нояб. 2018 г. в 19:28, Юрий < > >> > > > > > jury.gerzhedow...@gmail.com > >> > > > > > > >: > >> > > > > > > > > > > > > >> > > > > > > > > > > > > Igniters, > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > Below is a proposed design for thin client SQL > >> > > management > >> > > > > and > >> > > > > > > > > > > monitoring > >> > > > > > > > > > > > > to cancel a queries. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > 1) Ignite expose system SQL view with name > >> > > > > *running_queries* > >> > > > > > > > > > > > > proposed columns: *node_id, query_id, sql, > >> > schema_name, > >> > > > > > > > > > connection_id, > >> > > > > > > > > > > > > duration*. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > node_id - initial node of request > >> > > > > > > > > > > > > query_id - unique id of query on node > >> > > > > > > > > > > > > sql - text of query > >> > > > > > > > > > > > > schema name - name of sql schema > >> > > > > > > > > > > > > connection_id - id of client connection from > >> > > > > > > > > > > > ClientListenerConnectionContext > >> > > > > > > > > > > > > class > >> > > > > > > > > > > > > duration - duration in millisecond from start of > >> > query > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > Ignite will gather info about running queries > from > >> > each > >> > > > of > >> > > > > > > nodes > >> > > > > > > > > and > >> > > > > > > > > > > > > collect it during user query. We already have > >> most of > >> > > the > >> > > > > > > > > information > >> > > > > > > > > > > at > >> > > > > > > > > > > > GridRunningQueryInfo > >> > > > > > > > > > > > > on each of nodes. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > Instead of duration we can use start_time, but I > >> > think > >> > > > > > duration > >> > > > > > > > > will > >> > > > > > > > > > be > >> > > > > > > > > > > > > simple to use due to it not depend on a > timezone. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > 2) Propose to use following syntax to kill a > >> running > >> > > > query: > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > *KILL QUERY node_Id query_id* > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > Both parameters node_id and query_id can be get > >> > through > >> > > > > > > > > > running_queries > >> > > > > > > > > > > > > system view. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > When a node receive such request it can be run > >> > locally > >> > > in > >> > > > > > case > >> > > > > > > > node > >> > > > > > > > > > > have > >> > > > > > > > > > > > > given node_id or send message to node with given > >> id. > >> > > > > Because > >> > > > > > > node > >> > > > > > > > > > have > >> > > > > > > > > > > > > information about local running queries then can > >> > cancel > >> > > > it > >> > > > > - > >> > > > > > it > >> > > > > > > > > > already > >> > > > > > > > > > > > > implemented in > >> > > > GridReduceQueryExecutor.cancelQueries(qryId) > >> > > > > > > > method. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > Comments are welcome. > >> > > > > > > > > > > > > -- > >> > > > > > > > > > > > > Живи с улыбкой! :D > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > -- > >> > > > > > > > > > > > Живи с улыбкой! :D > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > -- > >> > > > > > > > > > Живи с улыбкой! :D > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > > > -- > >> > > > > > Живи с улыбкой! :D > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > > > > > -- > > Живи с улыбкой! :D > > > > > -- > Живи с улыбкой! :D >