Hi Paul, I'm still doubting the keyword for the SQL applications. SHOW QUERIES could imply that this will actually show the query, but we're returning IDs of the running application. At first I was also not very much in favour of SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax. Best regards, Martijn [1] https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary Op za 4 jun. 2022 om 10:38 schreef Paul Lam <[email protected]>: > Hi Godfrey, > > Sorry for the late reply, I was on vacation. > > It looks like we have a variety of preferences on the syntax, how about we > choose the most acceptable one? > > WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs > would be: > > - SHOW JOBS > - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and > `table.job.stop-with-drain`) > > WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR > JOB`: > > - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id> > - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job > manager remembers) > - DROP SAVEPOINT <savepoint_path> > > cc @Jark @ShengKai @Martijn @Timo . > > Best, > Paul Lam > > > godfrey he <[email protected]> 于2022年5月23日周一 21:34写道: > >> Hi Paul, >> >> Thanks for the update. >> >> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs >> (DataStream or SQL) or >> clients (SQL client or CLI). >> >> Is DataStream job a QUERY? I think not. >> For a QUERY, the most important concept is the statement. But the >> result does not contain this info. >> If we need to contain all jobs in the cluster, I think the name should >> be JOB or PIPELINE. >> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id. >> >> > SHOW SAVEPOINTS >> To list the savepoint for a specific job, we need to specify a >> specific pipeline, >> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id >> >> Best, >> Godfrey >> >> Paul Lam <[email protected]> 于2022年5月20日周五 11:25写道: >> > >> > Hi Jark, >> > >> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s >> > part of the reason why I proposed “STOP/CANCEL QUERY” at the >> > beginning. The downside of it is that it’s not ANSI-SQL compatible. >> > >> > Another question is, what should be the syntax for ungracefully >> > canceling a query? As ShengKai pointed out in a offline discussion, >> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users. >> > Flink CLI has both stop and cancel, mostly due to historical problems. >> > >> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is >> > that savepoints are owned by users and beyond the lifecycle of a Flink >> > cluster. For example, a user might take a savepoint at a custom path >> > that’s different than the default savepoint path, I think jobmanager >> would >> > not remember that, not to mention the jobmanager may be a fresh new >> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's >> > probably a best-effort one. >> > >> > WRT savepoint syntax, I’m thinking of the semantic of the savepoint. >> > Savepoints are alias for nested transactions in DB area[1], and there’s >> > correspondingly global transactions. If we consider Flink jobs as >> > global transactions and Flink checkpoints as nested transactions, >> > then the savepoint semantics are close, thus I think savepoint syntax >> > in SQL-standard could be considered. But again, I’m don’t have very >> > strong preference. >> > >> > Ping @Timo to get more inputs. >> > >> > [1] https://en.wikipedia.org/wiki/Nested_transaction < >> https://en.wikipedia.org/wiki/Nested_transaction> >> > >> > Best, >> > Paul Lam >> > >> > > 2022年5月18日 17:48,Jark Wu <[email protected]> 写道: >> > > >> > > Hi Paul, >> > > >> > > 1) SHOW QUERIES >> > > +1 to add finished time, but it would be better to call it "end_time" >> to >> > > keep aligned with names in Web UI. >> > > >> > > 2) DROP QUERY >> > > I think we shouldn't throw exceptions for batch jobs, otherwise, how >> to >> > > stop batch queries? >> > > At present, I don't think "DROP" is a suitable keyword for this >> statement. >> > > From the perspective of users, "DROP" sounds like the query should be >> > > removed from the >> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is >> more >> > > suitable and >> > > compliant with commands of Flink CLI. >> > > >> > > 3) SHOW SAVEPOINTS >> > > I think this statement is needed, otherwise, savepoints are lost >> after the >> > > SAVEPOINT >> > > command is executed. Savepoints can be retrieved from REST API >> > > "/jobs/:jobid/checkpoints" >> > > with filtering "checkpoint_type"="savepoint". It's also worth >> considering >> > > providing "SHOW CHECKPOINTS" >> > > to list all checkpoints. >> > > >> > > 4) SAVEPOINT & RELEASE SAVEPOINT >> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT >> statements >> > > now. >> > > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are >> both >> > > the same savepoint id. >> > > However, in our syntax, the first one is query id, and the second one >> is >> > > savepoint path, which is confusing and >> > > not consistent. When I came across SHOW SAVEPOINT, I thought maybe >> they >> > > should be in the same syntax set. >> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT >> > > <sp_path>. >> > > That means we don't follow the majority of vendors in SAVEPOINT >> commands. I >> > > would say the purpose is different in Flink. >> > > What other's opinion on this? >> > > >> > > Best, >> > > Jark >> > > >> > > [1]: >> > > >> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints >> > > >> > > >> > > On Wed, 18 May 2022 at 14:43, Paul Lam <[email protected]> wrote: >> > > >> > >> Hi Godfrey, >> > >> >> > >> Thanks a lot for your inputs! >> > >> >> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs >> (DataStream >> > >> or SQL) or >> > >> clients (SQL client or CLI). Under the hook, it’s based on >> > >> ClusterClient#listJobs, the >> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed >> in SQL >> > >> client, because >> > >> these jobs can be managed via SQL client too. >> > >> >> > >> WRT finished time, I think you’re right. Adding it to the FLIP. But >> I’m a >> > >> bit afraid that the >> > >> rows would be too long. >> > >> >> > >> WRT ‘DROP QUERY’, >> > >>> What's the behavior for batch jobs and the non-running jobs? >> > >> >> > >> >> > >> In general, the behavior would be aligned with Flink CLI. Triggering >> a >> > >> savepoint for >> > >> a non-running job would cause errors, and the error message would be >> > >> printed to >> > >> the SQL client. Triggering a savepoint for batch(unbounded) jobs in >> > >> streaming >> > >> execution mode would be the same with streaming jobs. However, for >> batch >> > >> jobs in >> > >> batch execution mode, I think there would be an error, because batch >> > >> execution >> > >> doesn’t support checkpoints currently (please correct me if I’m >> wrong). >> > >> >> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink >> clusterClient/ >> > >> jobClient doesn’t have such a functionality at the moment, neither do >> > >> Flink CLI. >> > >> Maybe we could make it a follow-up FLIP, which includes the >> modifications >> > >> to >> > >> clusterClient/jobClient and Flink CLI. WDYT? >> > >> >> > >> Best, >> > >> Paul Lam >> > >> >> > >>> 2022年5月17日 20:34,godfrey he <[email protected]> 写道: >> > >>> >> > >>> Godfrey >> > >> >> > >> >> > >> >
