Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Jark Wu Sat, 07 May 2022 03:23:09 -0700

Hi Paul,

I think this FLIP has already in a good shape. I just left some additional
thoughts:


*1) the display of savepoint_path*
Could the displayed savepoint_path include the scheme part?
E.g. `hdfs:///flink-savepoints/savepoint-cca7bc-bb1e257f0dab`
IIUC, the scheme part is omitted when it's a local filesystem.
But the behavior would be clearer if including the scheme part in the
design doc.

*2) Please make a decision on multiple options in the FLIP.*
It might give the impression that we will support all the options.

*3) +1 SAVEPOINT and RELEASE SAVEPOINT*
Personally, I also prefer "SAVEPOINT <query_id>" and "RELEASE SAVEPOINT
<savepoint_path>"
to "CREATE/DROP SAVEPOINT", as they have been used in mature databases.

*4) +1 SHOW QUERIES*
Btw, the displayed column "address" is a little confusing to me.
At the first glance, I'm not sure what address it is, JM RPC address? JM
REST address? Gateway address?
If this is a link to the job's web UI URL, how about calling it "web_url"
and display in
"http://<hostname>:<port>" format?
Besides, how about displaying "startTime" or "uptime" as well?

*5) STOP/CANCEL QUERY vs DROP QUERY*
I'm +1 to DROP, because it's more compliant with SQL standard naming, i.e.,
"SHOW/CREATE/DROP".
Separating STOP and CANCEL confuses users a lot what are the differences
between them.
I'm +1 to add the "PURGE" keyword to the DROP QUERY statement, which
indicates to stop query without savepoint.
Note that, PURGE doesn't mean stop with --drain flag. The drain flag will
flush all the registered timers
and windows which could lead to incorrect results when the job is resumed.
I think the drain flag is rarely used
(please correct me if I'm wrong), therefore, I suggest moving this feature
into future work when the needs are clear.

*6) Table API*
I think it makes sense to support the new statements in Table API.
We should try to make the Gateway and CLI simple which just forward
statement to the underlying TableEnvironemnt.
JAR statements are being re-implemented in Table API as well, see
FLIP-214[1].

*7) <query_id> and <savepoint_path> should be quoted*
All the <query_id> and <savepoint_path> should be string literal, otherwise
it's hard to parse them.
For example, STOP QUERY '<query_id>'.

*8) Examples*
Could you add an example that consists of all the statements to show how to
manage the full lifecycle of queries?
Including show queries, create savepoint, remove savepoint, stop query with
a savepoint, and restart query with savepoint.

Best,
Jark

[1]:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-214+Support+Advanced+Function+DDL?src=contextnavpagetreemode


On Fri, 6 May 2022 at 19:13, Martijn Visser <martijnvis...@apache.org>
wrote:

> Hi Paul,
>
> Great that you could find something in the SQL standard! I'll try to read
> the FLIP once more completely next week to see if I have any more concerns.
>
> Best regards,
>
> Martijn
>
> On Fri, 6 May 2022 at 08:21, Paul Lam <paullin3...@gmail.com> wrote:
>
>> I had a look at SQL-2016 that Martijn mentioned, and found that
>> maybe we could follow the transaction savepoint syntax.
>>
>>    - SAVEPOINT <savepoint specifier>
>>    - RELEASE SAVEPOINT <savepoint specifier>
>>
>> These savepoint statements are supported in lots of databases, like
>> Oracle[1], PG[2], MariaDB[3].
>>
>> They’re usually used in the middle of a SQL transaction, so the target
>> would be the current transaction. But if used in Flink SQL session, we
>> need to add a JOB/QUERY id when create a savepoint, thus the syntax
>> would be:
>>
>>    - SAVEPOINT <job/query id> <savepoint path>
>>    - RELEASE SAVEPOINT <savepoint path>
>>
>> I’m adding it as an alternative in the FLIP.
>>
>> [1]
>> https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_10001.htm
>> [2] https://www.postgresql.org/docs/current/sql-savepoint.html
>> [3] https://mariadb.com/kb/en/savepoint/
>>
>> Best,
>> Paul Lam
>>
>> 2022年5月4日 16:42，Paul Lam <paullin3...@gmail.com> 写道：
>>
>> Hi Shengkai,
>>
>> Thanks a lot for your input!
>>
>> > I just wonder how the users can get the web ui in the application mode.
>> Therefore, it's better we can list the Web UI using the SHOW statement.
>> WDYT?
>>
>> I think it's a valid approach. I'm adding it to the FLIP.
>>
>> > After the investigation, I am fine with the QUERY but the keyword JOB is
>> also okay to me.
>>
>> In addition, CockroachDB has both SHOW QUERIES [1] and SHOW JOBS [2],
>> while the former shows the active running queries and the latter shows
>> the
>> background tasks like schema changes. FYI.
>>
>> WRT the questions:
>>
>> > 1. Could you add some details about the behaviour with the different
>> execution.target, e.g. session, application mode?
>>
>> IMHO, the difference between different `execution.target` is mostly about
>> cluster startup, which has little relation with the proposed statements.
>> These statements rely on the current ClusterClient/JobClient API,
>> which is deployment mode agnostic. Canceling a job in an application
>> cluster is the same as in a session cluster.
>>
>> BTW, application mode is still in the development progress ATM [3].
>>
>> > 2. Considering the SQL Client/Gateway is not limited to submitting the
>> job
>> to the specified cluster, is it able to list jobs in the other clusters?
>>
>> I think multi-cluster support in SQL Client/Gateway should be aligned with
>> CLI, at least at the early phase. We may use SET  to set a cluster id for
>> a
>> session, then we have access to the cluster. However,  every SHOW
>> statement would only involve one cluster.
>>
>> Best,
>> Paul Lam
>>
>> [1] https://www.cockroachlabs.com/docs/stable/show-statements.html
>> [2] https://www.cockroachlabs.com/docs/v21.2/show-jobs
>> [3] https://issues.apache.org/jira/browse/FLINK-26541
>>
>> Shengkai Fang <fskm...@gmail.com> 于2022年4月29日周五 15:36写道：
>>
>>> Hi.
>>>
>>> Thanks for Paul's update.
>>>
>>> > It's better we can also get the infos about the cluster where the job
>>> is
>>> > running through the DESCRIBE statement.
>>>
>>> I just wonder how the users can get the web ui in the application mode.
>>> Therefore, it's better we can list the Web UI using the SHOW statement.
>>> WDYT?
>>>
>>>
>>> > QUERY or other keywords.
>>>
>>> I list the statement to manage the lifecycle of the query/dml in other
>>> systems:
>>>
>>> Mysql[1] allows users to SHOW [FULL] PROCESSLIST and use the KILL command
>>> to kill the query.
>>>
>>> ```
>>> mysql> SHOW PROCESSLIST;
>>>
>>> mysql> KILL 27;
>>> ```
>>>
>>>
>>> Postgres use the following statements to kill the queries.
>>>
>>> ```
>>> SELECT pg_cancel_backend(<pid of the process>)
>>>
>>> SELECT pg_terminate_backend(<pid of the process>)
>>> ```
>>>
>>> KSQL uses the following commands to control the query lifecycle[4].
>>>
>>> ```
>>> SHOW QUERIES;
>>>
>>> TERMINATE <query id>;
>>>
>>> ```
>>>
>>> [1] https://dev.mysql.com/doc/refman/8.0/en/show-processlist.html
>>> [2] https://scaledynamix.com/blog/how-to-kill-mysql-queries/
>>> [3]
>>>
>>> https://stackoverflow.com/questions/35319597/how-to-stop-kill-a-query-in-postgresql
>>> [4]
>>>
>>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>>> [5]
>>>
>>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/terminate/
>>>
>>> After the investigation, I am fine with the QUERY but the keyword JOB is
>>> also okay to me.
>>>
>>> We also have two questions here.
>>>
>>> 1. Could you add some details about the behaviour with the different
>>> execution.target, e.g. session, application mode?
>>>
>>> 2. Considering the SQL Client/Gateway is not limited to submitting the
>>> job
>>> to the specified cluster, is it able to list jobs in the other clusters?
>>>
>>>
>>> Best,
>>> Shengkai
>>>
>>> Paul Lam <paullin3...@gmail.com> 于2022年4月28日周四 17:17写道：
>>>
>>> > Hi Martjin,
>>> >
>>> > Thanks a lot for your reply! I agree that the scope may be a bit
>>> confusing,
>>> > please let me clarify.
>>> >
>>> > The FLIP aims to add new SQL statements that are supported only in
>>> > sql-client, similar to
>>> > jar statements [1]. Jar statements can be parsed into jar operations,
>>> which
>>> > are used only in
>>> > CliClient in sql-client module and cannot be executed by
>>> TableEnvironment
>>> > (not available in
>>> > Table API program that contains SQL that you mentioned).
>>> >
>>> > WRT the unchanged CLI client, I mean CliClient instead of the
>>> sql-client
>>> > module, which
>>> > currently contains the gateway codes (e.g. Executor). The FLIP mainly
>>> > extends
>>> > the gateway part, and barely touches CliClient and REST server (REST
>>> > endpoint in FLIP-91).
>>> >
>>> > WRT the syntax, I don't have much experience with SQL standards, and
>>> I'd
>>> > like to hear
>>> > more opinions from the community. I prefer Hive-style syntax because I
>>> > think many users
>>> > are familiar with Hive, and there're on-going efforts to improve
>>> Flink-Hive
>>> > integration [2][3].
>>> > But my preference is not strong, I'm okay with other options too. Do
>>> you
>>> > think JOB/Task is
>>> > a good choice, or do you have other preferred keywords?
>>> >
>>> > [1]
>>> >
>>> >
>>> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/jar/
>>> > [2]
>>> >
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-152%3A+Hive+Query+Syntax+Compatibility
>>> > [3]
>>> >
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Support+HiveServer2+Endpoint
>>> >
>>> > Best,
>>> > Paul Lam
>>> >
>>> > Martijn Visser <martijnvis...@apache.org> 于2022年4月26日周二 20:14写道：
>>> >
>>> > > Hi Paul,
>>> > >
>>> > > Thanks for creating the FLIP and opening the discussion. I did get a
>>> bit
>>> > > confused about the title, being "query lifecycle statements in SQL
>>> > client".
>>> > > This sounds like you want to adopt the SQL client, but you want to
>>> expand
>>> > > the SQL syntax with lifecycle statements, which could be used from
>>> the
>>> > SQL
>>> > > client, but of course also in a Table API program that contains SQL.
>>> > GIven
>>> > > that you're highlighting the CLI client as unchanged, this adds to
>>> more
>>> > > confusion.
>>> > >
>>> > > I am interested if there's anything listed in the SQL 2016 standard
>>> on
>>> > > these types of lifecycle statements. I did a quick scan for "SHOW
>>> > QUERIES"
>>> > > but couldn't find it. It would be great if we could stay as close as
>>> > > possible to such syntax. Overall I'm not in favour of using QUERIES
>>> as a
>>> > > keyword. I think Flink applications are not queries, but short- or
>>> long
>>> > > running applications. Why should we follow Hive's setup and indeed
>>> not
>>> > > others such as Snowflake, but also Postgres or MySQL?
>>> > >
>>> > > Best regards,
>>> > >
>>> > > Martijn Visser
>>> > > https://twitter.com/MartijnVisser82
>>> > > https://github.com/MartijnVisser
>>> > >
>>> > >
>>> > > On Fri, 22 Apr 2022 at 12:06, Paul Lam <paullin3...@gmail.com>
>>> wrote:
>>> > >
>>> > > > Hi Shengkai,
>>> > > >
>>> > > > Thanks a lot for your opinions!
>>> > > >
>>> > > > > 1. I think the keyword QUERY may confuse users because the
>>> statement
>>> > > also
>>> > > > > works for the DML statement.
>>> > > >
>>> > > > I slightly lean to QUERY, because:
>>> > > >
>>> > > > Hive calls DMLs queries. We could be better aligned with Hive using
>>> > > QUERY,
>>> > > > especially given that we plan to introduce Hive endpoint.
>>> > > > QUERY is a more SQL-like concept and friendly to SQL users.
>>> > > >
>>> > > > In general, my preference: QUERY > JOB > TASK. I’m okay with JOB,
>>> but
>>> > not
>>> > > > very good with TASK, as it conflicts with the task concept in Flink
>>> > > runtime.
>>> > > >
>>> > > > We could wait for more feedbacks from the community.
>>> > > >
>>> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
>>> > > terminate
>>> > > > > their jobs.
>>> > > >
>>> > > > Agreed. I’m okay with DROP. And if we want to align with Hive, KILL
>>> > might
>>> > > > an alternative.
>>> > > >
>>> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like.
>>> > > >
>>> > > > Agreed. It’s more SQL-like and intuitive. I’m updating the syntax
>>> on
>>> > the
>>> > > > FLIP.
>>> > > >
>>> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to
>>> get
>>> > more
>>> > > > > detailed job infos.
>>> > > >
>>> > > > That is a more SQL-like approach I think. But considering the
>>> > > > ClusterClient APIs, we can fetch the names and the status along in
>>> one
>>> > > > request,
>>> > > > thus it may be more user friendly to return them all in the SHOW
>>> > > > statement?
>>> > > >
>>> > > > > It's better we can also get the infos about the cluster where
>>> the job
>>> > > is
>>> > > > > running on through the DESCRIBE statement.
>>> > > >
>>> > > > I think cluster info could be part of session properties instead.
>>> WDYT?
>>> > > >
>>> > > > Best,
>>> > > > Paul Lam
>>> > > >
>>> > > > > 2022年4月22日 11:14，Shengkai Fang <fskm...@gmail.com> 写道：
>>> > > > >
>>> > > > > Hi Paul
>>> > > > >
>>> > > > > Sorry for the late response. I propose my thoughts here.
>>> > > > >
>>> > > > > 1. I think the keyword QUERY may confuse users because the
>>> statement
>>> > > also
>>> > > > > works for the DML statement. I find the Snowflakes[1] supports
>>> > > > >
>>> > > > > - CREATE TASK
>>> > > > > - DROP TASK
>>> > > > > - ALTER TASK
>>> > > > > - SHOW TASKS
>>> > > > > - DESCRIPE TASK
>>> > > > >
>>> > > > > I think we can follow snowflake to use `TASK` as the keyword or
>>> use
>>> > the
>>> > > > > keyword `JOB`?
>>> > > > >
>>> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
>>> > > terminate
>>> > > > > their jobs.
>>> > > > >
>>> > > > > ```
>>> > > > > DROP TASK [IF EXISTS] <job id> PURGE; -- Forcely stop the job
>>> with
>>> > > drain
>>> > > > >
>>> > > > > DROP TASK [IF EXISTS] <job id>; -- Stop the task with savepoints
>>> > > > > ```
>>> > > > >
>>> > > > > Oracle[2] uses the PURGE to clean up the table and users can't
>>> not
>>> > > > recover.
>>> > > > > I think it also works for us to terminate the job permanently.
>>> > > > >
>>> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like.
>>> Users
>>> > can
>>> > > > use
>>> > > > > the
>>> > > > >
>>> > > > > ```
>>> > > > >  SET 'state.savepoints.dir' = '<path_to_savepoint>';
>>> > > > >  SET 'state.savepoints.fomat' = 'native';
>>> > > > >  CREATE SAVEPOINT <job id>;
>>> > > > >
>>> > > > >  DROP SAVEPOINT <path_to_savepoint>;
>>> > > > > ```
>>> > > > >
>>> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to
>>> get
>>> > more
>>> > > > > detailed job infos.
>>> > > > >
>>> > > > > ```
>>> > > > >
>>> > > > > SHOW TASKS;
>>> > > > >
>>> > > > >
>>> > > > > +----------------------------------+
>>> > > > > |            job_id                |
>>> > > > > +----------------------------------+
>>> > > > > | 0f6413c33757fbe0277897dd94485f04 |
>>> > > > > +----------------------------------+
>>> > > > >
>>> > > > > DESCRIPE TASK <job id>;
>>> > > > >
>>> > > > > +------------------------
>>> > > > > |  job name   | status  |
>>> > > > > +------------------------
>>> > > > > | insert-sink | running |
>>> > > > > +------------------------
>>> > > > >
>>> > > > > ```
>>> > > > > It's better we can also get the infos about the cluster where
>>> the job
>>> > > is
>>> > > > > running on through the DESCRIBE statement.
>>> > > > >
>>> > > > >
>>> > > > > [1]
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management
>>> > > > <
>>> > > >
>>> > >
>>> >
>>> https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management
>>> > > > >
>>> > > > > [2]
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806
>>> > > > <
>>> > > >
>>> > >
>>> >
>>> https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806
>>> > > > >
>>> > > > >
>>> > > > > Paul Lam <paullin3...@gmail.com <mailto:paullin3...@gmail.com>>
>>> > > > 于2022年4月21日周四 10:36写道：
>>> > > > >
>>> > > > >> ping @Timo @Jark @Shengkai
>>> > > > >>
>>> > > > >> Best,
>>> > > > >> Paul Lam
>>> > > > >>
>>> > > > >>> 2022年4月18日 17:12，Paul Lam <paullin3...@gmail.com> 写道：
>>> > > > >>>
>>> > > > >>> Hi team,
>>> > > > >>>
>>> > > > >>> I’d like to start a discussion about FLIP-222 [1], which adds
>>> query
>>> > > > >> lifecycle
>>> > > > >>> statements to SQL client.
>>> > > > >>>
>>> > > > >>> Currently, SQL client supports submitting queries (queries in a
>>> > broad
>>> > > > >> sense,
>>> > > > >>> including DQLs and DMLs) but no further lifecycle statements,
>>> like
>>> > > > >> canceling
>>> > > > >>> a query or triggering a savepoint. That makes SQL users have to
>>> > rely
>>> > > on
>>> > > > >>> CLI or REST API to manage theirs queries.
>>> > > > >>>
>>> > > > >>> Thus, I propose to introduce the following statements to fill
>>> the
>>> > > gap.
>>> > > > >>> SHOW QUERIES
>>> > > > >>> STOP QUERY <query_id>
>>> > > > >>> CANCEL QUERY <query_id>
>>> > > > >>> TRIGGER SAVEPOINT <savepoint_path>
>>> > > > >>> DISPOSE SAVEPOINT <savepoint_path>
>>> > > > >>> These statement would align SQL client with CLI, providing the
>>> full
>>> > > > >> lifecycle
>>> > > > >>> management for queries/jobs.
>>> > > > >>>
>>> > > > >>> Please see the FLIP page[1] for more details. Thanks a lot!
>>> > > > >>> (For reference, the previous discussion thread see [2].)
>>> > > > >>>
>>> > > > >>> [1]
>>> > > > >>
>>> > > >
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-222%3A+Support+full+query+lifecycle+statements+in+SQL+client
>>> > > > >> <
>>> > > > >>
>>> > > >
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client
>>> > > > <
>>> > > >
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client
>>> > > > >
>>> > > > >>>
>>> > > > >>> [2]
>>> > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb
>>> > > <
>>> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>
>>> <
>>> > > > >>
>>> https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <
>>> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>>
>>> > > > >>>
>>> > > > >>> Best,
>>> > > > >>> Paul Lam
>>> > > >
>>> > > >
>>> > >
>>> >
>>>
>>
>>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Reply via email to