Hi Ingo, Since we have supported the WITH syntax and SET command since v1.9 [1][2], and we have never received such complaints, I think it's fine for such differences.
Besides, the TBLPROPERTIES clause of CREATE TABLE in Hive also requires string literal keys[3], and the SET <key>=<value> doesn't allow quoted keys [4]. Best, Jark [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/connect.html [2]: https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/sqlClient.html#running-sql-queries [3]: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL [4]: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli (search "set mapred.reduce.tasks=32") On Thu, 4 Feb 2021 at 17:09, Ingo Bürk <i...@ververica.com> wrote: > Hi, > > regarding the (un-)quoted question, compatibility is of course an important > argument, but in terms of consistency I'd find it a bit surprising that > WITH handles it differently than SET, and I wonder if that could cause > friction for developers when writing their SQL. > > > Regards > Ingo > > On Thu, Feb 4, 2021 at 9:38 AM Jark Wu <imj...@gmail.com> wrote: > > > Hi all, > > > > Regarding "One Parser", I think it's not possible for now because Calcite > > parser can't parse > > special characters (e.g. "-") unless quoting them as string literals. > > That's why the WITH option > > key are string literals not identifiers. > > > > SET table.exec.mini-batch.enabled = true and ADD JAR > > /local/my-home/test.jar > > have the same > > problems. That's why we propose two parser, one splits lines into > multiple > > statements and match special > > command through regex which is light-weight, and delegate other > statements > > to the other parser which is Calcite parser. > > > > Note: we should stick on the unquoted SET table.exec.mini-batch.enabled = > > true syntax, > > both for backward-compatibility and easy-to-use, and all the other > systems > > don't have quotes on the key. > > > > > > Regarding "table.planner" vs "sql-client.planner", > > if we want to use "table.planner", I think we should explain clearly > what's > > the scope it can be used in documentation. > > Otherwise, there will be users complaining why the planner doesn't change > > when setting the configuration on TableEnv. > > Would be better throwing an exception to indicate users it's now allowed > to > > change planner after TableEnv is initialized. > > However, it seems not easy to implement. > > > > Best, > > Jark > > > > On Thu, 4 Feb 2021 at 15:49, godfrey he <godfre...@gmail.com> wrote: > > > > > Hi everyone, > > > > > > Regarding "table.planner" and "table.execution-mode" > > > If we define that those two options are just used to initialize the > > > TableEnvironment, +1 for introducing table options instead of > sql-client > > > options. > > > > > > Regarding "the sql client, we will maintain two parsers", I want to > give > > > more inputs: > > > We want to introduce sql-gateway into the Flink project (see FLIP-24 & > > > FLIP-91 for more info [1] [2]). In the "gateway" mode, the CLI client > and > > > the gateway service will communicate through Rest API. The " ADD JAR > > > /local/path/jar " will be executed in the CLI client machine. So when > we > > > submit a sql file which contains multiple statements, the CLI client > > needs > > > to pick out the "ADD JAR" line, and also statements need to be > submitted > > or > > > executed one by one to make sure the result is correct. The sql file > may > > be > > > look like: > > > > > > SET xxx=yyy; > > > create table my_table ...; > > > create table my_sink ...; > > > ADD JAR /local/path/jar1; > > > create function my_udf as com....MyUdf; > > > insert into my_sink select ..., my_udf(xx) from ...; > > > REMOVE JAR /local/path/jar1; > > > drop function my_udf; > > > ADD JAR /local/path/jar2; > > > create function my_udf as com....MyUdf2; > > > insert into my_sink select ..., my_udf(xx) from ...; > > > > > > The lines need to be splitted into multiple statements first in the CLI > > > client, there are two approaches: > > > 1. The CLI client depends on the sql-parser: the sql-parser splits the > > > lines and tells which lines are "ADD JAR". > > > pro: there is only one parser > > > cons: It's a little heavy that the CLI client depends on the > sql-parser, > > > because the CLI client is just a simple tool which receives the user > > > commands and displays the result. The non "ADD JAR" command will be > > parsed > > > twice. > > > > > > 2. The CLI client splits the lines into multiple statements and finds > the > > > ADD JAR command through regex matching. > > > pro: The CLI client is very light-weight. > > > cons: there are two parsers. > > > > > > (personally, I prefer the second option) > > > > > > Regarding "SHOW or LIST JARS", I think we can support them both. > > > For default dialect, we support SHOW JARS, but if we switch to hive > > > dialect, LIST JARS is also supported. > > > > > > > > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client > > > [2] > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway > > > > > > Best, > > > Godfrey > > > > > > Rui Li <lirui.fu...@gmail.com> 于2021年2月4日周四 上午10:40写道: > > > > > > > Hi guys, > > > > > > > > Regarding #3 and #4, I agree SHOW JARS is more consistent with other > > > > commands than LIST JARS. I don't have a strong opinion about REMOVE > vs > > > > DELETE though. > > > > > > > > While flink doesn't need to follow hive syntax, as far as I know, > most > > > > users who are requesting these features are previously hive users. > So I > > > > wonder whether we can support both LIST/SHOW JARS and REMOVE/DELETE > > JARS > > > > as synonyms? It's just like lots of systems accept both EXIT and QUIT > > as > > > > the command to terminate the program. So if that's not hard to > achieve, > > > and > > > > will make users happier, I don't see a reason why we must choose one > > over > > > > the other. > > > > > > > > On Wed, Feb 3, 2021 at 10:33 PM Timo Walther <twal...@apache.org> > > wrote: > > > > > > > > > Hi everyone, > > > > > > > > > > some feedback regarding the open questions. Maybe we can discuss > the > > > > > `TableEnvironment.executeMultiSql` story offline to determine how > we > > > > > proceed with this in the near future. > > > > > > > > > > 1) "whether the table environment has the ability to update itself" > > > > > > > > > > Maybe there was some misunderstanding. I don't think that we should > > > > > support `tEnv.getConfig.getConfiguration.setString("table.planner", > > > > > "old")`. Instead I'm proposing to support > > > > > `TableEnvironment.create(Configuration)` where planner and > execution > > > > > mode are read immediately and a subsequent changes to these options > > > will > > > > > have no effect. We are doing it similar in `new > > > > > StreamExecutionEnvironment(Configuration)`. These two > ConfigOption's > > > > > must not be SQL Client specific but can be part of the core table > > code > > > > > base. Many users would like to get a 100% preconfigured environment > > > from > > > > > just Configuration. And this is not possible right now. We can > solve > > > > > both use cases in one change. > > > > > > > > > > 2) "the sql client, we will maintain two parsers" > > > > > > > > > > I remember we had some discussion about this and decided that we > > would > > > > > like to maintain only one parser. In the end it is "One Flink SQL" > > > where > > > > > commands influence each other also with respect to keywords. It > > should > > > > > be fine to include the SQL Client commands in the Flink parser. Of > > > > > cource the table environment would not be able to handle the > > > `Operation` > > > > > instance that would be the result but we can introduce hooks to > > handle > > > > > those `Operation`s. Or we introduce parser extensions. > > > > > > > > > > Can we skip `table.job.async` in the first version? We should > further > > > > > discuss whether we introduce a special SQL clause for wrapping > async > > > > > behavior or if we use a config option? Esp. for streaming queries > we > > > > > need to be careful and should force users to either "one INSERT > INTO" > > > or > > > > > "one STATEMENT SET". > > > > > > > > > > 3) 4) "HIVE also uses these commands" > > > > > > > > > > In general, Hive is not a good reference. Aligning the commands > more > > > > > with the remaining commands should be our goal. We just had a > MODULE > > > > > discussion where we selected SHOW instead of LIST. But it is true > > that > > > > > JARs are not part of the catalog which is why I would not use > > > > > CREATE/DROP. ADD/REMOVE are commonly siblings in the English > > language. > > > > > Take a look at the Java collection API as another example. > > > > > > > > > > 6) "Most of the commands should belong to the table environment" > > > > > > > > > > Thanks for updating the FLIP this makes things easier to > understand. > > It > > > > > is good to see that most commends will be available in > > > TableEnvironment. > > > > > However, I would also support SET and RESET for consistency. Again, > > > from > > > > > an architectural point of view, if we would allow some kind of > > > > > `Operation` hook in table environment, we could check for SQL > Client > > > > > specific options and forward to regular > > `TableConfig.getConfiguration` > > > > > otherwise. What do you think? > > > > > > > > > > Regards, > > > > > Timo > > > > > > > > > > > > > > > On 03.02.21 08:58, Jark Wu wrote: > > > > > > Hi Timo, > > > > > > > > > > > > I will respond some of the questions: > > > > > > > > > > > > 1) SQL client specific options > > > > > > > > > > > > Whether it starts with "table" or "sql-client" depends on where > the > > > > > > configuration takes effect. > > > > > > If it is a table configuration, we should make clear what's the > > > > behavior > > > > > > when users change > > > > > > the configuration in the lifecycle of TableEnvironment. > > > > > > > > > > > > I agree with Shengkai `sql-client.planner` and > > > > > `sql-client.execution.mode` > > > > > > are something special > > > > > > that can't be changed after TableEnvironment has been > initialized. > > > You > > > > > can > > > > > > see > > > > > > `StreamExecutionEnvironment` provides `configure()` method to > > > override > > > > > > configuration after > > > > > > StreamExecutionEnvironment has been initialized. > > > > > > > > > > > > Therefore, I think it would be better to still use > > > > `sql-client.planner` > > > > > > and `sql-client.execution.mode`. > > > > > > > > > > > > 2) Execution file > > > > > > > > > > > >>From my point of view, there is a big difference between > > > > > > `sql-client.job.detach` and > > > > > > `TableEnvironment.executeMultiSql()` that `sql-client.job.detach` > > > will > > > > > > affect every single DML statement > > > > > > in the terminal, not only the statements in SQL files. I think > the > > > > single > > > > > > DML statement in the interactive > > > > > > terminal is something like tEnv#executeSql() instead of > > > > > > tEnv#executeMultiSql. > > > > > > So I don't like the "multi" and "sql" keyword in > > > > `table.multi-sql-async`. > > > > > > I just find that runtime provides a configuration called > > > > > > "execution.attached" [1] which is false by default > > > > > > which specifies if the pipeline is submitted in attached or > > detached > > > > > mode. > > > > > > It provides exactly the same > > > > > > functionality of `sql-client.job.detach`. What do you think about > > > using > > > > > > this option? > > > > > > > > > > > > If we also want to support this config in TableEnvironment, I > think > > > it > > > > > > should also affect the DML execution > > > > > > of `tEnv#executeSql()`, not only DMLs in > > `tEnv#executeMultiSql()`. > > > > > > Therefore, the behavior may look like this: > > > > > > > > > > > > val tableResult = tEnv.executeSql("INSERT INTO ...") ==> async > by > > > > > default > > > > > > tableResult.await() ==> manually block until finish > > > > > > > tEnv.getConfig().getConfiguration().setString("execution.attached", > > > > > "true") > > > > > > val tableResult2 = tEnv.executeSql("INSERT INTO ...") ==> sync, > > > don't > > > > > need > > > > > > to wait on the TableResult > > > > > > tEnv.executeMultiSql( > > > > > > """ > > > > > > CREATE TABLE .... ==> always sync > > > > > > INSERT INTO ... => sync, because we set configuration above > > > > > > SET execution.attached = false; > > > > > > INSERT INTO ... => async > > > > > > """) > > > > > > > > > > > > On the other hand, I think `sql-client.job.detach` > > > > > > and `TableEnvironment.executeMultiSql()` should be two separate > > > topics, > > > > > > as Shengkai mentioned above, SQL CLI only depends on > > > > > > `TableEnvironment#executeSql()` to support multi-line statements. > > > > > > I'm fine with making `executeMultiSql()` clear but don't want it > to > > > > block > > > > > > this FLIP, maybe we can discuss this in another thread. > > > > > > > > > > > > > > > > > > Best, > > > > > > Jark > > > > > > > > > > > > [1]: > > > > > > > > > > > > > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/deployment/config.html#execution-attached > > > > > > > > > > > > On Wed, 3 Feb 2021 at 15:33, Shengkai Fang <fskm...@gmail.com> > > > wrote: > > > > > > > > > > > >> Hi, Timo. > > > > > >> Thanks for your detailed feedback. I have some thoughts about > your > > > > > >> feedback. > > > > > >> > > > > > >> *Regarding #1*: I think the main problem is whether the table > > > > > environment > > > > > >> has the ability to update itself. Let's take a simple program as > > an > > > > > >> example. > > > > > >> > > > > > >> > > > > > >> ``` > > > > > >> TableEnvironment tEnv = TableEnvironment.create(...); > > > > > >> > > > > > >> tEnv.getConfig.getConfiguration.setString("table.planner", > "old"); > > > > > >> > > > > > >> > > > > > >> tEnv.executeSql("..."); > > > > > >> > > > > > >> ``` > > > > > >> > > > > > >> If we regard this option as a table option, users don't have to > > > create > > > > > >> another table environment manually. In that case, tEnv needs to > > > check > > > > > >> whether the current mode and planner are the same as before when > > > > > executeSql > > > > > >> or explainSql. I don't think it's easy work for the table > > > environment, > > > > > >> especially if users have a StreamExecutionEnvironment but set > old > > > > > planner > > > > > >> and batch mode. But when we make this option as a sql client > > option, > > > > > users > > > > > >> only use the SET command to change the setting. We can rebuild a > > new > > > > > table > > > > > >> environment when set successes. > > > > > >> > > > > > >> > > > > > >> *Regarding #2*: I think we need to discuss the implementation > > before > > > > > >> continuing this topic. In the sql client, we will maintain two > > > > parsers. > > > > > The > > > > > >> first parser(client parser) will only match the sql client > > commands. > > > > If > > > > > the > > > > > >> client parser can't parse the statement, we will leverage the > > power > > > of > > > > > the > > > > > >> table environment to execute. According to our blueprint, > > > > > >> TableEnvironment#executeSql is enough for the sql client. > > Therefore, > > > > > >> TableEnvironment#executeMultiSql is out-of-scope for this FLIP. > > > > > >> > > > > > >> But if we need to introduce the > `TableEnvironment.executeMultiSql` > > > in > > > > > the > > > > > >> future, I think it's OK to use the option > `table.multi-sql-async` > > > > rather > > > > > >> than option `sql-client.job.detach`. But we think the name is > not > > > > > suitable > > > > > >> because the name is confusing for others. When setting the > option > > > > > false, we > > > > > >> just mean it will block the execution of the INSERT INTO > > statement, > > > > not > > > > > DDL > > > > > >> or others(other sql statements are always executed > synchronously). > > > So > > > > > how > > > > > >> about `table.job.async`? It only works for the sql-client and > the > > > > > >> executeMultiSql. If we set this value false, the table > environment > > > > will > > > > > >> return the result until the job finishes. > > > > > >> > > > > > >> > > > > > >> *Regarding #3, #4*: I still think we should use DELETE JAR and > > LIST > > > > JAR > > > > > >> because HIVE also uses these commands to add the jar into the > > > > classpath > > > > > or > > > > > >> delete the jar. If we use such commands, it can reduce our work > > for > > > > > hive > > > > > >> compatibility. > > > > > >> > > > > > >> For SHOW JAR, I think the main concern is the jars are not > > > maintained > > > > by > > > > > >> the Catalog. If we really needs to keep consistent with SQL > > grammar, > > > > > maybe > > > > > >> we should use > > > > > >> > > > > > >> `ADD JAR` -> `CREATE JAR`, > > > > > >> `DELETE JAR` -> `DROP JAR`, > > > > > >> `LIST JAR` -> `SHOW JAR`. > > > > > >> > > > > > >> *Regarding #5*: I agree with you that we'd better keep > consistent. > > > > > >> > > > > > >> *Regarding #6*: Yes. Most of the commands should belong to the > > table > > > > > >> environment. In the Summary section, I use the <NOTE> tag to > > > identify > > > > > which > > > > > >> commands should belong to the sql client and which commands > should > > > > > belong > > > > > >> to the table environment. I also add a new section about > > > > implementation > > > > > >> details in the FLIP. > > > > > >> > > > > > >> Best, > > > > > >> Shengkai > > > > > >> > > > > > >> Timo Walther <twal...@apache.org> 于2021年2月2日周二 下午6:43写道: > > > > > >> > > > > > >>> Thanks for this great proposal Shengkai. This will give the SQL > > > > Client > > > > > a > > > > > >>> very good update and make it production ready. > > > > > >>> > > > > > >>> Here is some feedback from my side: > > > > > >>> > > > > > >>> 1) SQL client specific options > > > > > >>> > > > > > >>> I don't think that `sql-client.planner` and > > > > `sql-client.execution.mode` > > > > > >>> are SQL Client specific. Similar to > `StreamExecutionEnvironment` > > > and > > > > > >>> `ExecutionConfig#configure` that have been added recently, we > > > should > > > > > >>> offer a possibility for TableEnvironment. How about we offer > > > > > >>> `TableEnvironment.create(ReadableConfig)` and add a > > `table.planner` > > > > and > > > > > >>> `table.execution-mode` to > > > > > >>> `org.apache.flink.table.api.config.TableConfigOptions`? > > > > > >>> > > > > > >>> 2) Execution file > > > > > >>> > > > > > >>> Did you have a look at the Appendix of FLIP-84 [1] including > the > > > > > mailing > > > > > >>> list thread at that time? Could you further elaborate how the > > > > > >>> multi-statement execution should work for a unified > > batch/streaming > > > > > >>> story? According to our past discussions, each line in an > > execution > > > > > file > > > > > >>> should be executed blocking which means a streaming query > needs a > > > > > >>> statement set to execute multiple INSERT INTO statement, > correct? > > > We > > > > > >>> should also offer this functionality in > > > > > >>> `TableEnvironment.executeMultiSql()`. Whether > > > `sql-client.job.detach` > > > > > is > > > > > >>> SQL Client specific needs to be determined, it could also be a > > > > general > > > > > >>> `table.multi-sql-async` option? > > > > > >>> > > > > > >>> 3) DELETE JAR > > > > > >>> > > > > > >>> Shouldn't the opposite of "ADD" be "REMOVE"? "DELETE" sounds > like > > > one > > > > > is > > > > > >>> actively deleting the JAR in the corresponding path. > > > > > >>> > > > > > >>> 4) LIST JAR > > > > > >>> > > > > > >>> This should be `SHOW JARS` according to other SQL commands such > > as > > > > > `SHOW > > > > > >>> CATALOGS`, `SHOW TABLES`, etc. [2]. > > > > > >>> > > > > > >>> 5) EXPLAIN [ExplainDetail[, ExplainDetail]*] > > > > > >>> > > > > > >>> We should keep the details in sync with > > > > > >>> `org.apache.flink.table.api.ExplainDetail` and avoid confusion > > > about > > > > > >>> differently named ExplainDetails. I would vote for > > `ESTIMATED_COST` > > > > > >>> instead of `COST`. I'm sure the original author had a reason > why > > to > > > > > call > > > > > >>> it that way. > > > > > >>> > > > > > >>> 6) Implementation details > > > > > >>> > > > > > >>> It would be nice to understand how we plan to implement the > given > > > > > >>> features. Most of the commands and config options should go > into > > > > > >>> TableEnvironment and SqlParser directly, correct? This way > users > > > > have a > > > > > >>> unified way of using Flink SQL. TableEnvironment would provide > a > > > > > similar > > > > > >>> user experience in notebooks or interactive programs than the > SQL > > > > > Client. > > > > > >>> > > > > > >>> [1] > > > > > >>> > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878 > > > > > >>> [2] > > > > > >>> > > > > > >>> > > > > > >> > > > > > > > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/show.html > > > > > >>> > > > > > >>> Regards, > > > > > >>> Timo > > > > > >>> > > > > > >>> > > > > > >>> On 02.02.21 10:13, Shengkai Fang wrote: > > > > > >>>> Sorry for the typo. I mean `RESET` is much better rather than > > > > `UNSET`. > > > > > >>>> > > > > > >>>> Shengkai Fang <fskm...@gmail.com> 于2021年2月2日周二 下午4:44写道: > > > > > >>>> > > > > > >>>>> Hi, Jingsong. > > > > > >>>>> > > > > > >>>>> Thanks for your reply. I think `UNSET` is much better. > > > > > >>>>> > > > > > >>>>> 1. We don't need to introduce another command `UNSET`. > `RESET` > > is > > > > > >>>>> supported in the current sql client now. Our proposal just > > > extends > > > > > its > > > > > >>>>> grammar and allow users to reset the specified keys. > > > > > >>>>> 2. Hive beeline also uses `RESET` to set the key to the > default > > > > > >>> value[1]. > > > > > >>>>> I think it is more friendly for batch users. > > > > > >>>>> > > > > > >>>>> Best, > > > > > >>>>> Shengkai > > > > > >>>>> > > > > > >>>>> [1] > > > > > >>> > > > https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients > > > > > >>>>> > > > > > >>>>> Jingsong Li <jingsongl...@gmail.com> 于2021年2月2日周二 下午1:56写道: > > > > > >>>>> > > > > > >>>>>> Thanks for the proposal, yes, sql-client is too outdated. +1 > > for > > > > > >>>>>> improving it. > > > > > >>>>>> > > > > > >>>>>> About "SET" and "RESET", Why not be "SET" and "UNSET"? > > > > > >>>>>> > > > > > >>>>>> Best, > > > > > >>>>>> Jingsong > > > > > >>>>>> > > > > > >>>>>> On Mon, Feb 1, 2021 at 2:46 PM Rui Li < > lirui.fu...@gmail.com> > > > > > wrote: > > > > > >>>>>> > > > > > >>>>>>> Thanks Shengkai for the update! The proposed changes look > > good > > > to > > > > > >> me. > > > > > >>>>>>> > > > > > >>>>>>> On Fri, Jan 29, 2021 at 8:26 PM Shengkai Fang < > > > fskm...@gmail.com > > > > > > > > > > >>> wrote: > > > > > >>>>>>> > > > > > >>>>>>>> Hi, Rui. > > > > > >>>>>>>> You are right. I have already modified the FLIP. > > > > > >>>>>>>> > > > > > >>>>>>>> The main changes: > > > > > >>>>>>>> > > > > > >>>>>>>> # -f parameter has no restriction about the statement > type. > > > > > >>>>>>>> Sometimes, users use the pipe to redirect the result of > > > queries > > > > to > > > > > >>>>>>> debug > > > > > >>>>>>>> when submitting job by -f parameter. It's much convenient > > > > > comparing > > > > > >>> to > > > > > >>>>>>>> writing INSERT INTO statements. > > > > > >>>>>>>> > > > > > >>>>>>>> # Add a new sql client option `sql-client.job.detach` . > > > > > >>>>>>>> Users prefer to execute jobs one by one in the batch mode. > > > Users > > > > > >> can > > > > > >>>>>>> set > > > > > >>>>>>>> this option false and the client will process the next job > > > until > > > > > >> the > > > > > >>>>>>>> current job finishes. The default value of this option is > > > false, > > > > > >>> which > > > > > >>>>>>>> means the client will execute the next job when the > current > > > job > > > > is > > > > > >>>>>>>> submitted. > > > > > >>>>>>>> > > > > > >>>>>>>> Best, > > > > > >>>>>>>> Shengkai > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>>> Rui Li <lirui.fu...@gmail.com> 于2021年1月29日周五 下午4:52写道: > > > > > >>>>>>>> > > > > > >>>>>>>>> Hi Shengkai, > > > > > >>>>>>>>> > > > > > >>>>>>>>> Regarding #2, maybe the -f options in flink and hive have > > > > > >> different > > > > > >>>>>>>>> implications, and we should clarify the behavior. For > > > example, > > > > if > > > > > >>> the > > > > > >>>>>>>>> client just submits the job and exits, what happens if > the > > > file > > > > > >>>>>>> contains > > > > > >>>>>>>>> two INSERT statements? I don't think we should treat them > > as > > > a > > > > > >>>>>>> statement > > > > > >>>>>>>>> set, because users should explicitly write BEGIN > STATEMENT > > > SET > > > > in > > > > > >>> that > > > > > >>>>>>>>> case. And the client shouldn't asynchronously submit the > > two > > > > > jobs, > > > > > >>>>>>> because > > > > > >>>>>>>>> the 2nd may depend on the 1st, right? > > > > > >>>>>>>>> > > > > > >>>>>>>>> On Fri, Jan 29, 2021 at 4:30 PM Shengkai Fang < > > > > fskm...@gmail.com > > > > > > > > > > > >>>>>>> wrote: > > > > > >>>>>>>>> > > > > > >>>>>>>>>> Hi Rui, > > > > > >>>>>>>>>> Thanks for your feedback. I agree with your suggestions. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> For the suggestion 1: Yes. we are plan to strengthen the > > set > > > > > >>>>>>> command. In > > > > > >>>>>>>>>> the implementation, it will just put the key-value into > > the > > > > > >>>>>>>>>> `Configuration`, which will be used to generate the > table > > > > > config. > > > > > >>> If > > > > > >>>>>>> hive > > > > > >>>>>>>>>> supports to read the setting from the table config, > users > > > are > > > > > >> able > > > > > >>>>>>> to set > > > > > >>>>>>>>>> the hive-related settings. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> For the suggestion 2: The -f parameter will submit the > job > > > and > > > > > >>> exit. > > > > > >>>>>>> If > > > > > >>>>>>>>>> the queries never end, users have to cancel the job by > > > > > >> themselves, > > > > > >>>>>>> which is > > > > > >>>>>>>>>> not reliable(people may forget their jobs). In most > case, > > > > > queries > > > > > >>>>>>> are used > > > > > >>>>>>>>>> to analyze the data. Users should use queries in the > > > > interactive > > > > > >>>>>>> mode. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> Best, > > > > > >>>>>>>>>> Shengkai > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> Rui Li <lirui.fu...@gmail.com> 于2021年1月29日周五 下午3:18写道: > > > > > >>>>>>>>>> > > > > > >>>>>>>>>>> Thanks Shengkai for bringing up this discussion. I > think > > it > > > > > >>> covers a > > > > > >>>>>>>>>>> lot of useful features which will dramatically improve > > the > > > > > >>>>>>> usability of our > > > > > >>>>>>>>>>> SQL Client. I have two questions regarding the FLIP. > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> 1. Do you think we can let users set arbitrary > > > configurations > > > > > >> via > > > > > >>>>>>> the > > > > > >>>>>>>>>>> SET command? A connector may have its own > configurations > > > and > > > > we > > > > > >>>>>>> don't have > > > > > >>>>>>>>>>> a way to dynamically change such configurations in SQL > > > > Client. > > > > > >> For > > > > > >>>>>>> example, > > > > > >>>>>>>>>>> users may want to be able to change hive conf when > using > > > hive > > > > > >>>>>>> connector [1]. > > > > > >>>>>>>>>>> 2. Any reason why we have to forbid queries in SQL > files > > > > > >> specified > > > > > >>>>>>> with > > > > > >>>>>>>>>>> the -f option? Hive supports a similar -f option but > > allows > > > > > >>> queries > > > > > >>>>>>> in the > > > > > >>>>>>>>>>> file. And a common use case is to run some query and > > > redirect > > > > > >> the > > > > > >>>>>>> results > > > > > >>>>>>>>>>> to a file. So I think maybe flink users would like to > do > > > the > > > > > >> same, > > > > > >>>>>>>>>>> especially in batch scenarios. > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-20590 > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> On Fri, Jan 29, 2021 at 10:46 AM Sebastian Liu < > > > > > >>>>>>> liuyang0...@gmail.com> > > > > > >>>>>>>>>>> wrote: > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>>> Hi Shengkai, > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> Glad to see this improvement. And I have some > additional > > > > > >>>>>>> suggestions: > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> #1. Unify the TableEnvironment in ExecutionContext to > > > > > >>>>>>>>>>>> StreamTableEnvironment for both streaming and batch > sql. > > > > > >>>>>>>>>>>> #2. Improve the way of results retrieval: sql client > > > collect > > > > > >> the > > > > > >>>>>>>>>>>> results > > > > > >>>>>>>>>>>> locally all at once using accumulators at present, > > > > > >>>>>>>>>>>> which may have memory issues in JM or Local > for > > > the > > > > > big > > > > > >>> query > > > > > >>>>>>>>>>>> result. > > > > > >>>>>>>>>>>> Accumulator is only suitable for testing purpose. > > > > > >>>>>>>>>>>> We may change to use SelectTableSink, which is > > > based > > > > > >>>>>>>>>>>> on CollectSinkOperatorCoordinator. > > > > > >>>>>>>>>>>> #3. Do we need to consider Flink SQL gateway which is > in > > > > > >> FLIP-91. > > > > > >>>>>>> Seems > > > > > >>>>>>>>>>>> that this FLIP has not moved forward for a long time. > > > > > >>>>>>>>>>>> Provide a long running service out of the box > to > > > > > >>> facilitate > > > > > >>>>>>> the > > > > > >>>>>>>>>>>> sql > > > > > >>>>>>>>>>>> submission is necessary. > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> What do you think of these? > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> [1] > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> > > > > > >>>>>>> > > > > > >>> > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> Shengkai Fang <fskm...@gmail.com> 于2021年1月28日周四 > > 下午8:54写道: > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>>> Hi devs, > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> Jark and I want to start a discussion about > > FLIP-163:SQL > > > > > >> Client > > > > > >>>>>>>>>>>>> Improvements. > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> Many users have complained about the problems of the > > sql > > > > > >> client. > > > > > >>>>>>> For > > > > > >>>>>>>>>>>>> example, users can not register the table proposed by > > > > > FLIP-95. > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> The main changes in this FLIP: > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> - use -i parameter to specify the sql file to > > initialize > > > > the > > > > > >>>>>>> table > > > > > >>>>>>>>>>>>> environment and deprecated YAML file; > > > > > >>>>>>>>>>>>> - add -f to submit sql file and deprecated '-u' > > > parameter; > > > > > >>>>>>>>>>>>> - add more interactive commands, e.g ADD JAR; > > > > > >>>>>>>>>>>>> - support statement set syntax; > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> For more detailed changes, please refer to > FLIP-163[1]. > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> Look forward to your feedback. > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> Best, > > > > > >>>>>>>>>>>>> Shengkai > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> [1] > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>> > > > > > >>>>>>> > > > > > >>> > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-163%3A+SQL+Client+Improvements > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> -- > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> *With kind regards > > > > > >>>>>>>>>>>> > > > ------------------------------------------------------------ > > > > > >>>>>>>>>>>> Sebastian Liu 刘洋 > > > > > >>>>>>>>>>>> Institute of Computing Technology, Chinese Academy of > > > > Science > > > > > >>>>>>>>>>>> Mobile\WeChat: +86—15201613655 > > > > > >>>>>>>>>>>> E-mail: liuyang0...@gmail.com <liuyang0...@gmail.com> > > > > > >>>>>>>>>>>> QQ: 3239559* > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> -- > > > > > >>>>>>>>>>> Best regards! > > > > > >>>>>>>>>>> Rui Li > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>> > > > > > >>>>>>>>> > > > > > >>>>>>>>> -- > > > > > >>>>>>>>> Best regards! > > > > > >>>>>>>>> Rui Li > > > > > >>>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> -- > > > > > >>>>>>> Best regards! > > > > > >>>>>>> Rui Li > > > > > >>>>>>> > > > > > >>>>>> > > > > > >>>>>> > > > > > >>>>>> -- > > > > > >>>>>> Best, Jingsong Lee > > > > > >>>>>> > > > > > >>>>> > > > > > >>>> > > > > > >>> > > > > > >>> > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Best regards! > > > > Rui Li > > > > > > > > > >