[
https://issues.apache.org/jira/browse/FLINK-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152773#comment-16152773
]
ASF GitHub Bot commented on FLINK-6442:
---------------------------------------
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/3829#discussion_r136850229
--- Diff:
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/api/BatchTableEnvironment.scala
---
@@ -106,6 +106,43 @@ abstract class BatchTableEnvironment(
}
/**
+ * Registers an external [[TableSink]] in this [[TableEnvironment]]'s
catalog.
+ * Registered sink tables can be referenced in SQL DML clause.
+ *
+ * Examples:
+ *
+ * - predefine a table sink with schema
--- End diff --
I proposed this API for the following reason:
I would like to have a common API and preconditions for pre-registered and
on-demand used table sinks such that all (also existing) table sinks can be
used in both of these use cases.
There is already the `configure()` method which has exactly the purpose of
setting the field names and types. Currently, this method is only internally
called during the translation process. Of course, users could implement a
`TableSink` by setting field names and types in the constructor (as shown in
the Scala docs of this PR), but I think this would kind of circumvent the
current API and might lead to `TableSink` implementations that can be either
used in a pre-registered or an on-demand setting. Hence, I think it would be
better to "enforce" the use of the `configure()` method by designing the API
such that the `configure()` method is always internally called and hence
mandatory. That way we can guarantee that each `TableSink` can be used in both
cases because both use case require `configure()` and use it in the same way.
We could ask users to call the `configure()` method before registering the
table sink with an error message (as @wuchong proposed) or enforce this through
the API. I think the second approach is better because users would not
experience an exception. In my opinion, we should at least "encourage" the use
of the `configure()` method by not giving an example that sets field names and
types in the constructor.
What do you think @lincoln-lil and @wuchong ?
> Extend TableAPI Support Sink Table Registration and ‘insert into’ Clause in
> SQL
> -------------------------------------------------------------------------------
>
> Key: FLINK-6442
> URL: https://issues.apache.org/jira/browse/FLINK-6442
> Project: Flink
> Issue Type: New Feature
> Components: Table API & SQL
> Reporter: lincoln.lee
> Assignee: lincoln.lee
> Priority: Minor
>
> Currently in TableAPI there’s only registration method for source table,
> when we use SQL writing a streaming job, we should add additional part for
> the sink, like TableAPI does:
> {code}
> val sqlQuery = "SELECT * FROM MyTable WHERE _1 = 3"
> val t = StreamTestData.getSmall3TupleDataStream(env)
> tEnv.registerDataStream("MyTable", t)
> // one way: invoke tableAPI’s writeToSink method directly
> val result = tEnv.sql(sqlQuery)
> result.writeToSink(new YourStreamSink)
> // another way: convert to datastream first and then invoke addSink
> val result = tEnv.sql(sqlQuery).toDataStream[Row]
> result.addSink(new StreamITCase.StringSink)
> {code}
> From the api we can see the sink table always be a derived table because its
> 'schema' is inferred from the result type of upstream query.
> Compare to traditional RDBMS which support DML syntax, a query with a target
> output could be written like this:
> {code}
> insert into table target_table_name
> [(column_name [ ,...n ])]
> query
> {code}
> The equivalent form of the example above is as follows:
> {code}
> tEnv.registerTableSink("targetTable", new YourSink)
> val sql = "INSERT INTO targetTable SELECT a, b, c FROM sourceTable"
> val result = tEnv.sql(sql)
> {code}
> It is supported by Calcite’s grammar:
> {code}
> insert:( INSERT | UPSERT ) INTO tablePrimary
> [ '(' column [, column ]* ')' ]
> query
> {code}
> I'd like to extend Flink TableAPI to support such feature. see design doc:
> https://goo.gl/n3phK5
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)