Re: Table API thrift support

2022-11-26 Thread Chen Qin
Hi Martjin,

"shading Thrift libraries from the Hive connector"
Hivemetastore is foundational software running in many companies used by
Spark/Flink... etc. Upgrading the hive metastore touches many pieces of
data engineering. If the user updates flink job jar dependency to the
latest 0.17, it would not guarantee both HMS and jar would work properly.
and yes, 0.5-p6 is unfortunate internal tech debt we would work on outside
of this FLIP work.

"KafkaSource and KafkaSink"
sounds good, this part seems outdated.

"explain how a Thrift schema can be compiled/used in a SQL"
I see, our approach requires extra schema gen and jar load compared to
proto-buf implementation.  Our internal implementation contains a schema
inference patch that got moved out of this FLIP document. I agree it might
be worth removing compile requirement for ease of use.

Chen


On Wed, Nov 23, 2022 at 6:42 AM Martijn Visser 
wrote:

> Hi Chen,
>
> I'm a bit skeptical of shading Thrift libraries from the Hive connector,
> especially with the plans to externalize connectors (including Hive). Have
> we considered getting the versions in sync to avoid the need of any
> shading?


> The FLIP also shows a version of Thrift (0.5.0-p6) that I don't see in
> Maven central, but the latest version there is 0.17.0. We should support
> the latest version. Do you know when Thrift expects to reach a major
> version? I'm not too fond of not having any major version/compatibility
> guarantees.
>
> The FLIP mentions FlinkKafkaConsumer and FlinkKafkaProducer; these are
> deprecated and should not be implemented, only KafkaSource and KafkaSink.
>
> Can you explain how a Thrift schema can be compiled/used in a SQL
> application, like also is done for Protobuf?
>
> https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/table/formats/protobuf/
>
> Best regards,
>
> Martijn
>
> On Tue, Nov 22, 2022 at 6:44 PM Chen Qin  wrote:
>
> > Hi Yuxia, Martijin,
> >
> > Thanks for your feedback on FLIP-237!
> > My understanding is that FLIP-237 better focuses on thrift
> > encoding/decoding in Datastream/Table API/ Pyflink.
> > To address feedbacks, I made follow changes to FLIP-237
> > <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-237%3A+Thrift+Format+Support
> >
> >  doc
> >
> >- remove table schema section inference as flink doesn't have built-in
> >support yet
> >- remove paritialser/deser given this fits better as a kafka table
> >source optimization that apply to various of encoding formats
> >- align implementation with protol-buf flink support to keep code
> >consistency
> >
> > Please give another pass and let me know if you have any questions.
> >
> > Chen
> >
> > On Mon, May 30, 2022 at 6:34 PM Chen Qin  wrote:
> >
> >>
> >>
> >> On Mon, May 30, 2022 at 7:35 AM Martijn Visser <
> martijnvis...@apache.org>
> >> wrote:
> >>
> >>> Hi Chen,
> >>>
> >>> I think the best starting point would be to create a FLIP [1]. One of
> the
> >>> important topics from my point of view is to make sure that such
> changes
> >>> are not only available for SQL users, but are also being considered for
> >>> Table API, DataStream and/or Python. There might be reasons why not to
> do
> >>> that, but then those considerations should also be captured in the
> FLIP.
> >>>
> >>> > thanks for piointer, working on Flip-237, stay tune
> >>
> >>> Another thing that would be interesting is how Thrift translates into
> >>> Flink
> >>> connectors & Flink formats. Or is your Thrift implementation only a
> >>> connector?
> >>>
> >> > it's flink-format for most part, hope it can help with pyflink not
> sure.
> >>
> >>>
> >>> Best regards,
> >>>
> >>> Martijn
> >>>
> >>> [1]
> >>>
> >>>
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
> >>>
> >>> Op zo 29 mei 2022 om 19:06 schreef Chen Qin :
> >>>
> >>> > Hi there,
> >>> >
> >>> > We would like to discuss and potentially upstream our thrift support
> >>> > patches to flink.
> >>> >
> >>> > For some context, we have been internally patched flink-1.11.2 to
> >>> support
> >>> > FlinkSQL jobs read/write to thrift encoded kafka source/sink. Over
> the
> >>> > course of last 12 months, those patches supports a few features not
> >>> > available in open source master, including
> >>> >
> >>> >- allow user defined inference thrift stub class name in table
> DDL,
> >>> >Thrift binary <-> Row
> >>> >- dynamic overwrite schema type information loaded from
> HiveCatalog
> >>> >(Table only)
> >>> >- forward compatible when kafka topic encode with new schema
> >>> (adding new
> >>> >field)
> >>> >- backward compatible when job with new schema handles input or
> >>> state
> >>> >with old schema
> >>> >
> >>> > With more FlinkSQL jobs in production, we expect maintenance of
> >>> divergent
> >>> > feature sets to increase in the next 6-12 months. Specifically
> >>> challenges
> >>> > around
> >>> >
> >>> >- lack of systematic way to support inference

Re: [VOTE] FLIP-271: Autoscaling

2022-11-26 Thread Zheng Yu Chen
+1(no-binding)

Maximilian Michels  于 2022年11月24日周四 上午12:25写道:

> Hi everyone,
>
> I'd like to start a vote for FLIP-271 [1] which we previously discussed on
> the dev mailing list [2].
>
> I'm planning to keep the vote open for at least until Tuesday, Nov 29.
>
> -Max
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-271%3A+Autoscaling
> [2] https://lists.apache.org/thread/pvfb3fw99mj8r1x8zzyxgvk4dcppwssz
>


[jira] [Created] (FLINK-30221) Fix the bug of sum(try_cast(string as bigint)) return null when partial elements can't convert to bigint

2022-11-26 Thread dalongliu (Jira)
dalongliu created FLINK-30221:
-

 Summary: Fix the bug of sum(try_cast(string as bigint)) return 
null when partial elements can't convert to bigint
 Key: FLINK-30221
 URL: https://issues.apache.org/jira/browse/FLINK-30221
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API, Table SQL / Runtime
Affects Versions: 1.17.0
Reporter: dalongliu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30220) Hide user credentials in Flink SQL JDBC connector

2022-11-26 Thread Jun Qin (Jira)
Jun Qin created FLINK-30220:
---

 Summary: Hide user credentials in Flink SQL JDBC connector
 Key: FLINK-30220
 URL: https://issues.apache.org/jira/browse/FLINK-30220
 Project: Flink
  Issue Type: Improvement
Reporter: Jun Qin


Similar to FLINK-28028, when using Flink SQL JDBC connector, we should also 
have a way to secure the username and the password used in the DDL:
{code:java}
CREATE TABLE MyUserTable (
  id BIGINT,
  name STRING,
  age INT,
  status BOOLEAN,
  PRIMARY KEY (id) NOT ENFORCED
) WITH (
   'connector' = 'jdbc',
   'url' = 'jdbc:mysql://localhost:3306/mydatabase',
   'table-name' = 'users',
   'username' = 'a-username',
   'password' = 'a-password'
);
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)