Re: Flink Kafka EXACTLY_ONCE causing KafkaException ByteArraySerializer is not an instance of Serializer

2020-07-30 Thread Qingsheng Ren
from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ > -- Best Regards, *Qingsheng Ren* Electrical and Computer Engineering Carnegie Mellon University Email: renqs...@gmail.com

Re: Flink Kafka EXACTLY_ONCE causing KafkaException ByteArraySerializer is not an instance of Serializer

2020-08-02 Thread Qingsheng Ren
t; 1.10 at the moment. Are there any known work arounds? > > On Fri, Jul 31, 2020 at 02:42 Qingsheng Ren wrote: > >> Hi Vikash, >> >> It's a bug about classloader used in `abortTransaction()` method in >> `FlinkKafkaProducer`, Flink version 1.10.0. I think it has

Re: KafkaSource metrics

2021-05-25 Thread Qingsheng Ren
Hi Oscar, Thanks for raising this problem! Currently metrics of KafkaConsumer are not registered in Flink as in FlinkKafkaConsumer. A ticket has been created on JIRA, and hopefully we can fix it in next release. https://issues.apache.org/jira/browse/FLINK-22766 -- Best Regards, Qingsheng Ren

Re: KafkaFetcher [] - Committing offsets to Kafka failed.

2021-08-27 Thread Qingsheng Ren
be helpful to set logging level of org.apache.kafka.clients.consumer to DEBUG or TRACE,  which can provide more information about why offset commit is failed. Hope this can help you~ [1] https://kafka.apache.org/documentation/#consumer_monitoring -- Best Regards, Qingsheng Ren Email: renqs

Re: [DISCUSS] Creating an external connector repository

2021-10-18 Thread Qingsheng Ren
expansion of Flink ecosystem. Very excited to see the progress! Best, Qingsheng Ren On Oct 15, 2021, 8:47 PM +0800, Arvid Heise , wrote: > Dear community, > Today I would like to kickstart a series of discussions around creating an > external connector repository. The main idea is to d

Re: Problem with Flink job and Kafka.

2021-10-18 Thread Qingsheng Ren
aste part of your code (on DataStream API) or SQL (on Table & SQL API). -- Best Regards, Qingsheng Ren Email: renqs...@gmail.com On Oct 19, 2021, 9:28 AM +0800, Marco Villalobos , wrote: > I have the simplest Flink job that simply deques off of a kafka topic and > writes to another ka

Re: SplitFetcherManager custom error handler

2021-10-18 Thread Qingsheng Ren
, Qingsheng Ren Email: renqs...@gmail.com On Oct 19, 2021, 8:31 AM +0800, Mason Chen , wrote: > Hi all, > > I am implementing a Kafka connector with some custom error handling that is  > aligned with our internal infrastructure. `SplitFetcherManager` has a > hardcoded error

Re: Flink 1.14 doesn’t suppport kafka consummer 0.11 or lower?

2021-10-19 Thread Qingsheng Ren
n try to switch to a higher version Kafka client and it should work. [1] https://kafka.apache.org/protocol.html#protocol_compatibility -- Best Regards, Qingsheng Ren Email: renqs...@gmail.com On Oct 20, 2021, 11:18 AM +0800, Jary Zhen , wrote: > Hi, everyone > > I'm using Flink 1.14

Re: Kafka Source Recovery Behavior

2021-11-16 Thread Qingsheng Ren
ooking forward to your inspiring ideas! -- Best Regards, Qingsheng Ren Email: renqs...@gmail.com On Nov 10, 2021, 11:32 PM +0800, Mason Chen , wrote: > > there was no logic to filter/remove splits

Re: Require help regarding possible issue/bug I'm facing while using Flink

2022-03-06 Thread Qingsheng Ren
. Best regards, Qingsheng Ren > On Mar 7, 2022, at 09:16, Chia De Xun . wrote: > > Greetings, > > I'm facing a difficult issue/bug while working with Flink. Would definitely > appreciate some official expert help on this issue. I have posted my problem > on StackOve

Re: [External] Require help regarding possible issue/bug I'm facing while using Flink

2022-03-07 Thread Qingsheng Ren
for now. Also answered on your StackOverflow post. Hope this would be helpful! [1] https://issues.apache.org/jira/browse/FLINK-17782 Best Regards, Qingsheng > On Mar 7, 2022, at 18:48, De Xun Chia wrote: > > Hi Qingsheng Ren, > > > Thank you for the help! It worked and

Re: Kafka source with multiple partitions loses data during savepoint recovery

2022-03-18 Thread Qingsheng Ren
Hi Sharon, Could you check the log after starting the job with savepoint? If you have INFO log enabled you will get an entry “Consumer subtask {} will start reading {} partitions with offsets in restored state: {}” [1] in the log, which shows the starting offset of partitions. This might be he

Re: how to set kafka sink ssl properties

2022-03-18 Thread Qingsheng Ren
Hi, Your usage looks good to me, but could you provide the exception (if any) or the unexpected behavior you met after starting the job? It’s difficult to debug with only these configurations. Best regards, Qingsheng > On Mar 18, 2022, at 01:04, HG wrote: > > Hi Matthias, > > It should

Re: Flink kafka consumer disconnection, application processing stays behind

2022-03-24 Thread Qingsheng Ren
Hi Isidoros, I’m not sure in which kind of way the timeout and the high back pressure are related, but I think we can try to resolve the request timeout issue first. You can take a look at the request log on Kafka broker and see if the request was received by broker, and how long it takes for b

Re: Flink kafka consumer disconnection, application processing stays behind

2022-03-25 Thread Qingsheng Ren
. I don't know actually if I have helped, but is there any > chance that it would be a problem of how we have configured the watermarks? > > Στις Πέμ 24 Μαρ 2022 στις 10:27 π.μ., ο/η Qingsheng Ren > έγραψε: > Hi Isidoros, > > I’m not sure in which kind of way the timeou

Re: Datetime format

2022-03-28 Thread Qingsheng Ren
Hi, File system table sink doesn’t provide APIs for changing the prefix or suffix of the generated filename. Maybe you can consider trying DataStream connector and set OutputFileConfig manually to specify prefix and suffix of generating filenames. Best, Qingsheng > On Mar 28, 2022, at 13:10

Re: Where is the "Partitioned All Cache" doc?

2022-03-28 Thread Qingsheng Ren
Hi, The optimization you mentioned is only applicable for the product provided by Alibaba Cloud. In open-source Apache Flink there isn’t a unique caching abstraction for all lookup tables, and each connector has there own cache implementation. For example JDBC uses Guava cache and FileSystem u

Re: Could you please give me a hand about json object in flink sql

2022-04-01 Thread Qingsheng Ren
Hi, I’m afraid you have to use a UDTF to parse the content and construct the final json string manually. The key problem is that the field “content” is actually a STRING, although it looks like a json object. Currently the json format provided by Flink could not handle this kind of field defin

Re: Unbalanced distribution of keyed stream to downstream parallel operators

2022-04-01 Thread Qingsheng Ren
Hi Isidoros, Two assumptions in my mind: 1. Records are not evenly distributed across different keys, e.g. some accountId just has more events than others. If the record distribution is predicable, you can try to combine other fields or include more information into the key field to help bala

Re: Could you please give me a hand about json object in flink sql

2022-04-02 Thread Qingsheng Ren
.. > ); > > And in input json also might contains json array, like: > {"schema": "schema_infos", "payload": {"id": "1", "content": > "{\"color\":\"Red\",\"BackgroundColor\":

Re: Could you please give me a hand about json object in flink sql

2022-04-02 Thread Qingsheng Ren
ort single physical column, > and in our project reqiurement, there are more than one hundred columns in > sink table. So I need combine those columns into one string in a single UDF? > > Thanks && Regards, > Hunk > > > > > > > > At 2022-04

Re: FlinkKafkaProducer - Avro - Schema Registry

2022-04-07 Thread Qingsheng Ren
Hi Dan, In FlinkKafkaProducer, records are serialized by the SerializationSchema specified in the constructor, which is the “schema” (ConfluentRegistryAvroSerializationSchema.forSpecific(AvroObject.class)) in your case, instead of the serializer specified in producer properties. The default se

Re: Is there any way to get the ExecutionConfigurations in Dynamic factory class

2022-04-07 Thread Qingsheng Ren
Hi Anitha, AFAIK DynamicTableSourceFactory doesn’t expose interface for getting parallelism. Could you elaborate on why you need parallelism in table factory? Maybe we could find other ways to fulfill your requirement. Best regards, Qingsheng > On Apr 7, 2022, at 16:11, Anitha Thankappan

Re: Weird Flink Kafka source watermark behavior

2022-04-08 Thread Qingsheng Ren
Hi Jin, If you are using new FLIP-27 sources like KafkaSource, per-partition watermark (or per-split watermark) is a default feature integrated in SourceOperator. You might hit the bug described in FLINK-26018 [1], which happens during the first fetch of the source that records in the first spl

Re: Discuss making KafkaSubscriber Public

2022-04-13 Thread Qingsheng Ren
Thanks for the proposal Mason! I think exposing `KafkaSubscriber` as public API is helpful for users to implement more complex subscription logics. +1 (non-binding) Cheers, Qingsheng > On Apr 12, 2022, at 11:46, Mason Chen wrote: > > Hi Flink Devs, > > I was looking to contribute to > ht

Re: Is there any way to get the ExecutionConfigurations in Dynamic factory class

2022-04-13 Thread Qingsheng Ren
Thanks and Regards, > Anitha Thankappan > > > On Fri, Apr 8, 2022 at 11:53 AM Qingsheng Ren wrote: > Hi Anitha, > > AFAIK DynamicTableSourceFactory doesn’t expose interface for getting > parallelism. Could you elaborate on why you need parallelism in table > factory? Mayb

Re: Weird Flink Kafka source watermark behavior

2022-04-13 Thread Qingsheng Ren
. is there a way to easily test > this fix locally? based on the threads, should i just move back to > FlinkKafkaConsumer until 1.14.5? > > On Fri, Apr 8, 2022 at 1:34 AM Qingsheng Ren wrote: > Hi Jin, > > If you are using new FLIP-27 sources like KafkaSource, pe

Re: Weird Flink Kafka source watermark behavior

2022-04-13 Thread Qingsheng Ren
Another solution would be setting the parallelism = #partitions, so that one parallelism would be responsible for reading exactly one partition. Qingsheng > On Apr 13, 2022, at 17:52, Qingsheng Ren wrote: > > Hi Jin, > > Unfortunately I don’t have any quick bypass in mind ex

Re: How can I set job parameter in flink sql

2022-05-24 Thread Qingsheng Ren
Hi, You can take use of the configuration “pipeline.global-job-parameters” [1] to pass your custom configs all the way into the UDF. For example you can execute this in SQL client: SET pipeline.global-job-parameters=black_list_path:/root/list.properties; Then you can get the value “/root/list.

Re: Json Deserialize in DataStream API with array length not fixed

2022-05-24 Thread Qingsheng Ren
Hi Zain, I assume you are using DataStream API as described in the subject of your email, so I think you can define any functions/transformations to parse the json value, even the schema is changing. It looks like the value of field “array_coordinates” is a an escaped json-formatted STRING in

Re: Source vs SourceFunction and testing

2022-05-24 Thread Qingsheng Ren
Hi Piotr, I’d like to share my understanding about this. Source and SourceFunction are both interfaces to data sources. SourceFunction was designed and introduced earlier and as the project evolved, many shortcomings emerged. Therefore, the community re-designed the source interface and introdu

Re: Source vs SourceFunction and testing

2022-05-25 Thread Qingsheng Ren
nvironment.getExecutionEnvironment() multiple times and get > the same environment, was wrong. I had env.addSource and env.fromSource calls > using one instance of the environment, but then called env.execute() on > another instance :facepalm: > > On Wed, May 25, 2022 at 6

Re: Can we use CheckpointedFunction with the new Source api?

2022-05-30 Thread Qingsheng Ren
Hi Qing, I’m afraid CheckpointedFunction cannot be applied to the new source API, but could you share the abstractions of your source implementation, like which component a split maps to etc.? Maybe we can try to do some workarounds. Best, Qingsheng > On May 30, 2022, at 20:09, Qing Lim wr

Re: FileSource SourceReader failure scenario

2022-05-31 Thread Qingsheng Ren
Hi Meghajit, Good question! To make a short answer: splits won’t be returned back to enumerator by reader once they are assigned and *checkpointed*. As described by the JavaDoc of SplitEnumerator#addSplitsBack [1]: > Add a split back to the split enumerator. It will only happen when a > Sourc

Re: Can we use CheckpointedFunction with the new Source api?

2022-05-31 Thread Qingsheng Ren
> something reusable within our organization, which is why I asked the original > question, but I have now solved it by implementing a stateful Map function > instead, it is a bit less ergonomic, but acceptable on my end. So if you have > an alternative, please share with me, tha

Re: Cannot cast GoogleHadoopFileSystem to hadoop.fs.FileSystem to list file in Flink 1.15

2022-06-01 Thread Qingsheng Ren
Hi ChangZhuo, I assume it’s a classloading issue but I can’t track down to the root cause in code. Would you mind sharing the entire exception stack and some JM/TM logs related to file system? Best regards, Qingsheng > On Jun 2, 2022, at 09:08, ChangZhuo Chen (陳昌倬) wrote: > > Hi, > > We u

Re: Cannot cast GoogleHadoopFileSystem to hadoop.fs.FileSystem to list file in Flink 1.15

2022-06-02 Thread Qingsheng Ren
wrote: > > On Thu, Jun 02, 2022 at 11:17:19AM +0800, Qingsheng Ren wrote: > > Hi ChangZhuo, > > > > I assume it’s a classloading issue but I can’t track down to the root cause > > in code. Would you mind sharing the entire exception stack and some JM/TM > >

Re: filesink part files roll over

2022-06-06 Thread Qingsheng Ren
Hi Sucheth, Please see https://issues.apache.org/jira/browse/FLINK-27910 Best, Qingsheng > On Jun 5, 2022, at 23:21, Sucheth S wrote: > > Hi, > > Can someone please help me with this please - > https://stackoverflow.com/q/72496963/9125940 ? > > Regards, > Sucheth Shivakumar > website : htt

Re: [External] Re: Source vs SourceFunction and testing

2022-06-09 Thread Qingsheng Ren
t is planned to implement the FromElementsSource we'd rather prefer to wait > for it. > > Thanks! > Carlos > > [1] > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/testing/#junit-rule-miniclusterwithclientresource > > -Original Mes

Fwd: Apache flink doesn't work with avro kafka topic with multiple event types

2022-06-13 Thread Qingsheng Ren
Hi Sucheth, If you are referring to Table / SQL API, I'm afraid it doesn't support schema evolution or different types from one Kafka table. An alternative way is to consume the topic with raw format [1] and do deserialization with a UDTF. If you are using the DataStream API, you can implement the

Re: Flink Shaded dependencies and extending Flink APIs

2022-06-13 Thread Qingsheng Ren
Hi Andrew, This is indeed a tricky case since Flink doesn't provide non-shaded JAR for flink-json. One hacky solution in my mind is like: 1. Create a module let's say "wikimedia-event-utilities-shaded" that relocates Jackson in the same way and uses the same Jackson version as flink-shaded-jackso

Re: Kafka Consumer commit error

2022-06-15 Thread Qingsheng Ren
Hi, Thanks for reporting the issue and the demo provided by Christian! I traced the code and think it's a bug in KafkaConsumer (see KAFKA-13563 [1]). We probably need to bump the Kafka client to 3.1 to fix it but we should check the compatilibity issue first because it’s crossing major version

Re: [ANNOUNCE] Apache Flink 1.14.5 released

2022-06-26 Thread Qingsheng Ren
Thanks Xingbo for driving this release! Best, Qingsheng > On Jun 22, 2022, at 11:50, Xingbo Huang wrote: > > The Apache Flink community is very happy to announce the release of Apache > Flink 1.14.5, which is the fourth bugfix release for the Apache Flink 1.14 > series. > > Apache Flink® is

Re: Synchronizing streams in coprocessfunction

2022-06-27 Thread Qingsheng Ren
Hi Gopi, What about using a window with a custom trigger? The window is doing nothing but aggregating your input to a collection. The trigger accepts metadata from the low input stream so it can fire and purge the window (emit all elements in the window to downstream) on arrival of metadata.

[DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-08 Thread Qingsheng Ren
Hi devs and users, I’d like to start a discussion about reverting a breaking change about sink  metrics made in 1.15 by FLINK-26126 [1] and FLINK-26492 [2]. TL;DR All sink metrics with name “numXXXOut” defined in FLIP-33 are replace by  “numXXXSend” in FLINK-26126 and FLINK-26492. Considering me

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-10 Thread Qingsheng Ren
t; > >> +1 for reverting sink metric name. > > > >> > > > >> We often forget that metric is also one of the important APIs. > > > >> > > > >> +1 for releasing 1.15.3 to fix this. > > > >> > > > >> Best, &g

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-10 Thread Qingsheng Ren
unted both (which > is completely wrong). > > A new metric was always required; otherwise you inevitably end up breaking > some semantic. > Adding a new metric for what the sink writes to the external system is, for > better or worse, more consistent with how these metric

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-12 Thread Qingsheng Ren
d suggest is to stick with what we got (although I despise the > name numRecordsSend), and alias the numRecordsOut metric for all > non-TwoPhaseCommittingSink. > > On 11/10/2022 05:54, Qingsheng Ren wrote: > > Thanks for the details Chesnay! > > By “alias” I mean to respec

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-12 Thread Qingsheng Ren
output of sink is a metric that users care a lot about. Thanks, Qingsheng On Wed, Oct 12, 2022 at 6:20 PM Qingsheng Ren wrote: > > Thanks Chesnay for the reply. +1 for making a unified and clearer > metric definition distinguishing internal and external data transfers. > As you describe

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-13 Thread Qingsheng Ren
db30d001a95de95b3b9993eeb06f558f6c/flink-metrics/flink-metrics-core/src/main/java/org/apache/flink/metrics/groups/SinkWriterMetricGroup.java#L48 > > > On Wed, Oct 12, 2022 at 12:48 PM Qingsheng Ren wrote: > >> As a supplement, considering it could be a big reconstruction >>

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-13 Thread Qingsheng Ren
internal traffic in the future. > > Best regards, > Jing > > > On Thu, Oct 13, 2022 at 3:08 PM Qingsheng Ren wrote: > >> Hi Jing, >> >> Thanks for the reply! >> >> Let me rephrase my proposal: we’d like to use numXXXOut registered on >> SinkWri

[DISCUSS] Planning Flink 1.17

2022-10-20 Thread Qingsheng Ren
lease! Best regards, Qingsheng Ren and Leonard Xu Ververica (Alibaba) [1] https://cwiki.apache.org/confluence/display/FLINK/Release+Management+and+Feature+Plan

[SUMMARY] Flink 1.17 Release Sync 11/1/2022

2022-11-01 Thread Qingsheng Ren
Hi devs and users, I'd like to share some highlights about the 1.17 release sync on 11/1/2022. - The release sync is scheduled for biweekly Tuesdays at 9am (Central European Standard Time) / 4pm (China Standard Time). - Retrospective for 1.16: a discussion [1] is opened by Matthias for collectin

[SUMMARY] Flink 1.17 Release Sync 11/29/2022

2022-11-29 Thread Qingsheng Ren
Hi devs and users, I'd like to share some highlights from the release sync on 11/29/2022. 1. @Contributors please update your progress on the release 1.17 wiki page [1] before the sync meeting so that everyone could track it. 2. We have new CI stability tickets and owners should have been pinged

[SUMMARY] Flink 1.17 Release Sync 1/10/2023

2023-01-10 Thread Qingsheng Ren
Hi devs and users, I'd like to share some highlights from the release sync on 1/10/2023. - CI stabilities: owners of blocker issues should have been pinged offline. - Priorities of the test instabilities: test instabilities are prioritized as Critical and become blocker as soon as we notice that

[SUMMARY] Flink 1.17 Release Sync 1/17/2023

2023-01-17 Thread Qingsheng Ren
Hi devs and users, I'd like to share some highlights from the release sync on 1/17/2023. - CI & Performance: totally 5 blocker issues. Owners should have been pinged. - FLIP-272 [1] has finished and there will be a blog post before 1.17 release. - PR about publishing SBOM [2] has been merged, a

[SUMMARY] Flink 1.17 Release Sync 2/14/2023

2023-02-14 Thread Qingsheng Ren
Hi devs and users, I'd like to share some highlights from Flink 1.17 release sync on 2/14/2023. Release testing: - The original deadline of cross-team testing is Feb 21, 2023 (next Tuesday). We will monitor the status throughout the week and hopefully conclude everything before the deadline. - P

[SUMMARY] Flink 1.17 Release Sync 2/28/2023

2023-02-28 Thread Qingsheng Ren
Hi devs and users, I'd like to share some highlights from Flink 1.17 release sync on 2/28/2023. Release testing: - All release testing tasks have finished in the last week. Big thanks to our contributors and volunteers for the effort on this! 1.17 Blockers: There are 2 blockers currently: FLINK-

Re: [ANNOUNCE] Apache Flink 1.17.0 released

2023-03-24 Thread Qingsheng Ren
I'd like to say thank you to all contributors of Flink 1.17. Your support and great work together make this giant step forward! Also like Matthias mentioned, feel free to leave us any suggestions and let's improve the releasing procedure together. Cheers, Qingsheng On Fri, Mar 24, 2023 at 5:00 P

[ANNOUNCE] Starting with Flink 1.18 Release Sync

2023-04-03 Thread Qingsheng Ren
Hi everyone, As a fresh start of the Flink release 1.18, I'm happy to share with you that the first release sync meeting of 1.18 will happen tomorrow on Tuesday, April 4th at 10am (UTC+2) / 4pm (UTC+8). Welcome and feel free to join us and share your ideas about the new release cycle! Details of

[SUMMARY] Flink 1.18 Release Sync 05/30/2023

2023-05-30 Thread Qingsheng Ren
Hi devs and users, I'd like to share some highlights from the release sync of 1.18 on May 30. 1. @developers please update the progress of your features on 1.18 release wiki page [1] ! That will help us a lot to have an overview of the entire release cycle. 2. We found a JIRA issue (FLINK-18356)

Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 Thread Qingsheng Ren
Congratulations and big THANK YOU to everyone helping with this release! Best, Qingsheng On Fri, Oct 27, 2023 at 10:18 AM Benchao Li wrote: > Great work, thanks everyone involved! > > Rui Fan <1996fan...@gmail.com> 于2023年10月27日周五 10:16写道: > > > > Thanks for the great work! > > > > Best, > > Rui

[ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Qingsheng Ren
ease possible! Regards, Qingsheng Ren

[ANNOUNCE] Apache Flink CDC 3.1.1 released

2024-06-18 Thread Qingsheng Ren
ease possible! Regards, Qingsheng Ren