Re: FlinkKafkaConsumer and FlinkKafkaProducer and Kafka Cluster Migration

2022-01-17 Thread Martijn Visser
Hi Alexey, Just so you know, this feature most likely won't make it to 1.15 unfortunately. Best regards, Martijn On Mon, 17 Jan 2022 at 22:47, Alexey Trenikhun wrote: > Thank you Fabian. > > We are waiting for FLINK-18450 > (watermark

Re: Flink 1.14.2 - Log4j2 -Dlog4j.configurationFile is ignored and falls back to default /opt/flink/conf/log4j-console.properties

2022-01-17 Thread Yang Wang
I think you are right. Before 1.13.0, if the log configuration file does not exist, the logging properties would not be added to the start command. That is why it could work in 1.12.2. However, from 1.13.0, we are not using "kubernetes.container-start-command-template" to generate the JM/TM start

Flink mysql CDC 进程正常,但发现数据丢失了

2022-01-17 Thread Fei Han
@all: 大家好,Flink Mysql CDC实时同步数据,发现mysql和下游StarRocks的数据量对不上。 StarRocks用的是primary key模型, 版本: Flink1.13.3 Flink CDC 2.1.1 报错如下: Caused by: java.lang.RuntimeException: SplitFetcher thread 0 received unexpected exception while polling the records at

Re: Examples / Documentation for Flink ML 2

2022-01-17 Thread Dong Lin
Hi Bonino, Thanks for your interest! Flink ML is currently ready for experienced algorithm developers to try it out because we have setup the basic APIs and infrastructure to develop algorithms. Five algorithms (i.e. kmeans, naive bays, knn, logistic regression and one-hot encoder) has been

Flink Kinesis connector - EFO connection error with http proxy settings

2022-01-17 Thread Gnanamoorthy, Saravanan
Hello, We are using Flink kinesis connector for processing the streaming data from kinesis. We are running the application behind the proxy. After the proxyhost and proxyport settings, the Connector works with default publisher type(Polling) but it doesn’t work when we enable the publisher type

Re: FlinkKafkaConsumer and FlinkKafkaProducer and Kafka Cluster Migration

2022-01-17 Thread Alexey Trenikhun
Thank you Fabian. We are waiting for FLINK-18450 (watermark alignment) before switching to KafkaSource, currently we use extra logic on top of FlinkKafkaConsumer to support watermark alignment. Thanks, Alexey [FLINK-18450] Add watermark

Re: Flink 1.14.2 - Log4j2 -Dlog4j.configurationFile is ignored and falls back to default /opt/flink/conf/log4j-console.properties

2022-01-17 Thread Tamir Sagi
Hey Yang, thanks for answering, TL;DR Assuming I have not missed anything , the way TM and JM are created is different between these 2 versions, but it does look like flink-console.sh gets called eventually with the same exec command. in 1.12.2 if

Re: Examples / Documentation for Flink ML 2

2022-01-17 Thread Dawid Wysakowicz
I am adding a couple of people who worked on it. Hopefully, they will be able to answer you. On 17/01/2022 13:39, Bonino Dario wrote: > > Dear List, > > We are in the process of evaluating Flink ML version 2.0 in the > context of some ML task mainly concerned with classification and > clustering.

Re: [statefun] upgrade path - shared cluster use

2022-01-17 Thread Dawid Wysakowicz
I am pretty confident the goal is to be able to run on the newest Flink version. However, as the release cycle is decoupled for both modules it might take a bit. I added Igal to the conversation, who I hope will be able to give you an idea when you can expect that to happen. Best, Dawid On

Examples / Documentation for Flink ML 2

2022-01-17 Thread Bonino Dario
Dear List, We are in the process of evaluating Flink ML version 2.0 in the context of some ML task mainly concerned with classification and clustering. While algorithms for this 2 domains are already present, although in a limited form (perhaps) in the latest release of Flink ML, we did not

Re: Flink per-job cluster HbaseSinkFunction fails before starting - Configuration issue

2022-01-17 Thread Dawid Wysakowicz
Hey Kamil, Have you followed this guide to setup kerberos authentication[1]? Best, Dawid [1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/security/security-kerberos/ On 14/01/2022 17:09, Kamil ty wrote: > Hello all, > I have a flink job that is using the

Re: Flink native k8s integration vs. operator

2022-01-17 Thread Gyula Fóra
Hi Yang! Thanks for the input! I agree with you on both points that you made. Even if we might support both standalone and native modes in the long run, we should probably build the first version on top of the native integration. This I feel will result in a much simpler, minimalistic first

Re: [E] Re: Orphaned job files in HDFS

2022-01-17 Thread Yang Wang
The clean-up of the staging directory is best effort. If the JobManager crashed and killed externally, then it does not have any chance to do the staging directory clean-up. AFAIK, we do not have such Flink options to guarantee the clean-up. Best, Yang David Clutter 于2022年1月11日周二 22:59写道: >

Re: Flink 1.14.2 - Log4j2 -Dlog4j.configurationFile is ignored and falls back to default /opt/flink/conf/log4j-console.properties

2022-01-17 Thread Yang Wang
I think the root cause is that we are using "flink-console.sh" to start the JobManager/TaskManager process for native K8s integration after FLINK-21128[1]. So it forces the log4j configuration name to be "log4j-console.properties". [1]. https://issues.apache.org/jira/browse/FLINK-21128 Best,

[statefun] upgrade path - shared cluster use

2022-01-17 Thread Filip Karnicki
Hi, we're currently using statefun 3.1.1 on a shared cloudera cluster, which is going to be updated to 1.14.x We think this update might break our jobs, since 3.1.1 is not explicitly compatible with 1.14.x ( https://flink.apache.org/downloads.html#apache-flink-stateful-functions-311) Is there

Disable S3 HTTPS host name verification

2022-01-17 Thread Tim Eckhardt
Hi, we are trying to get the S3 integration to work for storing checkpoints and savepoints. Our S3 endpoint is self-hosted so we’re not using AWS in particular. The problem I’m facing is that I’m connecting via HTTPS (443) but I have to disable host name verification for now. But I

Re: [FEEDBACK] Metadata Platforms / Catalogs / Lineage integration

2022-01-17 Thread wangqinghuan
we are using Datahub to address table-level lineage and column-level lineage for Flink SQL. 在 2022/1/13 23:27, Martijn Visser 写道: Hi everyone, I'm currently checking out different metadata platforms, such as Amundsen [1] and Datahub [2]. In short, these types of tools try to address

How to accelerate state processor with a large savepoint

2022-01-17 Thread Hua Wei Chen
Hi team, We want to try to use state processor APIs[1] to clean up some legacy states. Here are our steps: 1. Create a new savepoint (~= 1.5TB) 2. Submit state processor jobs 3. Write results to a new savepoint We create 8 task managers with 120 slots to execute it. Here are the related

Re: Flink native k8s integration vs. operator

2022-01-17 Thread Yang Wang
Glad to see that the interest of this thread keeps going. And thanks Thomas, Gyula, and Marton for driving this effort. I want to share my two cents about the Flink K8s operator. > Standalone deployment VS native K8s integration There is already some feature requirement issue[1] for the