Re: recover from svaepoint

2021-06-02 Thread Till Rohrmann
Forwarding 周瑞's message to a duplicate thread: After our analysis, we found a bug in the org.apache.flink.streaming.connectors.kafka.internals.FlinkKafkaInternalProducer.resumeTransaction method The analysis process is as follows: org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkF

Re: How to check is Kafka partition "idle" in emitRecordsWithTimestamps

2021-06-02 Thread Till Rohrmann
Hi Alexey, I think the current idleness detection works based on timeouts. You need a special watermark generator that periodically emits the watermarks. If no event has been emitted for so and so long, then it is marked as idle. Yes, I was referring to FLINK-18450. At the moment nobody is activ

Re: Prometheus Reporter Enhancement

2021-06-02 Thread Chesnay Schepler
Let's move this to the ticket then. :) On 6/2/2021 8:45 PM, Mason Chen wrote: Hi Chesnay, I would like to take on https://issues.apache.org/jira/browse/FLINK-17495  as a contribution to OSS, if that’s alright with the team. We can discuss im

In native k8s application mode, how can I know whether the job is failed or finished?

2021-06-02 Thread 刘逍
Hi, We are currently using Flink 1.6 standalone mode, but the lack of isolation is a headache for us. At present, I am trying application mode of Flink 1.13.0 on native K8s. I found that as soon as the job ends, whether it ends normally or abnormally, the jobmanager can no longer be accessed, so

Re: SourceFunction cannot run in Batch Mode

2021-06-02 Thread Ingo Bürk
Hi Oscar, I think you'll find your answers in [1], have a look at Yun's response a couple emails down. Basically, SourceFunction is the legacy source stack, and ideally you'd instead implement your source using the FLIP-27 stack[2] where you can directly define the boundedness, but he also mention

Re: SourceFunction cannot run in Batch Mode

2021-06-02 Thread 陳樺威
Sorry, there are some typos that may be misleading. The SourceFunction will be detected as* Streaming Mode.* 陳樺威 於 2021年6月3日 週四 下午1:29寫道: > Hi, > > Currently, we want to use batch execution mode [0] to consume historical > data and rebuild states for our streaming application. > The Flink app w

SourceFunction cannot run in Batch Mode

2021-06-02 Thread 陳樺威
Hi, Currently, we want to use batch execution mode [0] to consume historical data and rebuild states for our streaming application. The Flink app will be run on-demand and close after complete all the file processing. We implement a SourceFuntion [1] to consume bounded parquet files from GCS. Howe

Flink stream processing issue

2021-06-02 Thread Qihua Yang
Hi, I have a question. We have two data streams that may contain duplicate data. We are using keyedCoProcessFunction to process stream data. I defined the same keySelector for both streams. Our flink application has multiple replicas. We want the same data to be processed by the same replica. Can

Corrupted unaligned checkpoints in Flink 1.11.1

2021-06-02 Thread Alexander Filipchik
Hi, Trying to figure out what happened with our Flink job. We use flink 1.11.1 and run a job with unaligned checkpoints and Rocks Db backend. The whole state is around 300Gb judging by the size of savepoints. The job ran ok. At some point we tried to deploy new code, but we couldn't take a save p

Re: Parquet reading/writing

2021-06-02 Thread Taras Moisiuk
Thank you, looks like shuffle() works -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Prometheus Reporter Enhancement

2021-06-02 Thread Mason Chen
Hi Chesnay, I would like to take on https://issues.apache.org/jira/browse/FLINK-17495 as a contribution to OSS, if that’s alright with the team. We can discuss implementation details here or in the ticket, but I was thinking about modifying th

Flink kafka consumers stopped consuming messages

2021-06-02 Thread Ilya Karpov
Hi there, today I've observed strange behaviour of a flink streaming application (flink 1.6.1, per-job cluster deployment, yarn): 3 task managers (2 slots each) are running but only 1 slot is actually consuming messages from kafka (v0.11.0.2), others were idling (currentOutputWatermark was stuc

Re: avro-confluent supports authentication enabled schema registry

2021-06-02 Thread tao xiao
JIRA created https://issues.apache.org/jira/browse/FLINK-22858 but I cannot assign it to myself. Can you pls assign it to me? On Wed, Jun 2, 2021 at 11:00 PM Fabian Paul wrote: > Hi Tao, > > I was browsing the code a bit and I think this is currently not support > but it seems to be not too > di

Re: avro-confluent supports authentication enabled schema registry

2021-06-02 Thread Fabian Paul
Hi Tao, I was browsing the code a bit and I think this is currently not support but it seems to be not too difficult to implement. You would need to allow a map of configurations and finally pass it to [1] Can you create a ticket in our JIRA? Would you be willing to contribute this feature? B

Re: Parquet reading/writing

2021-06-02 Thread Taras Moisiuk
Hi Fabian, Thank you for your answer. I've updated the flink version to 1.12.4 but unfortunately the problem still persists. I'm running this job in local mode, so I have only following log: WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.a

Re: recover from svaepoint

2021-06-02 Thread Piotr Nowojski
Hi, I think there is no generic way. If this error has happened indeed after starting a second job from the same savepoint, or something like that, user can change Sink's operator UID. If this is an issue of intentional recovery from an earlier checkpoint/savepoint, maybe `FlinkKafkaProducer#setL

Re: Parquet reading/writing

2021-06-02 Thread Fabian Paul
Hi Taras, On first glance this looks like a bug to me. Can you try the latest 1.12 version (1.12.4)? If the bug still persists can you share the full job manager and task manager logs to further debug this problem. Best, Fabian > On 2. Jun 2021, at 13:22, Taras Moisiuk wrote: > > Update: >

Re: avro-confluent supports authentication enabled schema registry

2021-06-02 Thread tao xiao
Hi Fabian, Unfortunately this will not work in our environment where we implement our own io.confluent.kafka.schemaregistry.client.security.bearerauth.BearerAuthCredentialProvider which does the login and supplies the JWT to authorization HTTP header. The only way it will work is to pass the sche

Re: DSL for Flink CEP

2021-06-02 Thread Fabian Paul
Hi Dipanjan, I am afraid there are no foreseeable efforts planned but if you find a nice addition, you can start a discussion in the community about this feature. Best, Fabian > On 2. Jun 2021, at 12:10, Dipanjan Mazumder wrote: > > Hi Fabian, > > Understood but is there any plan to gro

Re: Parquet reading/writing

2021-06-02 Thread Taras Moisiuk
Update: The job is working correctly if add an additional identity mapping step: env.createInput(parquetInputFormat) .map(record => record) .sinkTo(FileSink.forBulkFormat...) -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Parquet reading/writing

2021-06-02 Thread Taras Moisiuk
Hi, I'm trying to read parquet file with Flink 1.12.0 Scala API and save it as another parquet file. Now it's working correctly with ParquetRowInputFormat: val inputPath: String = ... val messageType: MessageType = ... val parquetInputFormat = new ParquetRowInputFormat(new Path(inputPath), mess

Re: DSL for Flink CEP

2021-06-02 Thread Dipanjan Mazumder
Hi Fabian,      Understood but is there any plan to grow the flink  CEP and build a friendly DSL around flink CEP by any chance. RegardsDipanjan On Wednesday, June 2, 2021, 03:22:46 PM GMT+5:30, Fabian Paul wrote: Hi Dipanjan, Unfortunately, I have no experience with Siddhi but I am no

Re: DSL for Flink CEP

2021-06-02 Thread Fabian Paul
Hi Dipanjan, Unfortunately, I have no experience with Siddhi but I am not aware of any official joined efforts between Flink and Siddhi. I can imagine that not all Siddhi CEP expressions are compatible with Flink’s CEP. At the moment there is also no active development for Flink’s CEP. I think

Re: avro-confluent supports authentication enabled schema registry

2021-06-02 Thread Fabian Paul
Hi Tao, Thanks for reaching out. Have you tried the following 'value.avro-confluent.schema-registry.url' = 'https://{SR_API_KEY}:{SR_API_SECRET}@psrc-lzvd0.ap-southeast-2.aws.confluent.cloud', It may be possible to provide basic HTTP authentication by adding your username and password to t

avro-confluent supports authentication enabled schema registry

2021-06-02 Thread tao xiao
Hi team, Confluent schema registry supports HTTP basic authentication[1] but I don't find the corresponding configs in Flink documentation[2]. Is this achievable in Flink avro-confluent? [1] https://docs.confluent.io/platform/current/confluent-security-plugins/schema-registry/install.html#authen