Re: Question

2021-02-12 Thread Matthias Pohl
Checkpoints are stored in some DFS storage. The location can be specified using state.checkpoints.dir configuration property [1]. You can access the state of a savepoint or checkpoint using the State Processor API [2]. Best, Matthias [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/

Re: Question

2021-02-12 Thread Abu Bakar Siddiqur Rahman Rocky
Thank you for your reply. Another Question: After Checkpointing, we save our snapshot to a storage. How can we access the storage? is this the source code: https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpointStore.java If

Re: Exception when writing part file to S3

2021-02-12 Thread Robert Metzger
Hey, Could it be this problem: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/What-S3-Permissions-does-StreamingFileSink-need-td31426.html ? is this problem reproducible, or just happening every now and then (not saying that this makes it less worse). Have you tried the prest

Re: CDC for MS SQL Server

2021-02-12 Thread Robert Metzger
Hey John, I haven't worked with the flink-cdc-connectors [1] myself. But if you take the MySQL one as a template, it shouldn't be too difficult to adjust it to MS SQL server. If you are doing that work, it would be nice if you would contribute it back to the repo ;) I don't think that you need to

Re: How to debug flink serialization error?

2021-02-12 Thread Robert Metzger
Thanks for reaching out to the Flink ML. It reports getMetricStoreProgramHelper as a non-serializable field, even though it looks a lot like a method. The only recommendation I have for you is carefully reading the full error message + stack trace. Your approach of using tagging fields as "transi

Re: State Access Beyond RichCoFlatMapFunction

2021-02-12 Thread Sandeep khanzode
Oh okay. Got it. I will check. Thanks. > On 12-Feb-2021, at 3:14 PM, Kezhu Wang wrote: > > Hi Sandeep, > > I must mislead you by inaccurate words. I did not mean using > CoGroupedStreams, but only CoGroupedStreams.apply as reference for how to > union streams together and keyBy them. This way

Re: Optimizing Flink joins

2021-02-12 Thread Dan Hill
Flink SQL is missing a reliable, commonly-used way of evolving a job (e.g. savepointing and checkpointing). The other concepts I heard about are not shared publicly enough to rely on (e.g. roll off). I wasn't able to find anything useful on this. On Fri, Feb 12, 2021, 02:05 Timo Walther wrote:

Re: Optimizing Flink joins

2021-02-12 Thread Timo Walther
Hi Dan, thanks for letting us know. Could you give us some feedback what is missing in SQL for this use case? Are you looking for some broadcast joining or which kind of algorithm would help you? Regards, Timo On 11.02.21 20:32, Dan Hill wrote: Hi Timo!  I'm moving away from SQL to DataStre

Re: State Access Beyond RichCoFlatMapFunction

2021-02-12 Thread Kezhu Wang
Hi Sandeep, I must mislead you by inaccurate words. I did not mean using CoGroupedStreams, but only CoGroupedStreams.apply as reference for how to union streams together and keyBy them. This way you can have all three streams’ states in downstream without duplication. Best, Kezhu Wang On Februar

Re: Question

2021-02-12 Thread Matthias Pohl
Hi Abu Bakar Siddiqur Rahman, Have you had a look at the Flink documentation [1]? It provides step-by-step instructions on how to run a job (the Flink binaries provide example jobs under ./examples) using a local standalone cluster. This should also work on a Mac. You would just need to start the F