Async code inside Flink Sink

2024-04-17 Thread Jacob Rollings
Hello, I have a use case where I need to do a cache file deletion after a successful sunk operation(writing to db). My Flink pipeline is built using Java. I am contemplating using Java completableFuture.runasync() to perform the file deletion activity. I am wondering what issues this might cause i

Re: Parallelism for auto-scaling, memory for auto-tuning - Flink operator

2024-04-17 Thread Zhanghao Chen
If you have some experience before, I'd recommend setting a good parallelism and TM resource spec first, to give the autotuner a good starting point. Usually, the autoscaler can tune your jobs well within a few attempts (<=3). As for `pekko.ask.timeout`, the default value should be sufficient i

Parallelism for auto-scaling, memory for auto-tuning - Flink operator

2024-04-17 Thread Maxim Senin via user
Hi. Does it make sense to specify `parallelism` for task managers or the `job`, and, similarly, to specify memory amount for the task managers, or it’s better to leave it to autoscaler and autotuner to pick the best values? How many times would the autoscaler need to restart task managers befor

Re: Understanding default firings in case of allowed lateness

2024-04-17 Thread Sachin Mittal
Hi Xuyang, So if I check the side output way then my pipeline would be something like this: final OutputTag lateOutputTag = new OutputTag("late-data"){}; SingleOutputStreamOperator reducedDataStream = dataStream .keyBy(new MyKeySelector()) .window(TumblingEventTimeWindows.of(Time.secon

Re:Understanding default firings in case of allowed lateness

2024-04-17 Thread Xuyang
Hi, Sachin. IIUC, it is in the second situation you listed, that is: [ reduceData (d1, d2, d3), reducedData(ld1, d2, d3, late d4, late d5, late d6) ]. However, because of `table.exec.emit.late-fire.delay`, it could also be such as [ reduceData (d1, d2, d3), reducedData(ld1, d2, d3, late d4, la

Understanding default firings in case of allowed lateness

2024-04-17 Thread Sachin Mittal
Hi, Suppose my pipeline is: data .keyBy(new MyKeySelector()) .window(TumblingEventTimeWindows.of(Time.seconds(60))) .allowedLateness(Time.seconds(180)) .reduce(new MyDataReducer()) So I wanted to know if the final output stream would contain reduced data at the end of the window

Re: Elasticsearch8 example

2024-04-17 Thread Hang Ruan
Hi Tauseef. I see that the support of Elasticsearch 8[1] will be released in elasticsearch-3.1.0. So there is no docs for the elasticsearch8 by now. We could learn to use it by some tests[2] before the docs is ready. Best, Hang [1] https://issues.apache.org/jira/browse/FLINK-26088 [2] https://gi

Re: Table Source from Parquet Bug

2024-04-17 Thread Hang Ruan
Hi, David. Have you added the parquet format[1] dependency in your dependencies? It seems that the class ParquetColumnarRowInputFormat cannot be found. Best, Hang [1] https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/connectors/table/formats/parquet/ Sohil Shah 于2024年4月17日周三 04:4