Re: Unable to read a (text)file in FileProcessing.PROCESS_CONTINUOS mode

2021-09-06 Thread David Morávek
Hi Samir, I'd second Roman here. Can you please verify that the directory you're trying to read from is accessible in all of your docker containers (TMs / JMs)? Ideally if you could list the directory inside the docker running container and post the output in this thread. Also please check that t

Re: When using the batch api, the sink task is always in the created state.

2021-09-06 Thread Caizhi Weng
My previous mail intends to answer what is needed for all subtasks in a batch job to run simultaneously. To just run a batch job the number of task slots can be as small as 1. In this case each parallelism of each subtask will run one by one. Also the scheduling of the subtasks depends on the shuf

Re: When using the batch api, the sink task is always in the created state.

2021-09-06 Thread Caizhi Weng
Hi! If you mean batch SQL then you'll need to prepare enough task slots for all subtasks. The number of task slots needed is the sum of parallelism of all subtasks as there is no slot reusing in batch jobs. lec ssmi 于2021年9月7日周二 下午2:13写道: > And My flink version is 1.11.0 > > lec ssmi 于2021年9月7

What is the event time of an element produced in a timer?

2021-09-06 Thread Marco Villalobos
If an event time timer is registered to fire exactly every 15 minutes, starting from exactly at the top of the hour (exactly 00:00, 00:15, 00:30, 00:45 for example), and within that timer it produces an element in the stream, what event time will that element have, and what window will it belong to

Re: When using the batch api, the sink task is always in the created state.

2021-09-06 Thread lec ssmi
And My flink version is 1.11.0 lec ssmi 于2021年9月7日周二 下午2:11写道: > Hi: >I'm not familar with batch api .And I write a program just like > "insert into tab_a select * from tab_b". >From the picture, there are only two tasks, one is the source task > which is in RUNNING state. And the oth

When using the batch api, the sink task is always in the created state.

2021-09-06 Thread lec ssmi
Hi: I'm not familar with batch api .And I write a program just like "insert into tab_a select * from tab_b". From the picture, there are only two tasks, one is the source task which is in RUNNING state. And the other one is sink task which is always in CREATE state. According to logs, I

Re: Broadcast data to all keyed streams

2021-09-06 Thread Caizhi Weng
Hi! Your RefDataPriceJoiner should implement KeyedBroadcastProcessFunction instead of KeyedCoProcessFunction. See the Java docs of DataStream#connect. What's your Flink version by the way? James Sandys-Lumsdaine 于2021年9月7日周二 上午12:02写道: > Hello, > > I have a Flink workflow which is partitioned o

Re: checkpoints/.../shared cleanup

2021-09-06 Thread Roman Khachatryan
I tried to reproduce the issue and I see that the folder grows (because of the underlying FS) but the files under shared/ are removed. With large state, it takes quite some time though. Do you see any errors/warnings in the logs while stopping the job? Could you please share: - the commands or API

Re: Flink jobs stuck in "RESTARTING" State, Jobmanager thrashing on "Could not fulfill resource requirements"

2021-09-06 Thread Jordan Hurwich
Thanks for the help Arvid. Going to make these changes now and expect this will address the issue we've had. On Mon, Sep 6, 2021 at 12:03 PM Arvid Heise wrote: > Sorry for not immediately writing that: yes you should just throw an > exception. Flink will then try to close everything and restarts

Re: Flink jobs stuck in "RESTARTING" State, Jobmanager thrashing on "Could not fulfill resource requirements"

2021-09-06 Thread Arvid Heise
Sorry for not immediately writing that: yes you should just throw an exception. Flink will then try to close everything and restarts the job if the job is configured for restarts. [1] [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/execution/task_failure_recovery/ On Mon

Re: Flink jobs stuck in "RESTARTING" State, Jobmanager thrashing on "Could not fulfill resource requirements"

2021-09-06 Thread Jordan Hurwich
Thanks very much for the quick response, Arvid. I'm removing our calls to Runtime.getRuntime().exit(1) now. If something goes wrong when setting up the task, what's the preferred way to graceful shutdown so the Taskmanager can restart the job successfully? Should I just allow an Exception to propa

Re: Dynamic Cluster/Index Routing for Elasticsearch Sink

2021-09-06 Thread Arvid Heise
Hi Rion, Thank you very much for the contribution. We are currently still busy with the 1.14 release but can pick up the review after that. David, if you have capacity, we would be grateful for any help. In general, we plan to port ES from SinkFunction to the new Sink interface in 1.15 but I don'

Re: Flink jobs stuck in "RESTARTING" State, Jobmanager thrashing on "Could not fulfill resource requirements"

2021-09-06 Thread Arvid Heise
Hi Jordan, Please never call Runtime.getRuntime().exit(1). This will rob Flink of any way of graceful shutdown and recovery. Since you mentioned EC2 (and not a managed service such as ECS or K8S), I'm also assuming that the task manager process is never restarted and that's why you always end up w

Re: Seeing Exception ClassNotFoundException: __wrapper while running in Kubernetes Cluster

2021-09-06 Thread Arvid Heise
This looks like you hold a reference to some outer class or non-serializable class. Make sure that your user function is minimal (e.g. static or top-level class) and all fields really need to be serialized (that excludes all caches). On Thu, Sep 2, 2021 at 3:13 PM Nicolaus Weidner < nicolaus.weid.

Re: Flink Performance Issue

2021-09-06 Thread Arvid Heise
Hi Mohammed, something is definitely wrong in your setup. You can safely say that you can process 1k records per second and core with Kafka and light processing, so you shouldn't even need to go distributed in your case. Do you perform any heavy computation? What is your flatMap doing? Are you em

Re: De/Serialization API to tear-down user code

2021-09-06 Thread Arvid Heise
I think it's a valid request to have a tear-down method whenever we have a setup. I guess in this situation we should simply use 2 different signatures for close (e.g. different names or parameters). Would you mind filing a feature request ticket? I can also take care of that otherwise. On Thu, S

Re: High availability - leader election not working?

2021-09-06 Thread Arvid Heise
Hi Jonas, I immediately see some network issues 2021-08-31 14:45:54,113 WARN akka.remote.ReliableDeliverySupervisor [] - Association with remote system [akka.tcp:// flink@100.116.0.10:6123] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp:/

Re: [ANNOUNCE] Apache Flink Stateful Functions 3.1.0 released

2021-09-06 Thread Arvid Heise
Congratulations! New features look awesome. On Wed, Sep 1, 2021 at 9:10 AM Till Rohrmann wrote: > Great news! Thanks a lot for all your work on the new release :-) > > Cheers, > Till > > On Wed, Sep 1, 2021 at 9:07 AM Johannes Moser wrote: > >> Congratulations, great job. 🎉 >> >> On 31.08.2021,

Broadcast data to all keyed streams

2021-09-06 Thread James Sandys-Lumsdaine
Hello, I have a Flink workflow which is partitioned on a key common to all the stream objects and a key that is best suited to the high volume of data I am processing. I now want to add in a new stream of prices that I want to make available to all partitioned streams - however, this new stream

Re: [ANNOUNCE] Flink mailing lists archive service has migrated to Apache Archive service

2021-09-06 Thread JING ZHANG
Thanks Leonard for driving this. The information is helpful. Best, JING ZHANG Jark Wu 于2021年9月6日周一 下午4:59写道: > Thanks Leonard, > > I have seen many users complaining that the Flink mailing list doesn't > work (they were using Nabble). > I think this information would be very helpful. > > Best,

Re: Job manager crash

2021-09-06 Thread houssem
hello, I have three jobs running on my kubernetes cluster and each job has his own cluster id. On 2021/09/06 03:28:10, Yangze Guo wrote: > Hi, > > The root cause is not "java.lang.NoClassDefFound". The job has been > running but could not edit the config map > "myJob-0

Re: Triggers for windowed aggregations in Table API

2021-09-06 Thread Guowei Ma
Hi, John If you want to trigger the aggregation calculation of the window “earlier” I think you might need to define your customized window trigger in DataStream yourself. I think you could find more detailed information from the doc [1] If you think that the customize the trigger is a little har

Re: [ANNOUNCE] Flink mailing lists archive service has migrated to Apache Archive service

2021-09-06 Thread Jark Wu
Thanks Leonard, I have seen many users complaining that the Flink mailing list doesn't work (they were using Nabble). I think this information would be very helpful. Best, Jark On Mon, 6 Sept 2021 at 16:39, Leonard Xu wrote: > Hi, all > > The mailing list archive service Nabble Archive was bro

[ANNOUNCE] Flink mailing lists archive service has migrated to Apache Archive service

2021-09-06 Thread Leonard Xu
Hi, all The mailing list archive service Nabble Archive was broken at the end of June, the Flink community has migrated the mailing lists archives[1] to Apache Archive service by commit[2], you can refer [3] to know more mailing lists archives of Flink. Apache Archive service is maintained by

Re: Dynamic Cluster/Index Routing for Elasticsearch Sink

2021-09-06 Thread David Morávek
Hi Rion, thanks for opening the PR. I'll take a look at it this week. I'd also pull Arvid into this topic to see whether he has any comments. Best, D. On Sat, Sep 4, 2021 at 9:10 PM Rion Williams wrote: > Hi again David et al, > > I managed to push an initial pull request for the implementatio