Scaling Flink for batch jobs

2021-08-13 Thread Gorjan Todorovski
Hi! I want to implement a Flink cluster as a native Kubernetes session cluster, with intention of executing Apache Beam jobs that will process only batch data, but I am not sure I understand how I would scale the cluster if I need to process large datasets. My understanding is that to be able to

Running Beam on a native Kubernetes Flink cluster

2021-08-14 Thread Gorjan Todorovski
Hi! I need help implementing a native Kubernetes Flink cluster that needs to run batch jobs (run by TensorFlow Extended), but I am not sure I am configuring it right as I have issues running jobs on more than one task manager, while jobs run fine if there is only one TM. I use the following param

Re: Running Beam on a native Kubernetes Flink cluster

2021-08-15 Thread Gorjan Todorovski
> worker_pool, which should be just fine. > > Best, > > Jan > > [1] > https://github.com/apache/beam/blob/af59192ac95a564ca3d29b6385a6d1448f5a407d/sdks/python/apache_beam/runners/worker/worker_pool_main.py#L81 > On 8/14/21 4:37 PM, Gorjan Todorovski wrote: > > Hi! &

Re: Scaling Flink for batch jobs

2021-08-16 Thread Gorjan Todorovski
longer time to process the data? >> > It depends on a lot of aspects, such as the type of source you are using, > the type of operators you are running, etc. Ideally we hope it will just > take longer but for some specific operators or connectors it might fail. > This is where user

Flink task stuck - MapPartition WAITING on java.util.concurrent.CompletableFuture$Signaller

2022-05-31 Thread Gorjan Todorovski
Hi, I am running a TensorFlow Extended (TFX) pipeline which uses Apache Beam for data processing which in turn has a Flink Runner (Basically a batch job on a Flink Session Cluster on Kubernetes) version 1.13.6, but the job (for gathering stats) gets stuck. There is nothing significant in the Job

Re: Flink task stuck - MapPartition WAITING on java.util.concurrent.CompletableFuture$Signaller

2022-06-01 Thread Gorjan Todorovski
id=3-2', '--logging_endpoint=localhost:35391', '--artifact_endpoint=localhost:46571', '--provision_endpoint=localhost:44073', '--control_endpoint=localhost:44133'] Starting work... On Wed, Jun 1, 2022 at 11:21 AM Jan Lukavský wrote: > Hi Gorjan, > &g

Re: Flink task stuck - MapPartition WAITING on java.util.concurrent.CompletableFuture$Signaller

2022-06-03 Thread Gorjan Todorovski
progressing? Does that happen at any specific instant (e.g. > end of window or end of window + allowed lateness)? > On 6/1/22 15:43, Gorjan Todorovski wrote: > > Hi Jan, > > I have not checked the harness log. I have now checked it *Apache Beam > worker log) and found this, but current