Re: flink cluster startup time

2022-03-30 Thread Gyula Fóra
Hi Frank!

Thank you for the interest.

As the others said, the flink-kubernetes-operator will give you quicker
job/cluster startup time together with full support for the application
mode.

Production readiness is always relative. If I had to build a new production
use-case I would not hesitate to use the flink-kubernetes-operator, simply
because the internal architecture is much simpler and it uses the Flink
Java Client APIs to interact with the cluster in a Flink-native manner. For
me as a java developer it is also much easier to debug/fix any issues that
might come up.

One important thing to note here is that the way the
flink-kubernetes-operator works is much less intrusive on the running
cluster compared to the SpotifyOperator. It internally relies on the Flink
Native Kubernetes integration (including HA) and once the job is submitted
it lets Flink take care of the rest.

I hope you will have some time to test our preview release and share some
feedback!

Cheers,
Gyula



On Thu, Mar 31, 2022 at 4:51 AM Yang Wang  wrote:

> @Gyula Fóra  is trying to prepare the preview
> release(0.1) for flink-kubernetes-operator. It now is fully functional for
> application mode.
> You could have a try and share more feedback with the community.
>
> The release-1.0 aims for production ready. And we still miss some
> important pieces(e.g. FlinkSessionJob, SQL job, observability improvements,
> etc.).
>
> Best,
> Yang
>
> Frank Dekervel  于2022年3月30日周三 23:40写道:
>
>> Hello David,
>>
>> Thanks for the information! So the two main takeaways from your email are
>> to
>>
>>- Move to something supporting application mode. Is
>>https://github.com/apache/flink-kubernetes-operator already ready
>>enough for production deployments ?
>>- wait for flink 1.15
>>
>> thanks!
>> Frank
>>
>>
>> On Mon, Mar 28, 2022 at 9:16 AM David Morávek  wrote:
>>
>>> Hi Frank,
>>>
>>> I'm not really familiar with the internal workings of the Spotify's
>>> operator, but here are few general notes:
>>>
>>> - You only need the JM process for the REST API to become available (TMs
>>> can join in asynchronously). I'd personally aim for < 1m for this step, if
>>> it takes longer it could signal a problem with your infrastructure (eg.
>>> images taking long time to pull, incorrect setup of liveness / readiness
>>> probes, not enough resources).
>>>
>>> The job is packaged as a fat jar, but it is already baked in the docker
 images we use (so technically there would be no need to "submit" it from a
 separate pod).

>>>
>>> That's where the application mode comes in. Please note that this might
>>> be also one of the reasons for previous steps taking too long (as all pods
>>> are pulling an image with your fat jar that might not be cached).
>>>
>>> Then the application needs to start up and load its state from the
 latest savepoint, which again takes a couple of minutes

>>>
>>> This really depends on the state size, state backend (eg. rocksdb
>>> restore might take longer), object store throughput / rate limit. The
>>> native-savepoint feature that will come out with 1.15 might help to shave
>>> off some time here as the there is no conversion into the state backend
>>> structures.
>>>
>>> Best,
>>> D.
>>>
>>>-
>>>
>>>
>>> On Fri, Mar 25, 2022 at 9:46 AM Frank Dekervel 
>>> wrote:
>>>
 Hello,

 We run flink using the spotify flink Kubernetes operator (job cluster
 mode). Everything works fine, including upgrades and crash recovery. We do
 not run the job manager in HA mode.

 One of the problems we have is that upon upgrades (or during testing),
 the startup time of the flink cluster takes a very long time:

- First the operator needs to create the cluster (JM+TM), and wait
for it to respond for api requests. This already takes a couple of 
 minutes.
- Then the operator creates a job-submitter pod that submits the
job to the cluster. The job is packaged as a fat jar, but it is already
baked in the docker images we use (so technically there would be no 
 need to
"submit" it from a separate pod). The submission goes rather fast tho 
 (the
time between the job submitter seeing the cluster is online and the 
 "hello"
log from the main program is <1min)
- Then the application needs to start up and load its state from
the latest savepoint, which again takes a couple of minutes

 All steps take quite some time, and we are looking to reduce the
 startup time to allow for easier testing but also less downtime during
 upgrades. So i have some questions:

- I wonder if the situation is the same for all kubernetes
operators.  I really need some kind of operator because i otherwise i 
 have
to set which savepoint to load from myself every startup.
- What cluster startup time is considered to be acceptable / best
practise ?
- If 

Re: flink cluster startup time

2022-03-30 Thread Yang Wang
@Gyula Fóra  is trying to prepare the preview
release(0.1) for flink-kubernetes-operator. It now is fully functional for
application mode.
You could have a try and share more feedback with the community.

The release-1.0 aims for production ready. And we still miss some important
pieces(e.g. FlinkSessionJob, SQL job, observability improvements, etc.).

Best,
Yang

Frank Dekervel  于2022年3月30日周三 23:40写道:

> Hello David,
>
> Thanks for the information! So the two main takeaways from your email are
> to
>
>- Move to something supporting application mode. Is
>https://github.com/apache/flink-kubernetes-operator already ready
>enough for production deployments ?
>- wait for flink 1.15
>
> thanks!
> Frank
>
>
> On Mon, Mar 28, 2022 at 9:16 AM David Morávek  wrote:
>
>> Hi Frank,
>>
>> I'm not really familiar with the internal workings of the Spotify's
>> operator, but here are few general notes:
>>
>> - You only need the JM process for the REST API to become available (TMs
>> can join in asynchronously). I'd personally aim for < 1m for this step, if
>> it takes longer it could signal a problem with your infrastructure (eg.
>> images taking long time to pull, incorrect setup of liveness / readiness
>> probes, not enough resources).
>>
>> The job is packaged as a fat jar, but it is already baked in the docker
>>> images we use (so technically there would be no need to "submit" it from a
>>> separate pod).
>>>
>>
>> That's where the application mode comes in. Please note that this might
>> be also one of the reasons for previous steps taking too long (as all pods
>> are pulling an image with your fat jar that might not be cached).
>>
>> Then the application needs to start up and load its state from the latest
>>> savepoint, which again takes a couple of minutes
>>>
>>
>> This really depends on the state size, state backend (eg. rocksdb restore
>> might take longer), object store throughput / rate limit. The
>> native-savepoint feature that will come out with 1.15 might help to shave
>> off some time here as the there is no conversion into the state backend
>> structures.
>>
>> Best,
>> D.
>>
>>-
>>
>>
>> On Fri, Mar 25, 2022 at 9:46 AM Frank Dekervel 
>> wrote:
>>
>>> Hello,
>>>
>>> We run flink using the spotify flink Kubernetes operator (job cluster
>>> mode). Everything works fine, including upgrades and crash recovery. We do
>>> not run the job manager in HA mode.
>>>
>>> One of the problems we have is that upon upgrades (or during testing),
>>> the startup time of the flink cluster takes a very long time:
>>>
>>>- First the operator needs to create the cluster (JM+TM), and wait
>>>for it to respond for api requests. This already takes a couple of 
>>> minutes.
>>>- Then the operator creates a job-submitter pod that submits the job
>>>to the cluster. The job is packaged as a fat jar, but it is already baked
>>>in the docker images we use (so technically there would be no need to
>>>"submit" it from a separate pod). The submission goes rather fast tho 
>>> (the
>>>time between the job submitter seeing the cluster is online and the 
>>> "hello"
>>>log from the main program is <1min)
>>>- Then the application needs to start up and load its state from the
>>>latest savepoint, which again takes a couple of minutes
>>>
>>> All steps take quite some time, and we are looking to reduce the startup
>>> time to allow for easier testing but also less downtime during upgrades. So
>>> i have some questions:
>>>
>>>- I wonder if the situation is the same for all kubernetes
>>>operators.  I really need some kind of operator because i otherwise i 
>>> have
>>>to set which savepoint to load from myself every startup.
>>>- What cluster startup time is considered to be acceptable / best
>>>practise ?
>>>- If there are other tricks to reduce startup time, i would be very
>>>interested in knowing them :-)
>>>
>>> There is also a discussion ongoing on running flink on spot nodes. I
>>> guess the startup time is relevant there too.
>>>
>>> Thanks already
>>> Frank
>>>
>>>
>>>
>>>
>>>
>>>
>
> --
> [image: Kapernikov] 
> Frank Dekervel
> +32 473 94 34 21 <+32473943421>
> www.kapernikov.com 
> [image: Blog] 
>


Re: flink cluster startup time

2022-03-30 Thread Frank Dekervel
Hello David,

Thanks for the information! So the two main takeaways from your email are to

   - Move to something supporting application mode. Is
   https://github.com/apache/flink-kubernetes-operator already ready enough
   for production deployments ?
   - wait for flink 1.15

thanks!
Frank


On Mon, Mar 28, 2022 at 9:16 AM David Morávek  wrote:

> Hi Frank,
>
> I'm not really familiar with the internal workings of the Spotify's
> operator, but here are few general notes:
>
> - You only need the JM process for the REST API to become available (TMs
> can join in asynchronously). I'd personally aim for < 1m for this step, if
> it takes longer it could signal a problem with your infrastructure (eg.
> images taking long time to pull, incorrect setup of liveness / readiness
> probes, not enough resources).
>
> The job is packaged as a fat jar, but it is already baked in the docker
>> images we use (so technically there would be no need to "submit" it from a
>> separate pod).
>>
>
> That's where the application mode comes in. Please note that this might be
> also one of the reasons for previous steps taking too long (as all pods are
> pulling an image with your fat jar that might not be cached).
>
> Then the application needs to start up and load its state from the latest
>> savepoint, which again takes a couple of minutes
>>
>
> This really depends on the state size, state backend (eg. rocksdb restore
> might take longer), object store throughput / rate limit. The
> native-savepoint feature that will come out with 1.15 might help to shave
> off some time here as the there is no conversion into the state backend
> structures.
>
> Best,
> D.
>
>-
>
>
> On Fri, Mar 25, 2022 at 9:46 AM Frank Dekervel 
> wrote:
>
>> Hello,
>>
>> We run flink using the spotify flink Kubernetes operator (job cluster
>> mode). Everything works fine, including upgrades and crash recovery. We do
>> not run the job manager in HA mode.
>>
>> One of the problems we have is that upon upgrades (or during testing),
>> the startup time of the flink cluster takes a very long time:
>>
>>- First the operator needs to create the cluster (JM+TM), and wait
>>for it to respond for api requests. This already takes a couple of 
>> minutes.
>>- Then the operator creates a job-submitter pod that submits the job
>>to the cluster. The job is packaged as a fat jar, but it is already baked
>>in the docker images we use (so technically there would be no need to
>>"submit" it from a separate pod). The submission goes rather fast tho (the
>>time between the job submitter seeing the cluster is online and the 
>> "hello"
>>log from the main program is <1min)
>>- Then the application needs to start up and load its state from the
>>latest savepoint, which again takes a couple of minutes
>>
>> All steps take quite some time, and we are looking to reduce the startup
>> time to allow for easier testing but also less downtime during upgrades. So
>> i have some questions:
>>
>>- I wonder if the situation is the same for all kubernetes
>>operators.  I really need some kind of operator because i otherwise i have
>>to set which savepoint to load from myself every startup.
>>- What cluster startup time is considered to be acceptable / best
>>practise ?
>>- If there are other tricks to reduce startup time, i would be very
>>interested in knowing them :-)
>>
>> There is also a discussion ongoing on running flink on spot nodes. I
>> guess the startup time is relevant there too.
>>
>> Thanks already
>> Frank
>>
>>
>>
>>
>>
>>

-- 
[image: Kapernikov] 
Frank Dekervel
+32 473 94 34 21 <+32473943421>
www.kapernikov.com 
[image: Blog] 


Re: flink cluster startup time

2022-03-28 Thread David Morávek
Hi Frank,

I'm not really familiar with the internal workings of the Spotify's
operator, but here are few general notes:

- You only need the JM process for the REST API to become available (TMs
can join in asynchronously). I'd personally aim for < 1m for this step, if
it takes longer it could signal a problem with your infrastructure (eg.
images taking long time to pull, incorrect setup of liveness / readiness
probes, not enough resources).

The job is packaged as a fat jar, but it is already baked in the docker
> images we use (so technically there would be no need to "submit" it from a
> separate pod).
>

That's where the application mode comes in. Please note that this might be
also one of the reasons for previous steps taking too long (as all pods are
pulling an image with your fat jar that might not be cached).

Then the application needs to start up and load its state from the latest
> savepoint, which again takes a couple of minutes
>

This really depends on the state size, state backend (eg. rocksdb restore
might take longer), object store throughput / rate limit. The
native-savepoint feature that will come out with 1.15 might help to shave
off some time here as the there is no conversion into the state backend
structures.

Best,
D.
-

On Fri, Mar 25, 2022 at 9:46 AM Frank Dekervel  wrote:

> Hello,
>
> We run flink using the spotify flink Kubernetes operator (job cluster
> mode). Everything works fine, including upgrades and crash recovery. We do
> not run the job manager in HA mode.
>
> One of the problems we have is that upon upgrades (or during testing), the
> startup time of the flink cluster takes a very long time:
>
>- First the operator needs to create the cluster (JM+TM), and wait for
>it to respond for api requests. This already takes a couple of minutes.
>- Then the operator creates a job-submitter pod that submits the job
>to the cluster. The job is packaged as a fat jar, but it is already baked
>in the docker images we use (so technically there would be no need to
>"submit" it from a separate pod). The submission goes rather fast tho (the
>time between the job submitter seeing the cluster is online and the "hello"
>log from the main program is <1min)
>- Then the application needs to start up and load its state from the
>latest savepoint, which again takes a couple of minutes
>
> All steps take quite some time, and we are looking to reduce the startup
> time to allow for easier testing but also less downtime during upgrades. So
> i have some questions:
>
>- I wonder if the situation is the same for all kubernetes operators.
>I really need some kind of operator because i otherwise i have to set which
>savepoint to load from myself every startup.
>- What cluster startup time is considered to be acceptable / best
>practise ?
>- If there are other tricks to reduce startup time, i would be very
>interested in knowing them :-)
>
> There is also a discussion ongoing on running flink on spot nodes. I guess
> the startup time is relevant there too.
>
> Thanks already
> Frank
>
>
>
>
>
>


flink cluster startup time

2022-03-25 Thread Frank Dekervel

Hello,

We run flink using the spotify flink Kubernetes operator (job cluster 
mode). Everything works fine, including upgrades and crash recovery. We 
do not run the job manager in HA mode.


One of the problems we have is that upon upgrades (or during testing), 
the startup time of the flink cluster takes a very long time:


 * First the operator needs to create the cluster (JM+TM), and wait for
   it to respond for api requests. This already takes a couple of minutes.
 * Then the operator creates a job-submitter pod that submits the job
   to the cluster. The job is packaged as a fat jar, but it is already
   baked in the docker images we use (so technically there would be no
   need to "submit" it from a separate pod). The submission goes rather
   fast tho (the time between the job submitter seeing the cluster is
   online and the "hello" log from the main program is <1min)
 * Then the application needs to start up and load its state from the
   latest savepoint, which again takes a couple of minutes

All steps take quite some time, and we are looking to reduce the startup 
time to allow for easier testing but also less downtime during upgrades. 
So i have some questions:


 * I wonder if the situation is the same for all kubernetes operators. 
   I really need some kind of operator because i otherwise i have to
   set which savepoint to load from myself every startup.
 * What cluster startup time is considered to be acceptable / best
   practise ?
 * If there are other tricks to reduce startup time, i would be very
   interested in knowing them :-)

There is also a discussion ongoing on running flink on spot nodes. I 
guess the startup time is relevant there too.


Thanks already
Frank