Re: REST API in an HA setup - must the leading JM be called?

2021-08-18 Thread Juha Mynttinen
Thank you, answers my questions.

--
Regards,
Juha

On Wed, Aug 18, 2021 at 2:28 PM Chesnay Schepler  wrote:

> You've pretty much answered the question yourself. *thumbs up*
>
> For the vast majority of cases you can call any JobManager.
> The exceptions are jar operations (because they are persisted in the
> JM-local filesystem, and other JMs don't know about them) and triggering
> savepoints (because metadata for on-going savepoint operations (i.e., the
> information returned when querying the savepoint operation status) is also
> kept locally in the JM).
>
> This does indeed imply that on JM failover all this information is lost.
>
> There are ideas to solve is, but no concrete timeline. See
> https://issues.apache.org/jira/browse/FLINK-18312
>
> On 18/08/2021 11:54, Juha Mynttinen wrote:
>
> I have questions related to REST API in the case of ZooKeeper HA and a
> standalone cluster. But I think the questions apply to other setups too
> such as YARN.
>
> Let's assume a standalone cluster with multiple JobManagers. The
> JobManagers elect the leader among themselves and register that to
> ZooKeeper. When using the Flink command line, AFAIK the code will go to
> ZooKeeper to find the host and port of the leading JobManager and send HTTP
> requests there.
>
> My question is: when accessing the REST API directly (e.g. curl) does one
> need to call the leading JobManager or will any up and running JobManager
> do? And if the leader needs to be called, why is it so?
>
> Behind the scenes the REST API will connect to the leading "JobManager"
> over RPC, making it irrelevant which JobManager receives the HTTP request.
>
> By experimenting, I found the Web UI works fine if all the JobManagers are
> behind a load balancer and leading and standby JobManagers are called. The
> only issue I found was that when a jar is submitted (/jars/upload), it is
> stored on the local disk of the JobManager that happens to handle that
> request. As a consequence, creating a job from that jar only succeeds if
> the HTTP request hits the JobManager that has the file. There might be a
> "hack" to overcome this limitation, set web.upload.dir to be in S3 / GCS or
> elsewhere accessible by all JobManagers. I didn't try this. Or in the case
> of uploading jars and creating jobs, ensure the same JobManager is called
> (bypass loadbalancer).
>
> But I wonder if there's something else why the leading JM should be called.
>
> A follow-up question arises. If the jars are stored only on the leading
> JobManager, doesn't that mean that if the leader changes, the new leader is
> not aware of the jars uploaded to the old leader? From the REST
> API's perspective this means that even in the JobManager HA setup and when
> always calling the leader, a simple "upload a jar and a deploy a job"-cycle
> is not guaranteed to work if the leader happens to change between the
> requests. Did I miss something?
>
> --
> Regards,
> Juha
>
>
>


Re: REST API in an HA setup - must the leading JM be called?

2021-08-18 Thread Chesnay Schepler

You've pretty much answered the question yourself. *thumbs up*

For the vast majority of cases you can call any JobManager.
The exceptions are jar operations (because they are persisted in the 
JM-local filesystem, and other JMs don't know about them) and triggering 
savepoints (because metadata for on-going savepoint operations (i.e., 
the information returned when querying the savepoint operation status) 
is also kept locally in the JM).


This does indeed imply that on JM failover all this information is lost.

There are ideas to solve is, but no concrete timeline. See 
https://issues.apache.org/jira/browse/FLINK-18312


On 18/08/2021 11:54, Juha Mynttinen wrote:
I have questions related to REST API in the case of ZooKeeper HA and a 
standalone cluster. But I think the questions apply to other setups 
too such as YARN.


Let's assume a standalone cluster with multiple JobManagers. The 
JobManagers elect the leader among themselves and register that to 
ZooKeeper. When using the Flink command line, AFAIK the code will go 
to ZooKeeper to find the host and port of the leading JobManager and 
send HTTP requests there.


My question is: when accessing the REST API directly (e.g. curl) does 
one need to call the leading JobManager or will any up and 
running JobManager do? And if the leader needs to be called, why is it so?


Behind the scenes the REST API will connect to the leading 
"JobManager" over RPC, making it irrelevant which JobManager receives 
the HTTP request.


By experimenting, I found the Web UI works fine if all the JobManagers 
are behind a load balancer and leading and standby JobManagers are 
called. The only issue I found was that when a jar is submitted 
(/jars/upload), it is stored on the local disk of the JobManager that 
happens to handle that request. As a consequence, creating a job from 
that jar only succeeds if the HTTP request hits the JobManager that 
has the file. There might be a "hack" to overcome this limitation, set 
web.upload.dir to be in S3 / GCS or elsewhere accessible by all 
JobManagers. I didn't try this. Or in the case of uploading jars and 
creating jobs, ensure the same JobManager is called (bypass loadbalancer).


But I wonder if there's something else why the leading JM should be 
called.


A follow-up question arises. If the jars are stored only on the 
leading JobManager, doesn't that mean that if the leader changes, the 
new leader is not aware of the jars uploaded to the old leader? From 
the REST API's perspective this means that even in the JobManager HA 
setup and when always calling the leader, a simple "upload a jar and a 
deploy a job"-cycle is not guaranteed to work if the leader happens to 
change between the requests. Did I miss something?


--
Regards,
Juha





REST API in an HA setup - must the leading JM be called?

2021-08-18 Thread Juha Mynttinen
I have questions related to REST API in the case of ZooKeeper HA and a
standalone cluster. But I think the questions apply to other setups too
such as YARN.

Let's assume a standalone cluster with multiple JobManagers. The
JobManagers elect the leader among themselves and register that to
ZooKeeper. When using the Flink command line, AFAIK the code will go to
ZooKeeper to find the host and port of the leading JobManager and send HTTP
requests there.

My question is: when accessing the REST API directly (e.g. curl) does one
need to call the leading JobManager or will any up and running JobManager
do? And if the leader needs to be called, why is it so?

Behind the scenes the REST API will connect to the leading "JobManager"
over RPC, making it irrelevant which JobManager receives the HTTP request.

By experimenting, I found the Web UI works fine if all the JobManagers are
behind a load balancer and leading and standby JobManagers are called. The
only issue I found was that when a jar is submitted (/jars/upload), it is
stored on the local disk of the JobManager that happens to handle that
request. As a consequence, creating a job from that jar only succeeds if
the HTTP request hits the JobManager that has the file. There might be a
"hack" to overcome this limitation, set web.upload.dir to be in S3 / GCS or
elsewhere accessible by all JobManagers. I didn't try this. Or in the case
of uploading jars and creating jobs, ensure the same JobManager is called
(bypass loadbalancer).

But I wonder if there's something else why the leading JM should be called.

A follow-up question arises. If the jars are stored only on the leading
JobManager, doesn't that mean that if the leader changes, the new leader is
not aware of the jars uploaded to the old leader? From the REST
API's perspective this means that even in the JobManager HA setup and when
always calling the leader, a simple "upload a jar and a deploy a job"-cycle
is not guaranteed to work if the leader happens to change between the
requests. Did I miss something?

--
Regards,
Juha