FYI there is already a corresponding issue https://issues.apache.org/jira/browse/FLINK-13660
Best, tison. Till Rohrmann <trohrm...@apache.org> 于2019年10月18日周五 下午9:42写道: > Hi Martin, > > Flink's web UI based job submission is not well suited to be run behind a > load balancer at the moment. The problem is that the web based job > submission is actually a two phase operation: Uploading the jars and then > starting the job. Since Flink's RestServer stores the uploaded files > locally, it is required that the web submission is executed on the same > RestServer to which you also uploaded the files before. Note, however, that > the cli client job submission is not affected by this since the job graph > upload and submission is one request. > > A workaround to make the uploads accessible to all RestServers is to > configure a DFS for the `web.upload.dir` as Ravi suggested or to use > Flink's CLI to submit jobs instead. > > A quick note about the old behaviour with the redirects. The redirects > actually defied the purpose of load balancers because all requests were > redirected to a single RestServer instance. Hence, running it with or w/o > load balancer should not have made a big difference. > > Cheers, > Till > > On Wed, Oct 16, 2019 at 5:58 PM Martin, Nick J [US] (IS) < > nick.mar...@ngc.com> wrote: > >> Yeah, I’ll do that if I have to. I’m hoping there’s a ‘right’ way to do >> it that’s easier. If I have to implement the zookeeper lookups in my load >> balancer myself, that feels like a definite step backwards from the pre-1.5 >> days when the cluster would give 307 redirects to the current leader >> >> >> >> *From:* Ravi Bhushan Ratnakar [mailto:ravibhushanratna...@gmail.com] >> *Sent:* Tuesday, October 15, 2019 10:35 PM >> *To:* Martin, Nick J [US] (IS) <nick.mar...@ngc.com> >> *Cc:* user <user@flink.apache.org> >> *Subject:* EXT :Re: Jar Uploads in High Availability (Flink 1.7.2) >> >> >> >> Hi, >> >> >> >> i was also experiencing with the similar behavior. I adopted following >> approach >> >> - used a distributed file system(in my case aws efs) and set the >> attribute "web.upload.dir", this way both the job manager have same >> location. >> - on the load balancer side(aws elb), i used "readiness probe" based >> on zookeeper entry for active jobmanager address, this way elb always >> point >> to the active job manager and if the active jobmanager changes then it >> automatically point to the new active jobmanager and as both are using the >> same location by configuring distributed file system so new active job is >> able to find the same jar. >> >> >> >> Regards, >> >> Ravi >> >> >> >> On Wed, Oct 16, 2019 at 1:15 AM Martin, Nick J [US] (IS) < >> nick.mar...@ngc.com> wrote: >> >> I’m seeing that when I upload a jar through the rest API, it looks like >> only the Jobmanager that received the upload request is aware of the newly >> uploaded jar. That worked fine for me in older versions where all clients >> were redirected to connect to the leader, but now that each Jobmanager >> accepts requests, if I send a jar upload request, it could end up on any >> one (and only one) of the Jobmanagers, not necessarily the leader. Further, >> each Jobmanager responds to a GET request on the /jars endpoint with its >> own local list of jars. If I try and use one of the Jar IDs from that >> request, my next request may not go to the same Jobmanager (requests are >> going through Docker and being load-balanced), and so the Jar ID isn’t >> found on the new Jobmanager handling that request. >> >> >> >> >> >> >> >> >> >>