I'm looking into integrating AWS Flow Framework
<https://aws.amazon.com/swf/details/flow/> with Flink. One of the
requirements is to provide a service API to initiate workflows and query
their state. I envision that this API could be integrated as a specialized
source that takes web requests and a sink that returns the replies to the
calling web front end.
The ideal solution is to run JobManager inside a web server and pin a
single *WebFrontEndSource and WebFrontEndSync *to each web server instance.
This way Flink handles all interprocess communication and the webserver cod
calls into the source and sink directly.
>From the Flink side the missing feature is support for different JobManager
roles and allowing to specify that given operator has to be placed only
into a JobManager of a specific role. Such feature would be useful in other
scenarios like having *high memory* *slot *role vs *high CPU slot
*role*. *Another
requirement would be support for explicit placement of subtasks of
operators that have the same parallelism into the same slot ensuring that
their indexes are the same.
So in my case I would define "WebService" JobManager role with a single
slot per instance and put pair of *WebFrontEndSource *and
*WebFrontEndSync *into
slot on each webserver.

If such feature is not feasible I would like to find a way to lookup
addresses of hosts that run a specific sink operator. Then each frontend
would query Flink for hosts that run *WebFrontEndSources *and forward
requests to them.

Is there a better way to support my use case?

Thanks,

Maxim.

Reply via email to