Hello! > to also run on AWS
> A spark cluster on EKS seems the closest analog There's another way of running Beam apps in AWS - https://aws.amazon.com/kinesis/data-analytics/ - which is basically "serverless" Flink. It says Kinesis, but you can run any Flink / Beam job there, you don't have to use Kinesis streams. I used KDA in multiple projects so far, works OK. FlinkRunner also seems to have more docs as far as I can see. Here's a pom.xml example: https://github.com/aws-samples/amazon-kinesis-data-analytics-examples/blob/master/Beam/pom.xml Best Regards, Pavel Solomin Tel: +351 962 950 692 | Skype: pavel_solomin | Linkedin <https://www.linkedin.com/in/pavelsolomin> On Wed, 21 Jun 2023 at 16:31, Jon Molle via user <user@beam.apache.org> wrote: > Hi, > > I've been looking at the Spark Portable Runner docs, specifically Java > when possible, and I'm a little confused about the organization. The docs > seem to say that the JobService both submits the code to the linked spark > cluster (described in the master url) and requires you to run a > spark-submit command after on whatever artifacts it builds. > > Unfortunately I'm not that familiar with Spark generally, so I'm probably > misunderstanding more here, but the job server images either totally lack > documentation or just repeat the spark runner page in the main docs. > > For context, I'm trying to port some code that we're currently running on > a Dataflow runner (on GCP) to also run on AWS. A spark cluster on EKS > (either self-managed or potentially through EMR, but likely not based on > what I am reading into the docs and some brief testing) seems the closest > analog. > > The new Tour does the same thing, in addition to only really having > examples for python and a few more typos. I haven't found any existing > questions like this elsewhere, so I assume that I'm just missing something > that should be obvious. > > Thanks for your time. >