Hi Amir, The Beam SparkRunner basically translates the Beam pipeline into a Spark job, so it's not much different then a common Spark job. I can personally say that I'm running both in Standalone (mostly testing) and YARN. I don't have much experience with Spark over Mesos in general though.
As for running over YARN, you can simply use the "spark-submit" script supplied with the Spark installation, and the runner will pick-up the necessary (Spark) configurations, such as "--master yarn". The SparkRunner README is not up-to-date right now, and I will patch it up soon, I'm also working on some improvements and new features for the runner as well, so stay tuned! Thanks, Amit On Wed, Sep 28, 2016 at 10:46 PM amir bahmanyari <[email protected]> wrote: > Hi Colleagues, > I am in progress setting up Spark Cluster for running Beam SparkRunner > apps. > The objective is to collect performance matrices via bench-marking > techniques. > The Spark docs suggest the following Clustering types. > Which one is the most appropriate type when it comes to performance > testing Beam SparkRunner? > Thanks+regards > Amir > > > [image: Inline image] > >
