Sure...Thanks Amit.So basically: Standard for testing & YARN for
Production?Yes, README for SparkRunner is way outdated. the FlinkRunner version
is very informative.In the meanwhile the README is in progress, could you give
me some helpful details so I do the perf testing in the right context pls?Have
a great dayAmir-
From: Amit Sela <[email protected]>
To: amir bahmanyari <[email protected]>; "[email protected]"
<[email protected]>
Sent: Wednesday, September 28, 2016 1:13 PM
Subject: Re: Appropriate Spark Cluster Mode for running Beam SparkRunner apps
Hi Amir,
The Beam SparkRunner basically translates the Beam pipeline into a Spark job,
so it's not much different then a common Spark job.I can personally say that
I'm running both in Standalone (mostly testing) and YARN. I don't have much
experience with Spark over Mesos in general though.
As for running over YARN, you can simply use the "spark-submit" script supplied
with the Spark installation, and the runner will pick-up the necessary (Spark)
configurations, such as "--master yarn".
The SparkRunner README is not up-to-date right now, and I will patch it up
soon, I'm also working on some improvements and new features for the runner as
well, so stay tuned!
Thanks,Amit
On Wed, Sep 28, 2016 at 10:46 PM amir bahmanyari <[email protected]> wrote:
Hi Colleagues,I am in progress setting up Spark Cluster for running Beam
SparkRunner apps.The objective is to collect performance matrices via
bench-marking techniques.The Spark docs suggest the following Clustering
types.Which one is the most appropriate type when it comes to performance
testing Beam SparkRunner?Thanks+regardsAmir