There are known performance issues with Beam on Spark that are being worked
on, e.g. https://issues.apache.org/jira/browse/BEAM-5036 . It's possible
you're hitting something different, but would be worth investigating. See
also
https://lists.apache.org/list.html?dev@beam.apache.org:lte=1M:Performance%20of%20write

On Tue, Sep 18, 2018 at 8:39 AM devinduan(段丁瑞) <devind...@tencent.com>
wrote:

> Hi,
>     I'm testing Beam on Spark.
>     I use spark example code WordCount processing 1G data file, cost 1
> minutes.
>     However, I use Beam example code WordCount processing the same file,
> cost 30minutes.
>     My Spark parameter is :  --deploy-mode client  --executor-memory 1g
> --num-executors 1 --driver-memory 1g
>     My Spark version is 2.3.1,  Beam version is 2.5
>     Is there any optimization method?
> Thank you.
>
>
>

Reply via email to