[jira] [Comment Edited] (BEAM-8970) Spark portable runner supports Yarn
[ https://issues.apache.org/jira/browse/BEAM-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024769#comment-17024769 ] Kyle Weaver edited comment on BEAM-8970 at 1/27/20 11:54 PM: - Hi Enis, thanks for the feedback. I'm not sure it's possible to use the Spark REST API along with YARN, because normally the Spark REST API is started along with the Spark master. You should be able to spark-submit portable jars. To create portable jars: [--runner=SparkRunner, --output_executable_path=~/path/to/output.jar] (Without using the spark_submit_uber_jar option.) Also, note that this will require YARN nodes to have installed or otherwise be able to access Beam worker code. [~angoenka] might know more. was (Author: ibzib): Hi Enis, thanks for the feedback. I'm not sure it's possible to use the Spark REST API along with YARN, because normally the Spark REST API is started along with the Spark master. You should be able to spark-submit portable jars. To create portable jars: {{ ['--runner=SparkRunner', --output_executable_path "$OUTPUT_JAR"] }} (Without using the spark_submit_uber_jar option.) Also, note that this will require YARN nodes to have installed or otherwise be able to access Beam worker code. [~angoenka] might know more. > Spark portable runner supports Yarn > --- > > Key: BEAM-8970 > URL: https://issues.apache.org/jira/browse/BEAM-8970 > Project: Beam > Issue Type: Wish > Components: runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-spark > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (BEAM-8970) Spark portable runner supports Yarn
[ https://issues.apache.org/jira/browse/BEAM-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024769#comment-17024769 ] Kyle Weaver edited comment on BEAM-8970 at 1/27/20 11:55 PM: - Hi Enis, thanks for the feedback. I'm not sure it's possible to use the Spark REST API along with YARN, because normally the Spark REST API is started along with the Spark master. You should be able to spark-submit portable jars. To create portable jars: ['--runner=SparkRunner', '--output_executable_path=~/path/to/output.jar'] (Without using the spark_submit_uber_jar option.) Also, note that this will require YARN nodes to have installed or otherwise be able to access Beam worker code. [~angoenka] might know more. was (Author: ibzib): Hi Enis, thanks for the feedback. I'm not sure it's possible to use the Spark REST API along with YARN, because normally the Spark REST API is started along with the Spark master. You should be able to spark-submit portable jars. To create portable jars: [--runner=SparkRunner, --output_executable_path=~/path/to/output.jar] (Without using the spark_submit_uber_jar option.) Also, note that this will require YARN nodes to have installed or otherwise be able to access Beam worker code. [~angoenka] might know more. > Spark portable runner supports Yarn > --- > > Key: BEAM-8970 > URL: https://issues.apache.org/jira/browse/BEAM-8970 > Project: Beam > Issue Type: Wish > Components: runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-spark > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (BEAM-8970) Spark portable runner supports Yarn
[ https://issues.apache.org/jira/browse/BEAM-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023657#comment-17023657 ] Enis Nazif edited comment on BEAM-8970 at 1/25/20 10:12 PM: looking at this issue, to run a pipeline on YARN backed spark cluster, a user should be able to specify runner options of {code:java} ['--runner=SparkRunner', '--spark_submit_uber_jar', '--spark_rest_url=http://spark-rest-api:6066', '--spark_master_url='yarn']{code} As it stands, the 'spark_master_url' isn't being passed into the request created in in [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145] It seems that this is necessary to support YARN Failing this, an alternative way may be to bypass the Spark REST API (which seems like fairly hidden functionality) and instead directly spark-submit the portable jars that are created. was (Author: enazif): looking at this issue, to run a pipeline on YARN backed sparked, a user should be able to specify runner options of {code:java} ['--runner=SparkRunner', '--spark_submit_uber_jar', '--spark_rest_url=http://spark-rest-api:6066', '--spark_master_url='yarn']{code} As it stands, the 'spark_master_url' isn't being passed into the request created in in [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145] It seems that this is necessary to support YARN Failing this, an alternative way may be to bypass the Spark REST API (which seems like fairly hidden functionality) and instead directly spark-submit the portable jars that are created. > Spark portable runner supports Yarn > --- > > Key: BEAM-8970 > URL: https://issues.apache.org/jira/browse/BEAM-8970 > Project: Beam > Issue Type: Wish > Components: runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-spark > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (BEAM-8970) Spark portable runner supports Yarn
[ https://issues.apache.org/jira/browse/BEAM-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023657#comment-17023657 ] Enis Nazif edited comment on BEAM-8970 at 1/25/20 10:12 PM: looking at this issue, to run a pipeline on YARN backed sparked, a user should be able to specify runner options of {code:java} ['--runner=SparkRunner', '--spark_submit_uber_jar', '--spark_rest_url=http://spark-rest-api:6066', '--spark_master_url='yarn']{code} As it stands, the 'spark_master_url' isn't being passed into the request created in in [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145] It seems that this is necessary to support YARN Failing this, an alternative way may be to bypass the Spark REST API (which seems like fairly hidden functionality) and instead directly spark-submit the portable jars that are created. was (Author: enazif): looking at this issue, to run a pipeline on YARN backed sparked, a user should be able to specify runner options of {code:java} ['--runner=SparkRunner', '--spark_submit_uber_jar', '--spark_rest_url=http://spark-rest-api:6066', '--spark_master_url='yarn']{code} As it stands, the 'spark_master_url' isn't being passed into the request created in in [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145] It seems that this is necessary to support YARN Failing this, an alternative way may be to bypass the Spark REST API (which seems like fairly hidden functionality) and instead directly {noformat} spark-submit{noformat} spark-submit` the portable jars that are created. > Spark portable runner supports Yarn > --- > > Key: BEAM-8970 > URL: https://issues.apache.org/jira/browse/BEAM-8970 > Project: Beam > Issue Type: Wish > Components: runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-spark > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (BEAM-8970) Spark portable runner supports Yarn
[ https://issues.apache.org/jira/browse/BEAM-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023657#comment-17023657 ] Enis Nazif edited comment on BEAM-8970 at 1/25/20 10:11 PM: looking at this issue, to run a pipeline on YARN backed sparked, a user should be able to specify runner options of {code:java} ['--runner=SparkRunner', '--spark_submit_uber_jar', '--spark_rest_url=http://spark-rest-api:6066', '--spark_master_url='yarn']{code} As it stands, the 'spark_master_url' isn't being passed into the request created in in [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145] It seems that this is necessary to support YARN Failing this, an alternative way may be to bypass the Spark REST API (which seems like fairly hidden functionality) and instead directly {noformat} spark-submit{noformat} spark-submit` the portable jars that are created. was (Author: enazif): looking at this issue, to run a pipeline on YARN backed sparked, a user should be able to specify runner options of {code:java} ['--runner=SparkRunner', '--spark_submit_uber_jar', '--spark_rest_url=http://spark-rest-api:6066', '--spark_master_url='yarn']{code} As it stands, the `spark_master_url` isn't being passed into the request created in in [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145] It seems that this is necessary to support YARN Failing this, an alternative way may be to bypass the Spark REST API (which seems like fairly hidden functionality) and instead directly `spark-submit` the portable jars that are created. > Spark portable runner supports Yarn > --- > > Key: BEAM-8970 > URL: https://issues.apache.org/jira/browse/BEAM-8970 > Project: Beam > Issue Type: Wish > Components: runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-spark > -- This message was sent by Atlassian Jira (v8.3.4#803005)