[ 
https://issues.apache.org/jira/browse/SPARK-29465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952647#comment-16952647
 ] 

Vishwas Nalka edited comment on SPARK-29465 at 10/16/19 9:18 AM:
-----------------------------------------------------------------

Need a small clarification for restricting ports,

I was able to configure values for all port types using their appropriate 
property. It was only spark.ui.port that was being overridden by 
ApplicationMaster and set to 0. Once the spark job was launched, I could see 
that all ports of the spark job (both driver process and executor processes) 
were assigned as configured except the UI port which was spawned in a random 
port. Even from the logs of the spark app, I was able to check that 
spark.driver.port, spark.executor.port and other port types were spawned in the 
range _(port_mentioned to port_mentioned+spark.port.maxRetries)_ except the UI 
port which started on random port. 

 As shared in description, from the spark app logs

command:LD_LIBRARY_PATH="/usr/hdp/2.6.4.0-91/hadoop/lib/native:$LD_LIBRARY_PATH"
 JAVA_HOME/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms4096m 
-Xmx4096m -Djava.io.tmpdir=PWD/tmp '-Dspark.blockManager.port=9900' 
'-Dspark.driver.port=9902' '-Dspark.fileserver.port=9903' 
'-Dspark.broadcast.port=9904' '-Dspark.port.maxRetries=20' 
*_'-Dspark.ui.port=0'_* '-Dspark.executor.port=9905'

_You can see that the UI port is being set to 0 i.e. random port select even 
though *I configured it to a different value.*_

Instead of adding new support to restrict the port range of spark, I felt this 
could be done by right combination of "_spark.port.maxRetries"_ and specifying 
values for port types. However, wrt UI port, ApplicationMaster overrides it 
using JVM property just before launch giving the user no other way to 
restrict/set the UI port. I felt that it's only the UI port causing the issue.

Please share your suggestion.


was (Author: vishwasn):
Need a small clarification for restricting ports,

I was able to configure values for all port types using their appropriate 
property. It was only spark.ui.port that was being overridden by 
ApplicationMaster and set to 0. Once the spark job was launched, I could see 
that all ports of the spark job (both driver process and executor processes) 
were assigned as configured except the UI port which was spawned in a random 
port. Even from the logs of the spark app, I was able to check that 
spark.driver.port, spark.executor.port and other port types were spawned in the 
range _(port_mentioned to port_mentioned+spark.port.maxRetries)_ except the UI 
port which started on random port. 

 As shared in description, from the spark app logs

command:LD_LIBRARY_PATH="/usr/hdp/2.6.4.0-91/hadoop/lib/native:$LD_LIBRARY_PATH"
 JAVA_HOME/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms4096m 
-Xmx4096m -Djava.io.tmpdir=PWD/tmp '-Dspark.blockManager.port=9900' 
'-Dspark.driver.port=9902' '-Dspark.fileserver.port=9903' 
'-Dspark.broadcast.port=9904' '-Dspark.port.maxRetries=20' 
*_'-Dspark.ui.port=0'_* '-Dspark.executor.port=9905'


_You can see that the UI port is being set to 0 i.e. random port select even 
though I configured it to a different value._

Instead of adding new support to restrict the port range of spark, I felt this 
could be done by right combination of "_spark.port.maxRetries"_ and specifying 
values for port types. However, wrt UI port, ApplicationMaster overrides it 
using JVM property just before launch giving the user no other way to 
restrict/set the UI port. I felt that it's only the UI port causing the issue.

Please share your suggestion.

> Unable to configure SPARK UI (spark.ui.port) in spark yarn cluster mode. 
> -------------------------------------------------------------------------
>
>                 Key: SPARK-29465
>                 URL: https://issues.apache.org/jira/browse/SPARK-29465
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Submit, YARN
>    Affects Versions: 3.0.0
>            Reporter: Vishwas Nalka
>            Priority: Major
>
>  I'm trying to restrict the ports used by spark app which is launched in yarn 
> cluster mode. All ports (viz. driver, executor, blockmanager) could be 
> specified using the respective properties except the ui port. The spark app 
> is launched using JAVA code and setting the property spark.ui.port in 
> sparkConf doesn't seem to help. Even setting a JVM option 
> -Dspark.ui.port="some_port" does not spawn the UI is required port. 
> From the logs of the spark app, *_the property spark.ui.port is overridden 
> and the JVM property '-Dspark.ui.port=0' is set_* even though it is never set 
> to 0. 
> _(Run in Spark 1.6.2) From the logs ->_
> _command:LD_LIBRARY_PATH="/usr/hdp/2.6.4.0-91/hadoop/lib/native:$LD_LIBRARY_PATH"
>  {{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms4096m 
> -Xmx4096m -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.blockManager.port=9900' 
> '-Dspark.driver.port=9902' '-Dspark.fileserver.port=9903' 
> '-Dspark.broadcast.port=9904' '-Dspark.port.maxRetries=20' 
> '-Dspark.ui.port=0' '-Dspark.executor.port=9905'_
> _19/10/14 16:39:59 INFO Utils: Successfully started service 'SparkUI' on port 
> 35167.19/10/14 16:39:59 INFO SparkUI: Started SparkUI at_ 
> [_http://10.65.170.98:35167_|http://10.65.170.98:35167/]
> Even tried using a *spark-submit command with --conf spark.ui.port* does 
> spawn UI in required port
> {color:#172b4d}_(Run in Spark 2.4.4)_{color}
>  {color:#172b4d}_./bin/spark-submit --class org.apache.spark.examples.SparkPi 
> --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 2g 
> --executor-cores 1 --conf spark.ui.port=12345 --conf spark.driver.port=12340 
> --queue default examples/jars/spark-examples_2.11-2.4.4.jar 10_{color}
> _From the logs::_
>  _19/10/15 00:04:05 INFO ui.SparkUI: Stopped Spark web UI at 
> [http://invrh74ace005.informatica.com:46622|http://invrh74ace005.informatica.com:46622/]_
> _command:{{JAVA_HOME}}/bin/java -server -Xmx2048m 
> -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.ui.port=0'  'Dspark.driver.port=12340' 
> -Dspark.yarn.app.container.log.dir=<LOG_DIR> -XX:OnOutOfMemoryError='kill %p' 
> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url 
> spark://coarsegrainedschedu...@invrh74ace005.informatica.com:12340 
> --executor-id <executorId> --hostname <hostname> --cores 1 --app-id 
> application_1570992022035_0089 --user-class-path 
> [file:$PWD/__app__.jar1|file://%24pwd/__app__.jar1]><LOG_DIR>/stdout2><LOG_DIR>/stderr_
>  
> Looks like the application master override this and set a JVM property before 
> launch resulting in random UI port even though spark.ui.port is set by the 
> user.
> In these links
>  # 
> [https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala]
>  (line 214)
>  # 
> [https://github.com/cloudera/spark/blob/master/yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala]
>  (line 75)
> I can see that the method _*run() in above files sets a system property 
> UI_PORT*_ and _*spark.ui.port respectively.*_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to