Philipp Dallig created ZEPPELIN-5897:
----------------------------------------

             Summary: Spark-Interpreter context change
                 Key: ZEPPELIN-5897
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5897
             Project: Zeppelin
          Issue Type: Bug
          Components: spark
            Reporter: Philipp Dallig


I have encountered some strange behaviour in the Spark interpreter. This 
problem occurs when several cron jobs are started in parallel.

The launch command looks quite good.
{code:java}
[INFO] Interpreter launch command: 
/opt/conda/lib/python3.9/site-packages/pyspark/bin/spark-submit --class 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer 
--driver-class-path 
/usr/share/java/*:/tmp/local-repo/spark_8g_8g/*:/opt/zeppelin/interpreter/spark/*:::/opt/zeppelin/interpreter/zeppelin-interpreter-shaded-0.11.0-SNAPSHOT.jar:/opt/zeppelin/interpreter/spark/spark-interpreter-0.11.0-SNAPSHOT.jar
 --driver-java-options   -Dfile.encoding=UTF-8 
-Dlog4j.configuration=file:///opt/zeppelin/conf/log4j.properties 
-Dlog4j.configurationFile=file:///opt/zeppelin/conf/log4j2.properties 
-Dzeppelin.log.file=/opt/zeppelin/logs/zeppelin-interpreter-spark_8g_8g-isolated-2G8V2J18D-2023-04-11_00-00-00--spark8g8g-isolated-2g8v2j18d-2023-04-1100-00-00-upuren.log
 --conf spark.driver.maxResultSize=8g --conf 
spark.kubernetes.executor.request.cores=0. --conf spark.network.timeout=1800 
--conf spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf 
spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog 
--verbose --conf spark.jars.ivySettings=/opt/spark/ivysettings.xml --proxy-user 
ejavaheri --conf spark.master=k8s://https://kubernetes.default.svc --conf 
spark.driver.memory=8g --conf spark.driver.cores=2 --conf 
spark.app.name=spark_8g_8g --conf 
spark.driver.host=spark8g8g-isolated-2g8v2j18d-2023-04-1100-00-00-upuren.spark.svc
 --conf spark.kubernetes.memoryOverheadFactor=0.4 --conf 
spark.webui.yarn.useProxy=false --conf spark.blockManager.port=22322 --conf 
spark.driver.port=22321 --conf spark.driver.bindAddress=0.0.0.0 --conf 
spark.kubernetes.namespace=spark --conf 
spark.kubernetes.driver.request.cores=200m --conf 
spark.kubernetes.driver.pod.name=spark8g8g-isolated-2g8v2j18d-2023-04-1100-00-00-upuren
 --conf spark.executor.instances=1 --conf spark.executor.memory=8g --conf 
spark.executor.cores=4 --conf spark.submit.deployMode=client --conf 
spark.kubernetes.container.image=harbor.mycompany.com/dap/zeppelin-executor:3.3 
/opt/zeppelin/interpreter/spark/spark-interpreter-0.11.0-SNAPSHOT.jar 
zeppelin-server.spark.svc 12320 
spark_8g_8g-isolated-2G8V2J18D-2023-04-11_00-00-00 12321:12321{code}
 

As you can see the config value `spark.driver.host` is 
`spark8g8g-isolated-2g8v2j18d-2023-04-1100-00-00-upuren.spark.svc`, which is 
correct

During start-up, the host seems to change. New name:
{code:java}
spark2g4g-isolated-2d8reueys-2023-04-1100-00-00-fbvrgw.spark.svc {code}

The new name is the host name of the other parallel running cron job. How is it 
possible that the spark driver host changes? Does Zeppelin even have the 
possibility to do this?

{code:java}
INFO [2023-04-11 00:00:04,288] ({RegisterThread} 
RemoteInterpreterServer.java[run]:620) - Start registration
INFO
 [2023-04-11 00:00:04,288] ({RemoteInterpreterServer-Thread} 
RemoteInterpreterServer.java[run]:200) - Launching ThriftServer at 
10.129.4.191:12321
INFO [2023-04-11 00:00:05,409] ({RegisterThread} 
RemoteInterpreterServer.java[run]:634) - Registering interpreter process
INFO [2023-04-11 00:00:05,433] ({RegisterThread} 
RemoteInterpreterServer.java[run]:636) - Registered interpreter process
INFO [2023-04-11 00:00:05,433] ({RegisterThread} 
RemoteInterpreterServer.java[run]:657) - Registration finished
WARN
 [2023-04-11 00:00:05,517] ({pool-3-thread-1} 
ZeppelinConfiguration.java[<init>]:87) - Failed to load XML 
configuration, proceeding with a default,for a stacktrace activate the 
debug log
INFO [2023-04-11 00:00:05,522] ({pool-3-thread-1} 
ZeppelinConfiguration.java[create]:137) - Server Host: 127.0.0.1
INFO [2023-04-11 00:00:05,523] ({pool-3-thread-1} 
ZeppelinConfiguration.java[create]:144) - Zeppelin Version: 0.11.0-SNAPSHOT
INFO [2023-04-11 00:00:05,522] ({pool-3-thread-1} 
ZeppelinConfiguration.java[create]:141) - Server Port: 8080
INFO [2023-04-11 00:00:05,523] ({pool-3-thread-1} 
ZeppelinConfiguration.java[create]:143) - Context Path: /
INFO
 [2023-04-11 00:00:05,531] ({pool-3-thread-1} 
RemoteInterpreterServer.java[createLifecycleManager]:293) - Creating 
interpreter lifecycle manager: 
org.apache.zeppelin.interpreter.lifecycle.TimeoutLifecycleManager
INFO
 [2023-04-11 00:00:05,535] ({pool-3-thread-1} 
RemoteInterpreterServer.java[init]:236) - Creating 
RemoteInterpreterEventClient with connection pool size: 100
INFO 
[2023-04-11 00:00:05,535] ({pool-3-thread-1} 
TimeoutLifecycleManager.java[onInterpreterProcessStarted]:73) - 
Interpreter process: spark_8g_8g-isolated-2G8V2J18D-2023-04-11_00-00-00 
is started
INFO [2023-04-11 00:00:05,535] ({pool-3-thread-1} 
TimeoutLifecycleManager.java[<init>]:67) - TimeoutLifecycleManager
 is started with checkInterval: 60000, timeoutThreshold: ¸3600000
INFO
 [2023-04-11 00:00:05,627] ({pool-3-thread-1} 
RemoteInterpreterServer.java[createInterpreter]:406) - Instantiate 
interpreter org.apache.zeppelin.spark.SparkInterpreter, isForceShutdown:
 true
INFO [2023-04-11 00:00:05,635] ({pool-3-thread-1} 
RemoteInterpreterServer.java[createInterpreter]:406) - Instantiate 
interpreter org.apache.zeppelin.spark.SparkSqlInterpreter, 
isForceShutdown: true
INFO [2023-04-11 00:00:05,645] 
({pool-3-thread-1} RemoteInterpreterServer.java[createInterpreter]:406) -
 Instantiate interpreter org.apache.zeppelin.spark.PySparkInterpreter, 
isForceShutdown: true
INFO [2023-04-11 00:00:05,655] 
({pool-3-thread-1} RemoteInterpreterServer.java[createInterpreter]:406) -
 Instantiate interpreter org.apache.zeppelin.spark.IPySparkInterpreter, 
isForceShutdown: true
INFO [2023-04-11 00:00:05,663] 
({pool-3-thread-1} RemoteInterpreterServer.java[createInterpreter]:406) -
 Instantiate interpreter org.apache.zeppelin.spark.SparkRInterpreter, 
isForceShutdown: true
INFO [2023-04-11 00:00:05,670] 
({pool-3-thread-1} RemoteInterpreterServer.java[createInterpreter]:406) -
 Instantiate interpreter org.apache.zeppelin.spark.SparkIRInterpreter, 
isForceShutdown: true
INFO [2023-04-11 00:00:05,679] 
({pool-3-thread-1} RemoteInterpreterServer.java[createInterpreter]:406) -
 Instantiate interpreter 
org.apache.zeppelin.spark.SparkShinyInterpreter, isForceShutdown: true
INFO
 [2023-04-11 00:00:05,753] ({pool-3-thread-1} 
RemoteInterpreterServer.java[createInterpreter]:406) - Instantiate 
interpreter org.apache.zeppelin.spark.KotlinSparkInterpreter, 
isForceShutdown: true
INFO [2023-04-11 00:00:05,806] 
({pool-3-thread-1} SchedulerFactory.java[createOrGetFIFOScheduler]:76) -
 Create FIFOScheduler: interpreter_688737023
INFO [2023-04-11 00:00:05,806] ({pool-3-thread-1} 
SchedulerFactory.java[<init>]:56) - Scheduler Thread Pool Size: 100
INFO
 [2023-04-11 00:00:05,810] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
AbstractScheduler.java[runJob]:127) - Job 20210622-101638_112853005 
started by scheduler interpreter_688737023
INFO [2023-04-11 
00:00:05,818] ({pool-3-thread-2} 
SchedulerFactory.java[createOrGetFIFOScheduler]:76) - Create 
FIFOScheduler: interpreter_839216362
INFO [2023-04-11 00:00:05,818] 
({pool-3-thread-2} 
SchedulerFactory.java[createOrGetParallelScheduler]:88) - Create 
ParallelScheduler: 
org.apache.zeppelin.spark.SparkSqlInterpreter1135593921 with 
maxConcurrency: 10
INFO [2023-04-11 00:00:05,857] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
SparkInterpreter.java[extractScalaVersion]:279) - Using Scala: version 
2.12.15
INFO [2023-04-11 00:00:05,881] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
SparkScala212Interpreter.scala[createSparkILoop]:182) - Scala shell repl
 output dir: /tmp/spark16004603505225443508
INFO [2023-04-11 
00:00:06,113] ({FIFOScheduler-interpreter_688737023-Worker-1} 
SparkScala212Interpreter.scala[createSparkILoop]:191) - UserJars: 
file:/opt/zeppelin/interpreter/spark/spark-interpreter-0.11.0-SNAPSHOT.jar:/opt/zeppelin/interpreter/spark/scala-2.12/spark-scala-2.12-0.11.0-SNAPSHOT.jar
INFO
 [2023-04-11 00:00:11,260] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
HiveConf.java[findConfigFile]:187) - Found configuration file 
file:/opt/conda/lib/python3.9/site-packages/pyspark/conf/hive-site.xml
INFO
 [2023-04-11 00:00:11,438] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Running Spark version 3.3.0
INFO 
[2023-04-11 00:00:11,472] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - No custom resources configured for 
spark.driver.
INFO [2023-04-11 00:00:11,472] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - 
==============================================================
INFO 
[2023-04-11 00:00:11,471] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - 
==============================================================
INFO 
[2023-04-11 00:00:11,473] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Submitted application: spark_8g_8g
INFO 
[2023-04-11 00:00:11,500] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Default ResourceProfile created, executor 
resources: Map(cores -> name: cores, amount: 4, script: , vendor: , 
memory -> name: memory, amount: 8192, script: , vendor: , offHeap 
-> name: offHeap, amount: 0, script: , vendor: ), task resources: 
Map(cpus -> name: cpus, amount: 1.0)
INFO [2023-04-11 
00:00:11,512] ({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Limiting resource is cpus at 4 tasks per 
executor
INFO [2023-04-11 00:00:11,515] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Added ResourceProfile id: 0
INFO 
[2023-04-11 00:00:11,580] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Changing view acls to: zeppelin,ejavaheri
INFO
 [2023-04-11 00:00:11,580] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Changing modify acls to: zeppelin,ejavaheri
INFO
 [2023-04-11 00:00:11,581] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - SecurityManager: authentication disabled; 
ui acls disabled; users  with view permissions: Set(zeppelin, 
ejavaheri); groups with view permissions: Set(); users  with modify 
permissions: Set(zeppelin, ejavaheri); groups with modify permissions: 
Set()
INFO [2023-04-11 00:00:11,581] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Changing modify acls groups to:
INFO 
[2023-04-11 00:00:11,581] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Changing view acls groups to:
INFO 
[2023-04-11 00:00:11,852] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Successfully started service 'sparkDriver' 
on port 22321.
INFO [2023-04-11 00:00:11,880] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Registering MapOutputTracker
INFO 
[2023-04-11 00:00:11,912] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Registering BlockManagerMaster
INFO 
[2023-04-11 00:00:11,946] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Using 
org.apache.spark.storage.DefaultTopologyMapper for getting topology 
information
INFO [2023-04-11 00:00:11,947] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - BlockManagerMasterEndpoint up
INFO 
[2023-04-11 00:00:11,950] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Registering BlockManagerMasterHeartbeat
INFO
 [2023-04-11 00:00:11,975] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Created local directory at 
/tmp/blockmgr-1903d257-be01-4cb7-954f-9a5c13ab0598
INFO [2023-04-11 
00:00:11,993] ({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - MemoryStore started with capacity 4.6 GiB
INFO
 [2023-04-11 00:00:12,010] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Registering OutputCommitCoordinator
INFO [2023-04-11 00:00:12,079] ({FIFOScheduler-interpreter_688737023-Worker-1} 
Log.java[initialized]:170) - Logging initialized @9839ms to 
org.sparkproject.jetty.util.log.Slf4jLog
INFO
 [2023-04-11 00:00:12,193] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Server.java[doStart]:375) - jetty-9.4.46.v20220331; built: 
2022-03-31T16:38:08.030Z; git: bc17a0369a11ecf40bb92c839b9ef0a8ac50ea18;
 jvm 11.0.17+8-post-Ubuntu-1ubuntu220.04
INFO [2023-04-11 00:00:12,223] ({FIFOScheduler-interpreter_688737023-Worker-1} 
Server.java[doStart]:415) - Started @9983ms
INFO
 [2023-04-11 00:00:12,273] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
AbstractConnector.java[doStart]:333) - Started 
ServerConnector@325be8be{HTTP/1.1, (http/1.1)}{0.0.0.0:4040}
INFO 
[2023-04-11 00:00:12,274] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Successfully started service 'SparkUI' on 
port 4040.
INFO [2023-04-11 00:00:12,310] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
ContextHandler.java[doStart]:921) - Started 
o.s.j.s.ServletContextHandler@47745fce{/,null,AVAILABLE,@Spark}
INFO
 [2023-04-11 00:00:12,342] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Added JAR 
file:/opt/zeppelin/interpreter/spark/spark-interpreter-0.11.0-SNAPSHOT.jar
 at 
spark://spark2g4g-isolated-2d8reueys-2023-04-1100-00-00-fbvrgw.spark.svc:22321/jars/spark-interpreter-0.11.0-SNAPSHOT.jar
 with timestamp 1681164011433
INFO
 [2023-04-11 00:00:12,413] 
({FIFOScheduler-interpreter_688737023-Worker-1} 
Logging.scala[logInfo]:61) - Auto-configuring K8S client using current 
context from users K8S config file {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to