[GitHub] [spark] roczei commented on pull request #38828: [SPARK-35084][CORE] Spark 3: supporting --packages in k8s cluster mode
roczei commented on PR #38828: URL: https://github.com/apache/spark/pull/38828#issuecomment-1373446915 Thanks @holdenk for the review! When do you plan to merge it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] roczei commented on pull request #38828: [SPARK-35084][CORE] Spark 3: supporting --packages in k8s cluster mode
roczei commented on PR #38828: URL: https://github.com/apache/spark/pull/38828#issuecomment-1354933514 @holdenk, @HyukjinKwon, @dongjoon-hyun Could you please take a look when you have some time? This fixes a k8s --packages issue which is part of Spark 3 since 3.0.0. It would be nice to solve it. Here you can see the old branch-3.0 where the conditional codes of the "if" / "else if" / "else" are equal to the latest master version: https://github.com/apache/spark/blob/branch-3.0/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L316-L328 vs. https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L317-L330 These conditions were restructured a bit by @ocworld I have already added my test results above. The fix works. This K8S PR is a follow-up PR for #32397. It has been closed by github-action because it hasn't been updated in a while and there was no unit test. The requested unit test has been added, now we need just someone from the Spark committer team who can review it again. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] roczei commented on pull request #38828: [SPARK-35084][CORE] Spark 3: supporting --packages in k8s cluster mode
roczei commented on PR #38828: URL: https://github.com/apache/spark/pull/38828#issuecomment-1339222364 Thanks @ocworld for the uploaded unit test! It works perfectly, it can identify the issue. Good: ``` - SPARK-35084: includes jars passed in through --packages in k8s client driver mode ``` Bad: ``` - SPARK-35084: includes jars passed in through --packages in k8s client driver mode *** FAILED *** false was not equal to true (SparkSubmitSuite.scala:513) ``` Just for documentation. Here is my test case what I have validated in my environment manually. I am using a spark-shell in a k8s Docker container. It fails with the following error without your fix: /tmp/spark.properties: ``` spark.kubernetes.submitInDriver=true spark.kubernetes.authenticate.driver.serviceAccountName=spark spark.kubernetes.namespace=default spark.driver.blockManager.port=7079 spark.driver.port=7078 spark.blockManager.port=7079 spark.kubernetes.executor.label.name=executor spark.kubernetes.driver.label.name=driver spark.locality.wait=0 spark.executor.instances=1 spark.kubernetes.container.image=spark:spark-35084-upstream-no-fix spark.master=k8s\://https\://kubernetes.default.svc.cluster.local\:443 spark.jars.packages=com.github.music-of-the-ainur\:almaren-framework_2.12\:0.9.4-3.2,com.github.music-of-the-ainur\:http-almaren_2.12\:1.2.4-3.2 spark.driver.host=172.17.0.4 spark.kubernetes.driver.pod.name=spark-submitter-spark-35084-upstream-no-fix-rxpjs ``` ``` spark-shell --properties-file /tmp/spark.properties ... __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.4.0-SNAPSHOT /_/ Using Scala version 2.12.17 (OpenJDK 64-Bit Server VM, Java 17.0.5) Type in expressions to have them evaluated. Type :help for more information. scala> import com.github.music.of.the.ainur.almaren.builder.Core.Implicit :22: error: object music is not a member of package com.github import com.github.music.of.the.ainur.almaren.builder.Core.Implicit ^ scala> import com.github.music.of.the.ainur.almaren.Almaren :22: error: object music is not a member of package com.github import com.github.music.of.the.ainur.almaren.Almaren ^ scala> import org.apache.spark.sql.DataFrame import org.apache.spark.sql.DataFrame scala> val almaren = Almaren("App Name") :26: error: not found: value Almaren val almaren = Almaren("App Name") ^ scala> $intp.isettings.maxPrintString = 0 $intp.isettings.maxPrintString: Int = 0 scala> spark.conf.get("spark.jars") res0: String = "" ``` and this is the good one which includes your fix: /tmp/spark.properties ``` spark.kubernetes.submitInDriver=true spark.kubernetes.authenticate.driver.serviceAccountName=spark spark.kubernetes.namespace=default spark.driver.blockManager.port=7079 spark.driver.port=7078 spark.blockManager.port=7079 spark.kubernetes.executor.label.name=executor spark.kubernetes.driver.label.name=driver spark.locality.wait=0 spark.executor.instances=1 spark.kubernetes.container.image=spark:spark-35084-upstream-with-fix spark.master=k8s\://https\://kubernetes.default.svc.cluster.local\:443 spark.jars.packages=com.github.music-of-the-ainur\:almaren-framework_2.12\:0.9.4-3.2,com.github.music-of-the-ainur\:http-almaren_2.12\:1.2.4-3.2 spark.driver.host=172.17.0.3 spark.kubernetes.driver.pod.name=spark-submitter-spark-35084-upstream-with-fix-whxzl ``` ``` spark-shell --properties-file /tmp/spark.properties ... __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.4.0-SNAPSHOT /_/ Using Scala version 2.12.17 (OpenJDK 64-Bit Server VM, Java 17.0.5) Type in expressions to have them evaluated. Type :help for more information. scala> import com.github.music.of.the.ainur.almaren.builder.Core.Implicit import com.github.music.of.the.ainur.almaren.builder.Core.Implicit scala> import com.github.music.of.the.ainur.almaren.Almaren import com.github.music.of.the.ainur.almaren.Almaren scala> import org.apache.spark.sql.DataFrame import org.apache.spark.sql.DataFrame scala> val almaren = Almaren("App Name") almaren: com.github.music.of.the.ainur.almaren.Almaren.type = com.github.music.of.the.ainur.almaren.Almaren$@4c2f971 scala> $intp.isettings.maxPrintString = 0 $intp.isettings.maxPrintString: Int = 0 scala> spark.conf.get("spark.jars") res0: String = file:///home/sparkuser/.ivy2/jars/com.github.music-of-the-ainur_almaren-framework_2.12-0.9.4-3.2.jar,file:///h
[GitHub] [spark] roczei commented on pull request #38828: [SPARK-35084][CORE] Spark 3: supporting --packages in k8s cluster mode
roczei commented on PR #38828: URL: https://github.com/apache/spark/pull/38828#issuecomment-1332308308 Hi @ocworld, Thanks a lot for this fix! I have tested it and it works for me as well. Do you plan to add unit tests? Have you found a solution for this problem what you have mentioned in the previous pull request? https://github.com/apache/spark/pull/32397#issuecomment-838655508 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org