[jira] [Created] (SPARK-30160) Extend ExecutorPodAllocator for arbitrary CRDs
Ilan Filonenko created SPARK-30160: -- Summary: Extend ExecutorPodAllocator for arbitrary CRDs Key: SPARK-30160 URL: https://issues.apache.org/jira/browse/SPARK-30160 Project: Spark Issue Type: New Feature Components: Kubernetes Affects Versions: 2.4.4, 3.0.0 Reporter: Ilan Filonenko There are internal use-cases at companies that require for executor creation to be done by privileged resources (like admin controllers / operators who would receive events or updates) that would create the Kubernetes Executor pods on-behalf of the user. In essence, the assumption that the driver has the appropriate service-account to create pods is not a guarantee in various organizations. This proposed JIRA looks to create a pluggable interface for pod creation and deletion to allow for customizable allocation libraries to exist i.e. CRDs that scale up and down -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-30160) Extend ExecutorPodAllocator for arbitrary CRDs
[ https://issues.apache.org/jira/browse/SPARK-30160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilan Filonenko updated SPARK-30160: --- Priority: Minor (was: Major) > Extend ExecutorPodAllocator for arbitrary CRDs > -- > > Key: SPARK-30160 > URL: https://issues.apache.org/jira/browse/SPARK-30160 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.4.4, 3.0.0 >Reporter: Ilan Filonenko >Priority: Minor > > There are internal use-cases at companies that require for executor creation > to be done by privileged resources (like admin controllers / operators who > would receive events or updates) that would create the Kubernetes Executor > pods on-behalf of the user. In essence, the assumption that the driver has > the appropriate service-account to create pods is not a guarantee in various > organizations. > This proposed JIRA looks to create a pluggable interface for pod creation and > deletion to allow for customizable allocation libraries to exist i.e. CRDs > that scale up and down -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-30111) spark R dockerfile fails to build
[ https://issues.apache.org/jira/browse/SPARK-30111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987372#comment-16987372 ] Ilan Filonenko commented on SPARK-30111: The error seems to be from: {code:yaml} Step 6/12 : RUN apt install -y python python-pip && apt install -y python3 python3-pip && rm -r /usr/lib/python*/ensurepip && pip install --upgrade pip setuptools && rm -r /root/.cache && rm -rf /var/cache/apt/* ---> Running in f3d520c3435b {code} so this is relating to pyspark dockerfile The error is because: *404 Not Found [IP: 151.101.188.204 80]* > spark R dockerfile fails to build > - > > Key: SPARK-30111 > URL: https://issues.apache.org/jira/browse/SPARK-30111 > Project: Spark > Issue Type: Bug > Components: Build, jenkins, Kubernetes >Affects Versions: 3.0.0 >Reporter: Shane Knapp >Priority: Major > > all recent k8s builds have been failing when trying to build the sparkR > dockerfile: > [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/19565/console] > [https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-k8s/426/console|https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-k8s/] > [https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-k8s-jdk11/76/console|https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-k8s-jdk11/] > [~ifilonenko] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-27812) kubernetes client import non-daemon thread which block jvm exit.
[ https://issues.apache.org/jira/browse/SPARK-27812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilan Filonenko updated SPARK-27812: --- Affects Version/s: 2.4.4 > kubernetes client import non-daemon thread which block jvm exit. > > > Key: SPARK-27812 > URL: https://issues.apache.org/jira/browse/SPARK-27812 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3, 2.4.4 >Reporter: Henry Yu >Priority: Major > > I try spark-submit to k8s with cluster mode. Driver pod failed to exit with > An Okhttp Websocket Non-Daemon Thread. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-26238) Set SPARK_CONF_DIR to be ${SPARK_HOME}/conf for K8S
Ilan Filonenko created SPARK-26238: -- Summary: Set SPARK_CONF_DIR to be ${SPARK_HOME}/conf for K8S Key: SPARK-26238 URL: https://issues.apache.org/jira/browse/SPARK-26238 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 2.4.0, 3.0.0 Reporter: Ilan Filonenko Set SPARK_CONF_DIR to point to ${SPARK_HOME}/conf as opposed to /opt/spark/conf which is hard-coded into the Constants. This is expected behavior according to spark docs -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25590) kubernetes-model-2.0.0.jar masks default Spark logging config
[ https://issues.apache.org/jira/browse/SPARK-25590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16667640#comment-16667640 ] Ilan Filonenko commented on SPARK-25590: Is this consistent with `kubernetes-model-4.1.0.jar` which is now what is being packaged as a result of a versioning refactor? (However, this versioning is pointed to 3.0.0 as per #SPARK-25828) > kubernetes-model-2.0.0.jar masks default Spark logging config > - > > Key: SPARK-25590 > URL: https://issues.apache.org/jira/browse/SPARK-25590 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Marcelo Vanzin >Priority: Major > > That jar file, which is packaged when the k8s profile is enabled, has a log4j > configuration embedded in it: > {noformat} > $ jar tf /path/to/kubernetes-model-2.0.0.jar | grep log4j > log4j.properties > {noformat} > What this causes is that Spark will always use that log4j configuration > instead of its own default (log4j-defaults.properties), unless the user > overrides it by somehow adding their own in the classpath before the > kubernetes one. > You can see that by running spark-shell. With the k8s jar in: > {noformat} > $ ./bin/spark-shell > ... > Setting default log level to "WARN" > {noformat} > Removing the k8s jar: > {noformat} > $ ./bin/spark-shell > ... > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > Setting default log level to "WARN". > {noformat} > The proper fix would be for the k8s jar to not ship that file, and then just > upgrade the dependency in Spark, but if there's something easy we can do in > the meantime... -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25828) Bumping Version of kubernetes.client to latest version
[ https://issues.apache.org/jira/browse/SPARK-25828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilan Filonenko updated SPARK-25828: --- Description: Upgrade the Kubernetes client version to at least [4.0.0|https://mvnrepository.com/artifact/io.fabric8/kubernetes-client/4.0.0] as we are falling behind on fabric8 updates. This will be an update to both kubernetes/core and kubernetes/integration-tests was: Upgrade the Kubernetes client version to at least [4.0.0|https://mvnrepository.com/artifact/io.fabric8/kubernetes-client/4.0.0] as we are falling behind on fabric8 updates. This will be an update to both in kubernetes/core and kubernetes/integration-tests > Bumping Version of kubernetes.client to latest version > -- > > Key: SPARK-25828 > URL: https://issues.apache.org/jira/browse/SPARK-25828 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 3.0.0 >Reporter: Ilan Filonenko >Priority: Minor > > Upgrade the Kubernetes client version to at least > [4.0.0|https://mvnrepository.com/artifact/io.fabric8/kubernetes-client/4.0.0] > as we are falling behind on fabric8 updates. This will be an update to both > kubernetes/core and kubernetes/integration-tests -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25828) Bumping Version of kubernetes.client to latest version
Ilan Filonenko created SPARK-25828: -- Summary: Bumping Version of kubernetes.client to latest version Key: SPARK-25828 URL: https://issues.apache.org/jira/browse/SPARK-25828 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 3.0.0 Reporter: Ilan Filonenko Upgrade the Kubernetes client version to at least [4.0.0|https://mvnrepository.com/artifact/io.fabric8/kubernetes-client/4.0.0] as we are falling behind on fabric8 updates. This will be an update to both in kubernetes/core and kubernetes/integration-tests -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25826) Kerberos Support in Kubernetes resource manager
[ https://issues.apache.org/jira/browse/SPARK-25826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilan Filonenko updated SPARK-25826: --- Description: This is the umbrella issue for all Kerberos related tasks with relation to Spark on Kubernetes (was: This is the umbrella issue for all Kerberos related tasks) Summary: Kerberos Support in Kubernetes resource manager (was: Kerberos Support in Kubernetes) > Kerberos Support in Kubernetes resource manager > --- > > Key: SPARK-25826 > URL: https://issues.apache.org/jira/browse/SPARK-25826 > Project: Spark > Issue Type: Umbrella > Components: Kubernetes >Affects Versions: 3.0.0 >Reporter: Ilan Filonenko >Priority: Major > > This is the umbrella issue for all Kerberos related tasks with relation to > Spark on Kubernetes -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25826) Kerberos Support in Kubernetes
Ilan Filonenko created SPARK-25826: -- Summary: Kerberos Support in Kubernetes Key: SPARK-25826 URL: https://issues.apache.org/jira/browse/SPARK-25826 Project: Spark Issue Type: Umbrella Components: Kubernetes Affects Versions: 3.0.0 Reporter: Ilan Filonenko This is the umbrella issue for all Kerberos related tasks -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25825) Kerberos Support for Long Running Jobs in Kubernetes
Ilan Filonenko created SPARK-25825: -- Summary: Kerberos Support for Long Running Jobs in Kubernetes Key: SPARK-25825 URL: https://issues.apache.org/jira/browse/SPARK-25825 Project: Spark Issue Type: New Feature Components: Kubernetes Affects Versions: 3.0.0 Reporter: Ilan Filonenko When provided with a --keytab and --principal combination, there is an expectation that Kubernetes would leverage the Driver to spin up a renewal thread to handle token renewal. However, in the case that a --keytab and --principal are not provided and instead a secretName and secretItemKey is provided, there should be an option to specify a config that specifies that a external renewal service exists. The driver should, therefore, be responsible for discovering changes to the secret and send the updated token data to the executor with the UpdateDelegationTokens message. Thereby enabling token renewal given just a secret in addition to the traditional use-case via --keytab and --principal -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25815) Kerberos Support in Kubernetes resource manager (Client Mode)
Ilan Filonenko created SPARK-25815: -- Summary: Kerberos Support in Kubernetes resource manager (Client Mode) Key: SPARK-25815 URL: https://issues.apache.org/jira/browse/SPARK-25815 Project: Spark Issue Type: New Feature Components: Kubernetes Affects Versions: 3.0.0 Reporter: Ilan Filonenko Include Kerberos support for Spark on K8S jobs running in client-mode -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23257) Kerberos Support in Kubernetes resource manager (Cluster Mode)
[ https://issues.apache.org/jira/browse/SPARK-23257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilan Filonenko updated SPARK-23257: --- Summary: Kerberos Support in Kubernetes resource manager (Cluster Mode) (was: Implement Kerberos Support in Kubernetes resource manager) > Kerberos Support in Kubernetes resource manager (Cluster Mode) > -- > > Key: SPARK-23257 > URL: https://issues.apache.org/jira/browse/SPARK-23257 > Project: Spark > Issue Type: Wish > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Rob Keevil >Assignee: Ilan Filonenko >Priority: Major > Fix For: 3.0.0 > > > On the forked k8s branch of Spark at > [https://github.com/apache-spark-on-k8s/spark/pull/540] , Kerberos support > has been added to the Kubernetes resource manager. The Kubernetes code > between these two repositories appears to have diverged, so this commit > cannot be merged in easily. Are there any plans to re-implement this work on > the main Spark repository? > > [ifilonenko|https://github.com/ifilonenko] [~liyinan926] I am happy to help > with the development and testing of this, but i wanted to confirm that this > isn't already in progress - I could not find any discussion about this > specific topic online. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25678) SPIP: Adding support in Spark for HPC cluster manager (PBS Professional)
[ https://issues.apache.org/jira/browse/SPARK-25678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16659756#comment-16659756 ] Ilan Filonenko commented on SPARK-25678: Would recommend to look at: https://jira.apache.org/jira/browse/SPARK-19700 as this seems to be related to your approach towards making Spark enable pluggable scheduler implementations. > SPIP: Adding support in Spark for HPC cluster manager (PBS Professional) > > > Key: SPARK-25678 > URL: https://issues.apache.org/jira/browse/SPARK-25678 > Project: Spark > Issue Type: New Feature > Components: Scheduler >Affects Versions: 3.0.0 >Reporter: Utkarsh Maheshwari >Priority: Major > > I sent an email on the dev mailing list but got no response, hence filing a > JIRA ticket. > > PBS (Portable Batch System) Professional is an open sourced workload > management system for HPC clusters. Many organizations using PBS for managing > their cluster also use Spark for Big Data but they are forced to divide the > cluster into Spark cluster and PBS cluster either physically dividing the > cluster nodes into two groups or starting Spark Standalone cluster manager's > Master and Slaves as PBS jobs, leading to underutilization of resources. > > I am trying to add support in Spark to use PBS as a pluggable cluster > manager. Going through the Spark codebase and looking at Mesos and Kubernetes > integration, I found that we can get this working as follows: > > - Extend `ExternalClusterManager`. > - Extend `CoarseGrainedSchedulerBackend` > - This class can start `Executors` as PBS jobs. > - The initial number of `Executors` are started `onStart`. > - More `Executors` can be started as and when required using > `doRequestTotalExecutors`. > - `Executors` can be killed using `doKillExecutors`. > - Extend `SparkApplication` to start `Driver` as a PBS job in cluster deploy > mode. > - This extended class can submit the Spark application again as a PBS job > which with deploy mode = client, so that the application driver is started on > a node in the cluster. > > I have a couple of questions: > - Does this seem like a good idea to do this or should we look at other > options? > - What are the expectations from the initial prototype? > - If this works, would Spark maintainers look forward to merging this or > would they want it to be maintained as a fork? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25750) Integration Testing for Kerberos Support for Spark on Kubernetes
[ https://issues.apache.org/jira/browse/SPARK-25750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilan Filonenko updated SPARK-25750: --- Summary: Integration Testing for Kerberos Support for Spark on Kubernetes (was: Secure HDFS Integration Testing) > Integration Testing for Kerberos Support for Spark on Kubernetes > > > Key: SPARK-25750 > URL: https://issues.apache.org/jira/browse/SPARK-25750 > Project: Spark > Issue Type: Test > Components: Kubernetes >Affects Versions: 3.0.0 >Reporter: Ilan Filonenko >Priority: Major > > Integration testing for Secure HDFS interaction for Spark on Kubernetes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25751) Unit Testing for Kerberos Support for Spark on Kubernetes
Ilan Filonenko created SPARK-25751: -- Summary: Unit Testing for Kerberos Support for Spark on Kubernetes Key: SPARK-25751 URL: https://issues.apache.org/jira/browse/SPARK-25751 Project: Spark Issue Type: Test Components: Kubernetes Affects Versions: 3.0.0 Reporter: Ilan Filonenko Unit tests for Kerberos Support within Spark on Kubernetes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25750) Secure HDFS Integration Testing
Ilan Filonenko created SPARK-25750: -- Summary: Secure HDFS Integration Testing Key: SPARK-25750 URL: https://issues.apache.org/jira/browse/SPARK-25750 Project: Spark Issue Type: Test Components: Kubernetes Affects Versions: 3.0.0 Reporter: Ilan Filonenko Integration testing for Secure HDFS interaction for Spark on Kubernetes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25681) Delegation Tokens fetched twice upon HadoopFSDelegationTokenProvider creation
Ilan Filonenko created SPARK-25681: -- Summary: Delegation Tokens fetched twice upon HadoopFSDelegationTokenProvider creation Key: SPARK-25681 URL: https://issues.apache.org/jira/browse/SPARK-25681 Project: Spark Issue Type: Improvement Components: Kubernetes, Mesos, YARN Affects Versions: 2.5.0 Reporter: Ilan Filonenko Looking for a refactor to {{HadoopFSDelegationTokenProvider.}} Within the function {{obtainDelegationTokens()}}: This code-block: {code:java} val fetchCreds = fetchDelegationTokens(getTokenRenewer(hadoopConf),...) // Get the token renewal interval if it is not set. It will only be called once. if (tokenRenewalInterval == null) { tokenRenewalInterval = getTokenRenewalInterval(...) }{code} calls {{fetchDelegationTokens()}} twice since the {{tokenRenewalInterval}} will always be null upon creation of the {{TokenManager}} which I think is unnecessary in the case of Kubernetes (as you are creating 2 DTs when only one is needed.) Could this possibly be refactored to only call {{fetchDelegationTokens()}} once upon startup or to have a param to specify {{tokenRenewalInterval}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25681) Delegation Tokens fetched twice upon HadoopFSDelegationTokenProvider creation
[ https://issues.apache.org/jira/browse/SPARK-25681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilan Filonenko updated SPARK-25681: --- Labels: Hadoop Kerberos (was: ) > Delegation Tokens fetched twice upon HadoopFSDelegationTokenProvider creation > - > > Key: SPARK-25681 > URL: https://issues.apache.org/jira/browse/SPARK-25681 > Project: Spark > Issue Type: Improvement > Components: Kubernetes, Mesos, YARN >Affects Versions: 2.5.0 >Reporter: Ilan Filonenko >Priority: Major > Labels: Hadoop, Kerberos > > Looking for a refactor to {{HadoopFSDelegationTokenProvider.}} Within the > function {{obtainDelegationTokens()}}: > This code-block: > {code:java} > val fetchCreds = fetchDelegationTokens(getTokenRenewer(hadoopConf),...) > // Get the token renewal interval if it is not set. It will only be > called once. > if (tokenRenewalInterval == null) { > tokenRenewalInterval = getTokenRenewalInterval(...) > }{code} > calls {{fetchDelegationTokens()}} twice since the {{tokenRenewalInterval}} > will always be null upon creation of the {{TokenManager}} which I think is > unnecessary in the case of Kubernetes (as you are creating 2 DTs when only > one is needed.) Could this possibly be refactored to only call > {{fetchDelegationTokens()}} once upon startup or to have a param to specify > {{tokenRenewalInterval}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25291) Flakiness of tests in terms of executor memory (SecretsTestSuite)
[ https://issues.apache.org/jira/browse/SPARK-25291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612911#comment-16612911 ] Ilan Filonenko commented on SPARK-25291: [~skonto] the problem stems from the executors not being created in time for the reading of logs. As such, the tests fail. As such, it is required that we block on executor creation via a Watcher and only read the logs when executors are up. In essence: val executorPods = kubernetesTestComponents.kubernetesClient .pods() .withLabel("spark-app-locator", appLocator) .withLabel("spark-role", "executor") .list() .getItems executorPods.asScala.foreach { pod => executorPodChecker(pod) } runs after the spark-submit command and it takes an arbitrary period of time for the executors to get spun up. If the k8s client which is reading the executor logs returns 0 pods it won't check over the executor pods. As such, this flakiness occurs when the executor pod isn't always checking the executor pods that are made. In terms of the flakiness for the PySpark tests it seems that the executor pods are setting: .set("spark.executor.memory", "500m") in the SparkConf and as such are expecting 884 instead of the 1408. So that error seems to be related to the PySpark test framework. > Flakiness of tests in terms of executor memory (SecretsTestSuite) > - > > Key: SPARK-25291 > URL: https://issues.apache.org/jira/browse/SPARK-25291 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Ilan Filonenko >Priority: Major > > SecretsTestSuite shows flakiness in terms of correct setting of executor > memory: > Run SparkPi with env and mount secrets. *** FAILED *** > "[884]Mi" did not equal "[1408]Mi" (KubernetesSuite.scala:272) > When ran with default settings -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25372) Deprecate Yarn-specific configs in regards to keytab login for SparkSubmit
Ilan Filonenko created SPARK-25372: -- Summary: Deprecate Yarn-specific configs in regards to keytab login for SparkSubmit Key: SPARK-25372 URL: https://issues.apache.org/jira/browse/SPARK-25372 Project: Spark Issue Type: Bug Components: Kubernetes, YARN Affects Versions: 2.4.0 Reporter: Ilan Filonenko {{SparkSubmit}} already logs in the user if a keytab is provided, the only issue is that it uses the existing configs which have "yarn" in their name. As such, we should use a common name for the principal and keytab configs, and deprecate the YARN-specific ones. cc [~vanzin] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25291) Flakiness of tests in terms of executor memory (SecretsTestSuite)
Ilan Filonenko created SPARK-25291: -- Summary: Flakiness of tests in terms of executor memory (SecretsTestSuite) Key: SPARK-25291 URL: https://issues.apache.org/jira/browse/SPARK-25291 Project: Spark Issue Type: Bug Components: Kubernetes Affects Versions: 2.4.0 Reporter: Ilan Filonenko SecretsTestSuite shows flakiness in terms of correct setting of executor memory: - Run SparkPi with env and mount secrets. *** FAILED *** "[884]Mi" did not equal "[1408]Mi" (KubernetesSuite.scala:272) When ran with default settings -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25291) Flakiness of tests in terms of executor memory (SecretsTestSuite)
[ https://issues.apache.org/jira/browse/SPARK-25291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilan Filonenko updated SPARK-25291: --- Description: SecretsTestSuite shows flakiness in terms of correct setting of executor memory: Run SparkPi with env and mount secrets. *** FAILED *** "[884]Mi" did not equal "[1408]Mi" (KubernetesSuite.scala:272) When ran with default settings was: SecretsTestSuite shows flakiness in terms of correct setting of executor memory: - Run SparkPi with env and mount secrets. *** FAILED *** "[884]Mi" did not equal "[1408]Mi" (KubernetesSuite.scala:272) When ran with default settings > Flakiness of tests in terms of executor memory (SecretsTestSuite) > - > > Key: SPARK-25291 > URL: https://issues.apache.org/jira/browse/SPARK-25291 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.0 >Reporter: Ilan Filonenko >Priority: Major > > SecretsTestSuite shows flakiness in terms of correct setting of executor > memory: > Run SparkPi with env and mount secrets. *** FAILED *** > "[884]Mi" did not equal "[1408]Mi" (KubernetesSuite.scala:272) > When ran with default settings -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25264) Fix comma-delineated arguments passed into PythonRunner and RRunner
[ https://issues.apache.org/jira/browse/SPARK-25264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilan Filonenko updated SPARK-25264: --- Description: The arguments passed into the PythonRunner and RRunner are comma-delineated. Because the Runners do a arg.slice(2,...) This means that the delineation in the entrypoint needs to be a space, as it would be expected by the Runner arguments. This issue was logged here: [https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/273] was: The arguments passed into the PythonRunner and RRunner are comma-delineated. This issue was logged here: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/273 > Fix comma-delineated arguments passed into PythonRunner and RRunner > --- > > Key: SPARK-25264 > URL: https://issues.apache.org/jira/browse/SPARK-25264 > Project: Spark > Issue Type: Bug > Components: Kubernetes, PySpark >Affects Versions: 2.4.0 >Reporter: Ilan Filonenko >Priority: Major > > The arguments passed into the PythonRunner and RRunner are comma-delineated. > Because the Runners do a arg.slice(2,...) This means that the delineation in > the entrypoint needs to be a space, as it would be expected by the Runner > arguments. > This issue was logged here: > [https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/273] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-25264) Fix comma-delineated arguments passed into PythonRunner and RRunner
Ilan Filonenko created SPARK-25264: -- Summary: Fix comma-delineated arguments passed into PythonRunner and RRunner Key: SPARK-25264 URL: https://issues.apache.org/jira/browse/SPARK-25264 Project: Spark Issue Type: Bug Components: Kubernetes, PySpark Affects Versions: 2.4.0 Reporter: Ilan Filonenko The arguments passed into the PythonRunner and RRunner are comma-delineated. This issue was logged here: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/273 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24736) --py-files not functional for non local URLs. It appears to pass non-local URL's into PYTHONPATH directly.
[ https://issues.apache.org/jira/browse/SPARK-24736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578882#comment-16578882 ] Ilan Filonenko commented on SPARK-24736: The URL, until a resource-staging-server is setup will be unable to resolve the file location unless you use SparkFiles.get(file_name)` in your application. As such, using a URL in the --py-files will be unresolved. Thus, remote dependencies won't be supported by --py-files just yet, but we can support local files. > --py-files not functional for non local URLs. It appears to pass non-local > URL's into PYTHONPATH directly. > -- > > Key: SPARK-24736 > URL: https://issues.apache.org/jira/browse/SPARK-24736 > Project: Spark > Issue Type: Bug > Components: Kubernetes, PySpark >Affects Versions: 2.4.0 > Environment: Recent 2.4.0 from master branch, submitted on Linux to a > KOPS Kubernetes cluster created on AWS. > >Reporter: Jonathan A Weaver >Priority: Minor > > My spark-submit > bin/spark-submit \ > --master > k8s://[https://internal-api-test-k8s-local-7afed8-796273878.us-east-1.elb.amazonaws.com|https://internal-api-test-k8s-local-7afed8-796273878.us-east-1.elb.amazonaws.com/] > \ > --deploy-mode cluster \ > --name pytest \ > --conf > spark.kubernetes.container.image=[412834075398.dkr.ecr.us-east-1.amazonaws.com/fids/pyspark-k8s:latest|http://412834075398.dkr.ecr.us-east-1.amazonaws.com/fids/pyspark-k8s:latest] > \ > --conf > [spark.kubernetes.driver.pod.name|http://spark.kubernetes.driver.pod.name/]=spark-pi-driver > \ > --conf > spark.kubernetes.authenticate.submission.caCertFile=[cluster.ca|http://cluster.ca/] > \ > --conf spark.kubernetes.authenticate.submission.oauthToken=$TOK \ > --conf spark.kubernetes.authenticate.driver.oauthToken=$TOK \ > --py-files "[https://s3.amazonaws.com/maxar-ids-fids/screw.zip]"; \ > [https://s3.amazonaws.com/maxar-ids-fids/it.py] > > *screw.zip is successfully downloaded and placed in SparkFIles.getRootPath()* > 2018-07-01 07:33:43 INFO SparkContext:54 - Added file > [https://s3.amazonaws.com/maxar-ids-fids/screw.zip] at > [https://s3.amazonaws.com/maxar-ids-fids/screw.zip] with timestamp > 1530430423297 > 2018-07-01 07:33:43 INFO Utils:54 - Fetching > [https://s3.amazonaws.com/maxar-ids-fids/screw.zip] to > /var/data/spark-7aba748d-2bba-4015-b388-c2ba9adba81e/spark-0ed5a100-6efa-45ca-ad4c-d1e57af76ffd/userFiles-a053206e-33d9-4245-b587-f8ac26d4c240/fetchFileTemp1549645948768432992.tmp > *I print out the PYTHONPATH and PYSPARK_FILES environment variables from the > driver script:* > PYTHONPATH > /opt/spark/python/lib/pyspark.zip:/opt/spark/python/lib/py4j-0.10.7-src.zip:/opt/spark/jars/spark-core_2.11-2.4.0-SNAPSHOT.jar:/opt/spark/python/lib/pyspark.zip:/opt/spark/python/lib/py4j-*.zip:*[https://s3.amazonaws.com/maxar-ids-fids/screw.zip]* > PYSPARK_FILES [https://s3.amazonaws.com/maxar-ids-fids/screw.zip] > > *I print out sys.path* > ['/tmp/spark-fec3684b-8b63-4f43-91a4-2f2fa41a1914', > u'/var/data/spark-7aba748d-2bba-4015-b388-c2ba9adba81e/spark-0ed5a100-6efa-45ca-ad4c-d1e57af76ffd/userFiles-a053206e-33d9-4245-b587-f8ac26d4c240', > '/opt/spark/python/lib/pyspark.zip', > '/opt/spark/python/lib/py4j-0.10.7-src.zip', > '/opt/spark/jars/spark-core_2.11-2.4.0-SNAPSHOT.jar', > '/opt/spark/python/lib/py4j-*.zip', *'/opt/spark/work-dir/https', > '//[s3.amazonaws.com/maxar-ids-fids/screw.zip|http://s3.amazonaws.com/maxar-ids-fids/screw.zip]',* > '/usr/lib/python27.zip', '/usr/lib/python2.7', > '/usr/lib/python2.7/plat-linux2', '/usr/lib/python2.7/lib-tk', > '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', > '/usr/lib/python2.7/site-packages'] > > *URL from PYTHONFILES gets placed in sys.path verbatim with obvious results.* > > *Dump of spark config from container.* > Spark config dumped: > [(u'spark.master', > u'k8s://[https://internal-api-test-k8s-local-7afed8-796273878.us-east-1.elb.amazonaws.com|https://internal-api-test-k8s-local-7afed8-796273878.us-east-1.elb.amazonaws.com/]'), > (u'spark.kubernetes.authenticate.submission.oauthToken', > u''), > (u'spark.kubernetes.authenticate.driver.oauthToken', > u''), (u'spark.kubernetes.executor.podNamePrefix', > u'pytest-1530430411996'), (u'spark.kubernetes.memoryOverheadFactor', u'0.4'), > (u'spark.driver.blockManager.port', u'7079'), > (u'[spark.app.id|http://spark.app.id/]', u'spark-application-1530430424433'), > (u'[spark.app.name|http://spark.app.name/]', u'pytest'), > (u'[spark.executor.id|http://spark.executor.id/]', u'driver'), > (u'spark.driver.host', u'pytest-1530430411996-driver-svc.default.svc'), > (u'spark.kubernetes.container.
[jira] [Closed] (SPARK-23984) PySpark Bindings for K8S
[ https://issues.apache.org/jira/browse/SPARK-23984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilan Filonenko closed SPARK-23984. -- > PySpark Bindings for K8S > > > Key: SPARK-23984 > URL: https://issues.apache.org/jira/browse/SPARK-23984 > Project: Spark > Issue Type: New Feature > Components: Kubernetes, PySpark >Affects Versions: 2.3.0 >Reporter: Ilan Filonenko >Priority: Major > Fix For: 2.4.0 > > > This ticket is tracking the ongoing work of moving the upsteam work from > [https://github.com/apache-spark-on-k8s/spark] specifically regarding Python > bindings for Spark on Kubernetes. > The points of focus are: dependency management, increased non-JVM memory > overhead default values, and modified Docker images to include Python > Support. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23984) PySpark Bindings for K8S
[ https://issues.apache.org/jira/browse/SPARK-23984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilan Filonenko updated SPARK-23984: --- Summary: PySpark Bindings for K8S (was: PySpark Bindings) > PySpark Bindings for K8S > > > Key: SPARK-23984 > URL: https://issues.apache.org/jira/browse/SPARK-23984 > Project: Spark > Issue Type: New Feature > Components: Kubernetes, PySpark >Affects Versions: 2.3.0 >Reporter: Ilan Filonenko >Priority: Major > > This ticket is tracking the ongoing work of moving the upsteam work from > [https://github.com/apache-spark-on-k8s/spark] specifically regarding Python > bindings for Spark on Kubernetes. > The points of focus are: dependency management, increased non-JVM memory > overhead default values, and modified Docker images to include Python > Support. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23984) PySpark Bindings
[ https://issues.apache.org/jira/browse/SPARK-23984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilan Filonenko updated SPARK-23984: --- Shepherd: (was: Holden Karau) > PySpark Bindings > > > Key: SPARK-23984 > URL: https://issues.apache.org/jira/browse/SPARK-23984 > Project: Spark > Issue Type: New Feature > Components: Kubernetes, PySpark >Affects Versions: 2.3.0 >Reporter: Ilan Filonenko >Priority: Major > > This ticket is tracking the ongoing work of moving the upsteam work from > [https://github.com/apache-spark-on-k8s/spark] specifically regarding Python > bindings for Spark on Kubernetes. > The points of focus are: dependency management, increased non-JVM memory > overhead default values, and modified Docker images to include Python > Support. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23984) PySpark Bindings
Ilan Filonenko created SPARK-23984: -- Summary: PySpark Bindings Key: SPARK-23984 URL: https://issues.apache.org/jira/browse/SPARK-23984 Project: Spark Issue Type: New Feature Components: Kubernetes, PySpark Affects Versions: 2.3.0 Reporter: Ilan Filonenko This ticket is tracking the ongoing work of moving the upsteam work from [https://github.com/apache-spark-on-k8s/spark] specifically regarding Python bindings for Spark on Kubernetes. The points of focus are: dependency management, increased non-JVM memory overhead default values, and modified Docker images to include Python Support. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23257) Implement Kerberos Support in Kubernetes resource manager
[ https://issues.apache.org/jira/browse/SPARK-23257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347303#comment-16347303 ] Ilan Filonenko commented on SPARK-23257: Hi [~RJKeevil], after the above PR is handled and my HDFS test for kerberos is in the integration testing repo, I will begin upstreaming the work here. Would love another pair of eyes in the review of this upstreaming process as I will be opening up a PR soon. > Implement Kerberos Support in Kubernetes resource manager > - > > Key: SPARK-23257 > URL: https://issues.apache.org/jira/browse/SPARK-23257 > Project: Spark > Issue Type: Wish > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Rob Keevil >Priority: Major > > On the forked k8s branch of Spark at > [https://github.com/apache-spark-on-k8s/spark/pull/540] , Kerberos support > has been added to the Kubernetes resource manager. The Kubernetes code > between these two repositories appears to have diverged, so this commit > cannot be merged in easily. Are there any plans to re-implement this work on > the main Spark repository? > > [ifilonenko|https://github.com/ifilonenko] [~liyinan926] I am happy to help > with the development and testing of this, but i wanted to confirm that this > isn't already in progress - I could not find any discussion about this > specific topic online. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org