[jira] [Comment Edited] (SPARK-34738) Upgrade Minikube and kubernetes cluster version on Jenkins
[ https://issues.apache.org/jira/browse/SPARK-34738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317418#comment-17317418 ] Shane Knapp edited comment on SPARK-34738 at 4/8/21, 6:40 PM: -- {noformat} Given path (/opt/spark/pv-tests/tmp5813668424419880732.txt) does not exist {noformat} this is supposed to be mounted in the minikube cluster, not on the bare metal. > But sth is mounted to there: {noformat} Mounts: /opt/spark/conf from spark-conf-volume-driver (rw) /opt/spark/pv-tests from data (rw) ... {noformat} > But my guess this is not the one which connects the host path > "PVC_TESTS_HOST_PATH" with the internal minikube mounted path (the one which > goes further to the driver/executor). This is why the locally created file is > missing. it's properly mounting the local (bare metal) filesystem, as is able to create the file. see: https://issues.apache.org/jira/browse/SPARK-34738?focusedCommentId=17312548&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17312548 was (Author: shaneknapp): {noformat} Given path (/opt/spark/pv-tests/tmp5813668424419880732.txt) does not exist {noformat} this is supposed to be mounted in the minikube cluster, not on the bare metal. > But sth is mounted to there: {noformat} Mounts: /opt/spark/conf from spark-conf-volume-driver (rw) /opt/spark/pv-tests from data (rw) ... {noformat} > But my guess this is not the one which connects the host path > "PVC_TESTS_HOST_PATH" with the internal minikube mounted path (the one which > goes further to the driver/executor). This is why the locally created file is > missing. it's properly mounting the local (bare metal) filesystem. see: https://issues.apache.org/jira/browse/SPARK-34738?focusedCommentId=17312548&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17312548 > Upgrade Minikube and kubernetes cluster version on Jenkins > -- > > Key: SPARK-34738 > URL: https://issues.apache.org/jira/browse/SPARK-34738 > Project: Spark > Issue Type: Task > Components: jenkins, Kubernetes >Affects Versions: 3.2.0 >Reporter: Attila Zsolt Piros >Assignee: Shane Knapp >Priority: Major > Attachments: integration-tests.log > > > [~shaneknapp] as we discussed [on the mailing > list|http://apache-spark-developers-list.1001551.n3.nabble.com/minikube-and-kubernetes-cluster-versions-for-integration-testing-td30856.html] > Minikube can be upgraded to the latest (v1.18.1) and kubernetes version > should be v1.17.3 (`minikube config set kubernetes-version v1.17.3`). > [Here|https://github.com/apache/spark/pull/31829] is my PR which uses a new > method to configure the kubernetes client. Thanks in advance to use it for > testing on the Jenkins after the Minikube version is updated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-34738) Upgrade Minikube and kubernetes cluster version on Jenkins
[ https://issues.apache.org/jira/browse/SPARK-34738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317381#comment-17317381 ] Attila Zsolt Piros edited comment on SPARK-34738 at 4/8/21, 6:02 PM: - No worries. I have a guess. Check "resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PVTestsSuite.scala" there is a match expression to select the node for mount: {noformat} .withMatchExpressions(new NodeSelectorRequirementBuilder() .withKey("kubernetes.io/hostname") .withOperator("In") .withValues("minikube", "m01", "docker-for-desktop", "docker-desktop") .build()) {noformat} This is very suspicious. I mean to see docker-desktop and docker-for-desktop listed there which I assume is behind the docker driver on Mac but on those server I doubt docker desktop provides the docker. was (Author: attilapiros): No worries. I have guess. Check "resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PVTestsSuite.scala" there is a match expression to select the node for mount: {noformat} .withMatchExpressions(new NodeSelectorRequirementBuilder() .withKey("kubernetes.io/hostname") .withOperator("In") .withValues("minikube", "m01", "docker-for-desktop", "docker-desktop") .build()) {noformat} This is very suspicious. > Upgrade Minikube and kubernetes cluster version on Jenkins > -- > > Key: SPARK-34738 > URL: https://issues.apache.org/jira/browse/SPARK-34738 > Project: Spark > Issue Type: Task > Components: jenkins, Kubernetes >Affects Versions: 3.2.0 >Reporter: Attila Zsolt Piros >Assignee: Shane Knapp >Priority: Major > Attachments: integration-tests.log > > > [~shaneknapp] as we discussed [on the mailing > list|http://apache-spark-developers-list.1001551.n3.nabble.com/minikube-and-kubernetes-cluster-versions-for-integration-testing-td30856.html] > Minikube can be upgraded to the latest (v1.18.1) and kubernetes version > should be v1.17.3 (`minikube config set kubernetes-version v1.17.3`). > [Here|https://github.com/apache/spark/pull/31829] is my PR which uses a new > method to configure the kubernetes client. Thanks in advance to use it for > testing on the Jenkins after the Minikube version is updated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-34738) Upgrade Minikube and kubernetes cluster version on Jenkins
[ https://issues.apache.org/jira/browse/SPARK-34738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17317334#comment-17317334 ] Shane Knapp edited comment on SPARK-34738 at 4/8/21, 4:45 PM: -- done (attached to the issue) also, it's been so long since i've had to debug this stuff that i'd forgotten about those logs... :facepalm: :) was (Author: shaneknapp): done (attached to the issue) > Upgrade Minikube and kubernetes cluster version on Jenkins > -- > > Key: SPARK-34738 > URL: https://issues.apache.org/jira/browse/SPARK-34738 > Project: Spark > Issue Type: Task > Components: jenkins, Kubernetes >Affects Versions: 3.2.0 >Reporter: Attila Zsolt Piros >Assignee: Shane Knapp >Priority: Major > Attachments: integration-tests.log > > > [~shaneknapp] as we discussed [on the mailing > list|http://apache-spark-developers-list.1001551.n3.nabble.com/minikube-and-kubernetes-cluster-versions-for-integration-testing-td30856.html] > Minikube can be upgraded to the latest (v1.18.1) and kubernetes version > should be v1.17.3 (`minikube config set kubernetes-version v1.17.3`). > [Here|https://github.com/apache/spark/pull/31829] is my PR which uses a new > method to configure the kubernetes client. Thanks in advance to use it for > testing on the Jenkins after the Minikube version is updated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-34738) Upgrade Minikube and kubernetes cluster version on Jenkins
[ https://issues.apache.org/jira/browse/SPARK-34738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312696#comment-17312696 ] Shane Knapp edited comment on SPARK-34738 at 3/31/21, 8:38 PM: --- managed to snag the logs from the pod when it errored out: {code:java} ++ id -u + myuid=185 ++ id -g + mygid=0 + set +e ++ getent passwd 185 + uidentry= + set -e + '[' -z '' ']' + '[' -w /etc/passwd ']' + echo '185:x:185:0:anonymous uid:/opt/spark:/bin/false' + SPARK_CLASSPATH=':/opt/spark/jars/*' + env + grep SPARK_JAVA_OPT_ + sort -t_ -k4 -n + sed 's/[^=]*=\(.*\)/\1/g' + readarray -t SPARK_EXECUTOR_JAVA_OPTS + '[' -n '' ']' + '[' -z ']' + '[' -z ']' + '[' -n '' ']' + '[' -z ']' + '[' -z x ']' + SPARK_CLASSPATH='/opt/spark/conf::/opt/spark/jars/*' + case "$1" in + shift 1 + CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@") + exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=172.17.0.3 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.DFSReadWriteTest local:///opt/spark/examples/jars/spark-examples_2.12-3.2.0-SNAPSHOT.jar /opt/spark/pv-tests/tmp4595937990978494271.txt /opt/spark/pv-tests 21/03/31 20:26:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Given path (/opt/spark/pv-tests/tmp4595937990978494271.txt) does not exist DFS Read-Write Test Usage: localFile dfsDir localFile - (string) local file to use in test dfsDir - (string) DFS directory for read/write tests log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.{code} this def caught my eye: Given path (/opt/spark/pv-tests/tmp4595937990978494271.txt) does not exist i sshed in to the cluster, was able (again) to confirm that mk was able to mount the PVC test dir on that worker in /tmp, and that the file tmp4595937990978494271.txt was visible and readable from within mk... however /opt/spark/pv-tests/ wasn't visible within the mk cluster. was (Author: shaneknapp): managed to snag the logs from the pod when it errored out: {code:java} ++ id -u + myuid=185 ++ id -g + mygid=0 + set +e ++ getent passwd 185 + uidentry= + set -e + '[' -z '' ']' + '[' -w /etc/passwd ']' + echo '185:x:185:0:anonymous uid:/opt/spark:/bin/false' + SPARK_CLASSPATH=':/opt/spark/jars/*' + env + grep SPARK_JAVA_OPT_ + sort -t_ -k4 -n + sed 's/[^=]*=\(.*\)/\1/g' + readarray -t SPARK_EXECUTOR_JAVA_OPTS + '[' -n '' ']' + '[' -z ']' + '[' -z ']' + '[' -n '' ']' + '[' -z ']' + '[' -z x ']' + SPARK_CLASSPATH='/opt/spark/conf::/opt/spark/jars/*' + case "$1" in + shift 1 + CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@") + exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=172.17.0.3 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.DFSReadWriteTest local:///opt/spark/examples/jars/spark-examples_2.12-3.2.0-SNAPSHOT.jar /opt/spark/pv-tests/tmp4595937990978494271.txt /opt/spark/pv-tests 21/03/31 20:26:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Given path (/opt/spark/pv-tests/tmp4595937990978494271.txt) does not exist DFS Read-Write Test Usage: localFile dfsDir localFile - (string) local file to use in test dfsDir - (string) DFS directory for read/write tests log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.{code} this def caught my eye: Given path (/opt/spark/pv-tests/tmp4595937990978494271.txt) does not exist i sshed in to the cluster, was able (again) to confirm that mk was able to mount the PVC test dir on that worker, and that the file tmp4595937990978494271.txt was visible and readable from within mk... > Upgrade Minikube and kubernetes cluster version on Jenkins > -- > > Key: SPARK-34738 > URL: https://issues.apache.org/jira/browse/SPARK-34738 > Project: Spark > Issue Type: Task > Components: jenkins, Kubernetes >Affects Versions: 3.2.0 >Reporter: Attila Zsolt Piros >Assignee: Shane Knapp >Priority: Major > > [~shaneknapp] as we discussed [on the mailing > list|http://apache-spark-developers-list.1001551.n3.nabble.com/minikube-and-kubernetes-cluster-v
[jira] [Comment Edited] (SPARK-34738) Upgrade Minikube and kubernetes cluster version on Jenkins
[ https://issues.apache.org/jira/browse/SPARK-34738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312517#comment-17312517 ] Shane Knapp edited comment on SPARK-34738 at 3/31/21, 4:43 PM: --- alright, sometimes these things go smoothly, sometimes not. this is firmly in the 'not' camp. after upgrading minikube and k8s, i was unable to mount a persistent volume when using the kvm2 driver. much debugging ensued. no progress was made and the error reported was that the minikube pod was unable to connect to the localhost and mount (Connection refused). so, i decided to randomly try the docker minikube driver. voila! i'm now able to happily mount persistent volumes. however, when running the k8s integration test, everything passes *except* the PVs w/local storage. from [https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-k8s-clone:] {code:java} - PVs with local storage *** FAILED *** The code passed to eventually never returned normally. Attempted 179 times over 3.00242447046 minutes. Last failure message: container not found ("spark-kubernetes-driver"). (PVTestsSuite.scala:117){code} i've never seen this error before, and apparently there aren't many things here's how we launch minikube and create the mount: {code:java} minikube --vm-driver=docker start --memory 6000 --cpus 8 minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} --9p-version=9p2000.L & {code} we're using ZFS on the bare metal, and minikube is complaining: {code:java} ! docker is currently using the zfs storage driver, consider switching to overlay2 for better performance{code} i'll continue to dig in to this today, but i'm currently blocked... was (Author: shaneknapp): alright, sometimes these things go smoothly, sometimes not. this is firmly in the 'not' camp. after upgrading minikube and k8s, i was unable to mount a persistent volume when using the kvm2 driver. much debugging ensued. no progress was made. so, i decided to randomly try the docker minikube driver. voila! i'm now able to happily mount persistent volumes. however, when running the k8s integration test, everything passes *except* the PVs w/local storage. from [https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-k8s-clone:] {code:java} - PVs with local storage *** FAILED *** The code passed to eventually never returned normally. Attempted 179 times over 3.00242447046 minutes. Last failure message: container not found ("spark-kubernetes-driver"). (PVTestsSuite.scala:117){code} i've never seen this error before, and apparently there aren't many things here's how we launch minikube and create the mount: {code:java} minikube --vm-driver=docker start --memory 6000 --cpus 8 minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} --9p-version=9p2000.L & {code} we're using ZFS on the bare metal, and minikube is complaining: {code:java} ! docker is currently using the zfs storage driver, consider switching to overlay2 for better performance{code} i'll continue to dig in to this today, but i'm currently blocked... > Upgrade Minikube and kubernetes cluster version on Jenkins > -- > > Key: SPARK-34738 > URL: https://issues.apache.org/jira/browse/SPARK-34738 > Project: Spark > Issue Type: Task > Components: jenkins, Kubernetes >Affects Versions: 3.2.0 >Reporter: Attila Zsolt Piros >Assignee: Shane Knapp >Priority: Major > > [~shaneknapp] as we discussed [on the mailing > list|http://apache-spark-developers-list.1001551.n3.nabble.com/minikube-and-kubernetes-cluster-versions-for-integration-testing-td30856.html] > Minikube can be upgraded to the latest (v1.18.1) and kubernetes version > should be v1.17.3 (`minikube config set kubernetes-version v1.17.3`). > [Here|https://github.com/apache/spark/pull/31829] is my PR which uses a new > method to configure the kubernetes client. Thanks in advance to use it for > testing on the Jenkins after the Minikube version is updated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-34738) Upgrade Minikube and kubernetes cluster version on Jenkins
[ https://issues.apache.org/jira/browse/SPARK-34738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312517#comment-17312517 ] Shane Knapp edited comment on SPARK-34738 at 3/31/21, 4:20 PM: --- alright, sometimes these things go smoothly, sometimes not. this is firmly in the 'not' camp. after upgrading minikube and k8s, i was unable to mount a persistent volume when using the kvm2 driver. much debugging ensued. no progress was made. so, i decided to randomly try the docker minikube driver. voila! i'm now able to happily mount persistent volumes. however, when running the k8s integration test, everything passes *except* the PVs w/local storage. from [https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-k8s-clone:] {code:java} - PVs with local storage *** FAILED *** The code passed to eventually never returned normally. Attempted 179 times over 3.00242447046 minutes. Last failure message: container not found ("spark-kubernetes-driver"). (PVTestsSuite.scala:117){code} i've never seen this error before, and apparently there aren't many things here's how we launch minikube and create the mount: {code:java} minikube --vm-driver=docker start --memory 6000 --cpus 8 minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} --9p-version=9p2000.L & {code} we're using ZFS on the bare metal, and minikube is complaining: {code:java} ! docker is currently using the zfs storage driver, consider switching to overlay2 for better performance{code} i'll continue to dig in to this today, but i'm currently blocked... was (Author: shaneknapp): alright, sometimes these things go smoothly, sometimes not. this is firmly in the 'not' camp. after upgrading minikube and k8s, i was unable to mount a persistent volume when using the kvm2 driver. much debugging ensued. no progress was made. so, i decided to randomly try the docker minikube driver. voila! i'm now able to happily mount persistent volumes. however, when running the k8s integration test, everything passes *except* the PVs w/local storage. from https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-k8s-clone: {code:java} - PVs with local storage *** FAILED *** The code passed to eventually never returned normally. Attempted 179 times over 3.00242447046 minutes. Last failure message: container not found ("spark-kubernetes-driver"). (PVTestsSuite.scala:117){code} i've never seen this error before, and apparently there aren't many things here's how we launch minikube and create the mount: {code:java} minikube --vm-driver=docker start --memory 6000 --cpus 8 minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} --9p-version=9p2000.L & {code} i'll continue to dig in to this today, but i'm currently blocked... > Upgrade Minikube and kubernetes cluster version on Jenkins > -- > > Key: SPARK-34738 > URL: https://issues.apache.org/jira/browse/SPARK-34738 > Project: Spark > Issue Type: Task > Components: jenkins, Kubernetes >Affects Versions: 3.2.0 >Reporter: Attila Zsolt Piros >Assignee: Shane Knapp >Priority: Major > > [~shaneknapp] as we discussed [on the mailing > list|http://apache-spark-developers-list.1001551.n3.nabble.com/minikube-and-kubernetes-cluster-versions-for-integration-testing-td30856.html] > Minikube can be upgraded to the latest (v1.18.1) and kubernetes version > should be v1.17.3 (`minikube config set kubernetes-version v1.17.3`). > [Here|https://github.com/apache/spark/pull/31829] is my PR which uses a new > method to configure the kubernetes client. Thanks in advance to use it for > testing on the Jenkins after the Minikube version is updated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org