[jira] [Comment Edited] (SPARK-34738) Upgrade Minikube and kubernetes cluster version on Jenkins

2021-04-08 Thread Shane Knapp (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17317418#comment-17317418
 ] 

Shane Knapp edited comment on SPARK-34738 at 4/8/21, 6:40 PM:
--

{noformat}
Given path (/opt/spark/pv-tests/tmp5813668424419880732.txt) does not exist
{noformat}
this is supposed to be mounted in the minikube cluster, not on the bare metal.

> But sth is mounted to there:
{noformat}
 Mounts:
  /opt/spark/conf from spark-conf-volume-driver (rw)
  /opt/spark/pv-tests from data (rw)
...
{noformat}
> But my guess this is not the one which connects the host path 
> "PVC_TESTS_HOST_PATH" with the internal minikube mounted path (the one which 
> goes further to the driver/executor). This is why the locally created file is 
> missing.

 

it's properly mounting the local (bare metal) filesystem, as is able to create 
the file.  see:  
https://issues.apache.org/jira/browse/SPARK-34738?focusedCommentId=17312548=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17312548


was (Author: shaneknapp):
{noformat}
Given path (/opt/spark/pv-tests/tmp5813668424419880732.txt) does not exist
{noformat}
this is supposed to be mounted in the minikube cluster, not on the bare metal.

> But sth is mounted to there:
{noformat}
 Mounts:
  /opt/spark/conf from spark-conf-volume-driver (rw)
  /opt/spark/pv-tests from data (rw)
...
{noformat}
> But my guess this is not the one which connects the host path 
> "PVC_TESTS_HOST_PATH" with the internal minikube mounted path (the one which 
> goes further to the driver/executor). This is why the locally created file is 
> missing.

 

it's properly mounting the local (bare metal) filesystem.  see:  
https://issues.apache.org/jira/browse/SPARK-34738?focusedCommentId=17312548=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17312548

> Upgrade Minikube and kubernetes cluster version on Jenkins
> --
>
> Key: SPARK-34738
> URL: https://issues.apache.org/jira/browse/SPARK-34738
> Project: Spark
>  Issue Type: Task
>  Components: jenkins, Kubernetes
>Affects Versions: 3.2.0
>Reporter: Attila Zsolt Piros
>Assignee: Shane Knapp
>Priority: Major
> Attachments: integration-tests.log
>
>
> [~shaneknapp] as we discussed [on the mailing 
> list|http://apache-spark-developers-list.1001551.n3.nabble.com/minikube-and-kubernetes-cluster-versions-for-integration-testing-td30856.html]
>  Minikube can be upgraded to the latest (v1.18.1) and kubernetes version 
> should be v1.17.3 (`minikube config set kubernetes-version v1.17.3`).
> [Here|https://github.com/apache/spark/pull/31829] is my PR which uses a new 
> method to configure the kubernetes client. Thanks in advance to use it for 
> testing on the Jenkins after the Minikube version is updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-34738) Upgrade Minikube and kubernetes cluster version on Jenkins

2021-04-08 Thread Attila Zsolt Piros (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17317381#comment-17317381
 ] 

Attila Zsolt Piros edited comment on SPARK-34738 at 4/8/21, 6:02 PM:
-

No worries. 

I have a guess. Check 
"resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PVTestsSuite.scala"
there is a match expression to select the node for mount:

{noformat}
  .withMatchExpressions(new NodeSelectorRequirementBuilder()
  .withKey("kubernetes.io/hostname")
  .withOperator("In")
  .withValues("minikube", "m01", "docker-for-desktop", 
"docker-desktop")
  .build()) 
{noformat}

This is very suspicious. I mean to see docker-desktop and docker-for-desktop 
listed there which I assume is behind the docker driver on Mac but on those 
server I doubt docker desktop provides the docker. 


was (Author: attilapiros):
No worries. 

I have guess. Check 
"resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PVTestsSuite.scala"
there is a match expression to select the node for mount:

{noformat}
  .withMatchExpressions(new NodeSelectorRequirementBuilder()
  .withKey("kubernetes.io/hostname")
  .withOperator("In")
  .withValues("minikube", "m01", "docker-for-desktop", 
"docker-desktop")
  .build()) 
{noformat}

This is very suspicious. 

> Upgrade Minikube and kubernetes cluster version on Jenkins
> --
>
> Key: SPARK-34738
> URL: https://issues.apache.org/jira/browse/SPARK-34738
> Project: Spark
>  Issue Type: Task
>  Components: jenkins, Kubernetes
>Affects Versions: 3.2.0
>Reporter: Attila Zsolt Piros
>Assignee: Shane Knapp
>Priority: Major
> Attachments: integration-tests.log
>
>
> [~shaneknapp] as we discussed [on the mailing 
> list|http://apache-spark-developers-list.1001551.n3.nabble.com/minikube-and-kubernetes-cluster-versions-for-integration-testing-td30856.html]
>  Minikube can be upgraded to the latest (v1.18.1) and kubernetes version 
> should be v1.17.3 (`minikube config set kubernetes-version v1.17.3`).
> [Here|https://github.com/apache/spark/pull/31829] is my PR which uses a new 
> method to configure the kubernetes client. Thanks in advance to use it for 
> testing on the Jenkins after the Minikube version is updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-34738) Upgrade Minikube and kubernetes cluster version on Jenkins

2021-04-08 Thread Shane Knapp (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17317334#comment-17317334
 ] 

Shane Knapp edited comment on SPARK-34738 at 4/8/21, 4:45 PM:
--

done (attached to the issue)

also, it's been so long since i've had to debug this stuff that i'd forgotten 
about those logs...  :facepalm:  :)


was (Author: shaneknapp):
done (attached to the issue)

> Upgrade Minikube and kubernetes cluster version on Jenkins
> --
>
> Key: SPARK-34738
> URL: https://issues.apache.org/jira/browse/SPARK-34738
> Project: Spark
>  Issue Type: Task
>  Components: jenkins, Kubernetes
>Affects Versions: 3.2.0
>Reporter: Attila Zsolt Piros
>Assignee: Shane Knapp
>Priority: Major
> Attachments: integration-tests.log
>
>
> [~shaneknapp] as we discussed [on the mailing 
> list|http://apache-spark-developers-list.1001551.n3.nabble.com/minikube-and-kubernetes-cluster-versions-for-integration-testing-td30856.html]
>  Minikube can be upgraded to the latest (v1.18.1) and kubernetes version 
> should be v1.17.3 (`minikube config set kubernetes-version v1.17.3`).
> [Here|https://github.com/apache/spark/pull/31829] is my PR which uses a new 
> method to configure the kubernetes client. Thanks in advance to use it for 
> testing on the Jenkins after the Minikube version is updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-34738) Upgrade Minikube and kubernetes cluster version on Jenkins

2021-03-31 Thread Shane Knapp (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312696#comment-17312696
 ] 

Shane Knapp edited comment on SPARK-34738 at 3/31/21, 8:38 PM:
---

managed to snag the logs from the pod when it errored out:
{code:java}
++ id -u
+ myuid=185
++ id -g
+ mygid=0
+ set +e
++ getent passwd 185
+ uidentry=
+ set -e
+ '[' -z '' ']'
+ '[' -w /etc/passwd ']'
+ echo '185:x:185:0:anonymous uid:/opt/spark:/bin/false'
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ env
+ grep SPARK_JAVA_OPT_
+ sort -t_ -k4 -n
+ sed 's/[^=]*=\(.*\)/\1/g'
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' -z ']'
+ '[' -z ']'
+ '[' -n '' ']'
+ '[' -z ']'
+ '[' -z x ']'
+ SPARK_CLASSPATH='/opt/spark/conf::/opt/spark/jars/*'
+ case "$1" in
+ shift 1
+ CMD=("$SPARK_HOME/bin/spark-submit" --conf 
"spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
+ exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf 
spark.driver.bindAddress=172.17.0.3 --deploy-mode client --properties-file 
/opt/spark/conf/spark.properties --class 
org.apache.spark.examples.DFSReadWriteTest 
local:///opt/spark/examples/jars/spark-examples_2.12-3.2.0-SNAPSHOT.jar 
/opt/spark/pv-tests/tmp4595937990978494271.txt /opt/spark/pv-tests
21/03/31 20:26:24 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
Given path (/opt/spark/pv-tests/tmp4595937990978494271.txt) does not exist
DFS Read-Write Test
Usage: localFile dfsDir
localFile - (string) local file to use in test
dfsDir - (string) DFS directory for read/write tests
log4j:WARN No appenders could be found for logger 
(org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.{code}
this def caught my eye:  Given path 
(/opt/spark/pv-tests/tmp4595937990978494271.txt) does not exist

i sshed in to the cluster, was able (again) to confirm that mk was able to 
mount the PVC test dir on that worker in /tmp, and that the file 
tmp4595937990978494271.txt was visible and readable from within mk...  however 
/opt/spark/pv-tests/ wasn't visible within the mk cluster.


was (Author: shaneknapp):
managed to snag the logs from the pod when it errored out:
{code:java}
++ id -u
+ myuid=185
++ id -g
+ mygid=0
+ set +e
++ getent passwd 185
+ uidentry=
+ set -e
+ '[' -z '' ']'
+ '[' -w /etc/passwd ']'
+ echo '185:x:185:0:anonymous uid:/opt/spark:/bin/false'
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ env
+ grep SPARK_JAVA_OPT_
+ sort -t_ -k4 -n
+ sed 's/[^=]*=\(.*\)/\1/g'
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' -z ']'
+ '[' -z ']'
+ '[' -n '' ']'
+ '[' -z ']'
+ '[' -z x ']'
+ SPARK_CLASSPATH='/opt/spark/conf::/opt/spark/jars/*'
+ case "$1" in
+ shift 1
+ CMD=("$SPARK_HOME/bin/spark-submit" --conf 
"spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
+ exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf 
spark.driver.bindAddress=172.17.0.3 --deploy-mode client --properties-file 
/opt/spark/conf/spark.properties --class 
org.apache.spark.examples.DFSReadWriteTest 
local:///opt/spark/examples/jars/spark-examples_2.12-3.2.0-SNAPSHOT.jar 
/opt/spark/pv-tests/tmp4595937990978494271.txt /opt/spark/pv-tests
21/03/31 20:26:24 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
Given path (/opt/spark/pv-tests/tmp4595937990978494271.txt) does not exist
DFS Read-Write Test
Usage: localFile dfsDir
localFile - (string) local file to use in test
dfsDir - (string) DFS directory for read/write tests
log4j:WARN No appenders could be found for logger 
(org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.{code}
this def caught my eye:  Given path 
(/opt/spark/pv-tests/tmp4595937990978494271.txt) does not exist

i sshed in to the cluster, was able (again) to confirm that mk was able to 
mount the PVC test dir on that worker, and that the file 
tmp4595937990978494271.txt was visible and readable from within mk...

> Upgrade Minikube and kubernetes cluster version on Jenkins
> --
>
> Key: SPARK-34738
> URL: https://issues.apache.org/jira/browse/SPARK-34738
> Project: Spark
>  Issue Type: Task
>  Components: jenkins, Kubernetes
>Affects Versions: 3.2.0
>Reporter: Attila Zsolt Piros
>Assignee: Shane Knapp
>Priority: Major
>
> [~shaneknapp] as we discussed [on the mailing 
> 

[jira] [Comment Edited] (SPARK-34738) Upgrade Minikube and kubernetes cluster version on Jenkins

2021-03-31 Thread Shane Knapp (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312517#comment-17312517
 ] 

Shane Knapp edited comment on SPARK-34738 at 3/31/21, 4:43 PM:
---

alright, sometimes these things go smoothly, sometimes not.

this is firmly in the 'not' camp.

after upgrading minikube and k8s, i was unable to mount a persistent volume 
when using the kvm2 driver.  much debugging ensued.  no progress was made and 
the error reported was that the minikube pod was unable to connect to the 
localhost and mount (Connection refused).

so, i decided to randomly try the docker minikube driver.  voila!  i'm now able 
to happily mount persistent volumes.

however, when running the k8s integration test, everything passes *except* the 
PVs w/local storage.

from [https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-k8s-clone:]
{code:java}
- PVs with local storage *** FAILED ***
 The code passed to eventually never returned normally. Attempted 179 times 
over 3.00242447046 minutes. Last failure message: container not found 
("spark-kubernetes-driver"). (PVTestsSuite.scala:117){code}
i've never seen this error before, and apparently there aren't many things 

here's how we launch minikube and create the mount:
{code:java}
minikube --vm-driver=docker start --memory 6000 --cpus 8
minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} 
--9p-version=9p2000.L &
{code}
we're using ZFS on the bare metal, and minikube is complaining:
{code:java}
! docker is currently using the zfs storage driver, consider switching to 
overlay2 for better performance{code}
i'll continue to dig in to this today, but i'm currently blocked...


was (Author: shaneknapp):
alright, sometimes these things go smoothly, sometimes not.

this is firmly in the 'not' camp.

after upgrading minikube and k8s, i was unable to mount a persistent volume 
when using the kvm2 driver.  much debugging ensued.  no progress was made.

so, i decided to randomly try the docker minikube driver.  voila!  i'm now able 
to happily mount persistent volumes.

however, when running the k8s integration test, everything passes *except* the 
PVs w/local storage.

from [https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-k8s-clone:]
{code:java}
- PVs with local storage *** FAILED ***
 The code passed to eventually never returned normally. Attempted 179 times 
over 3.00242447046 minutes. Last failure message: container not found 
("spark-kubernetes-driver"). (PVTestsSuite.scala:117){code}
i've never seen this error before, and apparently there aren't many things 

here's how we launch minikube and create the mount:
{code:java}
minikube --vm-driver=docker start --memory 6000 --cpus 8
minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} 
--9p-version=9p2000.L &
{code}
we're using ZFS on the bare metal, and minikube is complaining:
{code:java}
! docker is currently using the zfs storage driver, consider switching to 
overlay2 for better performance{code}
i'll continue to dig in to this today, but i'm currently blocked...

> Upgrade Minikube and kubernetes cluster version on Jenkins
> --
>
> Key: SPARK-34738
> URL: https://issues.apache.org/jira/browse/SPARK-34738
> Project: Spark
>  Issue Type: Task
>  Components: jenkins, Kubernetes
>Affects Versions: 3.2.0
>Reporter: Attila Zsolt Piros
>Assignee: Shane Knapp
>Priority: Major
>
> [~shaneknapp] as we discussed [on the mailing 
> list|http://apache-spark-developers-list.1001551.n3.nabble.com/minikube-and-kubernetes-cluster-versions-for-integration-testing-td30856.html]
>  Minikube can be upgraded to the latest (v1.18.1) and kubernetes version 
> should be v1.17.3 (`minikube config set kubernetes-version v1.17.3`).
> [Here|https://github.com/apache/spark/pull/31829] is my PR which uses a new 
> method to configure the kubernetes client. Thanks in advance to use it for 
> testing on the Jenkins after the Minikube version is updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-34738) Upgrade Minikube and kubernetes cluster version on Jenkins

2021-03-31 Thread Shane Knapp (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312517#comment-17312517
 ] 

Shane Knapp edited comment on SPARK-34738 at 3/31/21, 4:20 PM:
---

alright, sometimes these things go smoothly, sometimes not.

this is firmly in the 'not' camp.

after upgrading minikube and k8s, i was unable to mount a persistent volume 
when using the kvm2 driver.  much debugging ensued.  no progress was made.

so, i decided to randomly try the docker minikube driver.  voila!  i'm now able 
to happily mount persistent volumes.

however, when running the k8s integration test, everything passes *except* the 
PVs w/local storage.

from [https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-k8s-clone:]
{code:java}
- PVs with local storage *** FAILED ***
 The code passed to eventually never returned normally. Attempted 179 times 
over 3.00242447046 minutes. Last failure message: container not found 
("spark-kubernetes-driver"). (PVTestsSuite.scala:117){code}
i've never seen this error before, and apparently there aren't many things 

here's how we launch minikube and create the mount:
{code:java}
minikube --vm-driver=docker start --memory 6000 --cpus 8
minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} 
--9p-version=9p2000.L &
{code}
we're using ZFS on the bare metal, and minikube is complaining:
{code:java}
! docker is currently using the zfs storage driver, consider switching to 
overlay2 for better performance{code}
i'll continue to dig in to this today, but i'm currently blocked...


was (Author: shaneknapp):
alright, sometimes these things go smoothly, sometimes not.

this is firmly in the 'not' camp.

after upgrading minikube and k8s, i was unable to mount a persistent volume 
when using the kvm2 driver.  much debugging ensued.  no progress was made.

so, i decided to randomly try the docker minikube driver.  voila!  i'm now able 
to happily mount persistent volumes.

however, when running the k8s integration test, everything passes *except* the 
PVs w/local storage.

from https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-k8s-clone:
{code:java}
- PVs with local storage *** FAILED ***
 The code passed to eventually never returned normally. Attempted 179 times 
over 3.00242447046 minutes. Last failure message: container not found 
("spark-kubernetes-driver"). (PVTestsSuite.scala:117){code}
i've never seen this error before, and apparently there aren't many things 

here's how we launch minikube and create the mount:

 
{code:java}
minikube --vm-driver=docker start --memory 6000 --cpus 8
minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} 
--9p-version=9p2000.L &
{code}
i'll continue to dig in to this today, but i'm currently blocked...

> Upgrade Minikube and kubernetes cluster version on Jenkins
> --
>
> Key: SPARK-34738
> URL: https://issues.apache.org/jira/browse/SPARK-34738
> Project: Spark
>  Issue Type: Task
>  Components: jenkins, Kubernetes
>Affects Versions: 3.2.0
>Reporter: Attila Zsolt Piros
>Assignee: Shane Knapp
>Priority: Major
>
> [~shaneknapp] as we discussed [on the mailing 
> list|http://apache-spark-developers-list.1001551.n3.nabble.com/minikube-and-kubernetes-cluster-versions-for-integration-testing-td30856.html]
>  Minikube can be upgraded to the latest (v1.18.1) and kubernetes version 
> should be v1.17.3 (`minikube config set kubernetes-version v1.17.3`).
> [Here|https://github.com/apache/spark/pull/31829] is my PR which uses a new 
> method to configure the kubernetes client. Thanks in advance to use it for 
> testing on the Jenkins after the Minikube version is updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org