[spark] branch master updated (d5563f3 -> 4e1e33b)

2021-10-25 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d5563f3  [SPARK-36956][MLLIB] model prediction in .mllib avoid 
conversion to breeze vector
 add 4e1e33b  [SPARK-37011][PYTHON][BUILD] update flake8 on jenkins workers

No new revisions were added by this update.

Summary of changes:
 .../files/python_environments/py36.txt | 49 --
 .../files/python_environments/spark-py36-spec.txt  | 18 
 dev/lint-python|  3 +-
 3 files changed, 10 insertions(+), 60 deletions(-)
 delete mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/files/python_environments/py36.txt

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark-website] branch asf-site updated: updating local k8s/minikube testing instructions

2021-08-02 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new d60611b  updating local k8s/minikube testing instructions
d60611b is described below

commit d60611b5c464db0fb99ca072a9d8b55e824ca7c2
Author: shane knapp 
AuthorDate: Mon Aug 2 10:15:46 2021 -0700

updating local k8s/minikube testing instructions

a small update to the k8s/minikube integration test instructions

Author: shane knapp 

Closes #351 from shaneknapp/updating-k8s-docs.
---
 developer-tools.md| 9 ++---
 site/developer-tools.html | 9 ++---
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/developer-tools.md b/developer-tools.md
index 9551533..bf3ee6e 100644
--- a/developer-tools.md
+++ b/developer-tools.md
@@ -169,8 +169,8 @@ Please check other available options via 
`python/run-tests[-with-coverage] --hel
 
 If you have made changes to the K8S bindings in Apache Spark, it would behoove 
you to test locally before submitting a PR.  This is relatively simple to do, 
but it will require a local (to you) installation of 
[minikube](https://kubernetes.io/docs/setup/minikube/).  Due to how minikube 
interacts with the host system, please be sure to set things up as follows:
 
-- minikube version v1.7.3 (or greater)
-- You must use a VM driver!  Running minikube with the `--vm-driver=none` 
option requires that the user launching minikube/k8s have root access.  Our 
Jenkins workers use the [kvm2](https://minikube.sigs.k8s.io/docs/drivers/kvm2/) 
drivers.  More details [here](https://minikube.sigs.k8s.io/docs/drivers/).
+- minikube version v1.18.1 (or greater)
+- You must use a VM driver!  Running minikube with the `--vm-driver=none` 
option requires that the user launching minikube/k8s have root access, which 
could impact how the tests are run.  Our Jenkins workers use the default 
[docker](https://minikube.sigs.k8s.io/docs/drivers/docker/) drivers.  More 
details [here](https://minikube.sigs.k8s.io/docs/drivers/).
 - kubernetes version v1.17.3 (can be set by executing `minikube config set 
kubernetes-version v1.17.3`)
 - the current kubernetes context must be minikube's default context (called 
'minikube'). This can be selected by `minikube kubectl -- config use-context 
minikube`. This is only needed when after minikube is started another 
kubernetes context is selected.
 
@@ -196,8 +196,11 @@ PVC_TMP_DIR=$(mktemp -d)
 export PVC_TESTS_HOST_PATH=$PVC_TMP_DIR
 export PVC_TESTS_VM_PATH=$PVC_TMP_DIR
 
-minikube --vm-driver= start --memory 6000 --cpus 8
 minikube config set kubernetes-version v1.17.3
+minikube --vm-driver= start --memory 6000 --cpus 8
+
+# for macos only (see https://github.com/apache/spark/pull/32793):
+# minikube ssh "sudo useradd spark -u 185 -g 0 -m -s /bin/bash"
 
 minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} 
--9p-version=9p2000.L --gid=0 --uid=185 &; MOUNT_PID=$!
 
diff --git a/site/developer-tools.html b/site/developer-tools.html
index 5558945..81d64c7 100644
--- a/site/developer-tools.html
+++ b/site/developer-tools.html
@@ -348,8 +348,8 @@ Generating HTML files for PySpark coverage under 
/.../spark/python/test_coverage
 If you have made changes to the K8S bindings in Apache Spark, it would 
behoove you to test locally before submitting a PR.  This is relatively simple 
to do, but it will require a local (to you) installation of https://kubernetes.io/docs/setup/minikube/;>minikube.  Due to how 
minikube interacts with the host system, please be sure to set things up as 
follows:
 
 
-  minikube version v1.7.3 (or greater)
-  You must use a VM driver!  Running minikube with the --vm-driver=none option 
requires that the user launching minikube/k8s have root access.  Our Jenkins 
workers use the https://minikube.sigs.k8s.io/docs/drivers/kvm2/;>kvm2 drivers.  More 
details https://minikube.sigs.k8s.io/docs/drivers/;>here.
+  minikube version v1.18.1 (or greater)
+  You must use a VM driver!  Running minikube with the --vm-driver=none option 
requires that the user launching minikube/k8s have root access, which could 
impact how the tests are run.  Our Jenkins workers use the default https://minikube.sigs.k8s.io/docs/drivers/docker/;>docker drivers.  
More details https://minikube.sigs.k8s.io/docs/drivers/;>here.
   kubernetes version v1.17.3 (can be set by executing minikube config set 
kubernetes-version v1.17.3)
   the current kubernetes context must be minikubes default context 
(called minikube). This can be selected by minikube kubectl -- config 
use-context minikube. This is only needed when after minikube is started 
another kubernetes context is selected.
 
@@ -374,8 +374,11 @@ export 
TARBALL_TO_TEST=($(pwd)/spark-*${DATE}-${REVISION}.tgz)
 export PVC_TESTS_HOST_PATH=$PVC_TMP_DIR
 export PVC_TESTS_VM_PATH=

[spark] branch master updated: [SPARK-35430][K8S] Switch on "PVs with local storage" integration test on Docker driver

2021-08-02 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 7b90fd2  [SPARK-35430][K8S] Switch on "PVs with local storage" 
integration test on Docker driver
7b90fd2 is described below

commit 7b90fd2ca79b9a1fec5fca0bdcc169c7962ad880
Author: attilapiros 
AuthorDate: Mon Aug 2 09:17:29 2021 -0700

[SPARK-35430][K8S] Switch on "PVs with local storage" integration test on 
Docker driver

### What changes were proposed in this pull request?

Switching back the  "PVs with local storage" integration test on Docker 
driver.

I have analyzed why this test was failing on my machine (I hope the root 
cause of the problem is OS agnostic).
It failed because of the mounting of the host directory into the Minikube 
node using the `--uid=185` (Spark user user id):

```
$ minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} 
--9p-version=9p2000.L --gid=0 --uid=185 &; MOUNT_PID=$!
```

Are referring to a nonexistent user. See the the number of occurence of 185 
in "/etc/passwd":

```
$ minikube ssh "grep -c 185 /etc/passwd"
0
```

This leads to a permission denied. Skipping the `--uid=185` won't help 
although the path will listable before the test execution:

```
╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-35430*›
╰─$   Mounting host path 
/var/folders/t_/fr_vqcyx23vftk81ftz1k5hwgn/T/tmp.k9X4Gecv into VM as 
/var/folders/t_/fr_vqcyx23vftk81ftz1k5hwgn/T/tmp.k9X4Gecv ...
▪ Mount type:
▪ User ID:  docker
▪ Group ID: 0
▪ Version:  9p2000.L
▪ Message Size: 262144
▪ Permissions:  755 (-rwxr-xr-x)
▪ Options:  map[]
▪ Bind Address: 127.0.0.1:51740
  Userspace file server: ufs starting

╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-35430*›
╰─$ minikube ssh "ls 
/var/folders/t_/fr_vqcyx23vftk81ftz1k5hwgn/T/tmp.k9X4Gecv"
╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-35430*›
╰─$
```

But the test will fail and after its execution the `dmesg` shows the 
following error:
```
[13670.493359] bpfilter: Loaded bpfilter_umh pid 66153
[13670.493363] bpfilter: write fail -32
[13670.530737] bpfilter: Loaded bpfilter_umh pid 66155
...
```

This `bpfilter` is a firewall module and we are back to a permission denied 
when we want to list the mounted directory.

The solution is to add a spark user with 185 uid when the minikube is 
started.

**So this must be added to Jenkins job (and the mount should use --gid=0 
--uid=185)**:

```
$ minikube ssh "sudo useradd spark -u 185 -g 0 -m -s /bin/bash"
```

### Why are the changes needed?

This integration test is needed to validate the PVs feature.

### Does this PR introduce _any_ user-facing change?

No. It is just testing.

### How was this patch tested?

Running the test locally:
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided 
SPARK_CONF_DIR/log4j.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
```

The "PVs with local storage" was successful but the next test `Launcher 
client dependencies` the minio stops the test executions on Mac (only on Mac):
```
21/06/29 04:33:32.449 ScalaTest-main-running-KubernetesSuite INFO 
ProcessUtils:   Starting tunnel for service minio-s3.
21/06/29 04:33:33.425 ScalaTest-main-running-KubernetesSuite INFO 
ProcessUtils: 
|--|--|-||
21/06/29 04:33:33.426 ScalaTest-main-running-KubernetesSuite INFO 
ProcessUtils: |NAMESPACE |   NAME   | TARGET PORT | 
 URL   |
21/06/29 04:33:33.426 ScalaTest-main-running-KubernetesSuite INFO 
ProcessUtils: 
|--|--|-||

[spark] branch master updated (d506815 -> ad528a0)

2021-07-21 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d506815  [SPARK-36188][PYTHON] Add categories setter to 
CategoricalAccessor and CategoricalIndex
 add ad528a0  [SPARK-32797][SPARK-32391][SPARK-33242][SPARK-32666][ANSIBLE] 
updating a bunch of python packages

No new revisions were added by this update.

Summary of changes:
 .../files/python_environments/spark-py36-spec.txt  | 64 +-
 1 file changed, 61 insertions(+), 3 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (733e85f1 -> 2c94fbc)

2021-06-30 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 733e85f1 [SPARK-35953][SQL] Support extracting date fields from 
timestamp without time zone
 add 2c94fbc  initial commit for skeleton ansible for jenkins worker config

No new revisions were added by this update.

Summary of changes:
 dev/.rat-excludes  |   1 +
 dev/ansible-for-test-node/README.md|  25 +++
 .../deploy-jenkins-worker.yml  |   8 +
 dev/ansible-for-test-node/roles/common/README.md   |   4 +
 .../roles/common/tasks/main.yml|   4 +
 .../roles/common/tasks/setup_local_userspace.yml   |   8 +
 .../roles/common/tasks/system_packages.yml |  73 
 .../roles/jenkins-worker/README.md |  15 ++
 .../roles/jenkins-worker/defaults/main.yml |  30 
 .../files/python_environments/base-py3-pip.txt |   3 +
 .../files/python_environments/base-py3-spec.txt|  21 +++
 .../files/python_environments/py36.txt |  49 ++
 .../files/python_environments/spark-py2-pip.txt|   8 +
 .../files/python_environments/spark-py36-spec.txt  |  61 +++
 .../files/python_environments/spark-py3k-spec.txt  |  42 +
 .../files/scripts/jenkins-gitcache-cron|   7 +
 .../files/util_scripts/kill_zinc_nailgun.py|  60 +++
 .../files/util_scripts/post_github_pr_comment.py   |  81 +
 .../files/util_scripts/session_lock_resource.py| 152 +
 .../roles/jenkins-worker/files/worker-limits.conf  |   5 +
 .../roles/jenkins-worker/tasks/cleanup.yml |  12 ++
 .../jenkins-worker/tasks/install_anaconda.yml  |  79 +
 .../tasks/install_build_packages.yml   |  21 +++
 .../roles/jenkins-worker/tasks/install_docker.yml  |  33 
 .../jenkins-worker/tasks/install_minikube.yml  |  16 ++
 .../tasks/install_spark_build_packages.yml | 183 +
 .../jenkins-worker/tasks/jenkins_userspace.yml | 119 ++
 .../roles/jenkins-worker/tasks/main.yml|  22 +++
 .../roles/jenkins-worker/vars/main.yml |   9 +
 dev/tox.ini|   4 +-
 30 files changed, 1153 insertions(+), 2 deletions(-)
 create mode 100644 dev/ansible-for-test-node/README.md
 create mode 100644 dev/ansible-for-test-node/deploy-jenkins-worker.yml
 create mode 100644 dev/ansible-for-test-node/roles/common/README.md
 create mode 100644 dev/ansible-for-test-node/roles/common/tasks/main.yml
 create mode 100644 
dev/ansible-for-test-node/roles/common/tasks/setup_local_userspace.yml
 create mode 100644 
dev/ansible-for-test-node/roles/common/tasks/system_packages.yml
 create mode 100644 dev/ansible-for-test-node/roles/jenkins-worker/README.md
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/defaults/main.yml
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/files/python_environments/base-py3-pip.txt
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/files/python_environments/base-py3-spec.txt
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/files/python_environments/py36.txt
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/files/python_environments/spark-py2-pip.txt
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/files/python_environments/spark-py36-spec.txt
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/files/python_environments/spark-py3k-spec.txt
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/files/scripts/jenkins-gitcache-cron
 create mode 100755 
dev/ansible-for-test-node/roles/jenkins-worker/files/util_scripts/kill_zinc_nailgun.py
 create mode 100755 
dev/ansible-for-test-node/roles/jenkins-worker/files/util_scripts/post_github_pr_comment.py
 create mode 100755 
dev/ansible-for-test-node/roles/jenkins-worker/files/util_scripts/session_lock_resource.py
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/files/worker-limits.conf
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/tasks/cleanup.yml
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/tasks/install_anaconda.yml
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/tasks/install_build_packages.yml
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/tasks/install_docker.yml
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/tasks/install_minikube.yml
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/tasks/install_spark_build_packages.yml
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/tasks/jenkins_userspace.yml
 create mode 100644 
dev/ansible-for-test-node/roles/jenkins-worker/tasks/main.yml
 create mode 100644 dev/ansible-for-test-node

[spark] branch branch-3.0 updated (efae8b6 -> 8eedc41)

2020-11-25 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from efae8b6  [SPARK-33535][INFRA][TESTS] Export LANG to en_US.UTF-8 in 
run-tests-jenkins script
 add 8eedc41  [SPARK-33565][PYTHON][BUILD][3.0] Remove py38 spark3

No new revisions were added by this update.

Summary of changes:
 python/run-tests.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (1de3fc4 -> c529426)

2020-11-25 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1de3fc4  [SPARK-33525][SQL] Update hive-service-rpc to 3.1.2
 add c529426  [SPARK-33565][BUILD][PYTHON] remove python3.8 and fix breakage

No new revisions were added by this update.

Summary of changes:
 python/run-tests.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-2.4 updated (8bde6ed -> 7f522d5)

2020-05-28 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8bde6ed  [SPARK-31839][TESTS] Delete duplicate code in castsuit
 add 7f522d5  [BUILD][INFRA] bump the timeout to match the jenkins PRB

No new revisions were added by this update.

Summary of changes:
 dev/run-tests-jenkins.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-2.4 updated (8bde6ed -> 7f522d5)

2020-05-28 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8bde6ed  [SPARK-31839][TESTS] Delete duplicate code in castsuit
 add 7f522d5  [BUILD][INFRA] bump the timeout to match the jenkins PRB

No new revisions were added by this update.

Summary of changes:
 dev/run-tests-jenkins.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-2.4 updated (8bde6ed -> 7f522d5)

2020-05-28 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8bde6ed  [SPARK-31839][TESTS] Delete duplicate code in castsuit
 add 7f522d5  [BUILD][INFRA] bump the timeout to match the jenkins PRB

No new revisions were added by this update.

Summary of changes:
 dev/run-tests-jenkins.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [BUILD][INFRA] bump the timeout to match the jenkins PRB

2020-05-28 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 9580266  [BUILD][INFRA] bump the timeout to match the jenkins PRB
9580266 is described below

commit 9580266c8ee910dbf9a37dd6ff8baa9a94bca38e
Author: shane knapp 
AuthorDate: Thu May 28 14:25:49 2020 -0700

[BUILD][INFRA] bump the timeout to match the jenkins PRB

### What changes were proposed in this pull request?

bump the timeout to match what's set in jenkins

### Why are the changes needed?

tests be timing out!

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

via jenkins

Closes #28666 from shaneknapp/increase-jenkins-timeout.

Authored-by: shane knapp 
Signed-off-by: shane knapp 
(cherry picked from commit 9e68affd13a3875b92f0700b8ab7c9d902f1a08c)
Signed-off-by: shane knapp 
---
 dev/run-tests-jenkins.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dev/run-tests-jenkins.py b/dev/run-tests-jenkins.py
index 72e32d4..13be959 100755
--- a/dev/run-tests-jenkins.py
+++ b/dev/run-tests-jenkins.py
@@ -198,7 +198,7 @@ def main():
 # format: http://linux.die.net/man/1/timeout
 # must be less than the timeout configured on Jenkins. Usually Jenkins's 
timeout is higher
 # then this. Please consult with the build manager or a committer when it 
should be increased.
-tests_timeout = "400m"
+tests_timeout = "500m"
 
 # Array to capture all test names to run on the pull request. These tests 
are represented
 # by their file equivalents in the dev/tests/ directory.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-2.4 updated (8bde6ed -> 7f522d5)

2020-05-28 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8bde6ed  [SPARK-31839][TESTS] Delete duplicate code in castsuit
 add 7f522d5  [BUILD][INFRA] bump the timeout to match the jenkins PRB

No new revisions were added by this update.

Summary of changes:
 dev/run-tests-jenkins.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [BUILD][INFRA] bump the timeout to match the jenkins PRB

2020-05-28 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 9580266  [BUILD][INFRA] bump the timeout to match the jenkins PRB
9580266 is described below

commit 9580266c8ee910dbf9a37dd6ff8baa9a94bca38e
Author: shane knapp 
AuthorDate: Thu May 28 14:25:49 2020 -0700

[BUILD][INFRA] bump the timeout to match the jenkins PRB

### What changes were proposed in this pull request?

bump the timeout to match what's set in jenkins

### Why are the changes needed?

tests be timing out!

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

via jenkins

Closes #28666 from shaneknapp/increase-jenkins-timeout.

Authored-by: shane knapp 
Signed-off-by: shane knapp 
(cherry picked from commit 9e68affd13a3875b92f0700b8ab7c9d902f1a08c)
Signed-off-by: shane knapp 
---
 dev/run-tests-jenkins.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dev/run-tests-jenkins.py b/dev/run-tests-jenkins.py
index 72e32d4..13be959 100755
--- a/dev/run-tests-jenkins.py
+++ b/dev/run-tests-jenkins.py
@@ -198,7 +198,7 @@ def main():
 # format: http://linux.die.net/man/1/timeout
 # must be less than the timeout configured on Jenkins. Usually Jenkins's 
timeout is higher
 # then this. Please consult with the build manager or a committer when it 
should be increased.
-tests_timeout = "400m"
+tests_timeout = "500m"
 
 # Array to capture all test names to run on the pull request. These tests 
are represented
 # by their file equivalents in the dev/tests/ directory.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (8bbb666 -> 9e68aff)

2020-05-28 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8bbb666  [SPARK-25351][PYTHON][TEST][FOLLOWUP] Fix test assertions to 
be consistent
 add 9e68aff  [BUILD][INFRA] bump the timeout to match the jenkins PRB

No new revisions were added by this update.

Summary of changes:
 dev/run-tests-jenkins.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (8bbb666 -> 9e68aff)

2020-05-28 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8bbb666  [SPARK-25351][PYTHON][TEST][FOLLOWUP] Fix test assertions to 
be consistent
 add 9e68aff  [BUILD][INFRA] bump the timeout to match the jenkins PRB

No new revisions were added by this update.

Summary of changes:
 dev/run-tests-jenkins.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (c0e9f9f -> 4d23938)

2020-01-09 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c0e9f9f  [SPARK-30459][SQL] Fix ignoreMissingFiles/ignoreCorruptFiles 
in data source v2
 add 4d23938  [MINOR][SQL][TEST-HIVE1.2] Fix scalastyle error due to length 
line in hive-1.2 profile

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala| 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (c0e9f9f -> 4d23938)

2020-01-09 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c0e9f9f  [SPARK-30459][SQL] Fix ignoreMissingFiles/ignoreCorruptFiles 
in data source v2
 add 4d23938  [MINOR][SQL][TEST-HIVE1.2] Fix scalastyle error due to length 
line in hive-1.2 profile

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala| 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5496e98 -> 708cf16)

2019-12-03 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5496e98  [SPARK-30109][ML] PCA use BLAS.gemv for sparse vectors
 add 708cf16  [SPARK-30111][K8S] Apt-get update to fix debian issues

No new revisions were added by this update.

Summary of changes:
 .../kubernetes/docker/src/main/dockerfiles/spark/bindings/R/Dockerfile | 2 +-
 .../docker/src/main/dockerfiles/spark/bindings/python/Dockerfile   | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (e46e487 -> 04e99c1)

2019-11-14 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e46e487  [SPARK-29682][SQL] Resolve conflicting attributes in Expand 
correctly
 add 04e99c1  [SPARK-29672][PYSPARK] update spark testing framework to use 
python3

No new revisions were added by this update.

Summary of changes:
 dev/pip-sanity-check.py|  2 --
 dev/run-pip-tests  | 23 +--
 dev/run-tests  |  6 +++---
 dev/run-tests-jenkins  |  8 +---
 dev/run-tests-jenkins.py   |  3 +--
 dev/run-tests.py   |  5 ++---
 dev/sparktestsupport/shellutils.py |  6 ++
 python/pyspark/context.py  |  2 --
 python/pyspark/version.py  |  2 +-
 python/run-tests   |  8 +++-
 python/run-tests.py| 17 ++---
 python/setup.py|  7 +++
 12 files changed, 43 insertions(+), 46 deletions(-)
 mode change 100644 => 100755 python/setup.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (e46e487 -> 04e99c1)

2019-11-14 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e46e487  [SPARK-29682][SQL] Resolve conflicting attributes in Expand 
correctly
 add 04e99c1  [SPARK-29672][PYSPARK] update spark testing framework to use 
python3

No new revisions were added by this update.

Summary of changes:
 dev/pip-sanity-check.py|  2 --
 dev/run-pip-tests  | 23 +--
 dev/run-tests  |  6 +++---
 dev/run-tests-jenkins  |  8 +---
 dev/run-tests-jenkins.py   |  3 +--
 dev/run-tests.py   |  5 ++---
 dev/sparktestsupport/shellutils.py |  6 ++
 python/pyspark/context.py  |  2 --
 python/pyspark/version.py  |  2 +-
 python/run-tests   |  8 +++-
 python/run-tests.py| 17 ++---
 python/setup.py|  7 +++
 12 files changed, 43 insertions(+), 46 deletions(-)
 mode change 100644 => 100755 python/setup.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (e696c36 -> 52186af)

2019-10-14 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e696c36  [SPARK-29442][SQL] Set `default` mode should override the 
existing mode
 add 52186af  [SPARK-25152][K8S] Enable SparkR Integration Tests for 
Kubernetes

No new revisions were added by this update.

Summary of changes:
 .../integration-tests/scripts/setup-integration-test-env.sh   | 4 ++--
 .../org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-28701][INFRA][FOLLOWUP] Fix the key error when looking in os.environ

2019-08-26 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 84d4f94  [SPARK-28701][INFRA][FOLLOWUP] Fix the key error when looking 
in os.environ
84d4f94 is described below

commit 84d4f945969e199a5d3fb658864e494b88d15f3c
Author: shane knapp 
AuthorDate: Mon Aug 26 12:40:31 2019 -0700

[SPARK-28701][INFRA][FOLLOWUP] Fix the key error when looking in os.environ

### What changes were proposed in this pull request?

i broke run-tests.py for non-PRB builds in this PR:
https://github.com/apache/spark/pull/25423

### Why are the changes needed?

to fix what i broke

### Does this PR introduce any user-facing change?
no

### How was this patch tested?
the build system will test this

Closes #25585 from shaneknapp/fix-run-tests.

Authored-by: shane knapp 
Signed-off-by: shane knapp 
---
 dev/run-tests.py | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/dev/run-tests.py b/dev/run-tests.py
index a338667..ea51570 100755
--- a/dev/run-tests.py
+++ b/dev/run-tests.py
@@ -405,10 +405,11 @@ def run_scala_tests(build_tool, hadoop_version, 
test_modules, excluded_tags):
 test_profiles += ['-Dtest.exclude.tags=' + ",".join(excluded_tags)]
 
 # set up java11 env if this is a pull request build with 'test-java11' in 
the title
-if "test-java11" in os.environ["ghprbPullTitle"].lower():
-os.environ["JAVA_HOME"] = "/usr/java/jdk-11.0.1"
-os.environ["PATH"] = "%s/bin:%s" % (os.environ["JAVA_HOME"], 
os.environ["PATH"])
-test_profiles += ['-Djava.version=11']
+if "ghprbPullTitle" in os.environ:
+if "test-java11" in os.environ["ghprbPullTitle"].lower():
+os.environ["JAVA_HOME"] = "/usr/java/jdk-11.0.1"
+os.environ["PATH"] = "%s/bin:%s" % (os.environ["JAVA_HOME"], 
os.environ["PATH"])
+test_profiles += ['-Djava.version=11']
 
 if build_tool == "maven":
 run_scala_tests_maven(test_profiles)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-2.3 updated: [SPARK-25079][PYTHON][BRANCH-2.3] update python3 executable to 3.6.x

2019-04-19 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a commit to branch branch-2.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.3 by this push:
 new a85ab12  [SPARK-25079][PYTHON][BRANCH-2.3] update python3 executable 
to 3.6.x
a85ab12 is described below

commit a85ab120e3d29a323e6d28aa307d4c20ee5f2c6c
Author: shane knapp 
AuthorDate: Fri Apr 19 09:45:40 2019 -0700

[SPARK-25079][PYTHON][BRANCH-2.3] update python3 executable to 3.6.x

## What changes were proposed in this pull request?

have jenkins test against python3.6 (instead of 3.4).

## How was this patch tested?

extensive testing on both the centos and ubuntu jenkins workers revealed 
that 2.3 probably doesn't like python 3.6... :(

NOTE: this is just for branch-2.3

PLEASE DO NOT MERGE

Author: shane knapp 

Closes #24380 from shaneknapp/update-python-executable-2.3.
---
 python/run-tests.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/run-tests.py b/python/run-tests.py
index 3539c76..d571855 100755
--- a/python/run-tests.py
+++ b/python/run-tests.py
@@ -114,7 +114,7 @@ def run_individual_python_test(test_name, pyspark_python):
 
 
 def get_default_python_executables():
-python_execs = [x for x in ["python2.7", "python3.4", "pypy"] if which(x)]
+python_execs = [x for x in ["python2.7", "python3.6", "pypy"] if which(x)]
 if "python2.7" not in python_execs:
 LOGGER.warning("Not testing against `python2.7` because it could not 
be found; falling"
" back to `python` instead")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-2.4 updated: [SPARK-25079][PYTHON][BRANCH-2.4] update python3 executable to 3.6.x

2019-04-19 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new eaa88ae  [SPARK-25079][PYTHON][BRANCH-2.4] update python3 executable 
to 3.6.x
eaa88ae is described below

commit eaa88ae5237b23fb7497838f3897a64641efe383
Author: shane knapp 
AuthorDate: Fri Apr 19 09:44:06 2019 -0700

[SPARK-25079][PYTHON][BRANCH-2.4] update python3 executable to 3.6.x

## What changes were proposed in this pull request?

have jenkins test against python3.6 (instead of 3.4).

## How was this patch tested?

extensive testing on both the centos and ubuntu jenkins workers revealed 
that 2.4 doesn't like python 3.6...  :(

NOTE: this is just for branch-2.4

PLEASE DO NOT MERGE

Closes #24379 from shaneknapp/update-python-executable.

Authored-by: shane knapp 
Signed-off-by: shane knapp 
---
 python/run-tests.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/run-tests.py b/python/run-tests.py
index ccbdfac..921fdc9 100755
--- a/python/run-tests.py
+++ b/python/run-tests.py
@@ -162,7 +162,7 @@ def run_individual_python_test(target_dir, test_name, 
pyspark_python):
 
 
 def get_default_python_executables():
-python_execs = [x for x in ["python2.7", "python3.4", "pypy"] if which(x)]
+python_execs = [x for x in ["python2.7", "python3.6", "pypy"] if which(x)]
 if "python2.7" not in python_execs:
 LOGGER.warning("Not testing against `python2.7` because it could not 
be found; falling"
" back to `python` instead")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark-website] branch asf-site updated: testing how-to for k8s changes

2019-03-28 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 5f4985d  testing how-to for k8s changes
5f4985d is described below

commit 5f4985db6340efa0107ebe34bb285ab8ceb74f65
Author: shane knapp 
AuthorDate: Thu Mar 28 11:03:09 2019 -0700

testing how-to for k8s changes

i think that this will be quite useful.  :)

Author: shane knapp 

Closes #186 from shaneknapp/add-k8s-testing-instructions.
---
 developer-tools.md| 48 ++
 site/developer-tools.html | 49 +++
 2 files changed, 97 insertions(+)

diff --git a/developer-tools.md b/developer-tools.md
index 43ad445..00d57cd 100644
--- a/developer-tools.md
+++ b/developer-tools.md
@@ -175,6 +175,54 @@ You can check the coverage report visually by HTMLs under 
`/.../spark/python/tes
 
 Please check other available options via `python/run-tests[-with-coverage] 
--help`.
 
+Testing K8S
+
+If you have made changes to the K8S bindings in Apache Spark, it would behoove 
you to test locally before submitting a PR.  This is relatively simple to do, 
but it will require a local (to you) installation of 
[minikube](https://kubernetes.io/docs/setup/minikube/).  Due to how minikube 
interacts with the host system, please be sure to set things up as follows:
+
+- minikube version v0.34.1 (or greater, but backwards-compatibility between 
versions is spotty)
+- You must use a VM driver!  Running minikube with the `--vm-driver=none` 
option requires that the user launching minikube/k8s have root access.  Our 
Jenkins workers use the 
[kvm2](https://github.com/kubernetes/minikube/blob/master/docs/drivers.md#kvm2-driver)
 drivers.  More details 
[here](https://github.com/kubernetes/minikube/blob/master/docs/drivers.md).
+- kubernetes version v1.13.3 (can be set by executing `minikube config set 
kubernetes-version v1.13.3`)
+
+Once you have minikube properly set up, and have successfully completed the 
[quick start](https://kubernetes.io/docs/setup/minikube/#quickstart), you can 
test your changes locally.  All subsequent commands should be run from your 
root spark/ repo directory:
+
+1) Build a tarball to test against:
+
+```
+export DATE=`date "+%Y%m%d"`
+export REVISION=`git rev-parse --short HEAD`
+export ZINC_PORT=$(python -S -c "import random; print 
random.randrange(3030,4030)")
+
+./dev/make-distribution.sh --name ${DATE}-${REVISION} --pip --tgz 
-DzincPort=${ZINC_PORT} \
+ -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver
+```
+
+2) Use that tarball and run the K8S integration tests:
+
+```
+PVC_TMP_DIR=$(mktemp -d)
+export PVC_TESTS_HOST_PATH=$PVC_TMP_DIR
+export PVC_TESTS_VM_PATH=$PVC_TMP_DIR
+
+minikube --vm-driver= start --memory 6000 --cpus 8
+
+minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} 
--9p-version=9p2000.L --gid=0 --uid=185 &
+
+MOUNT_PID=$(jobs -rp)
+
+kubectl create clusterrolebinding serviceaccounts-cluster-admin 
--clusterrole=cluster-admin --group=system:serviceaccounts || true
+
+./resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh
 \
+--spark-tgz ${WORKSPACE}/spark-*.tgz
+
+kill -9 $MOUNT_PID
+minikube stop
+```
+
+After the run is completed, the integration test logs are saved here:  
`./resource-managers/kubernetes/integration-tests/target/integration-tests.log`
+
+Getting logs from the pods and containers directly is an exercise left to the 
reader.
+
+Kubernetes, and more importantly, minikube have rapid release cycles, and 
point releases have been found to be buggy and/or break older and existing 
functionality.  If you are having trouble getting tests to pass on Jenkins, but 
locally things work, don't hesitate to file a Jira issue.
 
 ScalaTest Issues
 
diff --git a/site/developer-tools.html b/site/developer-tools.html
index e676d7b..17da11c 100644
--- a/site/developer-tools.html
+++ b/site/developer-tools.html
@@ -353,6 +353,55 @@ Generating HTML files for PySpark coverage under 
/.../spark/python/test_coverage
 
 Please check other available options via 
python/run-tests[-with-coverage] --help.
 
+Testing K8S
+
+If you have made changes to the K8S bindings in Apache Spark, it would 
behoove you to test locally before submitting a PR.  This is relatively simple 
to do, but it will require a local (to you) installation of https://kubernetes.io/docs/setup/minikube/;>minikube.  Due to how 
minikube interacts with the host system, please be sure to set things up as 
follows:
+
+
+  minikube version v0.34.1 (or greater, but backwards-compatibility 
between versions is spotty)
+  You must use a VM driver!  Running minikube with the 
--vm-driver=none option requires that the user launching 
minikube/k8s have root access.  Our Je

[spark] branch master updated: [SPARK-24902][K8S] Add PV integration tests

2019-03-27 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 39577a2  [SPARK-24902][K8S] Add PV integration tests
39577a2 is described below

commit 39577a27a0b58fd75b41d24b10012447748b7ee9
Author: Stavros Kontopoulos 
AuthorDate: Wed Mar 27 13:00:56 2019 -0700

[SPARK-24902][K8S] Add PV integration tests

## What changes were proposed in this pull request?

- Adds persistent volume integration tests
- Adds a custom tag to the test to exclude it if it is run against a cloud 
backend.
- Assumes default fs type for the host, AFAIK that is ext4.

## How was this patch tested?
Manually run the tests against minikube as usual:
```
[INFO] --- scalatest-maven-plugin:1.0:test (integration-test)  
spark-kubernetes-integration-tests_2.12 ---
Discovery starting.
Discovery completed in 192 milliseconds.
Run starting. Expected test count is: 16
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- Test PVs with local storage
```

Closes #23514 from skonto/pvctests.

Authored-by: Stavros Kontopoulos 
Signed-off-by: shane knapp 
---
 .../apache/spark/examples/DFSReadWriteTest.scala   |  12 +-
 .../k8s/integrationtest/KubernetesSuite.scala  |  35 +++-
 .../integrationtest/KubernetesTestComponents.scala |   3 +-
 .../deploy/k8s/integrationtest/PVTestsSuite.scala  | 189 +
 .../k8s/integrationtest/SecretsTestsSuite.scala|  27 +--
 .../spark/deploy/k8s/integrationtest/Utils.scala   |  22 +++
 6 files changed, 260 insertions(+), 28 deletions(-)

diff --git 
a/examples/src/main/scala/org/apache/spark/examples/DFSReadWriteTest.scala 
b/examples/src/main/scala/org/apache/spark/examples/DFSReadWriteTest.scala
index 1a77971..a738598 100644
--- a/examples/src/main/scala/org/apache/spark/examples/DFSReadWriteTest.scala
+++ b/examples/src/main/scala/org/apache/spark/examples/DFSReadWriteTest.scala
@@ -22,6 +22,9 @@ import java.io.File
 
 import scala.io.Source._
 
+import org.apache.hadoop.fs.FileSystem
+import org.apache.hadoop.fs.Path
+
 import org.apache.spark.sql.SparkSession
 
 /**
@@ -107,6 +110,13 @@ object DFSReadWriteTest {
 
 println("Writing local file to DFS")
 val dfsFilename = s"$dfsDirPath/dfs_read_write_test"
+
+// delete file if exists
+val fs = FileSystem.get(spark.sessionState.newHadoopConf())
+if (fs.exists(new Path(dfsFilename))) {
+fs.delete(new Path(dfsFilename), true)
+}
+
 val fileRDD = spark.sparkContext.parallelize(fileContents)
 fileRDD.saveAsTextFile(dfsFilename)
 
@@ -123,7 +133,6 @@ object DFSReadWriteTest {
   .sum
 
 spark.stop()
-
 if (localWordCount == dfsWordCount) {
   println(s"Success! Local Word Count $localWordCount and " +
 s"DFS Word Count $dfsWordCount agree.")
@@ -131,7 +140,6 @@ object DFSReadWriteTest {
   println(s"Failure! Local Word Count $localWordCount " +
 s"and DFS Word Count $dfsWordCount disagree.")
 }
-
   }
 }
 // scalastyle:on println
diff --git 
a/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
 
b/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
index 91419e8..bc0bb20 100644
--- 
a/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
+++ 
b/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
@@ -40,7 +40,7 @@ import org.apache.spark.internal.config._
 
 class KubernetesSuite extends SparkFunSuite
   with BeforeAndAfterAll with BeforeAndAfter with BasicTestsSuite with 
SecretsTestsSuite
-  with PythonTestsSuite with ClientModeTestsSuite with PodTemplateSuite
+  with PythonTestsSuite with ClientModeTestsSuite with PodTemplateSuite with 
PVTestsSuite
   with Logging with Eventually with Matchers {
 
   import KubernetesSuite._
@@ -178,6 +178,29 @@ class KubernetesSuite

[spark] branch master updated: [SPARK-27178][K8S] add nss to the spark/k8s Dockerfile

2019-03-18 Thread shaneknapp
This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 5564fe5  [SPARK-27178][K8S] add nss to the spark/k8s Dockerfile
5564fe5 is described below

commit 5564fe51513f725d2526dbf9e25a2f2c40d19afc
Author: shane knapp 
AuthorDate: Mon Mar 18 16:38:42 2019 -0700

[SPARK-27178][K8S] add nss to the spark/k8s Dockerfile

## What changes were proposed in this pull request?

while performing some tests on our existing minikube and k8s 
infrastructure, i noticed that the integration tests were failing. i dug in and 
discovered the following message buried at the end of the stacktrace:

```
  Caused by: java.io.FileNotFoundException: /usr/lib/libnss3.so
at sun.security.pkcs11.Secmod.initialize(Secmod.java:193)
at sun.security.pkcs11.SunPKCS11.(SunPKCS11.java:218)
... 81 more
```
after i added the `nss` package to 
`resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile`, 
everything worked.

this is also impacting current builds.  see:  
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8959/console

## How was this patch tested?

i tested locally before pushing, and the build system will test the rest.

Closes #24111 from shaneknapp/add-nss-package-to-dockerfile.

Authored-by: shane knapp 
Signed-off-by: shane knapp 
---
 .../kubernetes/docker/src/main/dockerfiles/spark/Dockerfile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile 
b/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile
index 1d8ac3c..871d34b 100644
--- a/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile
+++ b/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile
@@ -29,7 +29,7 @@ ARG spark_uid=185
 RUN set -ex && \
 apk upgrade --no-cache && \
 ln -s /lib /lib64 && \
-apk add --no-cache bash tini libc6-compat linux-pam krb5 krb5-libs && \
+apk add --no-cache bash tini libc6-compat linux-pam krb5 krb5-libs nss && \
 mkdir -p /opt/spark && \
 mkdir -p /opt/spark/examples && \
 mkdir -p /opt/spark/work-dir && \


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [BUILD] refactor dev/lint-python in to something readable

2018-11-20 Thread shaneknapp
Repository: spark
Updated Branches:
  refs/heads/master db136d360 -> 42c48387c


[BUILD] refactor dev/lint-python in to something readable

## What changes were proposed in this pull request?

`dev/lint-python` is a mess of nearly unreadable bash.  i would like to fix 
that as best as i can.

## How was this patch tested?

the build system will test this.

Closes #22994 from shaneknapp/lint-python-refactor.

Authored-by: shane knapp 
Signed-off-by: shane knapp 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/42c48387
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/42c48387
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/42c48387

Branch: refs/heads/master
Commit: 42c48387c047d96154bcfeb95fcb816a43e60d7c
Parents: db136d3
Author: shane knapp 
Authored: Tue Nov 20 12:38:40 2018 -0800
Committer: shane knapp 
Committed: Tue Nov 20 12:38:40 2018 -0800

--
 dev/lint-python | 359 +++
 1 file changed, 220 insertions(+), 139 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/42c48387/dev/lint-python
--
diff --git a/dev/lint-python b/dev/lint-python
index 27d87f6..0681693 100755
--- a/dev/lint-python
+++ b/dev/lint-python
@@ -1,5 +1,4 @@
 #!/usr/bin/env bash
-
 #
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
@@ -16,160 +15,242 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
+# define test binaries + versions
+PYDOCSTYLE_BUILD="pydocstyle"
+MINIMUM_PYDOCSTYLE="3.0.0"
 
-SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )"
-SPARK_ROOT_DIR="$(dirname "$SCRIPT_DIR")"
-# Exclude auto-generated configuration file.
-PATHS_TO_CHECK="$( cd "$SPARK_ROOT_DIR" && find . -name "*.py" )"
-DOC_PATHS_TO_CHECK="$( cd "$SPARK_ROOT_DIR" && find . -name "*.py" | grep -vF 
'functions.py' )"
-PYCODESTYLE_REPORT_PATH="$SPARK_ROOT_DIR/dev/pycodestyle-report.txt"
-PYDOCSTYLE_REPORT_PATH="$SPARK_ROOT_DIR/dev/pydocstyle-report.txt"
-PYLINT_REPORT_PATH="$SPARK_ROOT_DIR/dev/pylint-report.txt"
-PYLINT_INSTALL_INFO="$SPARK_ROOT_DIR/dev/pylint-info.txt"
-
-PYDOCSTYLEBUILD="pydocstyle"
-MINIMUM_PYDOCSTYLEVERSION="3.0.0"
-
-FLAKE8BUILD="flake8"
+FLAKE8_BUILD="flake8"
 MINIMUM_FLAKE8="3.5.0"
 
-SPHINXBUILD=${SPHINXBUILD:=sphinx-build}
-SPHINX_REPORT_PATH="$SPARK_ROOT_DIR/dev/sphinx-report.txt"
+PYCODESTYLE_BUILD="pycodestyle"
+MINIMUM_PYCODESTYLE="2.4.0"
 
-cd "$SPARK_ROOT_DIR"
+SPHINX_BUILD="sphinx-build"
 
-# compileall: https://docs.python.org/2/library/compileall.html
-python -B -m compileall -q -l $PATHS_TO_CHECK > "$PYCODESTYLE_REPORT_PATH"
-compile_status="${PIPESTATUS[0]}"
+function compile_python_test {
+local COMPILE_STATUS=
+local COMPILE_REPORT=
+
+if [[ ! "$1" ]]; then
+echo "No python files found!  Something is very wrong -- exiting."
+exit 1;
+fi
 
-# Get pycodestyle at runtime so that we don't rely on it being installed on 
the build server.
-# See: https://github.com/apache/spark/pull/1744#issuecomment-50982162
-# Updated to the latest official version of pep8. pep8 is formally renamed to 
pycodestyle.
-PYCODESTYLE_VERSION="2.4.0"
-PYCODESTYLE_SCRIPT_PATH="$SPARK_ROOT_DIR/dev/pycodestyle-$PYCODESTYLE_VERSION.py"
-PYCODESTYLE_SCRIPT_REMOTE_PATH="https://raw.githubusercontent.com/PyCQA/pycodestyle/$PYCODESTYLE_VERSION/pycodestyle.py;
+# compileall: https://docs.python.org/2/library/compileall.html
+echo "starting python compilation test..."
+COMPILE_REPORT=$( (python -B -mcompileall -q -l $1) 2>&1)
+COMPILE_STATUS=$?
+
+if [ $COMPILE_STATUS -ne 0 ]; then
+echo "Python compilation failed with the following errors:"
+echo "$COMPILE_REPORT"
+echo "$COMPILE_STATUS"
+exit "$COMPILE_STATUS"
+else
+echo "python compilation succeeded."
+echo
+fi
+}
 
-if [ ! -e "$PYCODESTYLE_SCRIPT_PATH" ]; then
-curl --silent -o "$PYCODESTYLE_SCRIPT_PATH" 
"$PYCODESTYLE_SCRIPT_REMOTE_PATH"
-curl_status="$?"
+function pycodestyle_test {
+local PYCODESTYLE_STATUS=
+local PYCODESTYLE_REPORT=
+local RUN_LOCAL_PYCODESTYLE=
+local VERSION=
+local EXPECTED_PYCODE