[jira] [Comment Edited] (SPARK-29884) spark-submit to kuberentes can not parse valid ca certificate

2019-11-14 Thread Jeremy (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974621#comment-16974621
 ] 

Jeremy edited comment on SPARK-29884 at 11/14/19 9:34 PM:
--

After doing some debugging it seams like this might be in fabric k8s client. It 
tries to use .kube/config even if it gets all the parameters is needs from 
arguments.


was (Author: jeremyjjbrown):
After doing some debugging it seams like this might be in fabric k8s client. I 
tries to use .kube/config even if it gets all the parameters is needs from 
arguments.

> spark-submit to kuberentes can not parse valid ca certificate
> -
>
> Key: SPARK-29884
> URL: https://issues.apache.org/jira/browse/SPARK-29884
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.4.4
> Environment: A kuberentes cluster that has been in use for over 2 
> years and handles large amounts of production payloads.
>Reporter: Jeremy
>Priority: Major
>
> spark submit can not be used to to schedule to kuberentes with oauth token 
> and cacert
> {code:java}
> spark-submit \
> --deploy-mode cluster \
> --class org.apache.spark.examples.SparkPi \
> --master k8s://https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com \
> --conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN \
> --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
> --conf 
> spark.kubernetes.authenticate.submission.caCertFile=/home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt
>  \
> --conf spark.kubernetes.namespace=here-olp-3dds-sit \
> --conf spark.executor.instances=1 \
> --conf spark.app.name=spark-pi \
> --conf 
> spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0
>  \
> --conf 
> spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0
>  \
> local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar
> {code}
> returns
> {code:java}
> log4j:WARN No appenders could be found for logger 
> (io.fabric8.kubernetes.client.Config).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> Exception in thread "main" 
> io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.
>   at 
> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
>   at 
> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:53)
>   at 
> io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:183)
>   at 
> org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:84)
>   at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235)
>   at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235)
>   at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542)
>   at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241)
>   at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204)
>   at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
>   at 
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
>   at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
>   at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
>   at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
>   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
>   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.security.cert.CertificateException: Could not parse 
> certificate: java.io.IOException: Empty input
>   at 
> sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:110)
>   at 
> java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:339)
>   at 
> io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:104)
>   at 
> io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:197)
>   at 
> io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:128)
>   at 
> io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:122)
>   at 
> 

[jira] [Commented] (SPARK-29884) spark-submit to kuberentes can not parse valid ca certificate

2019-11-14 Thread Jeremy (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974621#comment-16974621
 ] 

Jeremy commented on SPARK-29884:


After doing some debugging it seams like this might be in fabric k8s client. I 
tries to use .kube/config even if it gets all the parameters is needs from 
arguments.

> spark-submit to kuberentes can not parse valid ca certificate
> -
>
> Key: SPARK-29884
> URL: https://issues.apache.org/jira/browse/SPARK-29884
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.4.4
> Environment: A kuberentes cluster that has been in use for over 2 
> years and handles large amounts of production payloads.
>Reporter: Jeremy
>Priority: Major
>
> spark submit can not be used to to schedule to kuberentes with oauth token 
> and cacert
> {code:java}
> spark-submit \
> --deploy-mode cluster \
> --class org.apache.spark.examples.SparkPi \
> --master k8s://https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com \
> --conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN \
> --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
> --conf 
> spark.kubernetes.authenticate.submission.caCertFile=/home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt
>  \
> --conf spark.kubernetes.namespace=here-olp-3dds-sit \
> --conf spark.executor.instances=1 \
> --conf spark.app.name=spark-pi \
> --conf 
> spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0
>  \
> --conf 
> spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0
>  \
> local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar
> {code}
> returns
> {code:java}
> log4j:WARN No appenders could be found for logger 
> (io.fabric8.kubernetes.client.Config).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> Exception in thread "main" 
> io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.
>   at 
> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
>   at 
> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:53)
>   at 
> io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:183)
>   at 
> org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:84)
>   at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235)
>   at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235)
>   at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542)
>   at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241)
>   at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204)
>   at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
>   at 
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
>   at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
>   at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
>   at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
>   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
>   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.security.cert.CertificateException: Could not parse 
> certificate: java.io.IOException: Empty input
>   at 
> sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:110)
>   at 
> java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:339)
>   at 
> io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:104)
>   at 
> io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:197)
>   at 
> io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:128)
>   at 
> io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:122)
>   at 
> io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:78)
>   ... 13 more
> Caused by: java.io.IOException: Empty input
>   at 
> sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:106)
>   ... 19 more
> {code}
> The cacert and 

[jira] [Updated] (SPARK-29884) spark-submit to kuberentes can not parse valid ca certificate

2019-11-13 Thread Jeremy (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy updated SPARK-29884:
---
Summary: spark-submit to kuberentes can not parse valid ca certificate  
(was: spark-Submit to kuberentes can not parse valid ca certificate)

> spark-submit to kuberentes can not parse valid ca certificate
> -
>
> Key: SPARK-29884
> URL: https://issues.apache.org/jira/browse/SPARK-29884
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.4.4
> Environment: A kuberentes cluster that has been in use for over 2 
> years and handles large amounts of production payloads.
>Reporter: Jeremy
>Priority: Major
>
> spark submit can not be used to to schedule to kuberentes with oauth token 
> and cacert
> {code:java}
> spark-submit \
> --deploy-mode cluster \
> --class org.apache.spark.examples.SparkPi \
> --master k8s://https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com \
> --conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN \
> --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
> --conf 
> spark.kubernetes.authenticate.submission.caCertFile=/home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt
>  \
> --conf spark.kubernetes.namespace=here-olp-3dds-sit \
> --conf spark.executor.instances=1 \
> --conf spark.app.name=spark-pi \
> --conf 
> spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0
>  \
> --conf 
> spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0
>  \
> local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar
> {code}
> returns
> {code:java}
> log4j:WARN No appenders could be found for logger 
> (io.fabric8.kubernetes.client.Config).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> Exception in thread "main" 
> io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.
>   at 
> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
>   at 
> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:53)
>   at 
> io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:183)
>   at 
> org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:84)
>   at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235)
>   at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235)
>   at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542)
>   at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241)
>   at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204)
>   at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
>   at 
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
>   at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
>   at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
>   at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
>   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
>   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.security.cert.CertificateException: Could not parse 
> certificate: java.io.IOException: Empty input
>   at 
> sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:110)
>   at 
> java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:339)
>   at 
> io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:104)
>   at 
> io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:197)
>   at 
> io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:128)
>   at 
> io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:122)
>   at 
> io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:78)
>   ... 13 more
> Caused by: java.io.IOException: Empty input
>   at 
> sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:106)
>   ... 19 more
> {code}
> The cacert and token are both valid and work even with curl
> {code:java}
> 

[jira] [Updated] (SPARK-29884) spark-Submit to kuberentes can not parse valid ca certificate

2019-11-13 Thread Jeremy (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy updated SPARK-29884:
---
Description: 
spark submit can not be used to to schedule to kuberentes with oauth token and 
cacert
{code:java}
spark-submit \
--deploy-mode cluster \
--class org.apache.spark.examples.SparkPi \
--master k8s://https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com \
--conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf 
spark.kubernetes.authenticate.submission.caCertFile=/home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt
 \
--conf spark.kubernetes.namespace=here-olp-3dds-sit \
--conf spark.executor.instances=1 \
--conf spark.app.name=spark-pi \
--conf 
spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0
 \
--conf 
spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0
 \
local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar
{code}
returns
{code:java}
log4j:WARN No appenders could be found for logger 
(io.fabric8.kubernetes.client.Config).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
Exception in thread "main" 
io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.
at 
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
at 
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:53)
at 
io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:183)
at 
org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:84)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204)
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.security.cert.CertificateException: Could not parse 
certificate: java.io.IOException: Empty input
at 
sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:110)
at 
java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:339)
at 
io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:104)
at 
io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:197)
at 
io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:128)
at 
io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:122)
at 
io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:78)
... 13 more
Caused by: java.io.IOException: Empty input
at 
sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:106)
... 19 more
{code}
The cacert and token are both valid and work even with curl
{code:java}
curl --cacert /home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt -H 
"Authorization: bearer $TOKEN" -v 
https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com/api/v1/namespaces/here-olp-3dds-sit/pods
 -o out
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0* 
  Trying 10.117.233.37:443...
* TCP_NODELAY set
* Connected to api.borg-dev-1-aws-eu-west-1.k8s.in.here.com (10.117.233.37) 
port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt
  CApath: none
} [5 bytes data]
* TLSv1.3 (OUT), TLS 

[jira] [Updated] (SPARK-29884) spark-Submit to kuberentes can not parse valid ca certificate

2019-11-13 Thread Jeremy (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy updated SPARK-29884:
---
Description: 
spark submit can not be used to to schedule to kuberentes with oauth token and 
cacert
{code:java}
spark-submit \
--deploy-mode cluster \
--class org.apache.spark.examples.SparkPi \
--master k8s://https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com \
--conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf 
spark.kubernetes.authenticate.submission.caCertFile=/home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt
 \
--conf spark.kubernetes.namespace=here-olp-3dds-sit \
--conf spark.executor.instances=1 \
--conf spark.app.name=spark-pi \
--conf 
spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0
 \
--conf 
spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0
 \
local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar
{code}
returns
{code:java}
log4j:WARN No appenders could be found for logger 
(io.fabric8.kubernetes.client.Config).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
Exception in thread "main" 
io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.
at 
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
at 
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:53)
at 
io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:183)
at 
org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:84)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204)
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.security.cert.CertificateException: Could not parse 
certificate: java.io.IOException: Empty input
at 
sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:110)
at 
java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:339)
at 
io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:104)
at 
io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:197)
at 
io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:128)
at 
io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:122)
at 
io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:78)
... 13 more
Caused by: java.io.IOException: Empty input
at 
sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:106)
... 19 more
{code}
The cacert and token are both valid and work even with curl
{code:java}
curl --cacert /home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt -H 
"Authorization: bearer $TOKEN" -v 
https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com/api/v1/namespaces/here-olp-3dds-sit/pods
 -o out
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0* 
  Trying 10.117.233.37:443...
* TCP_NODELAY set
* Connected to api.borg-dev-1-aws-eu-west-1.k8s.in.here.com (10.117.233.37) 
port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt
  CApath: none
} [5 bytes data]
* TLSv1.3 (OUT), TLS 

[jira] [Created] (SPARK-29884) spark-Submit to kuberentes can not parse valid ca certificate

2019-11-13 Thread Jeremy (Jira)
Jeremy created SPARK-29884:
--

 Summary: spark-Submit to kuberentes can not parse valid ca 
certificate
 Key: SPARK-29884
 URL: https://issues.apache.org/jira/browse/SPARK-29884
 Project: Spark
  Issue Type: Bug
  Components: Kubernetes
Affects Versions: 2.4.4
 Environment: A kuberentes cluster that has been in use for over 2 
years and handles large amounts of production payloads.
Reporter: Jeremy


spark submit can not be used to to schedule to kuberentes with oauth token and 
cacert
{code:java}
spark-submit \
--deploy-mode cluster \
--class org.apache.spark.examples.SparkPi \
--master k8s://https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com \
--conf spark.kubernetes.authenticate.submission.oauthToken=$TOKEN \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf 
spark.kubernetes.authenticate.submission.caCertFile=/home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt
 \
--conf spark.kubernetes.namespace=here-olp-3dds-sit \
--conf spark.executor.instances=1 \
--conf spark.app.name=spark-pi \
--conf 
spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0
 \
--conf 
spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0
 \
local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar
{code}
returns
{code:java}
log4j:WARN No appenders could be found for logger 
(io.fabric8.kubernetes.client.Config).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
Exception in thread "main" 
io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.
at 
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
at 
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:53)
at 
io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:183)
at 
org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:84)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$$anonfun$run$4.apply(KubernetesClientApplication.scala:235)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2542)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:241)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:204)
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.security.cert.CertificateException: Could not parse 
certificate: java.io.IOException: Empty input
at 
sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:110)
at 
java.security.cert.CertificateFactory.generateCertificate(CertificateFactory.java:339)
at 
io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:104)
at 
io.fabric8.kubernetes.client.internal.CertUtils.createKeyStore(CertUtils.java:197)
at 
io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:128)
at 
io.fabric8.kubernetes.client.internal.SSLUtils.keyManagers(SSLUtils.java:122)
at 
io.fabric8.kubernetes.client.utils.HttpClientUtils.createHttpClient(HttpClientUtils.java:78)
... 13 more
Caused by: java.io.IOException: Empty input
at 
sun.security.provider.X509Factory.engineGenerateCertificate(X509Factory.java:106)
... 19 more
{code}
The cacert and token are both valid and work even with curl
{code:java}
curl --cacert /home/jeremybr/.kube/borg-dev-1-aws-eu-west-1.crt -H 
"Authorization: bearer $TOKEN" -v 
https://api.borg-dev-1-aws-eu-west-1.k8s.in.here.com/api/v1/namespaces/here-olp-3dds-sit/pods
 -o out
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0* 
  Trying 10.117.233.37:443...
* 

[jira] [Commented] (SPARK-10408) Autoencoder

2017-05-11 Thread Jeremy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007535#comment-16007535
 ] 

Jeremy commented on SPARK-10408:


I mentioned this in the PR, but I also want to mention here that I'll do a code 
review within the next few days. 

> Autoencoder
> ---
>
> Key: SPARK-10408
> URL: https://issues.apache.org/jira/browse/SPARK-10408
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 1.5.0
>Reporter: Alexander Ulanov
>Assignee: Alexander Ulanov
>
> Goal: Implement various types of autoencoders 
> Requirements:
> 1)Basic (deep) autoencoder that supports different types of inputs: binary, 
> real in [0..1]. real in [-inf, +inf] 
> 2)Sparse autoencoder i.e. L1 regularization. It should be added as a feature 
> to the MLP and then used here 
> 3)Denoising autoencoder 
> 4)Stacked autoencoder for pre-training of deep networks. It should support 
> arbitrary network layers
> References: 
> 1. Vincent, Pascal, et al. "Extracting and composing robust features with 
> denoising autoencoders." Proceedings of the 25th international conference on 
> Machine learning. ACM, 2008. 
> http://www.iro.umontreal.ca/~vincentp/Publications/denoising_autoencoders_tr1316.pdf
>  
> 2. 
> http://machinelearning.wustl.edu/mlpapers/paper_files/ICML2011Rifai_455.pdf, 
> 3. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., and Manzagol, P.-A. 
> (2010). Stacked denoising autoencoders: Learning useful representations in a 
> deep network with a local denoising criterion. Journal of Machine Learning 
> Research, 11(3371–3408). 
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.297.3484=rep1=pdf
> 4, 5, 6. Bengio, Yoshua, et al. "Greedy layer-wise training of deep 
> networks." Advances in neural information processing systems 19 (2007): 153. 
> http://www.iro.umontreal.ca/~lisa/pointeurs/dbn_supervised_tr1282.pdf



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-11834) Ignore thresholds in LogisticRegression and update documentation

2016-04-27 Thread Jeremy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15260328#comment-15260328
 ] 

Jeremy commented on SPARK-11834:


The upside to this change is that users who set a threshold won't silently 
clear it by also calling thresholds. Users should never call this anyways, as 
they'll only be doing binary classification - it's unlikely that anybody has 
thresholds in their current usage pattern. The main downside that I see is that 
it would need to be changed back after multi is added. If that event is far 
away this change may save a few users some confusion, but if it's imminent then 
it would be work to undo this change quite soon. Regardless, the change is only 
consequential in a pretty isolated case that involves using the function 
incorrectly - quite minor either way.

> Ignore thresholds in LogisticRegression and update documentation
> 
>
> Key: SPARK-11834
> URL: https://issues.apache.org/jira/browse/SPARK-11834
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, ML
>Affects Versions: 1.6.0
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Minor
>
> ml.LogisticRegression does not support multiclass yet. So we should ignore 
> `thresholds` and update the documentation. In the next release, we can do 
> SPARK-11543.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-11834) Ignore thresholds in LogisticRegression and update documentation

2016-04-14 Thread Jeremy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241583#comment-15241583
 ] 

Jeremy commented on SPARK-11834:


To follow up, both setThreshold() and setThresholds() clear any value in the 
other threshold, and so checkThresholdConsistensy() will never be called. And 
so thresholds is being successfully ignored (though if set, it will clear the 
value for threshold, and the user will not see this).

> Ignore thresholds in LogisticRegression and update documentation
> 
>
> Key: SPARK-11834
> URL: https://issues.apache.org/jira/browse/SPARK-11834
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, ML
>Affects Versions: 1.6.0
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Minor
>
> ml.LogisticRegression does not support multiclass yet. So we should ignore 
> `thresholds` and update the documentation. In the next release, we can do 
> SPARK-11543.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-11834) Ignore thresholds in LogisticRegression and update documentation

2016-04-14 Thread Jeremy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241511#comment-15241511
 ] 

Jeremy edited comment on SPARK-11834 at 4/14/16 5:16 PM:
-

Looking into this JIRA, checkThresholdConsistency() should be called in 
getThreshold() in ml.classification.LogisticRegression.scala, but I can set 
inconsistent values with setThreshold() and setThresholds() and get predictions 
that run cleanly and that are consistent with the value for setThreshold(). 
Checking the documentation, it has been updated to reflect that only binary 
classes are supported.  Pictures here: https://goo.gl/wCpJzx


was (Author: jeremynixon):
Looking into this JIRA, checkThresholdConsistency() and validateParams() are 
not called in ml.classification.LogisticRegression.scala, and so I can set 
inconsistent values with setThreshold() and setThresholds() and get predictions 
that run cleanly and that are consistent with the value for setThreshold(). 
Checking the documentation, it has been updated to reflect that only binary 
classes are supported.  Pictures here: https://goo.gl/wCpJzx

> Ignore thresholds in LogisticRegression and update documentation
> 
>
> Key: SPARK-11834
> URL: https://issues.apache.org/jira/browse/SPARK-11834
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, ML
>Affects Versions: 1.6.0
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Minor
>
> ml.LogisticRegression does not support multiclass yet. So we should ignore 
> `thresholds` and update the documentation. In the next release, we can do 
> SPARK-11543.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-11834) Ignore thresholds in LogisticRegression and update documentation

2016-04-14 Thread Jeremy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241511#comment-15241511
 ] 

Jeremy edited comment on SPARK-11834 at 4/14/16 5:10 PM:
-

Looking into this JIRA, checkThresholdConsistency() and validateParams() are 
not called in ml.classification.LogisticRegression.scala, and so I can set 
inconsistent values with setThreshold() and setThresholds() and get predictions 
that run cleanly and that are consistent with the value for setThreshold(). 
Checking the documentation, it has been updated to reflect that only binary 
classes are supported.  Pictures here: https://goo.gl/wCpJzx


was (Author: jeremynixon):
Looking into this JIRA, checkThresholdConsistency() and validateParams() are 
not called in ml.classification.LogisticRegression.scala, and so I can set 
inconsistent values with setThreshold() and setThresholds() and get predictions 
that run cleanly and that are consistent with the value for setThreshold(). 
Checking the documentation, it has been updated to reflect that only binary 
classes are supported. 

> Ignore thresholds in LogisticRegression and update documentation
> 
>
> Key: SPARK-11834
> URL: https://issues.apache.org/jira/browse/SPARK-11834
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, ML
>Affects Versions: 1.6.0
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Minor
>
> ml.LogisticRegression does not support multiclass yet. So we should ignore 
> `thresholds` and update the documentation. In the next release, we can do 
> SPARK-11543.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-11834) Ignore thresholds in LogisticRegression and update documentation

2016-04-14 Thread Jeremy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241511#comment-15241511
 ] 

Jeremy edited comment on SPARK-11834 at 4/14/16 5:03 PM:
-

Looking into this JIRA, checkThresholdConsistency() and validateParams() are 
not called in ml.classification.LogisticRegression.scala, and so I can set 
inconsistent values with setThreshold() and setThresholds() and get predictions 
that run cleanly and that are consistent with the value for setThreshold(). 
Checking the documentation, it has been updated to reflect that only binary 
classes are supported. 


was (Author: jeremynixon):
Looking into this JIRA, checkThresholdConsistency() and validateParams() are 
not called in ml.classification.LogisticRegression.scala, and so I can set 

> Ignore thresholds in LogisticRegression and update documentation
> 
>
> Key: SPARK-11834
> URL: https://issues.apache.org/jira/browse/SPARK-11834
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, ML
>Affects Versions: 1.6.0
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Minor
>
> ml.LogisticRegression does not support multiclass yet. So we should ignore 
> `thresholds` and update the documentation. In the next release, we can do 
> SPARK-11543.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-11834) Ignore thresholds in LogisticRegression and update documentation

2016-04-14 Thread Jeremy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241511#comment-15241511
 ] 

Jeremy commented on SPARK-11834:


Looking into this JIRA, checkThresholdConsistency() and validateParams() are 
not called in ml.classification.LogisticRegression.scala, and so I can set 

> Ignore thresholds in LogisticRegression and update documentation
> 
>
> Key: SPARK-11834
> URL: https://issues.apache.org/jira/browse/SPARK-11834
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, ML
>Affects Versions: 1.6.0
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Minor
>
> ml.LogisticRegression does not support multiclass yet. So we should ignore 
> `thresholds` and update the documentation. In the next release, we can do 
> SPARK-11543.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13706) Python Example for Train Validation Split Missing

2016-03-06 Thread Jeremy (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy updated SPARK-13706:
---
Description: An example of how to use TrainValidationSplit in pyspark needs 
to be added. Should be consistent with the current examples. I'll submit a PR.  
(was: And example of how to use TrainValidationSplit in pyspark needs to be 
added. Should be consistent with the current examples. I'll submit a PR.)

> Python Example for Train Validation Split Missing
> -
>
> Key: SPARK-13706
> URL: https://issues.apache.org/jira/browse/SPARK-13706
> Project: Spark
>  Issue Type: Bug
>  Components: ML, MLlib, PySpark
>Reporter: Jeremy
>Priority: Minor
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> An example of how to use TrainValidationSplit in pyspark needs to be added. 
> Should be consistent with the current examples. I'll submit a PR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-13706) Python Example for Train Validation Split Missing

2016-03-06 Thread Jeremy (JIRA)
Jeremy created SPARK-13706:
--

 Summary: Python Example for Train Validation Split Missing
 Key: SPARK-13706
 URL: https://issues.apache.org/jira/browse/SPARK-13706
 Project: Spark
  Issue Type: Bug
  Components: ML, MLlib, PySpark
Reporter: Jeremy
Priority: Minor


And example of how to use TrainValidationSplit in pyspark needs to be added. 
Should be consistent with the current examples. I'll submit a PR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12877) TrainValidationSplit is missing in pyspark.ml.tuning

2016-03-03 Thread Jeremy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178867#comment-15178867
 ] 

Jeremy commented on SPARK-12877:


Hi Xiangrui, 

Commenting!

> TrainValidationSplit is missing in pyspark.ml.tuning
> 
>
> Key: SPARK-12877
> URL: https://issues.apache.org/jira/browse/SPARK-12877
> Project: Spark
>  Issue Type: New Feature
>  Components: PySpark
>Affects Versions: 1.6.0
>Reporter: Wojciech Jurczyk
> Fix For: 2.0.0
>
>
> I was investingating progress in SPARK-10759 and I noticed that there is no 
> TrainValidationSplit class in pyspark.ml.tuning module.
> Java/Scala's examples SPARK-10759 use 
> org.apache.spark.ml.tuning.TrainValidationSplit that is not available from 
> Python and this blocks SPARK-10759.
> Does the class have different name in PySpark, maybe? Also, I couldn't find 
> any JIRA task to saying it need to be implemented. Is it by design that the 
> TrainValidationSplit estimator is not ported to PySpark? If not, that is if 
> the estimator needs porting then I would like to contribute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-10759) Missing Python code example in ML Programming guide

2016-02-16 Thread Jeremy (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-10759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy updated SPARK-10759:
---
Comment: was deleted

(was: Cannot add example for code that doesn't exist.)

> Missing Python code example in ML Programming guide
> ---
>
> Key: SPARK-10759
> URL: https://issues.apache.org/jira/browse/SPARK-10759
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0
>Reporter: Raela Wang
>Assignee: Apache Spark
>Priority: Minor
>  Labels: starter
>
> http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-cross-validation
> http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-train-validation-split



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10759) Missing Python code example in ML Programming guide

2016-02-16 Thread Jeremy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149266#comment-15149266
 ] 

Jeremy commented on SPARK-10759:


Cannot add example for code that doesn't exist.

> Missing Python code example in ML Programming guide
> ---
>
> Key: SPARK-10759
> URL: https://issues.apache.org/jira/browse/SPARK-10759
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0
>Reporter: Raela Wang
>Assignee: Apache Spark
>Priority: Minor
>  Labels: starter
>
> http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-cross-validation
> http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-train-validation-split



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-13312) ML Model Selection via Train Validation Split example uses incorrect data

2016-02-13 Thread Jeremy (JIRA)
Jeremy created SPARK-13312:
--

 Summary: ML Model Selection via Train Validation Split example 
uses incorrect data
 Key: SPARK-13312
 URL: https://issues.apache.org/jira/browse/SPARK-13312
 Project: Spark
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.6.0
Reporter: Jeremy
Priority: Minor


The Model Selection via Train Validation Split example uses classification data 
for a regression problem, and so returns the appropriate errors when run. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10759) Missing Python code example in ML Programming guide

2016-02-13 Thread Jeremy (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146336#comment-15146336
 ] 

Jeremy commented on SPARK-10759:


It's been a few months, so I've begun to work on this here: 
https://github.com/apache/spark/compare/master...JeremyNixon:add_py_ex_ml-guide

It appears from the documentation that pyspark doesn't have an implementation 
of train-validation-split, or at least that it's not found in the tuning module 
like it is in the java and scala docs. Let me know if that's not the case and 
I'll pull an example from that function into the branch as well. If it doesn't 
exist, I can create and work on a JIRA requesting it.

> Missing Python code example in ML Programming guide
> ---
>
> Key: SPARK-10759
> URL: https://issues.apache.org/jira/browse/SPARK-10759
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0
>Reporter: Raela Wang
>Assignee: Lauren Moos
>Priority: Minor
>  Labels: starter
>
> http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-cross-validation
> http://spark.apache.org/docs/latest/ml-guide.html#example-model-selection-via-train-validation-split



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org