Re: k8s orchestrating Spark service

2019-07-01 Thread Matt Cheah
then the ML server process has to create a SparkContext object parameterized against the Kubernetes server in question. I hope this helps! -Matt Cheah From: Pat Ferrel Date: Monday, July 1, 2019 at 5:05 PM To: "user@spark.apache.org" , Matt Cheah Subject: Re: k8s orchestrating Spar

Re: k8s orchestrating Spark service

2019-07-01 Thread Matt Cheah
integration with Kubernetes here. Deploying on Spark standalone mode in Kubernetes is, to my understanding, meant to be superseded by the native integration introduced in Spark 2.4. From: Pat Ferrel Date: Monday, July 1, 2019 at 4:40 PM To: "user@spark.apache.org" , Matt Cheah S

Re: k8s orchestrating Spark service

2019-07-01 Thread Matt Cheah
/running-on-kubernetes.html I would think that building Helm around this architecture of running Spark applications would be easier than running a Spark standalone cluster. But admittedly I’m not very familiar with the Helm technology – we just use spark-submit. -Matt Cheah From: Pat Ferrel

[PSA] Sharing our Experiences With Kubernetes

2019-05-17 Thread Matt Cheah
for you. -Matt Cheah smime.p7s Description: S/MIME cryptographic signature

Re: Spark on k8s - map persistentStorage for data spilling

2019-03-01 Thread Matt Cheah
and faster? -Matt Cheah From: Tomasz Krol Date: Friday, March 1, 2019 at 10:53 AM To: Matt Cheah Cc: "user@spark.apache.org" Subject: Re: Spark on k8s - map persistentStorage for data spilling Hi Matt, Thanks for coming back to me. Yeah that doesn't work. Basically in the pr

Re: Spark on k8s - map persistentStorage for data spilling

2019-02-28 Thread Matt Cheah
I think we want to change the value of spark.local.dir to point to where your PVC is mounted. Can you give that a try and let us know if that moves the spills as expected? -Matt Cheah From: Tomasz Krol Date: Wednesday, February 27, 2019 at 3:41 AM To: "user@spark.apache.org"

Re: Problem running Spark on Kubernetes: Certificate error

2018-12-13 Thread Matt Cheah
it’s using for that communication. If there’s a fix that needs to happen in Spark, feel free to indicate as such. -Matt Cheah From: Steven Stetzler Date: Thursday, December 13, 2018 at 1:49 PM To: "user@spark.apache.org" Subject: Problem running Spark on Kubernetes: Certifi

Re: External shuffle service on K8S

2018-10-26 Thread Matt Cheah
Hi there, Please see https://issues.apache.org/jira/browse/SPARK-25299 for more discussion around this matter. -Matt Cheah From: Li Gao Date: Friday, October 26, 2018 at 9:10 AM To: "vincent.gromakow...@gmail.com" Cc: "caolijun1...@gmail.com" , "user@spark.

Re: [Spark for kubernetes] Azure Blob Storage credentials issue

2018-10-24 Thread Matt Cheah
Hi there, Can you check if HADOOP_CONF_DIR is being set on the executors to /opt/spark/conf? One should set an executor environment variable for that. A kubectl describe pod output for the executors would be helpful here. -Matt Cheah From: Oscar Bonilla Date: Friday, October 19

Re: Spark on Kubernetes: Kubernetes killing executors because of overallocation of memory

2018-08-02 Thread Matt Cheah
by offheap storage from Spark that won’t be accounted for in just the heap size. Hope this helps, -Matt Cheah From: Jayesh Lalwani Date: Thursday, August 2, 2018 at 12:35 PM To: "user@spark.apache.org" Subject: Spark on Kubernetes: Kubernetes killing executo

Re: Structured Streaming on Kubernetes

2018-04-13 Thread Matt Cheah
directories. However, I’m unaware of any specific use of streaming with the Spark on Kubernetes integration right now. Would be curious to get feedback on the failover behavior right now. -Matt Cheah From: Tathagata Das <t...@databricks.com> Date: Friday, April 13, 2018 at 1

Re: Spark on K8s resource staging server timeout

2018-03-29 Thread Matt Cheah
the submission of local files in the official release we should probably create a mechanism that’s more resilient. Using a single HTTP server isn’t ideal – would ideally like something that’s highly available, replicated, etc. -Matt Cheah From: Jenna Hoole <jenna.ho...@gmail.com> Date: Th

Re: UnresolvedAddressException in Kubernetes Cluster

2017-10-12 Thread Matt Cheah
? -Matt Cheah From: Suman Somasundar <suman.somasun...@oracle.com> Sent: Monday, October 9, 2017 3:42:37 PM To: user@spark.apache.org Subject: UnresolvedAddressException in Kubernetes Cluster Hi, I am trying to deploy a Spark app in a Kubernetes C

Re: spark 2.0 issue with yarn?

2016-05-09 Thread Matt Cheah
at all. However jersey-client looks relatively harmless since it does not bundle in JAX-RS classes, nor does it appear to have anything weird in its META-INF folder. -Matt Cheah On 5/9/16, 3:10 PM, "Marcelo Vanzin" <van...@cloudera.com> wrote: >Hi Jesse, > >On Mo

Specify number of partitions with which to run DataFrame.join?

2015-06-18 Thread Matt Cheah
the partitioning for a DataFrame join operation as it is being computed? Or do I have to compute the join and repartition separately after? Thanks, -Matt Cheah smime.p7s Description: S/MIME cryptographic signature

Cross-compatibility of YARN shuffle service

2015-03-25 Thread Matt Cheah
Manager, will the service work properly with all of the Spark applications, regardless of the specific versions of the applications? Or, is it it the case that, if I want to use the external shuffle service, I need to have all of my applications using the same version of Spark? Thanks, -Matt Cheah

Reading from Kerberos Secured HDFS in Spark?

2014-12-02 Thread Matt Cheah
HDFS. What configurations needed to be set in spark-env.sh? What am I missing? Also, will I have an issue if I try to access HDFS in distributed mode, using a standalone setup? Thanks, -Matt Cheah smime.p7s Description: S/MIME cryptographic signature