Re: Can’t write to PVC in K8S

2021-08-31 Thread Bjørn Jørgensen
Hi and thanks for all the good help. I will build jupyter on top of spark to be able to run jupyter in local mode with the new koalas library. The new koalas library can be imported as "from pyspark import pandas as ps". Then you can run spark on K8S the same way that you use pandas in a

memory_and_disk persistence level algorithm

2021-08-31 Thread Zilvinas Saltys
I'm curious if someone could provide a bit deeper insight into how memory_and_disk_ser persistence level works. I've noticed that if my cluster has 2.2 TB of memory and I set the persistence level to memory_only_ser that Spark will use about 2TB and the storage tab shows 97-99% fraction cached

Re: Spark Stream on Kubernetes Cannot Set up JavaSparkContext

2021-08-31 Thread Stelios Philippou
Yes you are right. I am using Spring Boot for this. The same does work for the event that does not involve any kafka events. But again i am not sending out extra jars there so nothing is replaced and we are using the default ones. If i do not use the userClassPathFirst which will force the

Re: Spark Stream on Kubernetes Cannot Set up JavaSparkContext

2021-08-31 Thread Jacek Laskowski
Hi Stelios, I've never seen this error before, but a couple of things caught my attention that I would look at closer to chase the root cause of the issue. "org.springframework.context.annotation.AnnotationConfigApplicationContext:" and "21/08/31 07:28:42 ERROR

Re: Spark Stream on Kubernetes Cannot Set up JavaSparkContext

2021-08-31 Thread Mich Talebzadeh
hm, I had similar issues. I built the docker image with JAVA 8 and that worked in k8, Have you tried building your docker image with JAVA 8? HTH view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all

RE: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

2021-08-31 Thread Sharma, Prakash (Nokia - IN/Bangalore)
Yes we are using the spark 3.0.2 submit and we are not accessing the cloud buckets . Actually tpc-ds data is stored on HDFS and not in any cloud storage. From this tpc-ds data external tables are created and we are running some queries on this tables basically this queries are select queries.

Spark Stream on Kubernetes Cannot Set up JavaSparkContext

2021-08-31 Thread Stelios Philippou
Hello, I have been facing the current issue for some time now and I was wondering if someone might have some inside on how I can resolve the following. The code (java 11) is working correctly on my local machine but whenever I try to launch the following on K8 I am getting the following error.