KafkaUtils module not found on spark 3 pyspark

2021-02-16 Thread aupres
I use hadoop 3.3.0 and spark 3.0.1-bin-hadoop3.2. And my python ide is eclipse version 2020-12. I try to develop python application with KafkaUtils pyspark module. My configuration reference of pyspark and eclipse is this site

Re: Introducing Gallia: a Scala+Spark library for data manipulation

2021-02-16 Thread galliaproject
I posted a quick update on the scala mailing list , which mostly discusses Scala 2.13 support, additional examples and licensing. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

Re: Using Custom Scala Spark ML Estimator in PySpark

2021-02-16 Thread HARSH TAKKAR
Hello Sean, Thanks for the advice, can you please point me to an example where i can find a custom wrapper for python. Kind Regards Harsh Takkar On Tue, 16 Feb, 2021, 8:25 pm Sean Owen, wrote: > You won't be able to use it in python if it is implemented in Java - needs > a python wrapper

Re: Using Custom Scala Spark ML Estimator in PySpark

2021-02-16 Thread Sean Owen
You won't be able to use it in python if it is implemented in Java - needs a python wrapper too. On Mon, Feb 15, 2021, 11:29 PM HARSH TAKKAR wrote: > Hi , > > I have created a custom Estimator in scala, which i can use successfully > by creating a pipeline model in Java and scala, But when i

Re: vm.swappiness value for Spark on Kubernetes

2021-02-16 Thread Sean Owen
You probably don't want swapping in any environment. Some tasks will grind to a halt under mem pressure rather than just fail quickly. You would want to simply provision more memory. On Tue, Feb 16, 2021, 7:57 AM Jahar Tyagi wrote: > Hi, > > We have recently migrated from Spark 2.4.4 to Spark

Using DataFrame to Read Avro files

2021-02-16 Thread VenkateshDurai
While using Spark 2.4.7 to read avro file getting below error. java.lang.NoClassDefFoundError: org/apache/spark/sql/connector/catalog/TableProvider at java.lang.ClassLoader.defineClass1(Native Method) Code val dataDF= spark.read.format("avro").load("E:\\Avro\\12.avro") Please provide

vm.swappiness value for Spark on Kubernetes

2021-02-16 Thread Jahar Tyagi
Hi, We have recently migrated from Spark 2.4.4 to Spark 3.0.1 and using Spark in virtual machine/bare metal as standalone deployment and as kubernetes deployment as well. There is a kernel parameter named as 'vm.swappiness' and we keep its value as '1' in standard deployment. Now since we are

Re: Using Custom Scala Spark ML Estimator in PySpark

2021-02-16 Thread Mich Talebzadeh
Hi, Specifically is this a run time or compilation error. I gather by class path you mean something like below spark-submit --master yarn --deploy-mode client --driver-class-path --jars .. HTH LinkedIn *