Re: How to change a DataFrame column from nullable to not nullable in PySpark

2021-10-14 Thread Sonal Goyal
I see some nice answers at https://stackoverflow.com/questions/46072411/can-i-change-the-nullability-of-a-column-in-my-spark-dataframe On Thu, 14 Oct 2021 at 5:21 PM, ashok34...@yahoo.com.INVALID wrote: > Gurus, > > I have an RDD in PySpark that I can convert to DF through > > df = rdd.toDF() >

unsubscribe

2021-10-14 Thread Luis Mateos
unsubscribe

Unsubscribe

2021-10-14 Thread 676366545
unsubscribe

Unsubscribe

2021-10-14 Thread Jesús Vásquez
I want to unsubscribe

Re: Spark for Image Processing Acceleration

2021-10-14 Thread Sean Owen
(The suggestion here is to use Tensorflow with Spark - definitely doable for a long time with things like Horovod. Spark handles the image processing just fine) On Thu, Oct 14, 2021 at 10:17 AM Artemis User wrote: > Spark is good with SQL type of structured data, not image data. Unless > you

Re: Spark for Image Processing Acceleration

2021-10-14 Thread Artemis User
Spark is good with SQL type of structured data, not image data. Unless you algorithms don' t require dealing with image data directly. I guess your best option would be to go with Tensorflow since it has image classification models built-in and can integrate with NVidia GPUs out of the box. 

Re: apache-spark

2021-10-14 Thread Mich Talebzadeh
Also have you tried to see what is going on within k8s driver? DRIVER_POD_NAME=`kubectl get pods -n $NAMESPACE |grep driver|awk '{print $1}'` kubectl describe pod $DRIVER_POD_NAME -n $NAMESPACE kubectl logs $DRIVER_POD_NAME -n $NAMESPACE view my Linkedin profile

Re: apache-spark

2021-10-14 Thread Mich Talebzadeh
Hi, Airflow is nothing but a new version of cron on linux with dag dependency. What operator in airflow are you using to submit your spark-submit for example BashOperator? Can you actually run the command outside of airflow by submitting spark-submit to K8s cluster? Is that GKE cluster or

apache-spark

2021-10-14 Thread Nick Shivhare
Hi All, We are facing an issue and would be thankful if anyone can help us on this issue. Environment: Spark, Kubernetes and Airflow. Airflow is used to schedule job spark job over kubernetes. We are using bash script which is using spark submit command to submit spark jobs. Issue: We are

How to change a DataFrame column from nullable to not nullable in PySpark

2021-10-14 Thread ashok34...@yahoo.com.INVALID
Gurus, I have an RDD in PySpark that I can convert to DF through df = rdd.toDF() However, when I do df.printSchema() I see the columns as nullable. = true by default root |-- COL-1: long (nullable = true) |-- COl-2: double (nullable = true) |-- COl-3: string (nullable = true) What would be the