Re: API Problem

2022-06-09 Thread Sean Owen
That repartition seems to do nothing? But yes the key point is use col() On Thu, Jun 9, 2022, 9:41 PM Stelios Philippou wrote: > Perhaps > > > finalDF.repartition(finalDF.rdd.getNumPartitions()).withColumn("status_for_batch > > To > >

Spark streaming / confluent Kafka- messages are empty

2022-06-09 Thread KhajaAsmath Mohammed
Hi, I am trying to read data from confluent Kafka using avro schema registry. Messages are always empty and stream always shows empty records. Any suggestion on this please ?? Thanks, Asmath - To unsubscribe e-mail:

Re: API Problem

2022-06-09 Thread Stelios Philippou
Perhaps finalDF.repartition(finalDF.rdd.getNumPartitions()).withColumn("status_for_batch To finalDF.repartition(finalDF.rdd.getNumPartitions()).withColumn(col("status_for_batch") On Thu, 9 Jun 2022, 22:32 Sid, wrote: > Hi Experts, > > I am facing one problem while passing a column to the

API Problem

2022-06-09 Thread Sid
Hi Experts, I am facing one problem while passing a column to the method. The problem is described in detail here: https://stackoverflow.com/questions/72565095/how-to-pass-columns-as-a-json-record-to-the-api-method-using-pyspark TIA, Sid

Re: to find Difference of locations in Spark Dataframe rows

2022-06-09 Thread Bjørn Jørgensen
If KM is kilometre then you must replace val distance = atan2(sqrt(a), sqrt (-a + 1)) * 2 * 6371 to val distance = atan2(sqrt(a), sqrt(-a + 1)) * 2 * 12742 Have a look at this gnist Spherical distance calcualtion based on latitude and longitude with Apache Spark

Re: Retrieve the count of spark nodes

2022-06-09 Thread Poorna Murali
Thanks Stephen! I will try this out. On Thu, 9 Jun, 2022, 6:02 am Stephen Coy, wrote: > Hi there, > > We use something like: > > /* > * Force Spark to initialise the defaultParallelism by executing a dummy > parallel operation and then return > * the resulting defaultParallelism. > */ >