unsubscribe

2024-05-01 Thread Yoel Benharrous
unsubscribe


unsubscribe

2024-04-30 Thread Yoel Benharrous



How grouping rows without shuffle

2023-11-09 Thread Yoel Benharrous
Hi all,

I'm trying to group X rows in a single one without shuffling the date.

I was thinking doing something like that :
val myDF = Seq(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11).toDF("myColumn")
myDF.withColumn("myColumn", expr("sliding(myColumn, 3)"))

expected result:
myColumn
[1,2,3]
[4,5,6]
[7,8,9]
[10, 11]

Any insight on how to implement this?
I saw in MlLib a SlidingRDD but I wanted to stay at Dataframe abstraction
https://spark.apache.org/docs/1.3.1/api/java/org/apache/spark/mllib/rdd/SlidingRDD.html

Thanks


How to upgrade a spark structure streaming application

2023-02-07 Thread Yoel Benharrous
Hi all,

I would like to ask how you perform a Spark Streaming application upgrade?
I didn't find any builtin solution.
I found some people writing a marker on file system and polling
periodically to stop running query.

Thanks,

Yoel