unsubscribe

2023-11-09 Thread Duflot Patrick
unsubscribe


Pass xmx values to SparkLauncher launched Java process

2023-11-09 Thread Deepthi Sathia Raj
Hi,

We have a usecase where we are submitting multiple spark jobs using
SparkLauncher from a Java class.
We are currently in a memory crunch situation on our edge node where we see
that the Java processes spawned by the launcher is taking around 1 GB.
Is there a way to pass JMX parameters to this launcher jvm process which is
launched for spark submit?

Thanks
Deepthi


How grouping rows without shuffle

2023-11-09 Thread Yoel Benharrous
Hi all,

I'm trying to group X rows in a single one without shuffling the date.

I was thinking doing something like that :
val myDF = Seq(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11).toDF("myColumn")
myDF.withColumn("myColumn", expr("sliding(myColumn, 3)"))

expected result:
myColumn
[1,2,3]
[4,5,6]
[7,8,9]
[10, 11]

Any insight on how to implement this?
I saw in MlLib a SlidingRDD but I wanted to stay at Dataframe abstraction
https://spark.apache.org/docs/1.3.1/api/java/org/apache/spark/mllib/rdd/SlidingRDD.html

Thanks


help needed with SPARK-45598 and SPARK-45769

2023-11-09 Thread Maksym M
Greetings,

tl;dr there must have been a regression in spark *connect*'s ability to 
retrieve data, more details in linked issues

https://issues.apache.org/jira/browse/SPARK-45598
https://issues.apache.org/jira/browse/SPARK-45769

we have projects that depend on spark connect 3.5 and we'd appreciate any 
suggestions on what could be wrong and how to resolve it.

happy to contribute!

best regards,
maksym

-- 

Confidentiality note: This e-mail may contain confidential information 
from Nu Holdings Ltd and/or its affiliates. If you have received it by 
mistake, please let us know by e-mail reply and delete it from your system; 
you may not copy this message or disclose its contents to anyone; for 
details about what personal information we collect and why, please refer to 
our privacy policy 
.