Hi ,
I am getting # java.lang.OutOfMemoryError: Java heap space . I have
increased my driver memory and executor memory still i am facing this issue.
I am using r4 for driver and core nodes(16). How can we see which step or
whether its related to any GC . Can we pin point to single point on code
I have a use case where i am using collect().toMap (Group by certain column
and finding count ,creating map with a key) and use that map to enable some
further calculations.
I am getting Out of memory errors and is there any alternative than
.collect() to create a structure like Map or some
Hi,
I am trying to use range between window function but i am keep on getting
below error
main" org.apache.spark.sql.AnalysisException: Window Frame
specifiedwindowframe(RangeFrame, currentrow$(), 5) must match the required
frame specified
I need to check next consecutive 5 seconds interval
Can we avoid multiple group by , l have a million records and its a
performance concern.
Below is my query , even with Windows functions also i guess it is a
performance hit, can you please advice if there is a better alternative.
I need to get max no of equipments for that house for list of
Hello I need a design recommendation.
I need to calcualte a couple of calculations with min shuffling and better
perf. I have an nested structure with say a class have n number of students
and structure will be similiar to this
{ classId: String,
StudendId:String,
Score:Int,
AreaCode:String}