No, as you shuffle each time again (you always partition by different
windows)
Instead: could you choose a single window (w3 with more columns =fine
granular) and the nfilter out records to achieve the same result?
Or instead:
df.groupBy(a,b,c).agg(sort_array(collect_list(foo,bar,baz))
and then
Hi All,
Any suggestions?
Thanks,
-Rishi
On Sun, Oct 20, 2019 at 12:56 AM Rishi Shah
wrote:
> Hi All,
>
> I have a use case where I need to perform nested windowing functions on a
> data frame to get final set of columns. Example:
>
> w1 = Window.partitionBy('col1')
> df =
Hi All,
I have a use case where I need to perform nested windowing functions on a
data frame to get final set of columns. Example:
w1 = Window.partitionBy('col1')
df = df.withColumn('sum1', F.sum('val'))
w2 = Window.partitionBy('col1', 'col2')
df = df.withColumn('sum2', F.sum('val'))
w3 =