Re: Creating a memory-efficient AggregateFunction to calculate Median

2021-12-15 Thread Pol Santamaria
with the median aggregate, I have some ideas on how to implement it for the Spark data source of Qbeast Format in an efficient way. [Qbeast-spark] https://github.com/Qbeast-io/qbeast-spark [Microsoft Hyperspace] https://github.com/microsoft/hyperspace Bests, Pol Santamaria On Tue, Dec 14, 2021 at 4:42 AM

Re: [Spark Core]: Adding support for size based partition coalescing

2021-03-30 Thread Pol Santamaria
to judge the complexity or how feasible it is though. Bests, Pol Santamaria On Tue, Mar 30, 2021 at 1:30 PM mhawes wrote: > Hi all, Sending this first before creating a jira issue in an effort to > start > a discussion :) > > Problem: > > We have a situation where we end wit