Hi!
Nice technical post.
Similar trick we use in a few other functions as well, if I'm not mistaken
(like count-distinct-up-to).
I think there's a redundant sort in the window function example.
Maybe a graph would show the data better than the table.

Ohad.

On Wed, Jan 21, 2026, 13:33 Eyal Allweil <[email protected]> wrote:

> Alon, thank you for your comment, I've added it to the draft. I also added
> a diagram of how the code runs - the latest version is in the same GitHub
> link here:
> https://github.com/eyala/datafu/blob/blog/site/source/blog/publish-date-here-collectNumberOrderedElements.markdown
>
> Question - do you think this sentence is good for the final paragraph?
>
> Even if it isn't useful to you today, the basic technique - using
> DeclarativeAggregate to allow Spark to optimize more effectively - may be.
> If you've done something similar, or created any useful general-purpose API
> in Spark, don't hesitate to contribute it to DataFu! We are always glad to
> review new contributions.
>
> Eyal
>
> On 2026/01/15 09:10:13 Alon Hartanu wrote:
> > Hi everyone,
> >
> > I read the blog, it looks great.
> >
> > I think you can also add about possible memory overflow this function can
> > help prevent, when using collect_list on large data.
> >
> > I have a use case for this function in one of my applications, I'll try
> it
> > out in a few weeks and let you know how it goes.
> >
> > Thanks, Alon
> >
>

Reply via email to