I don't consider it as method to apply filtering multiple time, instead use it as semi-action not just transformation. Let's think that we have something like map-partition which accept multiple lambda that each one collect their ROW for their dataset (or something like it). Is it possible?
On Sat, Feb 2, 2019 at 5:59 PM Sean Owen <sro...@gmail.com> wrote: > I think the problem is that can't produce multiple Datasets from one > source in one operation - consider that reproducing one of them would mean > reproducing all of them. You can write a method that would do the filtering > multiple times but it wouldn't be faster. What do you have in mind that's > different? > > On Sat, Feb 2, 2019 at 12:19 AM Moein Hosseini <moein...@gmail.com> wrote: > >> I've seen many application need to split dataset to multiple datasets >> based on some conditions. As there is no method to do it in one place, >> developers use *filter *method multiple times. I think it can be useful >> to have method to split dataset based on condition in one iteration, >> something like *partition* method of scala (of-course scala partition >> just split list into two list, but something more general can be more >> useful). >> If you think it can be helpful, I can create Jira issue and work on it to >> send PR. >> >> Best Regards >> Moein >> >> -- >> >> Moein Hosseini >> Data Engineer >> mobile: +98 912 468 1859 <+98+912+468+1859> >> site: www.moein.xyz >> email: moein...@gmail.com >> [image: linkedin] <https://www.linkedin.com/in/moeinhm> >> [image: twitter] <https://twitter.com/moein7tl> >> >> -- Moein Hosseini Data Engineer mobile: +98 912 468 1859 <+98+912+468+1859> site: www.moein.xyz email: moein...@gmail.com [image: linkedin] <https://www.linkedin.com/in/moeinhm> [image: twitter] <https://twitter.com/moein7tl>