Dataset.count() returns one value directly? On Thu, Apr 7, 2022 at 11:25 PM sam smith <qustacksm2123...@gmail.com> wrote:
> My bad, yes of course that! still i don't like the .. > select("count(myCol)") .. part in my line is there any replacement to that ? > > Le ven. 8 avr. 2022 à 06:13, Sean Owen <sro...@gmail.com> a écrit : > >> Just do an average then? Most of my point is that filtering to one group >> and then grouping is pointless. >> >> On Thu, Apr 7, 2022, 11:10 PM sam smith <qustacksm2123...@gmail.com> >> wrote: >> >>> What if i do avg instead of count? >>> >>> Le ven. 8 avr. 2022 à 05:32, Sean Owen <sro...@gmail.com> a écrit : >>> >>>> Wait, why groupBy at all? After the filter only rows with myCol equal >>>> to your target are left. There is only one group. Don't group just count >>>> after the filter? >>>> >>>> On Thu, Apr 7, 2022, 10:27 PM sam smith <qustacksm2123...@gmail.com> >>>> wrote: >>>> >>>>> I want to aggregate a column by counting the number of rows having the >>>>> value "myTargetValue" and return the result >>>>> I am doing it like the following:in JAVA >>>>> >>>>>> long result = >>>>>> dataset.filter(dataset.col("myCol").equalTo("myTargetVal")).groupBy(col("myCol")).agg(count(dataset.col("myCol"))).select("count(myCol)").first().getLong(0); >>>>> >>>>> >>>>> Is that the right way? if no, what if a more optimized way to do that >>>>> (always in JAVA)? >>>>> Thanks for the help. >>>>> >>>>