Re: Is there a difference between these aggregations
Any difference between using agg or select to do the aggregations? On Mon, Jul 24, 2017 at 5:08 PM, yohann jardin wrote: > Seen directly in the code: > > > /** >* Aggregate function: returns the average of the values in a group. >* Alias for avg. >* >* @group agg_funcs >* @since 1.4.0 >*/ > def mean(e: Column): Column = avg(e) > > > That's the same when the argument is the column name. > > So no difference between mean and avg functions. > > > -- > *De :* Aseem Bansal > *Envoyé :* lundi 24 juillet 2017 13:34 > *À :* user > *Objet :* Is there a difference between these aggregations > > If I want to aggregate mean and subtract from my column I can do either of > the following in Spark 2.1.0 Java API. Is there any difference between > these? Couldn't find anything from reading the docs. > > dataset.select(mean("mycol")) > dataset.agg(mean("mycol")) > > dataset.select(avg("mycol")) > dataset.agg(avg("mycol")) >
RE: Is there a difference between these aggregations
Seen directly in the code: /** * Aggregate function: returns the average of the values in a group. * Alias for avg. * * @group agg_funcs * @since 1.4.0 */ def mean(e: Column): Column = avg(e) That's the same when the argument is the column name. So no difference between mean and avg functions. De : Aseem Bansal Envoyé : lundi 24 juillet 2017 13:34 À : user Objet : Is there a difference between these aggregations If I want to aggregate mean and subtract from my column I can do either of the following in Spark 2.1.0 Java API. Is there any difference between these? Couldn't find anything from reading the docs. dataset.select(mean("mycol")) dataset.agg(mean("mycol")) dataset.select(avg("mycol")) dataset.agg(avg("mycol"))
Is there a difference between these aggregations
If I want to aggregate mean and subtract from my column I can do either of the following in Spark 2.1.0 Java API. Is there any difference between these? Couldn't find anything from reading the docs. dataset.select(mean("mycol")) dataset.agg(mean("mycol")) dataset.select(avg("mycol")) dataset.agg(avg("mycol"))