The OP is not calling stddev though, so I still don't see that this is the question at hand.
But while we're off on the topic -- while I certainly agree that stddev is mapped to the sample standard deviation in DBs, it doesn't actually make much sense as a default. What you get back is not the standard deviation (as in, sqrt of second central moment) of the values in the grouping or table, which is I presume what people think they're getting. You're getting an estimate the standard deviation of a population from which the values are theoretically some random sample, but that's rarely true. I disagree that this is the general use case, so, have always thought this was a just a historical practice in RDBMSes that was actually not a good decision. Maybe that's why Hive defined it differently, but, even I would prefer consistency in this regard. On Thu, Jul 7, 2016 at 9:41 AM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > stddev is mapped to stdddev_samp. That is the general use case or rather > common use of standard deviation. > > Dr Mich Talebzadeh > > > > LinkedIn > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > > > http://talebzadehmich.wordpress.com > > > Disclaimer: Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. The > author will in no case be liable for any monetary damages arising from such > loss, damage or destruction. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org