hagerf commented on issue #26762: [SPARK-30131] add array_median function
URL: https://github.com/apache/spark/pull/26762#issuecomment-562107475
@srowen Ok, I see. If it's really that restrictive then users can use other
functions for this, even though I think it could be a popular additio
hagerf commented on issue #26762: [SPARK-30131] add array_median function
URL: https://github.com/apache/spark/pull/26762#issuecomment-562097725
Yes, of course. But we have the prefix `approx` because calculating exact
median over a whole dataset is difficult to do efficiently. So users wh
hagerf commented on issue #26762: [SPARK-30131] add array_median function
URL: https://github.com/apache/spark/pull/26762#issuecomment-562059403
@HyukjinKwon I added some links, I think they should be relevant.
We already have `approxQuantile` but then this would be an exact function,
hagerf commented on issue #26762: [SPARK-30131] add array_median function
URL: https://github.com/apache/spark/pull/26762#issuecomment-561898965
@srowen From a quick googling, I see it in AWS Redshift and in IBM DB2 as
aggregate functions. I've seen several tickets in Spark requesting medi