hagerf commented on issue #26762: [SPARK-30131] add array_median function URL: https://github.com/apache/spark/pull/26762#issuecomment-562097725 Yes, of course. But we have the prefix `approx` because calculating exact median over a whole dataset is difficult to do efficiently. So users who want an exact median are forced to use rdds, or UDF etc on arrays if the data fits in an array. My point was: there is no exact median or percentile functionality at all in Spark. This would help for some subset of those use cases.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org