When I looked at this last fall, the only way that seemed to be available was to transform my data into SchemaRDDs, register them as tables and then use the Hive processor to calculate them with its built in percentile UDFs that were added in 1.2.
Curt ________________________________ From: kundan kumar <iitr.kun...@gmail.com> Sent: Wednesday, January 28, 2015 8:13 AM To: user@spark.apache.org Subject: Percentile Calculation Is there any inbuilt function for calculating percentile over a dataset ? I want to calculate the percentiles for each column in my data. Regards, Kundan