[ https://issues.apache.org/jira/browse/SPARK-21100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Ray updated SPARK-21100: ------------------------------- Summary: Add summary method as alternative to describe that gives quartiles similar to Pandas (was: describe should give quartiles similar to Pandas) > Add summary method as alternative to describe that gives quartiles similar to > Pandas > ------------------------------------------------------------------------------------ > > Key: SPARK-21100 > URL: https://issues.apache.org/jira/browse/SPARK-21100 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.1.1 > Reporter: Andrew Ray > Priority: Minor > > The DataFrame describe method should also include quartiles (25th, 50th, and > 75th percentiles) like Pandas. > Example pandas output: > {code} > In [4]: df.describe() > Out[4]: > Unnamed: 0 displ year cyl cty hwy > count 234.000000 234.000000 234.000000 234.000000 234.000000 234.000000 > mean 117.500000 3.471795 2003.500000 5.888889 16.858974 23.440171 > std 67.694165 1.291959 4.509646 1.611534 4.255946 5.954643 > min 1.000000 1.600000 1999.000000 4.000000 9.000000 12.000000 > 25% 59.250000 2.400000 1999.000000 4.000000 14.000000 18.000000 > 50% 117.500000 3.300000 2003.500000 6.000000 17.000000 24.000000 > 75% 175.750000 4.600000 2008.000000 8.000000 19.000000 27.000000 > max 234.000000 7.000000 2008.000000 8.000000 35.000000 44.000000 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org