Hi all,
I see that the pivot functionality is being added to spark DFs from 1.6
onward.
I am interested to see if there is a Spark SQL syntax available for
pivoting? example: Slide 11 of [1]
*pandas (Python) - pivot_table(df, values='D', index=['A', 'B'],
columns=['C'], aggfunc=np.sum) *
*reshape2 (R) - dcast(df, A + B ~ C, sum) *
*Oracle 11g - SELECT * FROM df PIVOT (sum(D) FOR C IN ('small', 'large')) p*
Best
[1]
http://www.slideshare.net/SparkSummit/pivoting-data-with-sparksql-by-andrew-ray
--
Niranda Perera
@n1r44 <https://twitter.com/N1R44>
+94 71 554 8430
https://www.linkedin.com/in/niranda
https://pythagoreanscript.wordpress.com/