Re: take the difference between two columns of a dataframe in pyspark

2017-05-08 Thread Gourav Sengupta
Hi, convert then to temporary table and write a SQL, that will also work. Regards, Gourav On Sun, May 7, 2017 at 2:49 AM, Zeming Yu wrote: > Say I have the following dataframe with two numeric columns A and B, > what's the best way to add a column showing the difference

Re: take the difference between two columns of a dataframe in pyspark

2017-05-06 Thread Zeming Yu
OK. I've worked it out. df.withColumn('diff', col('A')-col('B')) On Sun, May 7, 2017 at 11:49 AM, Zeming Yu wrote: > Say I have the following dataframe with two numeric columns A and B, > what's the best way to add a column showing the difference between the two > columns?

take the difference between two columns of a dataframe in pyspark

2017-05-06 Thread Zeming Yu
Say I have the following dataframe with two numeric columns A and B, what's the best way to add a column showing the difference between the two columns? +-+--+ |A| B| +-+--+ |786.31999|786.12| | 786.12|