Re: pyspark: calculating row deltas

2016-01-10 Thread Femi Anthony
Can you clarify what you mean with an actual example ? For example, if your data frame looks like this: ID Year Value 12012 100 22013 101 32014 102 What's your desired output ? Femi On Sat, Jan 9, 2016 at 4:55 PM, Franc Carter wrote: > > Hi, > >

Re: pyspark: calculating row deltas

2016-01-10 Thread Franc Carter
Sure, for a dataframe that looks like this ID Year Value 1 2012 100 1 2013 102 1 2014 106 2 2012 110 2 2013 118 2 2014 128 I'd like to get back ID Year Value 1 2013 2 1 2014 4 2 2013 8 2 201410 i.e the Value for an ID,Year combination is the Value for the

Re: pyspark: calculating row deltas

2016-01-10 Thread Blaž Šnuderl
This can be done using spark.sql and window functions. Take a look at https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html On Sun, Jan 10, 2016 at 11:07 AM, Franc Carter wrote: > > Sure, for a dataframe that looks like this > > ID Year

Re: pyspark: calculating row deltas

2016-01-10 Thread Franc Carter
Thanks cheers On 10 January 2016 at 22:35, Blaž Šnuderl wrote: > This can be done using spark.sql and window functions. Take a look at > https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html > > On Sun, Jan 10, 2016 at 11:07 AM, Franc Carter

pyspark: calculating row deltas

2016-01-09 Thread Franc Carter
Hi, I have a DataFrame with the columns ID,Year,Value I'd like to create a new Column that is Value2-Value1 where the corresponding Year2=Year-1 At the moment I am creating a new DataFrame with renamed columns and doing DF.join(DF2, . . . .) This looks cumbersome to me, is there