This can be done using spark.sql and window functions. Take a look at https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
On Sun, Jan 10, 2016 at 11:07 AM, Franc Carter <franc.car...@gmail.com> wrote: > > Sure, for a dataframe that looks like this > > ID Year Value > 1 2012 100 > 1 2013 102 > 1 2014 106 > 2 2012 110 > 2 2013 118 > 2 2014 128 > > I'd like to get back > > ID Year Value > 1 2013 2 > 1 2014 4 > 2 2013 8 > 2 2014 10 > > i.e the Value for an ID,Year combination is the Value for the ID,Year > minus the Value for the ID,Year-1 > > thanks > > > > > > > On 10 January 2016 at 20:51, Femi Anthony <femib...@gmail.com> wrote: > >> Can you clarify what you mean with an actual example ? >> >> For example, if your data frame looks like this: >> >> ID Year Value >> 1 2012 100 >> 2 2013 101 >> 3 2014 102 >> >> What's your desired output ? >> >> Femi >> >> >> On Sat, Jan 9, 2016 at 4:55 PM, Franc Carter <franc.car...@gmail.com> >> wrote: >> >>> >>> Hi, >>> >>> I have a DataFrame with the columns >>> >>> ID,Year,Value >>> >>> I'd like to create a new Column that is Value2-Value1 where the >>> corresponding Year2=Year-1 >>> >>> At the moment I am creating a new DataFrame with renamed columns and >>> doing >>> >>> DF.join(DF2, . . . .) >>> >>> This looks cumbersome to me, is there abtter way ? >>> >>> thanks >>> >>> >>> -- >>> Franc >>> >> >> >> >> -- >> http://www.femibyte.com/twiki5/bin/view/Tech/ >> http://www.nextmatrix.com >> "Great spirits have always encountered violent opposition from mediocre >> minds." - Albert Einstein. >> > > > > -- > Franc >