val test=sc.textFile(file).keyBy(x => x.split("\\~") (0)) .map(x => x._2.split("\\~")) .map(x => ((x(0),x(1),x(2)))) .map{case (account,datevalue,amount) => ((account,datevalue),(amount.toDouble))}.mapValues(x => x).toArray.sliding(2,1).map(x => (x(0)._1,x(1)._2,(x.foldLeft(0.0)(_ + _._2/x.size)),x.foldLeft(0.0)(_ + _._2))).foreach(println)
On Sun, Jul 31, 2016 at 12:15 PM, sri hari kali charan Tummala < kali.tumm...@gmail.com> wrote: > Hi All, > > I already solved it using DF and spark sql I was wondering how to solve in > scala rdd, I just got the answer need to check my results compared to spark > sql thanks all for your time. > > I am trying to solve moving average using scala RDD group by key. > > > input:- > -987~20150728~100 > -987~20150729~50 > -987~20150730~-100 > -987~20150804~200 > -987~20150807~-300 > -987~20150916~100 > > > val test=sc.textFile(file).keyBy(x => x.split("\\~") (0)) > .map(x => x._2.split("\\~")) > .map(x => ((x(0),x(1),x(2)))) > .map{case (account,datevalue,amount) => > ((account,datevalue),(amount.toDouble))}.mapValues(x => > x).toArray.sliding(2,1).map(x => (x(0)._1,x(1)._2,(x.foldLeft(0.0)(_ + > _._2/x.size)),x.foldLeft(0.0)(_ + _._2))).foreach(println) > > Op:- > > accountkey,date,balance_of_account, daily_average, sum_base_on_window > > ((-987,20150728),50.0,75.0,150.0) > ((-987,20150729),-100.0,-25.0,-50.0) > ((-987,20150730),200.0,50.0,100.0) > ((-987,20150804),-300.0,-50.0,-100.0) > ((-987,20150807),100.0,-100.0,-200.0) > > > below book is written for Hadoop Mapreduce the book has solution for > moving average but its in Java. > > > https://www.safaribooksonline.com/library/view/data-algorithms/9781491906170/ch06.html > > > Sql:- > > > SELECT DATE,balance, > SUM(balance) OVER (ORDER BY DATE ROWS BETWEEN UNBOUNDED PRECEDING AND > CURRENT ROW) daily_balance > FROM table > > Thanks > Sri > > > > On Sun, Jul 31, 2016 at 11:54 AM, Mich Talebzadeh < > mich.talebza...@gmail.com> wrote: > >> Check also this >> <https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html> >> >> HTH >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> On 31 July 2016 at 19:49, sri hari kali charan Tummala < >> kali.tumm...@gmail.com> wrote: >> >>> Tuple >>> >>> [Lscala.Tuple2;@65e4cb84 >>> >>> On Sun, Jul 31, 2016 at 1:00 AM, Jacek Laskowski <ja...@japila.pl> >>> wrote: >>> >>>> Hi, >>>> >>>> What's the result type of sliding(2,1)? >>>> >>>> Pozdrawiam, >>>> Jacek Laskowski >>>> ---- >>>> https://medium.com/@jaceklaskowski/ >>>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark >>>> Follow me at https://twitter.com/jaceklaskowski >>>> >>>> >>>> On Sun, Jul 31, 2016 at 9:23 AM, sri hari kali charan Tummala >>>> <kali.tumm...@gmail.com> wrote: >>>> > tried this no luck, wht is non-empty iterator here ? >>>> > >>>> > OP:- >>>> > (-987,non-empty iterator) >>>> > (-987,non-empty iterator) >>>> > (-987,non-empty iterator) >>>> > (-987,non-empty iterator) >>>> > (-987,non-empty iterator) >>>> > >>>> > >>>> > sc.textFile(file).keyBy(x => x.split("\\~") (0)) >>>> > .map(x => x._2.split("\\~")) >>>> > .map(x => (x(0),x(2))) >>>> > .map { case (key,value) => >>>> (key,value.toArray.toSeq.sliding(2,1).map(x >>>> > => x.sum/x.size))}.foreach(println) >>>> > >>>> > >>>> > On Sun, Jul 31, 2016 at 12:03 AM, sri hari kali charan Tummala >>>> > <kali.tumm...@gmail.com> wrote: >>>> >> >>>> >> Hi All, >>>> >> >>>> >> I managed to write using sliding function but can it get key as well >>>> in my >>>> >> output ? >>>> >> >>>> >> sc.textFile(file).keyBy(x => x.split("\\~") (0)) >>>> >> .map(x => x._2.split("\\~")) >>>> >> .map(x => (x(2).toDouble)).toArray().sliding(2,1).map(x => >>>> >> (x,x.size)).foreach(println) >>>> >> >>>> >> >>>> >> at the moment my output:- >>>> >> >>>> >> 75.0 >>>> >> -25.0 >>>> >> 50.0 >>>> >> -50.0 >>>> >> -100.0 >>>> >> >>>> >> I want with key how to get moving average output based on key ? >>>> >> >>>> >> >>>> >> 987,75.0 >>>> >> 987,-25 >>>> >> 987,50.0 >>>> >> >>>> >> Thanks >>>> >> Sri >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> On Sat, Jul 30, 2016 at 11:40 AM, sri hari kali charan Tummala >>>> >> <kali.tumm...@gmail.com> wrote: >>>> >>> >>>> >>> for knowledge just wondering how to write it up in scala or spark >>>> RDD. >>>> >>> >>>> >>> Thanks >>>> >>> Sri >>>> >>> >>>> >>> On Sat, Jul 30, 2016 at 11:24 AM, Jacek Laskowski <ja...@japila.pl> >>>> >>> wrote: >>>> >>>> >>>> >>>> Why? >>>> >>>> >>>> >>>> Pozdrawiam, >>>> >>>> Jacek Laskowski >>>> >>>> ---- >>>> >>>> https://medium.com/@jaceklaskowski/ >>>> >>>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark >>>> >>>> Follow me at https://twitter.com/jaceklaskowski >>>> >>>> >>>> >>>> >>>> >>>> On Sat, Jul 30, 2016 at 4:42 AM, kali.tumm...@gmail.com >>>> >>>> <kali.tumm...@gmail.com> wrote: >>>> >>>> > Hi All, >>>> >>>> > >>>> >>>> > I managed to write business requirement in spark-sql and hive I >>>> am >>>> >>>> > still >>>> >>>> > learning scala how this below sql be written using spark RDD not >>>> spark >>>> >>>> > data >>>> >>>> > frames. >>>> >>>> > >>>> >>>> > SELECT DATE,balance, >>>> >>>> > SUM(balance) OVER (ORDER BY DATE ROWS BETWEEN UNBOUNDED >>>> PRECEDING AND >>>> >>>> > CURRENT ROW) daily_balance >>>> >>>> > FROM table >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > >>>> >>>> > -- >>>> >>>> > View this message in context: >>>> >>>> > >>>> http://apache-spark-user-list.1001560.n3.nabble.com/sql-to-spark-scala-rdd-tp27433.html >>>> >>>> > Sent from the Apache Spark User List mailing list archive at >>>> >>>> > Nabble.com. >>>> >>>> > >>>> >>>> > >>>> --------------------------------------------------------------------- >>>> >>>> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>> >>>> > >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> -- >>>> >>> Thanks & Regards >>>> >>> Sri Tummala >>>> >>> >>>> >> >>>> >> >>>> >> >>>> >> -- >>>> >> Thanks & Regards >>>> >> Sri Tummala >>>> >> >>>> > >>>> > >>>> > >>>> > -- >>>> > Thanks & Regards >>>> > Sri Tummala >>>> > >>>> >>> >>> >>> >>> -- >>> Thanks & Regards >>> Sri Tummala >>> >>> >> > > > -- > Thanks & Regards > Sri Tummala > > -- Thanks & Regards Sri Tummala