HI All, Any inputs for the actual problem statement Regards, Satish
On Fri, Aug 21, 2015 at 5:57 PM, Jeff Zhang <zjf...@gmail.com> wrote: > Yong, Thanks for your reply. > > I tried spark-shell -i <script-file>, it works fine for me. Not sure the > different with > dse spark --master local --jars postgresql-9.4-1201.jar -i <ScriptFile> > > On Fri, Aug 21, 2015 at 7:01 PM, java8964 <java8...@hotmail.com> wrote: > >> I believe "spark-shell -i scriptFile" is there. We also use it, at least >> in Spark 1.3.1. >> >> "dse spark" will just wrap "spark-shell" command, underline it is just >> invoking "spark-shell". >> >> I don't know too much about the original problem though. >> >> Yong >> >> ------------------------------ >> Date: Fri, 21 Aug 2015 18:19:49 +0800 >> Subject: Re: Transformation not happening for reduceByKey or GroupByKey >> From: zjf...@gmail.com >> To: jsatishchan...@gmail.com >> CC: robin.e...@xense.co.uk; user@spark.apache.org >> >> >> Hi Satish, >> >> I don't see where spark support "-i", so suspect it is provided by DSE. >> In that case, it might be bug of DSE. >> >> >> >> On Fri, Aug 21, 2015 at 6:02 PM, satish chandra j < >> jsatishchan...@gmail.com> wrote: >> >> HI Robin, >> Yes, it is DSE but issue is related to Spark only >> >> Regards, >> Satish Chandra >> >> On Fri, Aug 21, 2015 at 3:06 PM, Robin East <robin.e...@xense.co.uk> >> wrote: >> >> Not sure, never used dse - it’s part of DataStax Enterprise right? >> >> On 21 Aug 2015, at 10:07, satish chandra j <jsatishchan...@gmail.com> >> wrote: >> >> HI Robin, >> Yes, below mentioned piece or code works fine in Spark Shell but the same >> when place in Script File and executed with -i <file name> it creating an >> empty RDD >> >> scala> val pairs = sc.makeRDD(Seq((0,1),(0,2),(1,20),(1,30),(2,40))) >> pairs: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[77] >> at makeRDD at <console>:28 >> >> >> scala> pairs.reduceByKey((x,y) => x + y).collect >> res43: Array[(Int, Int)] = Array((0,3), (1,50), (2,40)) >> >> Command: >> >> dse spark --master local --jars postgresql-9.4-1201.jar -i >> <ScriptFile> >> >> I understand, I am missing something here due to which my final RDD does >> not have as required output >> >> Regards, >> Satish Chandra >> >> On Thu, Aug 20, 2015 at 8:23 PM, Robin East <robin.e...@xense.co.uk> >> wrote: >> >> This works for me: >> >> scala> val pairs = sc.makeRDD(Seq((0,1),(0,2),(1,20),(1,30),(2,40))) >> pairs: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[77] >> at makeRDD at <console>:28 >> >> >> scala> pairs.reduceByKey((x,y) => x + y).collect >> res43: Array[(Int, Int)] = Array((0,3), (1,50), (2,40)) >> >> On 20 Aug 2015, at 11:05, satish chandra j <jsatishchan...@gmail.com> >> wrote: >> >> HI All, >> I have data in RDD as mentioned below: >> >> RDD : Array[(Int),(Int)] = Array((0,1), (0,2),(1,20),(1,30),(2,40)) >> >> >> I am expecting output as Array((0,3),(1,50),(2,40)) just a sum function >> on Values for each key >> >> Code: >> RDD.reduceByKey((x,y) => x+y) >> RDD.take(3) >> >> Result in console: >> RDD: org.apache.spark.rdd.RDD[(Int,Int)]= ShuffledRDD[1] at reduceByKey >> at <console>:73 >> res:Array[(Int,Int)] = Array() >> >> Command as mentioned >> >> dse spark --master local --jars postgresql-9.4-1201.jar -i <ScriptFile> >> >> >> Please let me know what is missing in my code, as my resultant Array is >> empty >> >> >> >> Regards, >> Satish >> >> >> >> >> >> >> >> >> -- >> Best Regards >> >> Jeff Zhang >> > > > > -- > Best Regards > > Jeff Zhang >