Connecting SparkR through Yarn

2015-11-08 Thread Amit Behera
Hi All, Spark Version = 1.5.1 Hadoop Version = 2.6.0 I set up the cluster in Amazon EC2 machines (1+5) I am able create a SparkContext object using *init* method from *RStudio.* But I do not know how can I create a SparkContext object in *yarn mode.* I got the below link to run on yarn. but in

How can I read file from HDFS i sparkR from RStudio

2015-10-08 Thread Amit Behera
Hi All, I am very new to SparkR. I am able to run a sample code from example given in the link : http://www.r-bloggers.com/installing-and-starting-sparkr-locally-on-windows-os-and-rstudio/ Then I am trying to read a file from HDFS in RStudio, but unable to read. Below is my code.

Re: groupByKey is not working

2015-01-30 Thread Amit Behera
org.apache.spark.SparkContext._ On Fri Jan 30 2015 at 3:21:45 PM Amit Behera amit.bd...@gmail.com wrote: hi all, my sbt file is like this: name := Spark version := 1.0 scalaVersion := 2.10.4 libraryDependencies += org.apache.spark %% spark-core % 1.1.0 libraryDependencies += net.sf.opencsv % opencsv % 2.3

Re: groupByKey is not working

2015-01-30 Thread Amit Behera
. It includes implicits that intellij will not know about otherwise. 2015-01-30 12:44 GMT-08:00 Amit Behera amit.bd...@gmail.com: I am sorry Sean. I am developing code in intelliJ Idea. so with the above dependencies I am not able to find *groupByKey* when I am searching by ctrl+space On Sat

Re: groupByKey is not working

2015-01-30 Thread Amit Behera
*really* need to say what that means. On Fri, Jan 30, 2015 at 8:20 PM, Amit Behera amit.bd...@gmail.com wrote: hi all, my sbt file is like this: name := Spark version := 1.0 scalaVersion := 2.10.4 libraryDependencies += org.apache.spark %% spark-core % 1.1.0

groupByKey is not working

2015-01-30 Thread Amit Behera
hi all, my sbt file is like this: name := Spark version := 1.0 scalaVersion := 2.10.4 libraryDependencies += org.apache.spark %% spark-core % 1.1.0 libraryDependencies += net.sf.opencsv % opencsv % 2.3 *code:* object SparkJob { def pLines(lines:Iterator[String])={ val parser=new

Re: unable to check whether an item is present in RDD

2014-12-28 Thread Amit Behera
method On Sun, Dec 28, 2014 at 1:54 AM, Amit Behera amit.bd...@gmail.com wrote: Hi All, I want to check an item is present or not in a RDD of Iterable[Int] using scala something like in java we do : *list.contains(item)* and the statement returns true if the item is present otherwise

Re: unable to check whether an item is present in RDD

2014-12-28 Thread Amit Behera
Hi Sean, I have a RDD like *theItems: org.apache.spark.rdd.RDD[Iterable[Int]]* I did like *val items = theItems.collect *//to get it as an array items: Array[Iterable[Int]] *val check = items.contains(item)* Thanks Amit On Sun, Dec 28, 2014 at 1:58 PM, Amit Behera amit.bd...@gmail.com wrote

Re: unable to check whether an item is present in RDD

2014-12-28 Thread Amit Behera
find the element you want by doing something like: rdd.filter(i = i.contains(target)).collect() Where target is the Int you are looking for. Nick ​ On Sun Dec 28 2014 at 3:28:45 AM Amit Behera amit.bd...@gmail.com wrote: Hi Nicholas, The RDD contains only one Iterable[Int]. Pankaj, I

Re: unable to check whether an item is present in RDD

2014-12-28 Thread Amit Behera
Hi Sean and Nicholas Thank you very much, *exists* method works here :) On Sun, Dec 28, 2014 at 2:27 PM, Sean Owen so...@cloudera.com wrote: Try instead i.exists(_ == target) On Dec 28, 2014 8:46 AM, Amit Behera amit.bd...@gmail.com wrote: Hi Nicholas, I am getting error: value contains

unable to check whether an item is present in RDD

2014-12-27 Thread Amit Behera
Hi All, I want to check an item is present or not in a RDD of Iterable[Int] using scala something like in java we do : *list.contains(item)* and the statement returns true if the item is present otherwise false. Please help me to find the solution. Thanks Amit

Re: unable to do group by with 1st column

2014-12-26 Thread Amit Behera
* String call(String i1, String i2) { *return* i1 + , + i2; } }); *From:* Tobias Pfeiffer [mailto:t...@preferred.jp] *Sent:* Friday, December 26, 2014 6:35 AM *To:* Amit Behera *Cc:* u

unable to do group by with 1st column

2014-12-25 Thread Amit Behera
Hi Users, I am reading a csv file and my data format is like : key1,value1 key1,value2 key1,value1 key1,value3 key2,value1 key2,value5 key2,value5 key2,value4 key1,value4 key1,value4 key3,value1 key3,value1 key3,value2 required output : key1:[value1,value2,value1,value3,value4,value4]