Re: FlatMapValues

2015-01-05 Thread Sean Owen
, Dec 31, 2014 at 3:46 PM, Sean Owen so...@cloudera.com wrote: From the clarification below, the problem is that you are calling flatMapValues, which is only available on an RDD of key-value tuples. Your map function returns a tuple in one case but a String in the other, so your RDD is a bunch

Re: FlatMapValues

2015-01-05 Thread Sanjay Subramanian
cool let me adapt that. thanks a tonregardssanjay From: Sean Owen so...@cloudera.com To: Sanjay Subramanian sanjaysubraman...@yahoo.com Cc: user@spark.apache.org user@spark.apache.org Sent: Monday, January 5, 2015 3:19 AM Subject: Re: FlatMapValues For the record, the solution I

Re: FlatMapValues

2015-01-02 Thread Sanjay Subramanian
PM Subject: Re: FlatMapValues thanks let me try that out From: Hitesh Khamesra hiteshk...@gmail.com To: Sanjay Subramanian sanjaysubraman...@yahoo.com Cc: Kapil Malik kma...@adobe.com; Sean Owen so...@cloudera.com; user@spark.apache.org user@spark.apache.org Sent: Thursday, January

Re: FlatMapValues

2015-01-01 Thread Hitesh Khamesra
, Canadian sidefx data and vaccines sidefx data. @Kapil , sorry but flatMapValues is being reported as undefined To give u a complete picture of the code (its inside IntelliJ but thats only for testingthe real code runs on sparkshell on my cluster) https://github.com/sanjaysubramanian

Re: FlatMapValues

2015-01-01 Thread Sanjay Subramanian
: FlatMapValues How about this..apply flatmap on per line. And in that function, parse each line and return all the colums as per your need. On Wed, Dec 31, 2014 at 10:16 AM, Sanjay Subramanian sanjaysubraman...@yahoo.com.invalid wrote: hey guys Some of u may care :-) but this is just give u

FlatMapValues

2014-12-31 Thread Sanjay Subramanian
is as follows but the flatMapValues does not work even after I have created the pair RDD.reacRdd.map(line = line.split(',')).map(fields = { if (fields.length = 11 !fields(0).contains(VAERS_ID)) { (fields(0),(fields(1)+\t+fields

Re: FlatMapValues

2014-12-31 Thread Raghavendra Pandey
but the flatMapValues does not work even after I have created the pair RDD. reacRdd.map(line = line.split(',')).map(fields = { if (fields.length = 11 !fields(0).contains(VAERS_ID)) { (fields(0),(fields(1)+\t

RE: FlatMapValues

2014-12-31 Thread Kapil Malik
)) { (fields(0),(fields(1)+\t+fields(3)+\t+fields(5)+\t+fields(7)+\t+fields(9))) // Returns a pair (String, String), good } else { // Returns a String, bad } }) // RDD[Serializable] – PROBLEM I was not even able to apply flatMapValues since the filtered RDD passed to it is RDD

Re: FlatMapValues

2014-12-31 Thread Fernando O.
))) // Returns a pair (String, String), good } else { // Returns a String, bad } }) // RDD[Serializable] – PROBLEM I was not even able to apply flatMapValues since the filtered RDD passed to it is RDD[Serializable] and not a pair RDD. I am surprised how your code compiled

Re: FlatMapValues

2014-12-31 Thread Sanjay Subramanian
my code DID NOT compile saying that flatMapValues is not defined. In fact when I used your snippet , the code still does not compile  Error:(36, 57) value flatMapValues is not a member of org.apache.spark.rdd.RDD[(String, String)]                }).filter(pair = pair._1.length() 0).flatMapValues

RE: FlatMapValues

2014-12-31 Thread Kapil Malik
Hi Sanjay, Oh yes .. on flatMapValues, it's defined in PairRDDFunctions, and you need to import org.apache.spark.rdd.SparkContext._ to use them (http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.PairRDDFunctions ) @Sean, yes indeed flatMap / flatMapValues both can

Re: Type problem in Java when using flatMapValues

2014-10-03 Thread Robin Keunen
), [from, to, value]) ean and key are string from and to are DateTime value is a Double JavaPairRDDStringString, ListSerializable eanKeyTsParameters = javaRDD.mapToPair( ... ); Then I try to do flatMapValues to apply the GenerateTimeSeries Function, it takes the from, to and values to generate

Type problem in Java when using flatMapValues

2014-10-02 Thread Robin Keunen
is a Double JavaPairRDDStringString, ListSerializable eanKeyTsParameters = javaRDD.mapToPair( ... ); Then I try to do flatMapValues to apply the GenerateTimeSeries Function, it takes the /from, to /and /values/ to generate a ListLongDouble. Here is the error I get when compiling: error