Re: RDD with a Map

Oleg Proudnikov Wed, 04 Jun 2014 02:47:52 -0700

Just a thought... Are you trying to use use the RDD as a Map?



On 3 June 2014 23:14, Doris Xin <doris.s....@gmail.com> wrote:

> Hey Amit,
>
> You might want to check out PairRDDFunctions
> <http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.PairRDDFunctions>.
> For your use case in particular, you can load the file as a RDD[(String,
> String)] and then use the groupByKey() function in PairRDDFunctions to get
> an RDD[(String, Iterable[String])].
>
> Doris
>
>
> On Tue, Jun 3, 2014 at 2:56 PM, Amit Kumar <kumarami...@gmail.com> wrote:
>
>> Hi Folks,
>>
>> I am new to spark -and this is probably a basic question.
>>
>> I have a file on the hdfs
>>
>> 1, one
>> 1, uno
>> 2, two
>> 2, dos
>>
>> I want to create a multi Map RDD  RDD[Map[String,List[String]]]
>>
>> {"1"->["one","uno"], "2"->["two","dos"]}
>>
>>
>> First I read the file
>> val identityData:RDD[String] = sc.textFile($path_to_the_file, 2).cache()
>>
>> val identityDataList:RDD[List[String]]=
>>        identityData.map{ line =>
>>         val splits= line.split(",")
>>         splits.toList
>>     }
>>
>> Then I group them by the first element
>>
>>  val grouped:RDD[(String,Iterable[List[String]])]=
>>     songArtistDataList.groupBy{
>>       element =>{
>>         element(0)
>>       }
>>     }
>>
>> Then I do the equivalent of mapValues of scala collections to get rid of
>> the first element
>>
>>  val groupedWithValues:RDD[(String,List[String])] =
>>     grouped.flatMap[(String,List[String])]{ case (key,list)=>{
>>       List((key,list.map{element => {
>>         element(1)
>>       }}.toList))
>>     }
>>     }
>>
>> for this to actually materialize I do collect
>>
>>  val groupedAndCollected=groupedWithValues.collect()
>>
>> I get an Array[String,List[String]].
>>
>> I am trying to figure out if there is a way for me to get
>> Map[String,List[String]] (a multimap), or to create an
>> RDD[Map[String,List[String]] ]
>>
>>
>> I am sure there is something simpler, I would appreciate advice.
>>
>> Many thanks,
>> Amit
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>


-- 
Kind regards,

Oleg

Re: RDD with a Map

Reply via email to