Just a thought... Are you trying to use use the RDD as a Map?

> You might want to check out PairRDDFunctions
> <http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.PairRDDFunctions>.
> For your use case in particular, you can load the file as a RDD[(String,
> String)] and then use the groupByKey() function in PairRDDFunctions to get
> an RDD[(String, Iterable[String])].
>> I am new to spark -and this is probably a basic question.
>> I have a file on the hdfs
>> 1, one
>> 1, uno
>> 2, two
>> 2, dos
>> I want to create a multi Map RDD  RDD[Map[String,List[String]]]
>> {"1"->["one","uno"], "2"->["two","dos"]}
>> First I read the file
>> val identityData:RDD[String] = sc.textFile($path_to_the_file, 2).cache()
>> val identityDataList:RDD[List[String]]=
>>        identityData.map{ line =>
>>         val splits= line.split(",")
>>         splits.toList
>>     }
>> Then I group them by the first element
>>  val grouped:RDD[(String,Iterable[List[String]])]=
>>     songArtistDataList.groupBy{
>>       element =>{
>>         element(0)
>>       }
>>     }
>> Then I do the equivalent of mapValues of scala collections to get rid of
>> the first element
>>  val groupedWithValues:RDD[(String,List[String])] =
>>     grouped.flatMap[(String,List[String])]{ case (key,list)=>{
>>       List((key,list.map{element => {
>>         element(1)
>>       }}.toList))
>>     }
>>     }
>> for this to actually materialize I do collect
>>  val groupedAndCollected=groupedWithValues.collect()
>> I get an Array[String,List[String]].
>> I am trying to figure out if there is a way for me to get
>> Map[String,List[String]] (a multimap), or to create an
>> RDD[Map[String,List[String]] ]
>> I am sure there is something simpler, I would appreciate advice.
