On 08/28/2014 07:20 AM, marylucy wrote:
fileA=1 2 3 4  one number a line,save in /sparktest/1/
fileB=3 4 5 6  one number a line,save in /sparktest/2/
I want to get 3 and 4

var a = sc.textFile("/sparktest/1/").map((_,1))
var b = sc.textFile("/sparktest/2/").map((_,1))

a.filter(param=>{b.lookup(param._1).length>0}).map(_._1).foreach(println)

Error throw
Scala.MatchError:Null
PairRDDFunctions.lookup...

the issue is nesting of the b rdd inside a transformation of the a rdd

consider using intersection, it's more idiomatic

a.intersection(b).foreach(println)

but not that intersection will remove duplicates

best,


matt

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to