On 08/28/2014 07:20 AM, marylucy wrote:
fileA=1 2 3 4 one number a line,save in /sparktest/1/
fileB=3 4 5 6 one number a line,save in /sparktest/2/
I want to get 3 and 4
var a = sc.textFile("/sparktest/1/").map((_,1))
var b = sc.textFile("/sparktest/2/").map((_,1))
a.filter(param=>{b.lookup(param._1).length>0}).map(_._1).foreach(println)
Error throw
Scala.MatchError:Null
PairRDDFunctions.lookup...
the issue is nesting of the b rdd inside a transformation of the a rdd
consider using intersection, it's more idiomatic
a.intersection(b).foreach(println)
but not that intersection will remove duplicates
best,
matt
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org