Hi All,

 In Spark, each action results in launching a job. Lets say my spark app
looks as-

val baseRDD =sc.parallelize(Array(1,2,3,4,5),2)
val rdd1 = baseRdd.map(x => x+2)
val rdd2 = rdd1.filter(x => x%2 ==0)
val count = rdd2.count
val firstElement = rdd2.first

println("Count is"+count)
println("First is"+firstElement)

Now, rdd2.count launches  job0 with 1 task and rdd2.first launches job1
with 1 task. Here in job2, when calculating rdd.first, is the entire
lineage computed again or else as job0 already computes rdd2, is it reused
???

Thanks,
Padma Ch

Reply via email to