Will multiple filters on the same RDD optimized to one filter?

Bin Wang Thu, 16 Jul 2015 01:03:33 -0700

If I write code like this:

val rdd = input.map(_.value)
val f1 = rdd.filter(_ == 1)
val f2 = rdd.filter(_ == 2)
...


Then the DAG of the execution may be this:

         -> Filter -> ...
Map
         -> Filter -> ...

But the two filters is operated on the same RDD, which means it could be
done by just scan the RDD once. Does spark have this kind optimization for
now?

Will multiple filters on the same RDD optimized to one filter?

Reply via email to