subject:"Will multiple filters on the same RDD optimized to one filter\?"

Re: Will multiple filters on the same RDD optimized to one filter?

2015-07-16 Thread Bin Wang

What if I would use both rdd1 and rdd2 later? Raghavendra Pandey raghavendra.pan...@gmail.com于2015年7月16日周四下午4:08写道： If you cache rdd it will save some operations. But anyway filter is a lazy operation. And it runs based on what you will do later on with rdd1 and rdd2... Raghavendra On Jul

Re: Will multiple filters on the same RDD optimized to one filter?

2015-07-16 Thread Raghavendra Pandey

If you cache rdd it will save some operations. But anyway filter is a lazy operation. And it runs based on what you will do later on with rdd1 and rdd2... Raghavendra On Jul 16, 2015 1:33 PM, Bin Wang wbi...@gmail.com wrote: If I write code like this: val rdd = input.map(_.value) val f1 =

Will multiple filters on the same RDD optimized to one filter?

2015-07-16 Thread Bin Wang

If I write code like this: val rdd = input.map(_.value) val f1 = rdd.filter(_ == 1) val f2 = rdd.filter(_ == 2) ... Then the DAG of the execution may be this: - Filter - ... Map - Filter - ... But the two filters is operated on the same RDD, which means it could be done by

Re: Will multiple filters on the same RDD optimized to one filter?

2015-07-16 Thread Raghavendra Pandey

Depending on what you do with them, they will get computed separately. Bcoz u may have long dag in each branch. So spark tries to run all the transformation function together rather than trying to optimize things across branches. On Jul 16, 2015 1:40 PM, Bin Wang wbi...@gmail.com wrote: What if