What if I would use both rdd1 and rdd2 later?
Raghavendra Pandey raghavendra.pan...@gmail.com于2015年7月16日周四 下午4:08写道:
If you cache rdd it will save some operations. But anyway filter is a lazy
operation. And it runs based on what you will do later on with rdd1 and
rdd2...
Raghavendra
On Jul
If you cache rdd it will save some operations. But anyway filter is a lazy
operation. And it runs based on what you will do later on with rdd1 and
rdd2...
Raghavendra
On Jul 16, 2015 1:33 PM, Bin Wang wbi...@gmail.com wrote:
If I write code like this:
val rdd = input.map(_.value)
val f1 =
If I write code like this:
val rdd = input.map(_.value)
val f1 = rdd.filter(_ == 1)
val f2 = rdd.filter(_ == 2)
...
Then the DAG of the execution may be this:
- Filter - ...
Map
- Filter - ...
But the two filters is operated on the same RDD, which means it could be
done by
Depending on what you do with them, they will get computed separately.
Bcoz u may have long dag in each branch. So spark tries to run all the
transformation function together rather than trying to optimize things
across branches.
On Jul 16, 2015 1:40 PM, Bin Wang wbi...@gmail.com wrote:
What if