there is a

*PartitionPruningRDD*

:: DeveloperApi :: A RDD used to prune RDD partitions/partitions so we can
avoid launching tasks on all partitions. An example use case: If we know
the RDD is partitioned by range, and the execution DAG has a filter on the
key, we can avoid launching tasks on partitions that don't have the range
covering the key.

seems exactly made for the case,  but it's marked as DeveloperApi, anyone
know how to use it?



On Mon, Dec 8, 2014 at 11:31 AM, nsareen <nsar...@gmail.com> wrote:

> @Sowen, would appreciate, if you can explain how would Spark SQL help in my
> scenario..
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Does-filter-on-an-RDD-scan-every-data-item-tp20170p20571.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to