[ https://issues.apache.org/jira/browse/SPARK-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen updated SPARK-2032: ----------------------------- Target Version/s: (was: 1.1.0) > Add an RDD.samplePartitions method for partition-level sampling > --------------------------------------------------------------- > > Key: SPARK-2032 > URL: https://issues.apache.org/jira/browse/SPARK-2032 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Reporter: Matei Zaharia > Assignee: Prashant Sharma > Priority: Minor > > This would allow us to sample a percent of the partitions and not have to > materialize all of them. It's less uniform but much faster and may be useful > for quickly exploring data. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org