Josh Rosen created SPARK-17283: ---------------------------------- Summary: Cancel job in RDD.take() as soon as enough output is receieved Key: SPARK-17283 URL: https://issues.apache.org/jira/browse/SPARK-17283 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Josh Rosen Assignee: Josh Rosen
The current implementation of RDD.take() waits until all partitions of each job have been computed before checking whether enough rows have been received. If take() were to perform this check on-the-fly as individual partitions were completed then it could stop early, offering large speedups for certain interactive queries. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org