Gengliang Wang created SPARK-23005: -------------------------------------- Summary: Improve RDD.take on small number of partitions Key: SPARK-23005 URL: https://issues.apache.org/jira/browse/SPARK-23005 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 2.2.1 Reporter: Gengliang Wang
In current implementation of RDD.take, we overestimate the number of partitions we need to try by 50%: `(1.5 * num * partsScanned / buf.size).toInt` However, when the number is small, the result of `.toInt` is not what we want. E.g, 2.9 will become 2, which should be 3. Use math.Ceil fix the problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org