Robert Kruszewski created SPARK-16984: -----------------------------------------
Summary: executeTake tries all partitions if first parition is empty Key: SPARK-16984 URL: https://issues.apache.org/jira/browse/SPARK-16984 Project: Spark Issue Type: Bug Affects Versions: 2.0.0 Reporter: Robert Kruszewski in executeTake if the number of rows returned by first partition is 0 we try all partitions next time. This can lead to pathological cases where your first partition is empty and rest have data. This unfortunately can happen with skewed data. Empirically observed it's better to make few roundtrips instead of potentially killing driver with big collect -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org