Nong Li created SPARK-14259: ------------------------------- Summary: Add config to control maximum number of files when coalescing partitions Key: SPARK-14259 URL: https://issues.apache.org/jira/browse/SPARK-14259 Project: Spark Issue Type: Improvement Components: SQL Reporter: Nong Li Priority: Minor
The FileSourceStrategy currently has a config to control the maximum byte size of coalesced partitions. It is helpful to also have a config to control the maximum number of files as even small files have a non-trivial fixed cost. The current packing can put a lot of small files together which cases straggler tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org