[ https://issues.apache.org/jira/browse/SPARK-13184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301114#comment-15301114 ]
koert kuipers commented on SPARK-13184: --------------------------------------- agreed, there should be a general way for data sources to override the default options. > Support minPartitions parameter for JSON and CSV datasources as options > ----------------------------------------------------------------------- > > Key: SPARK-13184 > URL: https://issues.apache.org/jira/browse/SPARK-13184 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.0.0 > Reporter: Hyukjin Kwon > Priority: Minor > > After looking through the pull requests below at Spark CSV datasources, > https://github.com/databricks/spark-csv/pull/256 > https://github.com/databricks/spark-csv/issues/141 > https://github.com/databricks/spark-csv/pull/186 > It looks Spark might need to be able to set {{minPartitions}}. > {{repartition()}} or {{coalesce()}} can be alternatives but it looks it needs > to shuffle the data for most cases. > Although I am still not sure if it needs this, I will open this ticket just > for discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org