[ https://issues.apache.org/jira/browse/SPARK-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15247378#comment-15247378 ]
Hyukjin Kwon edited comment on SPARK-14726 at 3/26/17 2:09 PM: --------------------------------------------------------------- This is currently not supported. I will work on this if it is decided to be supported. [~rxin] was (Author: hyukjin.kwon): This is currently not supported. I can work on this but I feel a bit hesitating because I believe CSV data source is ported mainly for "small data world". But I believe there are a lot of users dealing with large CSV files. I will work on this if it is decided to be supported. [~rxin] > Support for sampling when inferring schema in CSV data source > ------------------------------------------------------------- > > Key: SPARK-14726 > URL: https://issues.apache.org/jira/browse/SPARK-14726 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.0.0 > Reporter: Bomi Kim > > Currently, I am using CSV data source and trying to get used to Spark 2.0 > because it has built-in CSV data source. > I realized that CSV data source infers schema with all the data. JSON data > source supports sampling ratio option. > It would be great if CSV data source has this option too (or is this > supported already?). -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org