[ https://issues.apache.org/jira/browse/SPARK-15585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303690#comment-15303690 ]
Takeshi Yamamuro commented on SPARK-15585: ------------------------------------------ We cannot pass `null` at `quote` for univocity parsers because the argument type is `char`. So, I think `CSVOptions#getChar` cannot return `null`. On the other hand, spark-csv uses commons CSV and it can set null at `quote` (See: https://github.com/databricks/spark-csv/blob/master/src/main/scala/com/databricks/spark/csv/CsvRelation.scala#L82). It seems we can get the same behaviour with spark-csv if we set 'u0000' at quote when `null` passed. https://github.com/maropu/spark/compare/master...SPARK-15585 Also, we need to fix readwriter.py according to this issue (any default quote is obviously set there)? AFAIK there is no way for pyspark to pass `null` into CsvOptions#getChar. https://github.com/apache/spark/blob/master/python/pyspark/sql/readwriter.py#L375 > Don't use null in data source options to indicate default value > --------------------------------------------------------------- > > Key: SPARK-15585 > URL: https://issues.apache.org/jira/browse/SPARK-15585 > Project: Spark > Issue Type: Improvement > Components: SQL > Reporter: Reynold Xin > Priority: Critical > > See email: > http://apache-spark-developers-list.1001551.n3.nabble.com/changed-behavior-for-csv-datasource-and-quoting-in-spark-2-0-0-SNAPSHOT-td17704.html > We'd need to change DataFrameReader/DataFrameWriter in Python's > csv/json/parquet/... functions to put the actual default option values as > function parameters, rather than setting them to None. We can then in > CSVOptions.getChar (and JSONOptions, etc) to actually return null if the > value is null, rather than setting it to default value. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org