Yaohua Zhao created SPARK-39689: ----------------------------------- Summary: Support 2-chars lineSep in CSV datasource Key: SPARK-39689 URL: https://issues.apache.org/jira/browse/SPARK-39689 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Yaohua Zhao
Univocity parser allows to set line separator to 1 to 2 characters ([code|https://github.com/uniVocity/univocity-parsers/blob/master/src/main/java/com/univocity/parsers/common/Format.java#L103]), CSV options should not block this usage ([code|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala#L218]). Due to the limitation around the `normalizedNewLine` (https://github.com/uniVocity/univocity-parsers/issues/170), setting 2 chars as a line separator could cause some weird/bad behaviors. Thus, we probably should leave this proposed fix as an undocumented feature and warn users to do this. A more proper fix could be further investigated in the future. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org