Maxim Gekk created SPARK-24329: ---------------------------------- Summary: Remove comments filtering before parsing of CSV files Key: SPARK-24329 URL: https://issues.apache.org/jira/browse/SPARK-24329 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.3.0 Reporter: Maxim Gekk
Comments and whitespace filtering has been performed by uniVocity parser already according to parser settings: https://github.com/apache/spark/blob/branch-2.3/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala#L178-L180 It is not necessary to do the same before parsing. Need to inspect all places where the filterCommentAndEmpty method is called, and remove the former one if it duplicates filtering of uniVocity parser. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org