Marcin Mejran created SPARK-27873: ------------------------------------- Summary: Csv reader, adding a corrupt record column causes error if enforceSchema=false Key: SPARK-27873 URL: https://issues.apache.org/jira/browse/SPARK-27873 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.4.3 Reporter: Marcin Mejran
In the Spark CSV reader If you're using permissive mode with a column for storing corrupt records then you need to add a new schema column corresponding to columnNameOfCorruptRecord. However, if you have a header row and enforceSchema=false the schema vs. header validation fails because there is an extra column corresponding to columnNameOfCorruptRecord. Since, the FAILFAST mode doesn't print informative error messages on which rows failed to parse there is no way other to track down broken rows without setting a corrupt record column. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org