HyukjinKwon commented on a change in pull request #33706: URL: https://github.com/apache/spark/pull/33706#discussion_r686696280
########## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala ########## @@ -68,13 +71,19 @@ private[sql] class JsonInferSchema(options: JSONOptions) extends Serializable { Some(inferField(parser)) } } catch { - case e @ (_: RuntimeException | _: JsonProcessingException) => parseMode match { - case PermissiveMode => - Some(StructType(Seq(StructField(columnNameOfCorruptRecord, StringType)))) - case DropMalformedMode => + case e @ (_: RuntimeException | _: JsonProcessingException | _: IOException) => + if (ignoreCorruptFiles) { + logWarning(s"Skipped the corrupted file: $row", e) Review comment: Also should we maybe exclude `RuntimeException` for now? I think we intentionally throw `RuntimeException` for some places like: ``` case token => // We cannot parse this token based on the given data type. So, we throw a // RuntimeException and this exception will be caught by `parse` method. throw QueryExecutionErrors.failToParseValueForDataTypeError(parser, token, dataType) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org