vladimirg-db commented on code in PR #46400: URL: https://github.com/apache/spark/pull/46400#discussion_r1591287417
########## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/BadRecordException.scala: ########## @@ -67,16 +67,23 @@ case class PartialResultArrayException( extends Exception(cause) /** - * Exception thrown when the underlying parser meet a bad record and can't parse it. + * Exception thrown when the underlying parser meets a bad record and can't parse it. Used for + * control flow between wrapper and underlying parser without overhead of creating a full exception. * @param record a function to return the record that cause the parser to fail * @param partialResults a function that returns an row array, which is the partial results of * parsing this bad record. - * @param cause the actual exception about why the record is bad and can't be parsed. + * @param cause the actual exception about why the record is bad and can't be parsed. It's better + * to use lightweight exception here (e.g. without stacktrace), and throw an + * adequate (user-facing) exception only in case it's necessary after + * unwrapping `BadRecordException` */ case class BadRecordException( @transient record: () => UTF8String, @transient partialResults: () => Array[InternalRow] = () => Array.empty[InternalRow], - cause: Throwable) extends Exception(cause) + cause: Throwable) extends Exception(cause) { Review Comment: Oh I found one place: https://github.com/apache/spark/blob/7c728b2c2d6cbe2b8310cd8cf8aa4c2528b489c8/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L720 It throws `QueryExecutionErrors.malformedRecordsDetectedInRecordParsingError`, but with a `BadRecordException` cause, which has a real cause embedded inside. I wonder if we would break this place by omitting the cause... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org