sandip-db commented on code in PR #45487: URL: https://github.com/apache/spark/pull/45487#discussion_r1556266125
########## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlParser.scala: ########## @@ -682,25 +684,25 @@ class XmlTokenizer( return false } val c = cOrEOF.toChar - if (c == commentEnd(i)) { - if (i >= commentEnd.length - 1) { - // Found comment close. + if (c == end(i)) { + i += 1 + if (i >= end.length) { Review Comment: Please add a test with two scenarios: - CDATA ends at the end of the file, - CDATA never ends. The later will be invalid XML. Goal is to make sure the parser doesn't crash and still returns other valid records. Add the same two tests for comments as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org