kbendick commented on a change in pull request #4396:
URL: https://github.com/apache/iceberg/pull/4396#discussion_r835695324
##########
File path:
spark/v3.2/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala
##########
@@ -176,7 +176,12 @@ class IcebergSparkSqlExtensionsParser(delegate:
ParserInterface) extends ParserI
}
private def isIcebergCommand(sqlText: String): Boolean = {
- val normalized =
sqlText.toLowerCase(Locale.ROOT).trim().replaceAll("\\s+", " ")
+ val normalized = sqlText.toLowerCase(Locale.ROOT).trim()
+ // Catch all SQL comments, including simple comments (that start with
--), as well as multiline comments
+ .replaceAll("(?ms)/\\*.*?\\*/|--.*?\\n", " ")
+ // Replace any remaining newlines
+ .replaceAll("\\s+", " ")
+ .trim()
Review comment:
Assuming you mean to pass the result of `normalizeSqlText` to
`isIcebergCommand`, my concern is that we want to allow the parser to do the
actual removal of comments.
I’m happy to move the normalization into a separate function, but I don’t
think we should be manipulating the SQL text beyond whether or not we are
checking if this is Iceberg specific DML.
The parser is responsible for removing comments etc, as it knows how to
differentiate between comments that are in lined SQL hints vs plain SQL
comments that serve as documentation.
I'll give it a show and see how it looks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]