kbendick commented on a change in pull request #4396:
URL: https://github.com/apache/iceberg/pull/4396#discussion_r834863473
##########
File path:
spark/v3.2/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala
##########
@@ -176,7 +176,12 @@ class IcebergSparkSqlExtensionsParser(delegate:
ParserInterface) extends ParserI
}
private def isIcebergCommand(sqlText: String): Boolean = {
- val normalized =
sqlText.toLowerCase(Locale.ROOT).trim().replaceAll("\\s+", " ")
+ val normalized = sqlText.toLowerCase(Locale.ROOT).trim()
+ // Catch all SQL comments, including simple comments (that start with
--), as well as multiline comments
+ .replaceAll("(?ms)/\\*.*?\\*/|--.*?\\n", " ")
Review comment:
I can also explain it if we’d like. It’s operating in multi-line mode as
well as single line mode, to be able to catch two scenarios:
1) `/\\*.*?\\*/` is for catching comments of the form `/* …. */`. We
operate in `(m)` multi line mode to be able to catch the case where that
comment spans multiple lines. The `m` allows for the `.` to match on a new line
character.
2) `--.*?\\n` is for catching simple SQL comments, that start with two
dashes and then consume the rest of the line they are on like `-- some line
terminating comment`. We operate in `(s)` single line mod here. We don’t risk
hitting multi-line mode here really, eg having a comment like that also catch
the next line, because we explicitly search for the newline character (and
there’s a test for that case).
And those two statements are separated by an or,`|`.
But admittedly there’s no chance I’d just sit down and craft a regex like
this without a lot of process. 🙂
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]