Stamatis Zampetakis created CALCITE-6915:
--------------------------------------------
Summary: Generalize terminology Linter to allow pattern based
checks in commit messages
Key: CALCITE-6915
URL: https://issues.apache.org/jira/browse/CALCITE-6915
Project: Calcite
Issue Type: Improvement
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis
CALCITE-6493 added some checks for enforcing certain terminology (mostly
focused on DBMS systems) in commit messages. There are still though various
terms that will not be captured by the existing checks. Consider, for instance
the ["snowflake"
term|https://github.com/apache/calcite/blob/bfbe8930f4ed7ba8da530e862e212a057191cfa3/core/src/test/java/org/apache/calcite/test/LintTest.java#L378]
and the following messages:
# Add support for Snowflake dialect
# Add support for snowflake dialect
# Add support for snowFlake dialect
# Add support for SnowFlake dialect
Normally, only the first commit message should be valid since the accepted term
is "Snowflake". The check flags correctly the case 2 as invalid but fails to
capture the case 3 and 4.
The current implementation is based on an exact match word pattern that would
require every single casing permutation of the word snowflake to be added in
the
[map|https://github.com/apache/calcite/blob/bfbe8930f4ed7ba8da530e862e212a057191cfa3/core/src/test/java/org/apache/calcite/test/LintTest.java#L71].
This already happens to some extend for
[MySQL|https://github.com/apache/calcite/blob/bfbe8930f4ed7ba8da530e862e212a057191cfa3/core/src/test/java/org/apache/calcite/test/LintTest.java#L367]
term that appears twice in the map.
In some cases terminology rules may require more than just different casing
rules. For this reason, I propose to generalize the terminology Linter to use a
pattern based definition that allows to capture more than just one instance of
a word and also extend the reference term to be a Set instead of a single entry.
Some secondary improvements from the proposed generalization are:
* the use of pre-compiled patterns that are instantiated only once
* the switch from Map to List as the container of the rules for faster
iteration and better readability
--
This message was sent by Atlassian Jira
(v8.20.10#820010)