This is an automated email from the ASF dual-hosted git repository.

mariofusco pushed a commit to branch dev-new-parser
in repository https://gitbox.apache.org/repos/asf/incubator-kie-drools.git

commit 03b35bc3b850bd30f5084252a50fb23f2ae3cd4d
Author: Toshiya Kobayashi <[email protected]>
AuthorDate: Fri Nov 17 17:16:34 2023 +0900

    [DROOLS-7577] drools-lsp : Add comments and explanations for drools-p… (#43)
    
    * [DROOLS-7577] drools-lsp : Add comments and explanations for drools-parser
    
    * removed syntax examples
---
 drools-drl/drools-drl10-parser/Developer_Notes.md  | 38 ++++++++++
 .../java/org/drools/parser/DRLErrorListener.java   |  3 +
 .../java/org/drools/parser/DRLParserError.java     |  3 +
 .../java/org/drools/parser/DRLParserHelper.java    | 13 ++++
 .../java/org/drools/parser/DRLParserWrapper.java   |  6 ++
 .../java/org/drools/parser/DRLVisitorImpl.java     | 86 ++++++++++++++++------
 .../java/org/drools/parser/ParserStringUtils.java  | 18 ++++-
 7 files changed, 142 insertions(+), 25 deletions(-)

diff --git a/drools-drl/drools-drl10-parser/Developer_Notes.md 
b/drools-drl/drools-drl10-parser/Developer_Notes.md
new file mode 100644
index 0000000000..c521ee6d14
--- /dev/null
+++ b/drools-drl/drools-drl10-parser/Developer_Notes.md
@@ -0,0 +1,38 @@
+## drools-parser
+
+This module is a reimplementation of the DRL (Drools Rule Language) parser 
based on ANTLR4.
+
+The current 
[DRL6Parser](https://github.com/apache/incubator-kie-drools/blob/main/drools-drl/drools-drl-parser/src/main/java/org/drools/drl/parser/lang/DRL6Parser.java)
 is based on ANTLR3 and contains a lot of custom modifications, which is hard 
to maintain. This new module should keep the separation between the parser 
syntax (`DRLParser.g4`) and the Descr generation (`DRLVisitorImpl.java`).
+
+This module started with a part of LSP to develop DRL editors, but it is not 
limited to that. This module will also replace DRL6Parser in the drools code 
base.
+
+### How is this developed?
+
+1. The starting point is 
[DRL6Parser](https://github.com/apache/incubator-kie-drools/blob/main/drools-drl/drools-drl-parser/src/main/java/org/drools/drl/parser/lang/DRL6Parser.java).
 While it contains lots of customizations, we can map its javadoc (e.g. 
`packageStatement := PACKAGE qualifiedIdentifier SEMICOLON?`) to `DRLParser.g4` 
(e.g. `packagedef : PACKAGE name=drlQualifiedName SEMI? ;`).
+2. `DRLLexer.g4` is written to define tokens for DRL.
+3. `DRLLexer.g4` imports `JavaLexer.g4` to reuse Java tokens. `DRLParser.g4` 
imports `JavaParser.g4` to reuse Java grammar. These Java parser files are 
distributed by ANTLR4 under BSD license.
+4. In `DRLLexer.g4`, basically define tokens with a prefix "DRL_" to clarify 
they are DRL keywords.
+5. In `DRLParser.g4`, define parser rules with a prefix "drl" if the rule name 
conflicts with `JavaParser.g4`. Sometimes we need to do that, because such a 
rule may contain DRL keywords.
+6. (As of 2023/10/31) this parser doesn't deeply parse rule RHS (just multiple 
`RHS_CHUNK`s), because Drools passes RHS text to drools-compiler as-is. In case 
of developing DRL editors, we may need to integrate another Java LSP to support 
RHS code completion, etc.
+7. LHS constraint (e.g. `age > 30`) is also handled as text. Further 
processing will be done in the later compiler phase.
+8. `DRLParser` processes a DRL text and produces an AST(abstract syntax tree). 
Then apply `DRLVisitorImpl` to generate PackageDescr following the visitor 
pattern. So the main work would be implementing `DRLParser.g4` and 
`DRLVisitorImpl`.
+9. Errors are handled by `DRLErrorListener`
+10. (As of 2023/10/31) We have 2 test classes. `DRLParserTest` is a very basic 
test to check if the parser can parse DRL. `MiscDRLParserTest` contains various 
DRL syntax to check if the parser generates correct Descr objects. 
`MiscDRLParserTest` was ported from 
[RuleParserTest](https://github.com/apache/incubator-kie-drools/blob/main/drools-test-coverage/test-compiler-integration/src/test/java/org/drools/mvel/compiler/lang/RuleParserTest.java)
 so that we can ensure the compatibility of g [...]
+11. As `DRL6Parser` contains hard-coded customizations, sometimes we need to 
read and understand the `DRL6Parser` source codes to meet the compatibility.
+12. (As of 2023/10/31) `MiscDRLParserTest` still has several test cases with 
`@Disabled` which are relatively lower priority or edge cases. They need to be 
resolved at some point in the future. To fix the issues, file a JIRA, remove 
the `@Disabled` annotation, and fix the implementation to pass the test case.
+
+### Next steps
+
+1. Create a feature branch in drools repo and replace `DRL6Parser` with this 
new parser.
+2. We will detect issues in the new parser by running the existing tests in 
drools repo. If we find any issues, we will fix them in the new parser and add 
new tests to cover them. Such tests would be more or less Descr comparison 
tests, so we would add a new test class which is similar to `MiscDRLParserTest`.
+
+### Refactoring candidates
+- `DRLParserHelper` and `DRLParserWrapper` have some duplicated code and 
purpose. We can merge them into one class.
+- `MiscDRLParserTest` can be cleaner and fixed to align with SonarLint 
suggestions.
+- Constraint related parser rules after `conditionalOrExpression` are written 
in antlr3 style. They could be refactored to antlr4 style (like 
`lhsExpression`).
+
+### Development tips
+- IntelliJ IDEA has an ANTLR4 plugin, which "ANTLR Preview" window displays a 
parse tree. It is very useful to debug the parser rules.
+
+### Resources
+[The Definitive ANTLR 4 
Reference](https://pragprog.com/titles/tpantlr2/the-definitive-antlr-4-reference/)
\ No newline at end of file
diff --git 
a/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLErrorListener.java
 
b/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLErrorListener.java
index a15d12addd..f7c6504df4 100644
--- 
a/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLErrorListener.java
+++ 
b/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLErrorListener.java
@@ -7,6 +7,9 @@ import org.antlr.v4.runtime.BaseErrorListener;
 import org.antlr.v4.runtime.RecognitionException;
 import org.antlr.v4.runtime.Recognizer;
 
+/**
+ * Collect errors while parsing DRL
+ */
 public class DRLErrorListener extends BaseErrorListener {
 
     private final List<DRLParserError> errors = new ArrayList<>();
diff --git 
a/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLParserError.java
 
b/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLParserError.java
index 6b9fe11694..ec3e41747d 100644
--- 
a/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLParserError.java
+++ 
b/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLParserError.java
@@ -1,5 +1,8 @@
 package org.drools.parser;
 
+/**
+ * Error information while parsing DRL
+ */
 public class DRLParserError {
 
     private int lineNumber;
diff --git 
a/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLParserHelper.java
 
b/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLParserHelper.java
index 3bd15b3f4f..bee7a6e96d 100644
--- 
a/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLParserHelper.java
+++ 
b/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLParserHelper.java
@@ -10,11 +10,18 @@ import org.antlr.v4.runtime.tree.ParseTree;
 import org.antlr.v4.runtime.tree.TerminalNode;
 import org.drools.drl.ast.descr.PackageDescr;
 
+/**
+ * Collection of static helper methods for DRLParser
+ */
 public class DRLParserHelper {
 
     private DRLParserHelper() {
     }
 
+    /**
+     * Entry point for parsing DRL.
+     * Unlike DRLParserWrapper.parse(), this method does not collect errors.
+     */
     public static PackageDescr parse(String drl) {
         DRLParser drlParser = createDrlParser(drl);
         return 
compilationUnitContext2PackageDescr(drlParser.compilationUnit(), 
drlParser.getTokenStream());
@@ -27,6 +34,9 @@ public class DRLParserHelper {
         return new DRLParser(commonTokenStream);
     }
 
+    /**
+     * DRLVisitorImpl visits a parse tree and creates a PackageDescr
+     */
     public static PackageDescr 
compilationUnitContext2PackageDescr(DRLParser.CompilationUnitContext ctx, 
TokenStream tokenStream) {
         DRLVisitorImpl visitor = new DRLVisitorImpl(tokenStream);
         Object descr = visitor.visit(ctx);
@@ -37,6 +47,9 @@ public class DRLParserHelper {
         }
     }
 
+    /**
+     * Given a row and column of the input DRL, return the index of the 
matched token
+     */
     public static Integer computeTokenIndex(DRLParser parser, int row, int 
col) {
         for (int i = 0; i < parser.getInputStream().size(); i++) {
             Token token = parser.getInputStream().get(i);
diff --git 
a/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLParserWrapper.java
 
b/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLParserWrapper.java
index 06992f0440..8f24bf1d3f 100644
--- 
a/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLParserWrapper.java
+++ 
b/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLParserWrapper.java
@@ -10,12 +10,18 @@ import org.slf4j.LoggerFactory;
 
 import static 
org.drools.parser.DRLParserHelper.compilationUnitContext2PackageDescr;
 
+/**
+ * Wrapper for DRLParser. Somewhat duplicated from DRLParserHelper, but this 
class is instantiated and holds errors.
+ */
 public class DRLParserWrapper {
 
     private static final Logger LOGGER = 
LoggerFactory.getLogger(DRLParserWrapper.class);
 
     private final List<DRLParserError> errors = new ArrayList<>();
 
+    /**
+     * Main entry point for parsing DRL
+     */
     public PackageDescr parse(String drl) {
         DRLParser drlParser = DRLParserHelper.createDrlParser(drl);
         DRLErrorListener errorListener = new DRLErrorListener();
diff --git 
a/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLVisitorImpl.java
 
b/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLVisitorImpl.java
index 8212ac32be..3f920cc13a 100644
--- 
a/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLVisitorImpl.java
+++ 
b/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/DRLVisitorImpl.java
@@ -48,6 +48,12 @@ import static 
org.drools.parser.ParserStringUtils.safeStripStringDelimiters;
 import static org.drools.parser.ParserStringUtils.trimThen;
 import static org.drools.util.StringUtils.unescapeJava;
 
+/**
+ * Visitor implementation for DRLParser.
+ * Basically, each visit method creates and returns a Descr object traversing 
the parse tree.
+ * Finally, visitCompilationUnit() returns a PackageDescr object.
+ * Try not to depend on DRLVisitorImpl's internal state for clean 
maintainability
+ */
 public class DRLVisitorImpl extends DRLParserBaseVisitor<Object> {
 
     private final TokenStream tokenStream;
@@ -56,6 +62,9 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
         this.tokenStream = tokenStream;
     }
 
+    /**
+     * Main entry point for creating PackageDescr from a parser tree.
+     */
     @Override
     public PackageDescr visitCompilationUnit(DRLParser.CompilationUnitContext 
ctx) {
         PackageDescr packageDescr = new PackageDescr();
@@ -67,7 +76,11 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
         return packageDescr;
     }
 
+    /**
+     * Add all children Descr to PackageDescr
+     */
     private void applyChildrenDescrs(PackageDescr packageDescr, 
List<BaseDescr> descrList) {
+        // This bunch of if-blocks will be refactored by DROOLS-7564
         descrList.forEach(descr -> {
             if (descr instanceof UnitDescr) {
                 packageDescr.setUnit((UnitDescr) descr);
@@ -147,6 +160,8 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
             functionDescr.setReturnType("void");
         }
         functionDescr.setName(ctx.IDENTIFIER().getText());
+
+        // add function parameters
         DRLParser.FormalParametersContext formalParametersContext = 
ctx.formalParameters();
         DRLParser.FormalParameterListContext formalParameterListContext = 
formalParametersContext.formalParameterList();
         if (formalParameterListContext != null) {
@@ -163,7 +178,7 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
 
     @Override
     public BaseDescr visitDeclaredef(DRLParser.DeclaredefContext ctx) {
-        return visitDescrChildren(ctx).get(0);
+        return visitDescrChildren(ctx).get(0); // only one child
     }
 
     @Override
@@ -202,6 +217,9 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
         return windowDeclarationDescr;
     }
 
+    /**
+     * entry point for one rule
+     */
     @Override
     public RuleDescr visitRuledef(DRLParser.RuledefContext ctx) {
         RuleDescr ruleDescr = new 
RuleDescr(safeStripStringDelimiters(ctx.name.getText()));
@@ -228,16 +246,27 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
 
         if (ctx.rhs() != null) {
             ruleDescr.setConsequenceLocation(ctx.rhs().getStart().getLine(), 
ctx.rhs().getStart().getCharPositionInLine()); // location of "then"
-            
ruleDescr.setConsequence(trimThen(getTextPreservingWhitespace(ctx.rhs())));
+            
ruleDescr.setConsequence(trimThen(getTextPreservingWhitespace(ctx.rhs()))); // 
RHS is just a text
         }
 
         return ruleDescr;
     }
 
     private void slimLhsRootDescr(AndDescr root) {
+        // Root Descr is always AndDescr.
+        // For example, if there are nested AndDescr like
+        //  AndDescr
+        //  /\
+        // P  AndDescr
+        //     /\
+        //    P  P
+        // is slimmed down to
+        //  AndDescr
+        //  / | \
+        // P  P  P
         List<BaseDescr> descrList = new ArrayList<>(root.getDescrs());
         root.getDescrs().clear();
-        descrList.forEach(root::addOrMerge); // This slims down nested AndDescr
+        descrList.forEach(root::addOrMerge);
     }
 
     @Override
@@ -373,6 +402,9 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
         return attributeDescr;
     }
 
+    /**
+     * entry point for LHS
+     */
     @Override
     public List<BaseDescr> visitLhs(DRLParser.LhsContext ctx) {
         if (ctx.lhsExpression() != null) {
@@ -394,10 +426,12 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
     }
 
     private PatternDescr getSinglePatternDescr(DRLParser.LhsPatternBindContext 
ctx) {
-        Optional<BaseDescr> optPatternDescr = visitFirstDescrChild(ctx);
-        PatternDescr patternDescr = 
optPatternDescr.filter(PatternDescr.class::isInstance)
-                .map(PatternDescr.class::cast)
-                .orElseThrow(() -> new IllegalStateException("lhsPatternBind 
must have at least one lhsPattern : " + ctx.getText()));
+        List<BaseDescr> patternDescrList = visitDescrChildren(ctx);
+        if (patternDescrList.isEmpty() || !(patternDescrList.get(0) instanceof 
PatternDescr)) {
+            throw new IllegalStateException("lhsPatternBind must have at least 
one lhsPattern : " + ctx.getText());
+        }
+        PatternDescr patternDescr = (PatternDescr)patternDescrList.get(0);
+
         if (ctx.label() != null) {
             patternDescr.setIdentifier(ctx.label().IDENTIFIER().getText());
         } else if (ctx.unif() != null) {
@@ -423,6 +457,9 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
         return orDescr;
     }
 
+    /**
+     * entry point for a Pattern
+     */
     @Override
     public PatternDescr visitLhsPattern(DRLParser.LhsPatternContext ctx) {
         PatternDescr patternDescr = new PatternDescr(ctx.objectType.getText());
@@ -528,6 +565,9 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
         return new 
EntryPointDescr(safeStripStringDelimiters(ctx.stringId().getText()));
     }
 
+    /**
+     * Collect constraints in a Pattern
+     */
     @Override
     public List<ExprConstraintDescr> 
visitConstraints(DRLParser.ConstraintsContext ctx) {
         List<ExprConstraintDescr> exprConstraintDescrList = new ArrayList<>();
@@ -535,6 +575,9 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
         return exprConstraintDescrList;
     }
 
+    /**
+     * Collect constraints in a Pattern. Positional constraints comes first 
with semicolon.
+     */
     private List<ExprConstraintDescr> 
visitConstraints(DRLParser.PositionalConstraintsContext positionalCtx, 
DRLParser.ConstraintsContext ctx) {
         List<ExprConstraintDescr> exprConstraintDescrList = new ArrayList<>();
         populateExprConstraintDescrList(positionalCtx, 
exprConstraintDescrList);
@@ -557,6 +600,9 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
         }
     }
 
+    /**
+     * Takes one constraint as String and create ExprConstraintDescr
+     */
     @Override
     public ExprConstraintDescr visitConstraint(DRLParser.ConstraintContext 
ctx) {
         String constraint = visitConstraintChildren(ctx);
@@ -618,6 +664,7 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
 
     @Override
     public BaseDescr 
visitLhsExpressionEnclosed(DRLParser.LhsExpressionEnclosedContext ctx) {
+        // enclosed expression is simply stripped because Descr itself is 
encapsulated
         return (BaseDescr) visit(ctx.lhsExpression());
     }
 
@@ -708,7 +755,7 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
 
     @Override
     public BaseDescr visitLhsUnary(DRLParser.LhsUnaryContext ctx) {
-        return visitDescrChildren(ctx).get(0);
+        return visitDescrChildren(ctx).get(0); // lhsUnary has only one child
     }
 
     private void populateStartEnd(BaseDescr descr, ParserRuleContext ctx) {
@@ -718,6 +765,10 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
         descr.setEndCharacter(ctx.getStop().getStopIndex());
     }
 
+    /**
+     * This is a special version of visitChildren().
+     * This collects children BaseDescr objects and returns them as a list.
+     */
     private List<BaseDescr> visitDescrChildren(RuleNode node) {
         List<BaseDescr> aggregator = new ArrayList<>();
         int n = node.getChildCount();
@@ -768,21 +819,10 @@ public class DRLVisitorImpl extends 
DRLParserBaseVisitor<Object> {
         }
     }
 
-    // leaves of constraint concatenate return Strings
+    /**
+     * Return the text of constraint as-is
+     */
     private String visitConstraintChildren(ParserRuleContext ctx) {
         return getTokenTextPreservingWhitespace(ctx, tokenStream);
     }
-
-    private Optional<BaseDescr> visitFirstDescrChild(RuleNode node) {
-        int n = node.getChildCount();
-
-        for (int i = 0; i < n && this.shouldVisitNextChild(node, null); ++i) {
-            ParseTree c = node.getChild(i);
-            Object childResult = c.accept(this);
-            if (childResult instanceof BaseDescr) {
-                return Optional.of((BaseDescr) childResult);
-            }
-        }
-        return Optional.empty();
-    }
-}
+}
\ No newline at end of file
diff --git 
a/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/ParserStringUtils.java
 
b/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/ParserStringUtils.java
index 1f84eb6f9c..a294bf2a92 100644
--- 
a/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/ParserStringUtils.java
+++ 
b/drools-drl/drools-drl10-parser/src/main/java/org/drools/parser/ParserStringUtils.java
@@ -5,13 +5,17 @@ import org.antlr.v4.runtime.TokenStream;
 import org.antlr.v4.runtime.misc.Interval;
 
 /**
- * will be merged in drools-util
+ * Collection of String utilities used by DRLParser.
+ * This may be merged in drools-util
  */
 public class ParserStringUtils {
 
     private ParserStringUtils() {
     }
 
+    /**
+     * Strip string delimiters (e.g. "foo" -> foo)
+     */
     public static String safeStripStringDelimiters(String value) {
         if (value != null) {
             value = value.trim();
@@ -22,6 +26,9 @@ public class ParserStringUtils {
         return value;
     }
 
+    /**
+     * Get text from ParserRuleContext's CharStream without trimming whitespace
+     */
     public static String getTextPreservingWhitespace(ParserRuleContext ctx) {
         // Using raw CharStream
         int startIndex = ctx.start.getStartIndex();
@@ -34,11 +41,18 @@ public class ParserStringUtils {
         return ctx.start.getTokenSource().getInputStream().getText(interval);
     }
 
+    /**
+     * Get text from ParserRuleContext's CharStream without trimming whitespace
+     * tokenStream is required to get hidden channel token (e.g. whitespace).
+     * Unlike getTextPreservingWhitespace, this method reflects Lexer 
normalizeString
+     */
     public static String getTokenTextPreservingWhitespace(ParserRuleContext 
ctx, TokenStream tokenStream) {
-        // tokenStream is required to get hidden channel token (e.g. 
whitespace). Unlike getTextPreservingWhitespace, this method reflects Lexer 
normalizeString
         return tokenStream.getText(ctx.start, ctx.stop);
     }
 
+    /**
+     * Just remove leading "then"
+     */
     public static String trimThen(String rhs) {
         if (rhs.startsWith("then")) {
             return rhs.substring("then".length());


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to