wu-sheng opened a new pull request, #13723:
URL: https://github.com/apache/skywalking/pull/13723

   ### Replace Groovy DSL runtime with ANTLR4 + Javassist for MAL, LAL, and 
Hierarchy
   
   - [x] This is a non-trivial feature. Design doc: 
`docs/en/academy/dsl-compiler-design.md`
   - [x] Documentation updated to include this new feature.
   - [x] Tests (UT, IT, E2E) are added to verify the new feature.
   
   - [ ] If this pull request closes/resolves/fixes an existing issue, replace 
the issue number. Closes #<issue number>.
   - [x] Update the [`CHANGES` 
log](https://github.com/apache/skywalking/blob/master/docs/en/changes/changes.md).
   
   #### What this PR does
   
   Replaces the Groovy-based DSL runtime for **MAL** (Meter Analysis Language), 
**LAL** (Log Analysis Language), and **Hierarchy** matching rules with 
compile-time ANTLR4 parsing and Javassist bytecode generation — the same 
approach already used by the OAL v2 engine.
   
   All three DSL compilers follow the same pipeline:
   ```
   DSL string → ANTLR4 parse → Immutable AST → Javassist bytecode → Direct Java 
execution
   ```
   
   #### Why
   
   - **Remove Groovy runtime dependency** (~7MB) from the OAP server classpath
   - **Eliminate runtime interpretation** — generated bytecode uses direct 
method calls with zero reflection at runtime
   - **Thread-safe by design** — all generated instances are stateless 
singletons; per-request state passed as parameters (no ThreadLocal, no mutable 
wrappers)
   - **Fail-fast at boot** — DSL compilation errors are caught during startup 
with file/line/column reporting, not at first log/metric arrival
   - **Debugger-friendly** — all generated methods include LocalVariableTable 
(LVT) entries with named variables
   
   #### Architecture
   
   | DSL | Compiled Interface | Runtime Signature | State Passing |
   |-----|-------------------|-------------------|---------------|
   | MAL | `MalExpression` | `SampleFamily run(Map<String, SampleFamily>)` | 
Parameter |
   | LAL | `LalExpression` | `void execute(FilterSpec, ExecutionContext)` | 
Parameter |
   | Hierarchy | `BiFunction<Service, Service, Boolean>` | `Boolean 
apply(Service, Service)` | Parameter |
   
   All v2 classes live under `.v2.` packages to avoid FQCN conflicts with v1 
(Groovy) classes, which remain in 
`test/script-cases/script-runtime-with-groovy/` for comparison testing.
   
   ---
   
   #### Compiled Code Examples
   
   ##### MAL Example 1 — Simple aggregation
   
   **DSL**: `instance_jvm_cpu.sum(['service', 'instance'])`
   
   **Generated `run()` method**:
   ```java
   public SampleFamily run(Map samples) {
     SampleFamily sf;
     sf = ((SampleFamily) samples.getOrDefault("instance_jvm_cpu", 
SampleFamily.EMPTY));
     sf = sf.sum(new String[]{"service", "instance"});
     return sf;
   }
   ```
   
   ##### MAL Example 2 — Tag closure (compiled as a separate class)
   
   **DSL**: `metric.tag({tags -> tags.service_name = 'APISIX::' + 
tags.skywalking_service})`
   
   **Generated main class `run()`**:
   ```java
   public SampleFamily run(Map samples) {
     SampleFamily sf;
     sf = ((SampleFamily) samples.getOrDefault("metric", SampleFamily.EMPTY));
     sf = sf.tag(this._tag);          // _tag field holds pre-compiled closure 
instance
     return sf;
   }
   ```
   
   **Generated closure class `apply()` (separate `MalExpr_N$_tag`)**:
   ```java
   public Map apply(Map tags) {
     tags.put("service_name", "APISIX::" + tags.get("skywalking_service"));
     return tags;
   }
   ```
   
   Closures become separate Javassist classes (Javassist cannot compile 
lambdas/anonymous classes), stored as fields on the main class, wired via 
reflection at compile time.
   
   ##### MAL Example 3 — Regex match with ternary
   
   **DSL**: `metric.tag({ tags -> def matcher = (tags.metrics_name =~ 
/\.ssl\.certificate\.([^.]+)\.expiration/); tags.secret_name = matcher ? 
matcher[0][1] : "unknown" })`
   
   **Generated closure class `apply()`**:
   ```java
   public Map apply(Map tags) {
     String[][] matcher = MalRuntimeHelper.regexMatch(
         (String) tags.get("metrics_name"),
         "\\.ssl\\.certificate\\.([^.]+)\\.expiration");
     tags.put("secret_name", (((Object)(matcher)) != null ? (matcher[0][1]) : 
("unknown")));
     return tags;
   }
   ```
   
   `def` type inferred as `String[][]` from `=~` regex match. Ternary compiles 
to Java ternary with null-check on `Object` cast.
   
   ---
   
   ##### LAL Example 1 — JSON parser with extractor
   
   **DSL**:
   ```groovy
   filter {
     json {}
     extractor {
       service parsed.service as String
       instance parsed.instance as String
     }
     sink {}
   }
   ```
   
   **Generated class**:
   ```java
   public void execute(FilterSpec filterSpec, ExecutionContext ctx) {
     LalRuntimeHelper h = new LalRuntimeHelper(ctx);
     filterSpec.json(ctx);
     if (!ctx.shouldAbort()) { _extractor(filterSpec.extractor(), h); }
     filterSpec.sink(ctx);
   }
   
   private void _extractor(ExtractorSpec _e, LalRuntimeHelper h) {
     _e.service(h.ctx(), h.toStr(h.mapVal("service")));
     _e.instance(h.ctx(), h.toStr(h.mapVal("instance")));
   }
   ```
   
   Single class, no closures. `h.mapVal()` accesses JSON parsed map. 
`h.toStr()` preserves `null` (unlike `String.valueOf()` which returns `"null"`).
   
   ##### LAL Example 2 — Proto-based with extraLogType (Envoy ALS)
   
   **DSL**:
   ```groovy
   filter {
     if (parsed?.response?.responseCode?.value as Integer < 400) { abort {} }
     extractor {
       if (parsed?.response?.responseCode) {
         tag 'status.code': parsed?.response?.responseCode?.value
       }
       tag 'response.flag': parsed?.commonProperties?.responseFlags
     }
     sink {}
   }
   ```
   
   **Generated class** (with `extraLogType = HTTPAccessLogEntry`):
   ```java
   public void execute(FilterSpec filterSpec, ExecutionContext ctx) {
     LalRuntimeHelper h = new LalRuntimeHelper(ctx);
     // Cast once, reuse as _p
     HTTPAccessLogEntry _p = (HTTPAccessLogEntry) h.ctx().extraLog();
     // Safe-nav chain cached in local variables
     HTTPResponseProperties _t0 = _p == null ? null : _p.getResponse();
     UInt32Value _t1 = _t0 == null ? null : _t0.getResponseCode();
     if (_t1 != null && _t1.getValue() < 400) { filterSpec.abort(ctx); }
     if (!ctx.shouldAbort()) { _extractor(filterSpec.extractor(), h); }
     filterSpec.sink(ctx);
   }
   
   private void _extractor(ExtractorSpec _e, LalRuntimeHelper h) {
     HTTPAccessLogEntry _p = (HTTPAccessLogEntry) h.ctx().extraLog();
     HTTPResponseProperties _t0 = _p == null ? null : _p.getResponse();
     UInt32Value _t1 = _t0 == null ? null : _t0.getResponseCode();
     if (_t1 != null) {
       _e.tag(h.ctx(), "status.code", h.toStr(Integer.valueOf(_t1.getValue())));
     }
     AccessLogCommon _t2 = _p == null ? null : _p.getCommonProperties();
     _e.tag(h.ctx(), "response.flag", h.toStr(_t2 == null ? null : 
_t2.getResponseFlags()));
   }
   ```
   
   Proto getter chains resolved via Java reflection **at compile time** — at 
runtime it's direct method calls. `?.` safe navigation emits `== null ? null :` 
ternaries. Intermediate values cached in `_tN` local variables for readability 
and dedup.
   
   ---
   
   ##### Hierarchy Example 1 — Simple name match
   
   **DSL**: `{ (u, l) -> u.name == l.name }`
   
   **Generated class**:
   ```java
   public Object apply(Object arg0, Object arg1) {
     Service u = (Service) arg0;
     Service l = (Service) arg1;
     return Boolean.valueOf(java.util.Objects.equals(u.getName(), l.getName()));
   }
   ```
   
   ##### Hierarchy Example 2 — Block body with if/return
   
   **DSL**: `{ (u, l) -> { if (l.shortName.lastIndexOf('.') > 0) { return 
u.shortName == l.shortName.substring(0, l.shortName.lastIndexOf('.')); } return 
false; } }`
   
   **Generated class**:
   ```java
   public Object apply(Object arg0, Object arg1) {
     Service u = (Service) arg0;
     Service l = (Service) arg1;
     if (l.getShortName().lastIndexOf(".") > 0) {
       return Boolean.valueOf(java.util.Objects.equals(
           u.getShortName(),
           l.getShortName().substring(0, l.getShortName().lastIndexOf("."))));
     }
     return Boolean.valueOf(false);
   }
   ```
   
   Property access → getter methods. `==` → `Objects.equals()`. Numeric `>` → 
direct operator.
   
   ---
   
   #### v1 vs v2 Cross-Verification
   
   The `test/script-cases/script-runtime-with-groovy/` module runs every 
production DSL expression through **both** Groovy v1 and Javassist v2, then 
asserts identical results. Both v1 and v2 must pass — v1 failure also `fail()`s 
(no silent skip).
   
   ##### How it works
   
   ```
   For each DSL expression in production YAML configs:
     1. Compile with v1 (Groovy)    → run with mock data → collect output
     2. Compile with v2 (Javassist) → run with same data → collect output
     3. Assert v1 output == v2 output field by field
   ```
   
   v1 and v2 coexist in the same JVM via **package isolation** (`*.dsl.*` vs 
`*.v2.dsl.*`), each with its own `ModuleManager` mock.
   
   ##### What is compared
   
   | DSL | Fields compared |
   |-----|----------------|
   | **MAL** (metadata) | `samples` (input metric names), `scopeType`, 
`downsampling`, `isHistogram`, `scopeLabels`, `aggregationLabels` |
   | **MAL** (runtime) | Output `Sample[]` arrays — label sets and values, 
after running both with identical `SampleFamily` input |
   | **LAL** | `shouldAbort`, `shouldSave`, `log.service`, 
`log.serviceInstance`, `log.endpoint`, `log.layer`, `log.timestamp`, `log.tags` 
|
   | **Hierarchy** | Boolean result for each `(upper, lower)` Service pair |
   
   ##### Test data
   
   - **MAL**: 73 companion `.data.yaml` files provide realistic mock 
`SampleFamily` input per YAML config. For `increase()`/`rate()` expressions, 
the checker primes the counter window with initial data before the comparison 
run. Expressions without companion data fall back to auto-generated mock data.
   - **LAL**: Mock `LogData` protobuf built with 
service/instance/endpoint/timestamp/traceContext. Rules with `extraLogType` 
(e.g., envoy-als proto) are skipped via `Assumptions.assumeTrue(false)` since 
they need typed protobuf mock data the generic test can't provide.
   - **Hierarchy**: Test pairs from `.data.yaml` with `(upper, lower, 
expected)` tuples per rule.
   
   ##### Verification counts
   
   | DSL | Expressions | Source |
   |-----|------------|--------|
   | MAL expressions | 1,229 | 73 YAML files across 4 directories |
   | MAL filter closures | 31 | Separate `MalFilterComparisonTest` |
   | LAL scripts | 29 (1 skipped) | oap-cases + feature-cases |
   | Hierarchy rules | 4 rules × test pairs | 
`test-hierarchy-definition.data.yaml` |
   | **Total** | **~1,290** | |
   
   ---
   
   #### Files changed
   
   537 files, +40,436 / -1,480 (bulk from new test data files, POM version 
changes, and generated grammar sources)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to