This is an automated email from the ASF dual-hosted git repository.
zwoop pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/trafficserver.git
The following commit(s) were added to refs/heads/master by this push:
new 9bcb9bcff8 hrw4u: add --error-format flag with pluggable formatters
(#13147)
9bcb9bcff8 is described below
commit 9bcb9bcff874248619b4b1aabe816f0b81f33af4
Author: samtes <[email protected]>
AuthorDate: Thu May 14 20:19:02 2026 -0500
hrw4u: add --error-format flag with pluggable formatters (#13147)
* hrw4u: fail fast when the antlr generator is missing
Adds a check-antlr Make target that errors with an install hint
(brew/dnf/apt/bootstrap.sh) before attempting parser generation,
instead of surfacing a cryptic rule failure deep in gen-fwd on a
fresh checkout.
* hrw4u: extract error rendering into pluggable formatters
Introduces an ErrorFormatter ABC with PlainText, JSON, and Markdown
implementations. ErrorCollector now delegates rendering instead of
building the output string inline, with PlainText as the default so
existing callers see byte-for-byte identical output. The JSON schema
is versioned (v1) and emitted on a single line so bulk-mode runs
produce NDJSON on stderr.
* hrw4u: add --error-format flag for json and markdown output
Wires the formatter abstraction into the CLI with a new
--error-format {plain,json,markdown} option (default: plain, so
existing consumers are unaffected). Non-syntax errors (file I/O,
argument errors) now flow through the same formatter path via
new emit_fatal_message / emit_fatal_error helpers, so downstream
tools see one stable schema regardless of where the error came from.
* hrw4u: omit 'Found 1 error:' preamble for single-error output
When only one error is produced, the 'Found 1 error:' summary line
adds noise without adding information. Skip it in that case and keep
the 'Found N errors:' header for counts of two or more.
* hrw4u: reflow trailing-comma signatures so yapf can collapse them
A trailing comma on the last parameter forced yapf to keep one
argument per line, which clashes with the project's 132-column
horizontal style. Drop the trailing commas in the new formatter
signatures, call sites, and dict literals so yapf collapses them.
* hrw4u: extend antlr-on-PATH check to inverse generation
gen-fwd already gated on check-antlr, but make gen-inv (and any
target depending only on it) still failed with the cryptic original
error when the generator was missing.
* hrw4u: annotate _build_formatter return type
Restores the typed-signature convention used elsewhere in this module
so type-checkers and IDEs can follow the formatter plumbing.
* hrw4u: drop duplicated location prefix on parse-time fatal errors
The stop-on-error path embedded "{filename}:0:0 - " into the message
itself, so JSON/Markdown formatters double-encoded the location
(which already lives in the filename/line/column fields). Pass the
plain message and let the formatter own the location, matching the
collecting-errors branch above.
* hrw4u: move diagnostic gutter from note text into the formatter
The " | " gutter was baked into the "Did you mean" note at creation
time, which leaked into JSON output (as literal whitespace + pipe in
each notes entry) and Markdown output (as a stray "| " after the block
quote). Store the raw note text and let PlainTextFormatter prepend the
gutter, so each formatter owns its own decoration.
---
tools/hrw4u/Makefile | 18 +++-
tools/hrw4u/src/common.py | 100 ++++++++++++------
tools/hrw4u/src/errors.py | 45 +++------
tools/hrw4u/src/formatters.py | 212 +++++++++++++++++++++++++++++++++++++++
tools/hrw4u/tests/test_cli.py | 62 ++++++++++++
tools/hrw4u/tests/test_errors.py | 167 +++++++++++++++++++++++++++++-
tools/hrw4u/tests/test_units.py | 2 +-
7 files changed, 541 insertions(+), 65 deletions(-)
diff --git a/tools/hrw4u/Makefile b/tools/hrw4u/Makefile
index ba25de02d9..24714279b7 100644
--- a/tools/hrw4u/Makefile
+++ b/tools/hrw4u/Makefile
@@ -36,6 +36,7 @@ SCRIPT_KG=scripts/hrw4u-kg
SHARED_FILES=src/common.py \
src/debugging.py \
src/errors.py \
+ src/formatters.py \
src/states.py \
src/tables.py \
src/types.py \
@@ -106,10 +107,21 @@ INIT_HRW4U=$(PKG_DIR_HRW4U)/__init__.py
INIT_U4WRH=$(PKG_DIR_U4WRH)/__init__.py
INIT_LSP=$(PKG_DIR_LSP)/__init__.py
-.PHONY: all gen gen-fwd gen-inv copy-src test clean build package env
setup-deps activate update coverage coverage-open
+.PHONY: all gen gen-fwd gen-inv copy-src check-antlr test clean build package
env setup-deps activate update coverage coverage-open
all: gen
+# Fail fast with a helpful message if the ANTLR generator is not on PATH.
+# Install is intentionally left to the user / bootstrap.sh — installers vary
+# by OS (brew on macOS, dnf/apt on Linux, CI images pin their own).
+check-antlr:
+ @command -v $(ANTLR) >/dev/null 2>&1 || { \
+ echo "Error: '$(ANTLR)' not found on PATH."; \
+ echo "Install it first (e.g. 'brew install antlr' on macOS),"; \
+ echo "or run ./bootstrap.sh which also sets up Python
dependencies."; \
+ exit 1; \
+ }
+
# Orchestrate generation then copy sources and drop __main__.py in each package
gen: gen-fwd gen-inv copy-src $(MAIN_HRW4U) $(MAIN_U4WRH) $(MAIN_LSP)
$(INIT_HRW4U) $(INIT_U4WRH) $(INIT_LSP)
@@ -137,7 +149,7 @@ $(INIT_LSP): | $(PKG_DIR_LSP)
touch $@
# Generate forward parser/lexer into build/hrw4u and build/hrw4u-lsp
-gen-fwd: $(ANTLR_FILES_FWD)
+gen-fwd: check-antlr $(ANTLR_FILES_FWD)
$(ANTLR_FILES_FWD): $(GRAMMAR_FWD)
@mkdir -p $(PKG_DIR_HRW4U)
@@ -146,7 +158,7 @@ $(ANTLR_FILES_FWD): $(GRAMMAR_FWD)
# LSP no longer generates its own ANTLR files - it imports from hrw4u
# Generate inverse parser/lexer into build/u4wrh
-gen-inv: $(ANTLR_FILES_INV)
+gen-inv: check-antlr $(ANTLR_FILES_INV)
$(ANTLR_FILES_INV): $(GRAMMAR_INV)
@mkdir -p $(PKG_DIR_U4WRH)
diff --git a/tools/hrw4u/src/common.py b/tools/hrw4u/src/common.py
index 680a4f9442..15f1885f4b 100644
--- a/tools/hrw4u/src/common.py
+++ b/tools/hrw4u/src/common.py
@@ -27,6 +27,7 @@ from antlr4.error.ErrorStrategy import BailErrorStrategy,
DefaultErrorStrategy
from antlr4 import InputStream, CommonTokenStream
from hrw4u.errors import Hrw4uSyntaxError, ThrowingErrorListener,
ErrorCollector, CollectingErrorListener
+from hrw4u.formatters import FORMATTERS, ErrorFormatter
from hrw4u.types import MagicStrings
@@ -112,6 +113,43 @@ def fatal(message: str) -> NoReturn:
sys.exit(1)
+def _build_formatter(error_format: str) -> ErrorFormatter:
+ """Instantiate the configured error formatter, falling back to plain."""
+ return FORMATTERS.get(error_format, FORMATTERS["plain"])()
+
+
+def emit_fatal_message(error_format: str, message: str, filename: str =
SystemDefaults.DEFAULT_FILENAME) -> NoReturn:
+ """Emit a non-syntax error (I/O, argument) via the chosen formatter and
exit.
+
+ Plain mode preserves the legacy bare-string output. Structured formats wrap
+ the message as a synthetic diagnostic so downstream consumers always see
the
+ same schema regardless of where the error originated.
+ """
+ if error_format == 'plain':
+ print(message, file=sys.stderr)
+ else:
+ err = Hrw4uSyntaxError(filename, 0, 0, message, "")
+ collector = ErrorCollector(formatter=_build_formatter(error_format))
+ collector.add_error(err)
+ print(collector.get_error_summary(), file=sys.stderr)
+ sys.exit(1)
+
+
+def emit_fatal_error(error_format: str, error: Hrw4uSyntaxError) -> NoReturn:
+ """Emit a single Hrw4uSyntaxError via the chosen formatter and exit.
+
+ Plain mode keeps the legacy ``str(error)`` output (no ``Found 1 error:``
+ prefix) so existing CLI consumers see byte-identical output.
+ """
+ if error_format == 'plain':
+ print(str(error), file=sys.stderr)
+ else:
+ collector = ErrorCollector(formatter=_build_formatter(error_format))
+ collector.add_error(error)
+ print(collector.get_error_summary(), file=sys.stderr)
+ sys.exit(1)
+
+
def create_base_parser(description: str) -> tuple[argparse.ArgumentParser,
argparse._MutuallyExclusiveGroup]:
"""Create base argument parser with common options."""
parser = argparse.ArgumentParser(description=description,
formatter_class=argparse.RawDescriptionHelpFormatter)
@@ -147,13 +185,14 @@ def create_parse_tree(
parser_class: type[ParserProtocol],
error_prefix: str,
collect_errors: bool = True,
- max_errors: int = 5) -> tuple[Any, ParserProtocol, ErrorCollector |
None]:
+ max_errors: int = 5,
+ error_format: str = "plain") -> tuple[Any, ParserProtocol,
ErrorCollector | None]:
"""Create ANTLR parse tree from input content with optional error
collection."""
input_stream = InputStream(content)
error_collector = None
if collect_errors:
- error_collector = ErrorCollector(max_errors=max_errors)
+ error_collector = ErrorCollector(max_errors=max_errors,
formatter=_build_formatter(error_format))
error_listener = CollectingErrorListener(filename=filename,
error_collector=error_collector)
else:
error_listener = ThrowingErrorListener(filename=filename)
@@ -181,7 +220,7 @@ def create_parse_tree(
error_collector.add_error(e)
return None, parser_obj, error_collector
else:
- fatal(str(e))
+ emit_fatal_error(error_format, e)
except Exception as e:
if collect_errors:
if error_collector:
@@ -189,7 +228,7 @@ def create_parse_tree(
error_collector.add_error(syntax_error)
return None, parser_obj, error_collector
else:
- fatal(f"{filename}:0:0 - {error_prefix} error: {e}")
+ emit_fatal_message(error_format, f"{error_prefix} error: {e}",
filename=filename)
def generate_output(
@@ -233,7 +272,9 @@ def generate_output(
syntax_error.add_note(note)
error_collector.add_error(syntax_error)
else:
- fatal(str(e))
+ visitor_err = e if isinstance(e, Hrw4uSyntaxError) else
Hrw4uSyntaxError(
+ filename, 0, 0, f"Visitor error: {e}", "")
+ emit_fatal_error(getattr(args, 'error_format', 'plain'),
visitor_err)
if error_collector and (error_collector.has_errors() or
error_collector.has_warnings()):
print(error_collector.get_error_summary(), file=sys.stderr)
@@ -289,6 +330,16 @@ def run_main(
default=5,
dest="max_errors",
help="Maximum number of errors to report before stopping (default: 5;
ignored with --stop-on-error)")
+ parser.add_argument(
+ "--error-format",
+ choices=sorted(FORMATTERS.keys()),
+ default="plain",
+ dest="error_format",
+ help=(
+ "Format used for error and warning output on stderr (default:
plain). "
+ "'json' emits one compact JSON object per input (NDJSON-friendly
in bulk mode); "
+ "'markdown' emits a rendered report suitable for PR comments and
chat. "
+ "Columns are always 0-based."))
if add_args is not None:
add_args(parser, output_group)
@@ -309,20 +360,18 @@ def run_main(
try:
content = pre_process(content, filename, args)
except Hrw4uSyntaxError as e:
- print(str(e), file=sys.stderr)
- sys.exit(1)
+ emit_fatal_error(args.error_format, e)
tree, parser_obj, error_collector = create_parse_tree(
- content, filename, lexer_class, parser_class, error_prefix, not
args.stop_on_error, args.max_errors)
+ content, filename, lexer_class, parser_class, error_prefix, not
args.stop_on_error, args.max_errors, args.error_format)
generate_output(tree, parser_obj, visitor_class, filename, args,
error_collector, extra_kwargs)
return
if any(':' in f for f in args.files):
for pair in args.files:
if ':' not in pair:
- print(
- f"Error: Mixed formats not allowed. All files must use
'input:output' format for bulk compilation.",
- file=sys.stderr)
- sys.exit(1)
+ emit_fatal_message(
+ args.error_format,
+ "Error: Mixed formats not allowed. All files must use
'input:output' format for bulk compilation.")
input_path, output_path = pair.split(':', 1)
@@ -331,20 +380,18 @@ def run_main(
content = input_file.read()
filename = input_path
except FileNotFoundError:
- print(f"Error: Input file '{input_path}' not found",
file=sys.stderr)
- sys.exit(1)
+ emit_fatal_message(args.error_format, f"Error: Input file
'{input_path}' not found", filename=input_path)
except Exception as e:
- print(f"Error reading '{input_path}': {e}", file=sys.stderr)
- sys.exit(1)
+ emit_fatal_message(args.error_format, f"Error reading
'{input_path}': {e}", filename=input_path)
if pre_process is not None:
try:
content = pre_process(content, filename, args)
except Hrw4uSyntaxError as e:
- print(str(e), file=sys.stderr)
- sys.exit(1)
+ emit_fatal_error(args.error_format, e)
tree, parser_obj, error_collector = create_parse_tree(
- content, filename, lexer_class, parser_class, error_prefix,
not args.stop_on_error, args.max_errors)
+ content, filename, lexer_class, parser_class, error_prefix,
not args.stop_on_error, args.max_errors,
+ args.error_format)
try:
with open(output_path, 'w', encoding='utf-8') as output_file:
@@ -355,8 +402,7 @@ def run_main(
finally:
sys.stdout = original_stdout
except Exception as e:
- print(f"Error writing to '{output_path}': {e}",
file=sys.stderr)
- sys.exit(1)
+ emit_fatal_message(args.error_format, f"Error writing to
'{output_path}': {e}", filename=output_path)
else:
for i, input_path in enumerate(args.files):
if i > 0:
@@ -367,19 +413,17 @@ def run_main(
content = input_file.read()
filename = input_path
except FileNotFoundError:
- print(f"Error: Input file '{input_path}' not found",
file=sys.stderr)
- sys.exit(1)
+ emit_fatal_message(args.error_format, f"Error: Input file
'{input_path}' not found", filename=input_path)
except Exception as e:
- print(f"Error reading '{input_path}': {e}", file=sys.stderr)
- sys.exit(1)
+ emit_fatal_message(args.error_format, f"Error reading
'{input_path}': {e}", filename=input_path)
if pre_process is not None:
try:
content = pre_process(content, filename, args)
except Hrw4uSyntaxError as e:
- print(str(e), file=sys.stderr)
- sys.exit(1)
+ emit_fatal_error(args.error_format, e)
tree, parser_obj, error_collector = create_parse_tree(
- content, filename, lexer_class, parser_class, error_prefix,
not args.stop_on_error, args.max_errors)
+ content, filename, lexer_class, parser_class, error_prefix,
not args.stop_on_error, args.max_errors,
+ args.error_format)
generate_output(tree, parser_obj, visitor_class, filename, args,
error_collector, extra_kwargs)
diff --git a/tools/hrw4u/src/errors.py b/tools/hrw4u/src/errors.py
index d4001b0b92..7980f70696 100644
--- a/tools/hrw4u/src/errors.py
+++ b/tools/hrw4u/src/errors.py
@@ -19,10 +19,13 @@ from __future__ import annotations
import re
from dataclasses import dataclass
-from typing import Final
+from typing import Final, TYPE_CHECKING
from antlr4.error.ErrorListener import ErrorListener
+if TYPE_CHECKING:
+ from hrw4u.formatters import ErrorFormatter
+
_TOKEN_NAMES: Final[dict[str, str]] = {
'QUALIFIED_IDENT': "qualified name (e.g. 'Namespace::Name')",
'IDENT': 'identifier',
@@ -111,6 +114,7 @@ class Hrw4uSyntaxError(Exception):
self.filename = filename
self.line = line
self.column = column
+ self.message = message
self.source_line = source_line
@@ -122,7 +126,7 @@ class SymbolResolutionError(Exception):
def add_symbol_suggestion(self, suggestions: list[str]) -> None:
if suggestions:
- self.add_note(f" | Did you mean: {suggestions[0]}?")
+ self.add_note(f"Did you mean: {suggestions[0]}?")
def hrw4u_error(filename: str, ctx: object, exc: Exception) ->
Hrw4uSyntaxError:
@@ -166,11 +170,12 @@ class Warning:
class ErrorCollector:
"""Collects multiple syntax errors and warnings for comprehensive
reporting."""
- def __init__(self, max_errors: int = 5) -> None:
+ def __init__(self, max_errors: int = 5, formatter: "ErrorFormatter | None"
= None) -> None:
self.errors: list[Hrw4uSyntaxError] = []
self.max_errors = max_errors
self.warnings: list[Warning] = []
self._sandbox_message: str | None = None
+ self._formatter = formatter
def add_error(self, error: Hrw4uSyntaxError) -> None:
self.errors.append(error)
@@ -194,35 +199,11 @@ class ErrorCollector:
return bool(self.warnings)
def get_error_summary(self) -> str:
- if not self.errors and not self.warnings:
- return "No errors found."
-
- lines: list[str] = []
-
- if self.errors:
- count = len(self.errors)
- lines.append(f"Found {count} error{'s' if count > 1 else ''}:")
-
- for error in self.errors:
- lines.append(str(error))
- if hasattr(error, '__notes__') and error.__notes__:
- lines.extend(error.__notes__)
-
- if self.warnings:
- if self.errors:
- lines.append("")
- count = len(self.warnings)
- lines.append(f"{count} warning{'s' if count > 1 else ''}:")
- lines.extend(w.format() for w in self.warnings)
-
- if self.at_limit:
- lines.append(f"(stopped after {self.max_errors} errors)")
-
- if self._sandbox_message:
- lines.append("")
- lines.append(self._sandbox_message)
-
- return "\n".join(lines)
+ formatter = self._formatter
+ if formatter is None:
+ from hrw4u.formatters import PlainTextFormatter
+ formatter = PlainTextFormatter()
+ return formatter.format_errors(self.errors, self.warnings,
self._sandbox_message, self.at_limit, self.max_errors)
class CollectingErrorListener(ErrorListener):
diff --git a/tools/hrw4u/src/formatters.py b/tools/hrw4u/src/formatters.py
new file mode 100644
index 0000000000..d666c16f31
--- /dev/null
+++ b/tools/hrw4u/src/formatters.py
@@ -0,0 +1,212 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Error/warning output formatters for hrw4u and u4wrh.
+
+Columns are 0-based across every format (matching the internal representation
+used by the ANTLR-driven listeners). The JSON schema is versioned so downstream
+consumers (UIs, CI tools) can guard against future changes.
+"""
+
+from __future__ import annotations
+
+import json
+from abc import ABC, abstractmethod
+from typing import TYPE_CHECKING
+
+if TYPE_CHECKING:
+ from hrw4u.errors import Hrw4uSyntaxError, Warning
+
+JSON_SCHEMA_VERSION = 1
+
+
+class ErrorFormatter(ABC):
+ """Renders a collected batch of errors and warnings into a single
string."""
+
+ @abstractmethod
+ def format_errors(
+ self, errors: list["Hrw4uSyntaxError"], warnings: list["Warning"],
sandbox_message: str | None, at_limit: bool,
+ max_errors: int) -> str:
+ ...
+
+
+class PlainTextFormatter(ErrorFormatter):
+ """Current CLI output: human-readable diagnostics with caret pointers."""
+
+ def format_errors(
+ self, errors: list["Hrw4uSyntaxError"], warnings: list["Warning"],
sandbox_message: str | None, at_limit: bool,
+ max_errors: int) -> str:
+ if not errors and not warnings:
+ return "No errors found."
+
+ lines: list[str] = []
+
+ if errors:
+ count = len(errors)
+ if count > 1:
+ lines.append(f"Found {count} errors:")
+
+ for error in errors:
+ lines.append(str(error))
+ notes = getattr(error, '__notes__', None)
+ if notes:
+ lines.extend(f" | {note}" for note in notes)
+
+ if warnings:
+ if errors:
+ lines.append("")
+ count = len(warnings)
+ lines.append(f"{count} warning{'s' if count > 1 else ''}:")
+ lines.extend(w.format() for w in warnings)
+
+ if at_limit:
+ lines.append(f"(stopped after {max_errors} errors)")
+
+ if sandbox_message:
+ lines.append("")
+ lines.append(sandbox_message)
+
+ return "\n".join(lines)
+
+
+class JSONFormatter(ErrorFormatter):
+ """Machine-readable output. Emits a single compact JSON object per call.
+
+ Suitable for NDJSON pipelines: in bulk mode each input file produces
exactly
+ one object on one line of stderr.
+ """
+
+ def format_errors(
+ self, errors: list["Hrw4uSyntaxError"], warnings: list["Warning"],
sandbox_message: str | None, at_limit: bool,
+ max_errors: int) -> str:
+ payload = {
+ "version": JSON_SCHEMA_VERSION,
+ "errors": [_diag_to_dict(e, "error") for e in errors],
+ "warnings": [_diag_to_dict(w, "warning") for w in warnings],
+ "summary": {
+ "error_count": len(errors),
+ "warning_count": len(warnings),
+ "truncated": at_limit,
+ "max_errors": max_errors
+ },
+ "sandbox_message": sandbox_message
+ }
+ return json.dumps(payload, separators=(",", ":"), ensure_ascii=False)
+
+
+class MarkdownFormatter(ErrorFormatter):
+ """Markdown report suitable for PR comments, chat, and docs."""
+
+ def format_errors(
+ self, errors: list["Hrw4uSyntaxError"], warnings: list["Warning"],
sandbox_message: str | None, at_limit: bool,
+ max_errors: int) -> str:
+ if not errors and not warnings:
+ return "_No errors found._"
+
+ parts: list[str] = []
+ parts.append(_markdown_heading(len(errors), len(warnings)))
+
+ for error in errors:
+ parts.append(
+ _markdown_diagnostic(
+ severity="Error",
+ filename=error.filename,
+ line=error.line,
+ column=error.column,
+ message=_extract_plain_message(error),
+ source_line=error.source_line,
+ notes=list(getattr(error, '__notes__', None) or [])))
+
+ for warning in warnings:
+ parts.append(
+ _markdown_diagnostic(
+ severity="Warning",
+ filename=warning.filename,
+ line=warning.line,
+ column=warning.column,
+ message=warning.message,
+ source_line=warning.source_line,
+ notes=[]))
+
+ if at_limit:
+ parts.append(f"> _Stopped after {max_errors} errors._")
+
+ if sandbox_message:
+ parts.append(f"> **Sandbox:** {sandbox_message}")
+
+ return "\n\n".join(parts)
+
+
+def _diag_to_dict(diag: "Hrw4uSyntaxError | Warning", severity: str) -> dict:
+ notes = list(getattr(diag, '__notes__', None) or [])
+ message = _extract_plain_message(diag)
+ return {
+ "filename": diag.filename,
+ "line": diag.line,
+ "column": diag.column,
+ "severity": severity,
+ "message": message,
+ "source_line": diag.source_line,
+ "notes": notes
+ }
+
+
+def _extract_plain_message(diag: "Hrw4uSyntaxError | Warning") -> str:
+ """Return just the message text, without the file:line:col: prefix or
caret art.
+
+ ``Hrw4uSyntaxError`` pre-formats a full diagnostic into ``args[0]``;
Warnings
+ carry the raw message on ``.message``.
+ """
+ message = getattr(diag, 'message', None)
+ if message is not None:
+ return message
+ raw = str(diag.args[0]) if diag.args else ""
+ header = raw.split("\n", 1)[0]
+ prefix = f"{diag.filename}:{diag.line}:{diag.column}: error: "
+ if header.startswith(prefix):
+ return header[len(prefix):]
+ return header
+
+
+def _markdown_heading(error_count: int, warning_count: int) -> str:
+ bits: list[str] = []
+ if error_count:
+ bits.append(f"{error_count} error{'s' if error_count != 1 else ''}")
+ if warning_count:
+ bits.append(f"{warning_count} warning{'s' if warning_count != 1 else
''}")
+ return f"## hrw4u: {', '.join(bits)}" if bits else "## hrw4u"
+
+
+def _markdown_diagnostic(
+ *, severity: str, filename: str, line: int, column: int, message: str,
source_line: str, notes: list[str]) -> str:
+ lines = [f"### {severity} — `{filename}:{line}:{column}`", message]
+
+ if source_line:
+ pointer = f"{' ' * column}^"
+ code_block = f"```\n{line:4d} | {source_line}\n | {pointer}\n```"
+ lines.append(code_block)
+
+ for note in notes:
+ lines.append(f"> {note.strip()}")
+
+ return "\n\n".join(lines)
+
+
+FORMATTERS: dict[str, type[ErrorFormatter]] = {
+ "plain": PlainTextFormatter,
+ "json": JSONFormatter,
+ "markdown": MarkdownFormatter,
+}
diff --git a/tools/hrw4u/tests/test_cli.py b/tools/hrw4u/tests/test_cli.py
index 30afb30a74..886728de05 100644
--- a/tools/hrw4u/tests/test_cli.py
+++ b/tools/hrw4u/tests/test_cli.py
@@ -16,6 +16,7 @@
# limitations under the License.
from __future__ import annotations
+import json
import subprocess
import sys
import tempfile
@@ -182,3 +183,64 @@ def test_u4wrh_bulk_mode(tmp_path: Path) -> None:
assert out2.exists()
assert "X-Test" in out1.read_text()
assert "404" in out2.read_text()
+
+
+def test_cli_error_format_json_on_parse_error(tmp_path: Path) -> None:
+ """With --error-format json, stderr must be a single JSON object matching
the schema."""
+ bad = tmp_path / "bad.hrw4u"
+ bad.write_text("REMAP { this is not valid syntax ( {\n")
+
+ result = run_hrw4u(["--error-format", "json", str(bad)])
+
+ assert result.returncode != 0 or result.stderr
+ payload = json.loads(result.stderr.strip().splitlines()[-1])
+ assert payload["version"] == 1
+ assert payload["summary"]["error_count"] >= 1
+ err = payload["errors"][0]
+ for field in ("filename", "line", "column", "severity", "message",
"source_line", "notes"):
+ assert field in err
+
+
+def test_cli_error_format_json_on_missing_file() -> None:
+ """File-I/O errors must also be wrapped in the JSON envelope."""
+ result = run_hrw4u(["--error-format", "json", "nonexistent_file.hrw4u"])
+
+ assert result.returncode != 0
+ payload = json.loads(result.stderr.strip().splitlines()[-1])
+ assert payload["version"] == 1
+ assert payload["summary"]["error_count"] == 1
+ assert "not found" in payload["errors"][0]["message"]
+
+
+def test_cli_error_format_markdown_on_parse_error(tmp_path: Path) -> None:
+ """Markdown format must include heading, fenced code block, and
location."""
+ bad = tmp_path / "bad.hrw4u"
+ bad.write_text("REMAP { this is not valid syntax ( {\n")
+
+ result = run_hrw4u(["--error-format", "markdown", str(bad)])
+
+ assert "## hrw4u:" in result.stderr
+ assert "### Error" in result.stderr
+ assert "```" in result.stderr
+
+
+def test_cli_default_error_format_is_plain(tmp_path: Path) -> None:
+ """Omitting --error-format must leave the legacy plain-text output
unchanged."""
+ bad = tmp_path / "bad.hrw4u"
+ bad.write_text("REMAP { this is not valid syntax ( {\n")
+
+ result = run_hrw4u([str(bad)])
+
+ assert ": error:" in result.stderr
+ assert "## hrw4u" not in result.stderr
+ assert not result.stderr.strip().startswith("{")
+
+
+def test_cli_help_lists_error_format_flag() -> None:
+ """--help must advertise the new flag."""
+ result = run_hrw4u(["--help"])
+
+ assert result.returncode == 0
+ assert "--error-format" in result.stdout
+ for choice in ("plain", "json", "markdown"):
+ assert choice in result.stdout
diff --git a/tools/hrw4u/tests/test_errors.py b/tools/hrw4u/tests/test_errors.py
index c11e2cba97..d3c096078c 100644
--- a/tools/hrw4u/tests/test_errors.py
+++ b/tools/hrw4u/tests/test_errors.py
@@ -16,8 +16,10 @@
# limitations under the License.
from hrw4u.errors import ErrorCollector, Hrw4uSyntaxError,
SymbolResolutionError, \
- ThrowingErrorListener, hrw4u_error, CollectingErrorListener
+ ThrowingErrorListener, hrw4u_error, CollectingErrorListener, Warning
+from hrw4u.formatters import FORMATTERS, JSON_SCHEMA_VERSION, JSONFormatter,
MarkdownFormatter, PlainTextFormatter
from hrw4u.validation import Validator, ValidatorChain
+import json
import pytest
@@ -323,5 +325,168 @@ class TestValidatorChainUnits:
assert Validator.quote_if_needed("has space") == '"has space"'
+class TestPlainTextFormatterParity:
+ """The plain formatter must preserve current CLI output byte-for-byte."""
+
+ def test_registry_has_expected_formats(self):
+ assert set(FORMATTERS.keys()) == {"plain", "json", "markdown"}
+
+ def test_empty_returns_no_errors_found(self):
+ ec = ErrorCollector(formatter=PlainTextFormatter())
+ assert ec.get_error_summary() == "No errors found."
+
+ def test_single_error_omits_found_preamble(self):
+ """A single error should not be prefixed with 'Found 1 error:'."""
+ ec = ErrorCollector(formatter=PlainTextFormatter())
+ err = Hrw4uSyntaxError("f.hrw4u", 1, 4, "oops", "foo bar")
+ ec.add_error(err)
+ out = ec.get_error_summary()
+ assert not out.startswith("Found")
+ assert out.startswith("f.hrw4u:1:4: error: oops")
+ assert " 1 | foo bar" in out
+
+ def test_multiple_errors_include_found_preamble(self):
+ """Two or more errors keep the 'Found N errors:' summary line."""
+ ec = ErrorCollector(formatter=PlainTextFormatter())
+ ec.add_error(Hrw4uSyntaxError("f.hrw4u", 1, 0, "a", ""))
+ ec.add_error(Hrw4uSyntaxError("f.hrw4u", 2, 0, "b", ""))
+ assert ec.get_error_summary().startswith("Found 2 errors:\n")
+
+ def test_at_limit_marker(self):
+ ec = ErrorCollector(max_errors=2, formatter=PlainTextFormatter())
+ err = Hrw4uSyntaxError("f.hrw4u", 1, 0, "x", "")
+ ec.add_error(err)
+ ec.add_error(err)
+ assert "(stopped after 2 errors)" in ec.get_error_summary()
+
+ def test_sandbox_message_appended(self):
+ ec = ErrorCollector(formatter=PlainTextFormatter())
+ err = Hrw4uSyntaxError("f.hrw4u", 1, 0, "x", "")
+ ec.add_error(err)
+ ec.set_sandbox_message("sandbox blocked thing")
+ assert ec.get_error_summary().endswith("sandbox blocked thing")
+
+ def test_default_formatter_is_plain(self):
+ """ErrorCollector() with no formatter must produce the legacy
output."""
+ legacy = ErrorCollector()
+ custom = ErrorCollector(formatter=PlainTextFormatter())
+ err = Hrw4uSyntaxError("f.hrw4u", 2, 3, "m", "src")
+ err.add_note("hint")
+ legacy.add_error(err)
+ custom.add_error(err)
+ assert legacy.get_error_summary() == custom.get_error_summary()
+
+
+class TestJSONFormatter:
+ """JSON output is the stable contract for downstream UIs (edgeconf,
etc.)."""
+
+ def _collect(self) -> ErrorCollector:
+ ec = ErrorCollector(formatter=JSONFormatter())
+ err = Hrw4uSyntaxError("f.hrw4u", 3, 4, "unexpected '('", "if foo ( {")
+ err.add_note("hint: try X")
+ ec.add_error(err)
+ ec.add_warning(Warning(filename="f.hrw4u", line=7, column=0,
message="deprecated", source_line="old;"))
+ return ec
+
+ def test_output_is_valid_json(self):
+ payload = json.loads(self._collect().get_error_summary())
+ assert payload["version"] == JSON_SCHEMA_VERSION
+
+ def test_error_fields_are_preserved(self):
+ payload = json.loads(self._collect().get_error_summary())
+ err = payload["errors"][0]
+ assert err["filename"] == "f.hrw4u"
+ assert err["line"] == 3
+ assert err["column"] == 4
+ assert err["severity"] == "error"
+ assert err["message"] == "unexpected '('"
+ assert err["source_line"] == "if foo ( {"
+ assert err["notes"] == ["hint: try X"]
+
+ def test_warning_severity_and_message(self):
+ payload = json.loads(self._collect().get_error_summary())
+ w = payload["warnings"][0]
+ assert w["severity"] == "warning"
+ assert w["message"] == "deprecated"
+ assert w["notes"] == []
+
+ def test_summary_counts_and_truncation(self):
+ payload = json.loads(self._collect().get_error_summary())
+ assert payload["summary"]["error_count"] == 1
+ assert payload["summary"]["warning_count"] == 1
+ assert payload["summary"]["truncated"] is False
+ assert payload["summary"]["max_errors"] == 5
+
+ def test_truncated_flag_flips_at_limit(self):
+ ec = ErrorCollector(max_errors=2, formatter=JSONFormatter())
+ err = Hrw4uSyntaxError("f.hrw4u", 1, 0, "x", "")
+ ec.add_error(err)
+ ec.add_error(err)
+ payload = json.loads(ec.get_error_summary())
+ assert payload["summary"]["truncated"] is True
+ assert payload["summary"]["max_errors"] == 2
+
+ def test_sandbox_message_is_top_level(self):
+ ec = self._collect()
+ ec.set_sandbox_message("sandbox blocked x")
+ payload = json.loads(ec.get_error_summary())
+ assert payload["sandbox_message"] == "sandbox blocked x"
+
+ def test_empty_collector_still_emits_valid_schema(self):
+ ec = ErrorCollector(formatter=JSONFormatter())
+ payload = json.loads(ec.get_error_summary())
+ assert payload["errors"] == []
+ assert payload["warnings"] == []
+ assert payload["sandbox_message"] is None
+
+ def test_single_line_output_for_ndjson(self):
+ out = self._collect().get_error_summary()
+ assert "\n" not in out, "JSON output must be single-line for NDJSON
streaming"
+
+
+class TestMarkdownFormatter:
+ """Markdown output is pure markdown — no ANSI, no colors."""
+
+ def _collect(self) -> ErrorCollector:
+ ec = ErrorCollector(formatter=MarkdownFormatter())
+ err = Hrw4uSyntaxError("f.hrw4u", 3, 4, "unexpected '('", "if foo ( {")
+ err.add_note("hint: try X")
+ ec.add_error(err)
+ return ec
+
+ def test_has_top_level_heading(self):
+ assert self._collect().get_error_summary().startswith("## hrw4u:")
+
+ def test_error_heading_includes_location(self):
+ assert "### Error — `f.hrw4u:3:4`" in
self._collect().get_error_summary()
+
+ def test_contains_fenced_code_block_with_caret(self):
+ md = self._collect().get_error_summary()
+ assert "```" in md
+ assert " 3 | if foo ( {" in md
+ assert "^" in md
+
+ def test_notes_render_as_blockquotes(self):
+ assert "> hint: try X" in self._collect().get_error_summary()
+
+ def test_empty_collector_friendly_message(self):
+ ec = ErrorCollector(formatter=MarkdownFormatter())
+ assert ec.get_error_summary() == "_No errors found._"
+
+ def test_no_source_line_skips_code_block(self):
+ ec = ErrorCollector(formatter=MarkdownFormatter())
+ ec.add_error(Hrw4uSyntaxError("f.hrw4u", 0, 0, "file not found", ""))
+ md = ec.get_error_summary()
+ assert "```" not in md
+ assert "file not found" in md
+
+ def test_at_limit_marker(self):
+ ec = ErrorCollector(max_errors=2, formatter=MarkdownFormatter())
+ err = Hrw4uSyntaxError("f.hrw4u", 1, 0, "x", "src")
+ ec.add_error(err)
+ ec.add_error(err)
+ assert "Stopped after 2 errors" in ec.get_error_summary()
+
+
if __name__ == "__main__":
pytest.main([__file__, "-v"])
diff --git a/tools/hrw4u/tests/test_units.py b/tools/hrw4u/tests/test_units.py
index 162681ce49..1058ff1ab5 100644
--- a/tools/hrw4u/tests/test_units.py
+++ b/tools/hrw4u/tests/test_units.py
@@ -95,7 +95,7 @@ class TestErrorCollectorUnits:
error_summary = self.error_collector.get_error_summary()
assert "Test error" in error_summary
- assert "Found 1 error:" in error_summary
+ assert "Found" not in error_summary
def test_error_collector_multiple_errors(self):
error1 = Hrw4uSyntaxError("test1.hrw4u", 1, 0, "Error 1", "line 1")