andreahlert opened a new pull request, #65:
URL: https://github.com/apache/airflow-steward/pull/65

   ## Summary
   
   `skill-validator`'s `slugify()` was using `re.sub(r"[\s]+", "-", text)`, 
which collapses runs of whitespace into a single dash. GitHub's anchor renderer 
(and doctoc, which generates our TOCs) replaces each whitespace character 
one-for-one, so a heading whose text contains an em-dash, like `## Mode B — 
conversational mentoring`, has `mode-b--conversational-mentoring` as its real 
anchor (em-dash strips to `""`, leaving two adjacent spaces, each becoming a 
dash).
   
   The validator was producing `mode-b-conversational-mentoring` (single dash) 
and reporting `anchor 'X' not found` against the doctoc-generated TOC anchor it 
had just slugified differently. Running `skill-validate` locally on `main` cuts 
from 191 violations to 153 with this single one-character fix, all of the 
dropped violations being false positives of this exact shape.
   
   ## Change
   
   - `tools/skill-validator/src/skill_validator/__init__.py`: drop the `+` 
quantifier from `ANCHOR_SPACE_PATTERN` so each whitespace becomes its own dash.
   - `tools/skill-validator/tests/test_validator.py`: update the existing 
`test_multiple_spaces` expectation to match the actual GitHub algorithm (it had 
been pinning the bug), and add `test_em_dash_in_heading` for the canonical case.
   
   ## Why now
   
   This unblocks wiring `skill-validator` into prek + CI in a follow-up. With 
the false-positive noise removed, the remaining violations are real (broken 
links, missing-frontmatter keys, anchor renames the skills did not follow) and 
worth gating on.
   
   ## What this does not fix
   
   - The 13 skills missing `license: Apache-2.0` frontmatter (see #pending 
sibling PR).
   - The 13 broken links to deleted `code-review.instructions.md`.
   - The 4 unsubstituted `<...>` placeholders in 
`pr-management-triage/comment-templates.md` (need a decision on adopter-config 
convention vs. literal substitution).
   - Wiring the validator itself into prek + CI.
   
   Each is a separate, scoped follow-up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to