juliuszsompolski opened a new pull request, #55581: URL: https://github.com/apache/spark/pull/55581
### What changes were proposed in this pull request? Two changes that let the unidoc diagnostic banner added in [SPARK-56630](https://github.com/apache/spark/pull/55548) actually pinpoint the source of a docs-job failure. Neither silences a real failure — they just remove the existing masking layers that hide which scaladoc / Java doc-comment is the cause. 1. **`project/SparkBuild.scala`** — bump `-Xmaxerrs` and `-Xmaxwarns` from javadoc's defaults of 100 to 999999, and add `-verbose`. - `-Xmaxerrs 100` made javadoc bail during source loading whenever the cumulative count of benign genjavadoc-stub errors crossed 100, before any HTML was generated. The diagnostic banner from SPARK-56630 cannot identify a culprit class in that mode because there is no `Generating .../<Class>.html` line yet. - `-Xmaxwarns 100` capped the printed warning count even when javadoc completed; on a real Spark unidoc run the actual warning count from doclint (`no comment`, `empty <p> tag`, `no @return`, `no @param ...`) is in the tens of thousands, and per-link `error: reference not found` messages were being clipped along with them. - `-verbose` makes javadoc emit a `<path>.java:<line>: error: reference not found` line for every broken `{@link}` it encounters during HTML generation. Without it, javadoc tracks reference errors in its internal counter and reports the bulk total in the final `<N> errors / <M> warnings` summary, but does not print a file:line for each one. The flag also dumps `Loading source file ...` progress lines, so the unidoc log grows by an order of magnitude; that is the price of being able to debug reference errors at all from CI logs. 2. **`docs/_plugins/build_api_docs.rb`** — extend the diagnostic banner. The original banner names the class javadoc was rendering when it crashed mid-HTML-generation. With `-verbose` now enabled, the build can also fail with a non-zero `<N> errors` summary after javadoc completed all HTML output, when broken `{@link}` references push the doclint reference-error count above zero. The extension scans the captured log for `error: reference not found` lines and lists their `<path>.java:<line>` in the banner, plus a short note about the most common root cause (`[[Class.member]]` in scaladoc on a regular class/trait — fix by using `[[Class#member]]`). ### Why are the changes needed? Today, when the unidoc step fails, the visible signal is hundreds of `[error]` lines on `target/java/...` files (`error: illegal combination of modifiers: abstract and static`, `error: cannot find symbol` on type variables, etc.). Those errors are inert — they come from the way genjavadoc lifts type-parameterised methods into static Java emission, every Spark unidoc run produces them, and javadoc normally finishes anyway. The commit message for https://github.com/apache/spark/pull/55548 documents this. The actual cause of the failure is something else: either javadoc bailed because `-Xmaxerrs 100` was hit during source loading, or javadoc completed but reported a non-zero error count from broken `{@link}` references. The diagnostic banner from https://github.com/apache/spark/pull/55548 only handles the mid-HTML-generation crash mode, and only when javadoc reaches HTML generation in the first place. Both the source-loading early bailout and the post-HTML reference-error tally were invisible to it. As a concrete example: PR #55371's docs job was failing deterministically with the SPARK-56630 banner reporting "Javadoc exited but no class HTML generation was in progress" — useless because the failure mode was the early bailout. With `-Xmaxerrs` / `-Xmaxwarns` raised, the same run reached HTML generation and reported `6 errors / 22188 warnings`, but the 6 errors had no per-error log line and the banner had nothing to say. With `-verbose` and the extended banner, the same run prints ``` ============================================================================== Unidoc failed -- diagnostic summary ============================================================================== Javadoc reference-resolution errors (each one is a broken {@link} in a doc comment that genjavadoc copied verbatim from the corresponding scaladoc; fix the [[link]] in the Scala source): /__w/.../core/target/java/org/apache/spark/util/LastAttemptAccumulator.java:9 /__w/.../core/target/java/org/apache/spark/util/LastAttemptAccumulator.java:15 /__w/.../core/target/java/org/apache/spark/util/LastAttemptAccumulator.java:22 /__w/.../core/target/java/org/apache/spark/util/LastAttemptAccumulator.java:23 /__w/.../core/target/java/org/apache/spark/util/LastAttemptAccumulator.java:42 /__w/.../core/target/java/org/apache/spark/util/LastAttemptAccumulator.java:51 Common cause: [[Class.member]] in scaladoc when Class is a regular `class`/`trait` (not a Scala `object`) and there is no companion-object member with that name. genjavadoc emits {@link Class.member}, javadoc reads `.` as the inner-class separator and fails to resolve. Use [[Class#member]] instead. ============================================================================== ``` which points the developer directly at the `{@link}` they need to fix. These changes do not silence any real failure. javadoc still exits non-zero when there are real errors. They only remove the noise-driven and clipping-driven masks so the SPARK-56630 banner can do its job. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Locally: - With deliberately broken `[[AccumulatorV2.add]]`-style refs in a scaladoc comment, `build/sbt -Pkinesis-asl unidoc` completed all 2465 HTML files but reported a non-zero error count and exited non-zero. The diagnostic banner listed each broken-reference file:line entry. - After converting the same refs to the `[[AccumulatorV2#add]]` form, the same command reported `Main Java API documentation successful.` and exited zero. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude (Opus 4.7) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
