Re: [PR] [SPARK-56827][INFRA] Clean stale genjavadoc output before JavaUnidoc [spark]

via GitHub Mon, 11 May 2026 16:08:41 -0700


dtenedor commented on code in PR #55801:
URL: https://github.com/apache/spark/pull/55801#discussion_r3222671610



##########
project/SparkBuild.scala:
##########
@@ -1729,6 +1778,11 @@ object Unidoc {
       inAnyProject -- inProjects(OldDeps.project, repl, examples, tools, 
kubernetes,
         yarn, tags, streamingKafka010, sqlKafka010, connectCommon, connect, 
connectJdbc,
         connectClient, connectShims, protobuf, profiler, udfWorkerProto, 
udfWorkerCore),
+
+    cleanGenjavadocOutput := {
+      IO.delete(genjavadocJavaOutputDirs((ThisBuild / baseDirectory).value))
+      ()
+    }

Review Comment:
   nit: this task can use sbt's built-in project enumeration instead of a 
hand-rolled filesystem walk. That makes the entire `genjavadocJavaOutputDirs` 
helper and its `skipNames` denylist (and the one-off `tpcds-` rule) 
unnecessary, and it stays correct as new top-level directories appear in the 
worktree or in a developer's clone:
   
   ```suggestion
       cleanGenjavadocOutput := {
         val log = streams.value.log
         val dirs = target.all(ScopeFilter(inAnyProject)).value
           .map(_ / "java").filter(_.isDirectory)
         // Remove any stale genjavadoc target/java tree(s).
         IO.delete(dirs)
       }
   ```
   
   After applying, the `cleanGenjavadocOutput` `taskKey` declaration above 
stays, but `genjavadocJavaOutputDirs` and the `skipNames` `Set` become dead 
code and should be deleted in the same commit. 
`target.all(ScopeFilter(inAnyProject))` is the documented sbt pattern for 
collecting per-project settings across the aggregated build, so the query is 
O(modules) rather than a recursive walk of the whole repo.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-56827][INFRA] Clean stale genjavadoc output before JavaUnidoc [spark]

Reply via email to