dtenedor commented on code in PR #55801:
URL: https://github.com/apache/spark/pull/55801#discussion_r3222671610
##########
project/SparkBuild.scala:
##########
@@ -1729,6 +1778,11 @@ object Unidoc {
inAnyProject -- inProjects(OldDeps.project, repl, examples, tools,
kubernetes,
yarn, tags, streamingKafka010, sqlKafka010, connectCommon, connect,
connectJdbc,
connectClient, connectShims, protobuf, profiler, udfWorkerProto,
udfWorkerCore),
+
+ cleanGenjavadocOutput := {
+ IO.delete(genjavadocJavaOutputDirs((ThisBuild / baseDirectory).value))
+ ()
+ }
Review Comment:
nit: this task can use sbt's built-in project enumeration instead of a
hand-rolled filesystem walk. That makes the entire `genjavadocJavaOutputDirs`
helper and its `skipNames` denylist (and the one-off `tpcds-` rule)
unnecessary, and it stays correct as new top-level directories appear in the
worktree or in a developer's clone:
```suggestion
cleanGenjavadocOutput := {
val log = streams.value.log
val dirs = target.all(ScopeFilter(inAnyProject)).value
.map(_ / "java").filter(_.isDirectory)
// Remove any stale genjavadoc target/java tree(s).
IO.delete(dirs)
}
```
After applying, the `cleanGenjavadocOutput` `taskKey` declaration above
stays, but `genjavadocJavaOutputDirs` and the `skipNames` `Set` become dead
code and should be deleted in the same commit.
`target.all(ScopeFilter(inAnyProject))` is the documented sbt pattern for
collecting per-project settings across the aggregated build, so the query is
O(modules) rather than a recursive walk of the whole repo.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]