bobbai00 opened a new issue, #4674:
URL: https://github.com/apache/texera/issues/4674

   ### What happened?
   
   After #4668 lands, the per-module `NOTICE-binary` files describe each Docker 
image's bundled third-party content, but they're hand-curated subsets of the 
previously-curated root `NOTICE-binary`. Hand-curated NOTICE files rot fast — 
every dep bump silently drifts the committed content from what the jars' 
`META-INF/NOTICE` actually carry.
   
   ASF compliance under Apache-2.0 §4(d) requires reproducing the attribution 
notices in every Apache-2.0 dep's bundled `NOTICE` file. Those notices live in 
each jar's `META-INF/NOTICE`. The right source of truth is the jars themselves.
   
   ### Proposed change
   
   Add a generator that produces each `<module>/NOTICE-binary` from the actual 
bundled jars:
   
   1. Walks the module's `lib/` dir.
   2. For each jar, extracts every `META-INF/NOTICE`-style file.
   3. Dedupes by content hash so jars sharing an upstream NOTICE collapse into 
one block.
   4. Emits one block per unique blob with a synthesized project heading + the 
verbatim upstream content.
   5. Optional `--extras` for non-jar attributions (Apache-2.0 Python wheels 
like aiohttp + Matplotlib that don't ship a NOTICE inside any jar).
   
   Then add a CI check that regenerates `<module>/NOTICE-binary` against the 
freshly-built dist `lib/` and diffs against the committed file. Drift fails the 
build with a one-line fix-up command.
   
   ### Version
   
   1.1.0-incubating (Pre-release/Master)
   
   ### Depends on
   
   This change requires #4668 to land first (which introduces the per-module 
`NOTICE-binary` files in the first place).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to