bobbai00 opened a new issue, #4395: URL: https://github.com/apache/texera/issues/4395
### Task Summary Binary distributions (Docker images, sbt-native-packager dist zips) bundle ~450 third-party dependency jars alongside Texera's own jars. Per ASF policy, the LICENSE file shipped with a binary distribution must describe all bundled contents, not just the source release. Currently, binary distributions copy the repo-root LICENSE, which only covers vendored source code (mbknor, Angular formly, TypeFox, SVGRepo). It does not account for the hundreds of third-party jars in `lib/`. What needs to be done: - Add a `LICENSE-binary` file that lists all non-Apache-2.0 bundled dependencies grouped by license (MIT, BSD, EPL, MPL, CDDL, etc.), with full license text for each in the `licenses/` directory. - Add a `tools/licensing/collect_binary_licenses.sh` helper script (modeled after Flink's) that extracts META-INF/LICENSE and META-INF/NOTICE from each bundled jar for review. - Add a `tools/licensing/check_binary_deps.sh` script and CI workflow that compares actual bundled jars against a known list (`known-binary-deps.txt`) and fails if a new dependency is added without updating `LICENSE-binary`. - Wire `LICENSE-binary` into Dockerfiles and dist zips so it replaces the repo-root LICENSE in binary artifacts. ### Priority P1 – High ### Task Type - [x] Code Implementation - [x] Documentation - [ ] Refactor / Cleanup - [ ] Testing / QA - [x] DevOps / Deployment -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
