ewianda opened a new issue, #33063:
URL: https://github.com/apache/beam/issues/33063
### What would you like to happen?
### Summary
It would be beneficial to have the Apache Beam SDK published as a `.tar.gz`
archive containing the `/opt/apache/beam` directory. This archive would make it
easier for Bazel users, particularly those using `rules_oci` for container
image building, to incorporate the SDK without relying on Docker’s multi-stage
build `--from` directive.
### Motivation
When building images using Bazel with `rules_oci`, the `--from` directive
commonly used in Docker multi-stage builds is not supported. Currently, users
can only copy files within the same layer, which limits the ability to build
custom images based on the Apache Beam SDK Docker images.
Publishing a `.tar.gz` archive of the `/opt/apache/beam` directory would
allow Bazel users to incorporate the SDK directly, simplifying the process and
ensuring compatibility with `rules_oci`. This addition would enable a
straightforward way to include the Beam SDK files as part of Bazel’s
`oci_image` rules by downloading and extracting the `.tar.gz` file as part of
the build process.
### Proposed Solution
Include a new build artifact in the Apache Beam release process that
packages the `/opt/apache/beam` directory into a `.tar.gz` file. This file
would match the directory structure expected in Beam Docker images and would be
published alongside the Docker images.
#### Example Workflow with `rules_oci`
With the `.tar.gz` package, Bazel users can specify the SDK directory within
their Bazel `BUILD` files without relying on Docker’s `COPY --from` syntax.
Here’s an example:
```python
oci_image(
name = "apache_beam",
tars = ["path/to/apache-beam-sdk.tar.gz"],
entrypoint = ["/opt/apache/beam/boot"],
)
```
This Bazel rule can then build an image with the Beam SDK included,
compatible with `rules_oci`.
### Benefits
- **Bazel Compatibility**: Allows Bazel users to build images without
Docker’s multi-stage build functionality.
- **Enhanced Flexibility**: Users can specify their own base image while
easily incorporating Apache Beam.
- **Easier CI/CD Integration**: Simplifies the build process for users in
CI/CD environments where Docker multi-stage builds are challenging to implement
with Bazel.
### Additional Context
The proposed `.tar.gz` archive would contain files and directories currently
in `/opt/apache/beam`, such as `target/base_image_requirements.txt`,
`target/apache-beam.tar.gz`, `target/LICENSE`, `target/NOTICE`, and
`target/LICENSE.python`, matching the Docker SDK image structure.
Thank you for considering this feature request! This addition would
significantly enhance Apache Beam's flexibility for Bazel users.
### Issue Priority
Priority: 2 (default / most feature requests should be filed as P2)
### Issue Components
- [X] Component: Python SDK
- [ ] Component: Java SDK
- [ ] Component: Go SDK
- [ ] Component: Typescript SDK
- [ ] Component: IO connector
- [ ] Component: Beam YAML
- [ ] Component: Beam examples
- [ ] Component: Beam playground
- [ ] Component: Beam katas
- [ ] Component: Website
- [ ] Component: Infrastructure
- [ ] Component: Spark Runner
- [ ] Component: Flink Runner
- [ ] Component: Samza Runner
- [ ] Component: Twister2 Runner
- [ ] Component: Hazelcast Jet Runner
- [ ] Component: Google Cloud Dataflow Runner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]