ewianda opened a new issue, #33063:
URL: https://github.com/apache/beam/issues/33063

   ### What would you like to happen?
   
   ### Summary
   It would be beneficial to have the Apache Beam SDK published as a `.tar.gz` 
archive containing the `/opt/apache/beam` directory. This archive would make it 
easier for Bazel users, particularly those using `rules_oci` for container 
image building, to incorporate the SDK without relying on Docker’s multi-stage 
build `--from` directive.
   
   ### Motivation
   When building images using Bazel with `rules_oci`, the `--from` directive 
commonly used in Docker multi-stage builds is not supported. Currently, users 
can only copy files within the same layer, which limits the ability to build 
custom images based on the Apache Beam SDK Docker images.
   
   Publishing a `.tar.gz` archive of the `/opt/apache/beam` directory would 
allow Bazel users to incorporate the SDK directly, simplifying the process and 
ensuring compatibility with `rules_oci`. This addition would enable a 
straightforward way to include the Beam SDK files as part of Bazel’s 
`oci_image` rules by downloading and extracting the `.tar.gz` file as part of 
the build process.
   
   ### Proposed Solution
   Include a new build artifact in the Apache Beam release process that 
packages the `/opt/apache/beam` directory into a `.tar.gz` file. This file 
would match the directory structure expected in Beam Docker images and would be 
published alongside the Docker images.
   
   #### Example Workflow with `rules_oci`
   With the `.tar.gz` package, Bazel users can specify the SDK directory within 
their Bazel `BUILD` files without relying on Docker’s `COPY --from` syntax. 
Here’s an example:
   
   ```python
   oci_image(
       name = "apache_beam",
       tars = ["path/to/apache-beam-sdk.tar.gz"],
       entrypoint = ["/opt/apache/beam/boot"],
   )
   ```
   
   This Bazel rule can then build an image with the Beam SDK included, 
compatible with `rules_oci`.
   
   ### Benefits
   - **Bazel Compatibility**: Allows Bazel users to build images without 
Docker’s multi-stage build functionality.
   - **Enhanced Flexibility**: Users can specify their own base image while 
easily incorporating Apache Beam.
   - **Easier CI/CD Integration**: Simplifies the build process for users in 
CI/CD environments where Docker multi-stage builds are challenging to implement 
with Bazel.
   
   ### Additional Context
   The proposed `.tar.gz` archive would contain files and directories currently 
in `/opt/apache/beam`, such as `target/base_image_requirements.txt`, 
`target/apache-beam.tar.gz`, `target/LICENSE`, `target/NOTICE`, and 
`target/LICENSE.python`, matching the Docker SDK image structure.
   
   Thank you for considering this feature request! This addition would 
significantly enhance Apache Beam's flexibility for Bazel users.
   
   ### Issue Priority
   
   Priority: 2 (default / most feature requests should be filed as P2)
   
   ### Issue Components
   
   - [X] Component: Python SDK
   - [ ] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam YAML
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Infrastructure
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to