[
https://issues.apache.org/jira/browse/TIKA-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18076576#comment-18076576
]
ASF GitHub Bot commented on TIKA-4703:
--------------------------------------
nddipiazza opened a new pull request, #2790:
URL: https://github.com/apache/tika/pull/2790
## Summary
The `tika-grpc` Docker image was failing to start with:
```
Error: Unable to initialize main class
org.apache.tika.pipes.grpc.TikaGrpcServer
Caused by: java.lang.NoClassDefFoundError: io/grpc/BindableService
```
## Root Cause
The Docker build context only copied the thin jar (`tika-grpc-X.jar`) but
not the `lib/` directory containing runtime dependencies. The jar's
`MANIFEST.MF` has `Class-Path: lib/grpc-stub-X.jar lib/grpc-netty-shaded-X.jar
...` (set by `classpathPrefix=lib/` in `maven-jar-plugin`), so Java looks for
deps at `/tika/libs/lib/` — which did not exist in the image.
## Fix
Instead of copying just the bare jar, unzip the Maven assembly zip (which
bundles the jar + `lib/` with all runtime deps) and include both in the Docker
build context:
```
/tika/libs/tika-grpc-X.jar ← main jar
/tika/libs/lib/*.jar ← runtime deps (matches MANIFEST Class-Path)
```
## Critical Files
- `.github/workflows/docker-snapshot.yml`
- `.github/workflows/docker-release.yml`
> Integrate Docker image builds into apache/tika and deprecate standalone
> Docker repos
> ------------------------------------------------------------------------------------
>
> Key: TIKA-4703
> URL: https://issues.apache.org/jira/browse/TIKA-4703
> Project: Tika
> Issue Type: Task
> Reporter: Nicholas DiPiazza
> Priority: Major
>
> h2. Summary
> Move Docker image building and publishing into the main
> [apache/tika|https://github.com/apache/tika] repository, deprecating the
> standalone Docker repos. This ensures Docker image releases are naturally
> tied to Tika releases through the existing Maven workflow, rather than
> requiring cross-repo coordination.
> h2. Current State
> * [tika-docker|https://github.com/apache/tika-docker] - standalone repo that
> builds the tika-server Docker image, published to [apache/tika on Docker
> Hub|https://hub.docker.com/r/apache/tika]
> * [tika-grpc-docker|https://github.com/apache/tika-grpc-docker] - standalone
> repo that builds the tika-grpc Docker image, published to [apache/tika-grpc
> on Docker Hub|https://hub.docker.com/r/apache/tika-grpc]
> h2. Problem
> Having Docker builds in separate repos means:
> * Docker image releases are decoupled from Tika releases - requires manual
> coordination
> * No guarantee Docker images match the released Tika version
> * Extra maintenance burden across multiple repos
> * Harder for contributors to understand the full release pipeline
> h2. Proposed Approach
> # Move Dockerfiles and related build config from {{tika-docker}} and
> {{tika-grpc-docker}} into the main {{apache/tika}} repo
> # Add GitHub Actions workflows to {{apache/tika}} that build and publish
> Docker images as part of the release process
> # Integrate with the existing Maven workflow so Docker builds happen
> naturally alongside Java artifact publishing
> # Docker images to publish:
> #* {{apache/tika}} (tika-server) to [Docker
> Hub|https://hub.docker.com/r/apache/tika]
> #* {{apache/tika-grpc}} (tika-grpc) to [Docker
> Hub|https://hub.docker.com/r/apache/tika-grpc]
> # Support multi-architecture builds (amd64, arm64) if applicable
> # Proper image tagging tied to Maven release versions (e.g. {{3.1.0}},
> {{latest}})
> # Deprecate {{tika-docker}} and {{tika-grpc-docker}} repos with README
> notices pointing to {{apache/tika}}
> h2. Acceptance Criteria
> * Dockerfiles and build config live in the {{apache/tika}} repo
> * GitHub Actions in {{apache/tika}} build and publish both Docker images on
> release
> * Docker image versions are automatically tied to Tika release versions
> * {{tika-docker}} and {{tika-grpc-docker}} repos are marked as deprecated
--
This message was sent by Atlassian Jira
(v8.20.10#820010)