[ 
https://issues.apache.org/jira/browse/TIKA-4703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18074240#comment-18074240
 ] 

ASF GitHub Bot commented on TIKA-4703:
--------------------------------------

nddipiazza commented on code in PR #2715:
URL: https://github.com/apache/tika/pull/2715#discussion_r3100525459


##########
tika-grpc/docker-build/Dockerfile:
##########
@@ -0,0 +1,53 @@
+# Licensed under the Apache License, Version 2.0 (the "License"); you may not
+# use this file except in compliance with the License. You may obtain a copy of
+# the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations under
+# the License.
+
+FROM ubuntu:plucky
+COPY libs/ /tika/libs/
+COPY plugins/ /tika/plugins/
+COPY config/ /tika/config/
+COPY bin/ /tika/bin
+ARG JRE='openjdk-17-jre-headless'
+ARG VERSION
+ARG TIKA_GRPC_MAX_INBOUND_MESSAGE_SIZE=104857600
+ARG TIKA_GRPC_MAX_OUTBOUND_MESSAGE_SIZE=104857600
+ARG TIKA_GRPC_NUM_THREADS=4
+RUN set -eux \
+    && apt-get update \
+    && apt-get install --yes --no-install-recommends gnupg2 
software-properties-common \
+    && DEBIAN_FRONTEND=noninteractive apt-get install --yes 
--no-install-recommends $JRE \
+        gdal-bin \
+        tesseract-ocr \
+        tesseract-ocr-eng \
+        tesseract-ocr-ita \
+        tesseract-ocr-fra \
+        tesseract-ocr-spa \
+        tesseract-ocr-deu \
+    && echo ttf-mscorefonts-installer msttcorefonts/accepted-mscorefonts-eula 
select true | debconf-set-selections \
+    && DEBIAN_FRONTEND=noninteractive apt-get install --yes 
--no-install-recommends \
+        xfonts-utils \
+        fonts-freefont-ttf \
+        fonts-liberation \
+        ttf-mscorefonts-installer \
+        wget \
+        cabextract \
+    && apt-get clean -y \
+    && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
+
+EXPOSE 9090

Review Comment:
   I pushed a fix for this on the updated branch. Could you please repeat the 
analysis when you have a moment?



##########
tika-grpc/docker-build/Dockerfile:
##########
@@ -0,0 +1,53 @@
+# Licensed under the Apache License, Version 2.0 (the "License"); you may not
+# use this file except in compliance with the License. You may obtain a copy of
+# the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations under
+# the License.
+
+FROM ubuntu:plucky

Review Comment:
   I pushed a fix for this on the updated branch. Could you please repeat the 
analysis when you have a moment?



##########
tika-grpc/docker-build/Dockerfile:
##########
@@ -0,0 +1,53 @@
+# Licensed under the Apache License, Version 2.0 (the "License"); you may not
+# use this file except in compliance with the License. You may obtain a copy of
+# the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations under
+# the License.
+
+FROM ubuntu:plucky
+COPY libs/ /tika/libs/
+COPY plugins/ /tika/plugins/
+COPY config/ /tika/config/
+COPY bin/ /tika/bin
+ARG JRE='openjdk-17-jre-headless'

Review Comment:
   I pushed a fix for this on the updated branch. Could you please repeat the 
analysis when you have a moment?



##########
tika-grpc/docker-build/Dockerfile:
##########
@@ -0,0 +1,53 @@
+# Licensed under the Apache License, Version 2.0 (the "License"); you may not
+# use this file except in compliance with the License. You may obtain a copy of
+# the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations under
+# the License.
+
+FROM ubuntu:plucky

Review Comment:
   I pushed a fix for this on the updated branch. Could you please repeat the 
analysis when you have a moment?





> Integrate Docker image builds into apache/tika and deprecate standalone 
> Docker repos
> ------------------------------------------------------------------------------------
>
>                 Key: TIKA-4703
>                 URL: https://issues.apache.org/jira/browse/TIKA-4703
>             Project: Tika
>          Issue Type: Task
>            Reporter: Nicholas DiPiazza
>            Priority: Major
>
> h2. Summary
> Move Docker image building and publishing into the main 
> [apache/tika|https://github.com/apache/tika] repository, deprecating the 
> standalone Docker repos. This ensures Docker image releases are naturally 
> tied to Tika releases through the existing Maven workflow, rather than 
> requiring cross-repo coordination.
> h2. Current State
> * [tika-docker|https://github.com/apache/tika-docker] - standalone repo that 
> builds the tika-server Docker image, published to [apache/tika on Docker 
> Hub|https://hub.docker.com/r/apache/tika]
> * [tika-grpc-docker|https://github.com/apache/tika-grpc-docker] - standalone 
> repo that builds the tika-grpc Docker image, published to [apache/tika-grpc 
> on Docker Hub|https://hub.docker.com/r/apache/tika-grpc]
> h2. Problem
> Having Docker builds in separate repos means:
> * Docker image releases are decoupled from Tika releases - requires manual 
> coordination
> * No guarantee Docker images match the released Tika version
> * Extra maintenance burden across multiple repos
> * Harder for contributors to understand the full release pipeline
> h2. Proposed Approach
> # Move Dockerfiles and related build config from {{tika-docker}} and 
> {{tika-grpc-docker}} into the main {{apache/tika}} repo
> # Add GitHub Actions workflows to {{apache/tika}} that build and publish 
> Docker images as part of the release process
> # Integrate with the existing Maven workflow so Docker builds happen 
> naturally alongside Java artifact publishing
> # Docker images to publish:
> #* {{apache/tika}} (tika-server) to [Docker 
> Hub|https://hub.docker.com/r/apache/tika]
> #* {{apache/tika-grpc}} (tika-grpc) to [Docker 
> Hub|https://hub.docker.com/r/apache/tika-grpc]
> # Support multi-architecture builds (amd64, arm64) if applicable
> # Proper image tagging tied to Maven release versions (e.g. {{3.1.0}}, 
> {{latest}})
> # Deprecate {{tika-docker}} and {{tika-grpc-docker}} repos with README 
> notices pointing to {{apache/tika}}
> h2. Acceptance Criteria
> * Dockerfiles and build config live in the {{apache/tika}} repo
> * GitHub Actions in {{apache/tika}} build and publish both Docker images on 
> release
> * Docker image versions are automatically tied to Tika release versions
> * {{tika-docker}} and {{tika-grpc-docker}} repos are marked as deprecated



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to