This is an automated email from the ASF dual-hosted git repository.

ndipiazza pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tika.git


The following commit(s) were added to refs/heads/main by this push:
     new 30e46db4fa TIKA-4606: Add e2e tests for Ignite 3.x upgrade (#2655)
30e46db4fa is described below

commit 30e46db4fab603dd69b152448aba3b241805c9b7
Author: Nicholas DiPiazza <[email protected]>
AuthorDate: Wed Mar 4 09:53:30 2026 -0600

    TIKA-4606: Add e2e tests for Ignite 3.x upgrade (#2655)
    
    * TIKA-4606: Add e2e tests for Ignite 3.x upgrade
    
    - Add tika-e2e-tests module to root pom.xml build
    - Update tika-e2e-tests parent pom: inherit from tika-parent, add Micronaut
      version alignment for Ignite 3.x transitive dependencies
    - Update tika-grpc e2e pom: local server mode by default 
(tika.e2e.useLocalServer=true),
      add Awaitility dependency, disable enforcer for Ignite 3.x dep conflicts
    - Add tika-config-ignite-local.json for local server mode testing
    - Update ExternalTestBase and IgniteConfigStoreTest for local vs Docker 
switching
    - Add FileSystemFetcherTest e2e test
    
    Note: these tests require the Ignite 3.x core upgrade from 
TIKA-4606-ignite-core.
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Fix forbidden API violations and hardcoded path in e2e tests
    
    - Use toLowerCase(Locale.ROOT) instead of toLowerCase() in ExternalTestBase
    - Use InputStreamReader with StandardCharsets.UTF_8 instead of default 
charset
    - Fix hardcoded absolute basePath in tika-config-ignite-local.json to use 
relative target/govdocs1
    - Add tika-config.json for FileSystemFetcherTest (ExternalTestBase) local 
server mode
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Fix tika-e2e-tests parent version to use ${revision}
    
    Use ${revision} instead of hardcoded 4.0.0-SNAPSHOT for tika-parent
    reference, consistent with all other modules in the project. Fixes
    CI failure on Windows where Maven cannot resolve the parent POM.
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Fix tika-grpc-e2e-test parent version to use ${revision}
    
    Same fix as tika-e2e-tests: use ${revision} with explicit relativePath
    instead of hardcoded 4.0.0-SNAPSHOT. Fixes Windows CI failure.
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Disable FileSystemFetcherTest on Windows
    
    exec:exec with %%classpath exceeds Windows CreateProcess command-line
    length limit (error=206). Consistent with IgniteConfigStoreTest which
    has the same @DisabledOnOs(WINDOWS) annotation.
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Remove AI-generated slop comments from e2e tests
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Address PR review comments
    
    - Move tika-e2e-tests behind -Pe2e Maven profile; update CI workflows
    - tika-e2e-tests/pom.xml: tika.version 4.0.0-SNAPSHOT -> ${revision}
    - tika-grpc/pom.xml: govdocs index properties pass-through with defaults
    - FileSystemFetcherTest: useLocalServer default false -> true
    - ExternalTestBase: use mvnw wrapper instead of bare mvn
    - ExternalTestBase: require tika.docker.compose.file system property for 
Docker mode
    - ExternalTestBase: remove Executors.newCachedThreadPool() leak from 
channel builder
    - IgniteConfigStoreTest: use mvnw wrapper instead of bare mvn
    - IgniteConfigStoreTest: require tika.docker.compose.ignite.file for Docker 
mode
    - IgniteConfigStoreTest: remove Executors.newCachedThreadPool() leak from 
channel builder
    - tika-config-ignite-local.json: replicas 2 -> 1 (single-node local test)
    - README.md: update note to reflect -Pe2e profile inclusion
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Fix Windows CI profile flag (PowerShell comma issue)
    
    Use -Pci -Pe2e instead of -Pci,e2e; PowerShell parses comma as array 
separator.
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Address second round of PR review comments
    
    - Wrap ManagedChannel in try/finally in FileSystemFetcherTest
    - Quote config file paths in exec.args to handle paths with spaces
    - Fix USE_LOCAL_SERVER default to true in ExternalTestBase
    - Remove pinned jackson.version from tika-e2e-tests/pom.xml (use parent)
    - Fix ./mvnw -> ../mvnw in README.md (no wrapper in this dir)
    - Clarify Docker is only required for Docker Compose mode in README.md
    - Move e2e tests to separate CI job that only runs on push
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Replace GovDocs1 download with committed test fixtures
    
    The default test run now uses small committed fixture files (txt, html,
    csv, xml) instead of downloading GovDocs1 from the internet. This removes
    the external network dependency from every CI run.
    
    GovDocs1 download is still available as an opt-in by setting
    -Dgovdocs1.fromIndex=N for large-corpus testing.
    
    The e2e CI job now runs on both pull_request and push since there is
    no longer a large corpus download to gate behind push-only.
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Fix remaining PR review issues
    
    - Fix USE_LOCAL_SERVER default to true in IgniteConfigStoreTest
    - Update @DisabledOnOs reason to accurately describe the Windows
      classpath length limit (CreateProcess error=206)
    - Replace channel-state-based waitForServerReady with a real listFetchers
      gRPC call so readiness confirms the service layer is up
    - Rewrite tika-grpc/README.md: fix ./mvnw -> ../../mvnw, clarify Docker
      is only needed for Docker mode, update corpus section to explain fixtures
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Fix Copilot review issues — govdocs opt-in, RAT excludes, typo 
corpus.numDocs, remove duplicates, safe killProcessOnPort
    
    - Replace govdocs1.fromIndex != null check with tika.e2e.useGovdocs=false 
so default CI runs always use committed fixtures
    - Add tika.e2e.useGovdocs and corpus.numDocs properties to pom.xml; pass 
through Surefire
    - Add RAT excludes for src/test/resources/tika-config*.json and 
src/test/resources/test-fixtures/**
    - Rename corpa.numdocs typo to corpus.numDocs in pom.xml, 
FileSystemFetcherTest, IgniteConfigStoreTest
    - Remove duplicate fields and methods from IgniteConfigStoreTest; use 
ExternalTestBase.TEST_FOLDER, OBJECT_MAPPER, copyTestFixtures, 
downloadAndUnzipGovdocs1, assertAllFilesFetched
    - Make ExternalTestBase.copyTestFixtures() public static for shared use
    - Fix killProcessOnPort to check process cmdline before killing to avoid 
collateral damage
    
    Co-authored-by: Copilot <[email protected]>
    
    * Update .github/workflows/main-jdk17-build.yml
    
    Co-authored-by: Copilot <[email protected]>
    
    * Update tika-e2e-tests/README.md
    
    Co-authored-by: Copilot <[email protected]>
    
    * Update tika-e2e-tests/tika-grpc/README.md
    
    Co-authored-by: Copilot <[email protected]>
    
    * Update 
tika-e2e-tests/tika-grpc/src/test/java/org/apache/tika/pipes/ExternalTestBase.java
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Fix README GovDocs opt-in flag — document 
tika.e2e.useGovdocs=true
    
    The README was still documenting govdocs1.fromIndex as the download trigger.
    The actual opt-in property is tika.e2e.useGovdocs=true; 
govdocs1.fromIndex/toIndex
    only control which zip range to fetch once the download is enabled.
    
    Co-authored-by: Copilot <[email protected]>
    
    * Update 
tika-e2e-tests/tika-grpc/src/test/java/org/apache/tika/pipes/ignite/IgniteConfigStoreTest.java
    
    Co-authored-by: Copilot <[email protected]>
    
    * Update 
tika-e2e-tests/tika-grpc/src/test/java/org/apache/tika/pipes/filesystem/FileSystemFetcherTest.java
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Fix corpus init check to require at least one non-zip 
extracted file
    
    Previously setupIgnite() skipped initialization if the folder was non-empty,
    which could leave tests running against only zip files from a partial prior 
run.
    hasExtractedFiles() now checks for at least one non-zip file before treating
    the corpus as ready.
    
    Co-authored-by: Copilot <[email protected]>
    
    * TIKA-4606: Fix malformed file-walking forEach block in 
IgniteConfigStoreTest
    
    The if (maxDocs > 0) block and the forEach lambda were merged into one
    broken block during a prior rebase. Restore the correct structure: apply
    limit() inside the if block, then call forEach() unconditionally outside it.
    
    Co-authored-by: Copilot <[email protected]>
    
    ---------
    
    Co-authored-by: Copilot <[email protected]>
    Co-authored-by: Copilot <[email protected]>
---
 .github/workflows/main-jdk17-build.yml             |  19 +
 .../main-jdk17-windows-build-multi-locale.yml      |   2 +-
 .github/workflows/main-jdk17-windows-build.yml     |   2 +-
 pom.xml                                            |   6 +
 tika-e2e-tests/README.md                           |  59 +++
 tika-e2e-tests/pom.xml                             | 173 +++++++
 tika-e2e-tests/tika-grpc/README.md                 |  84 ++++
 tika-e2e-tests/tika-grpc/pom.xml                   | 178 +++++++
 .../sample-configs/ignite/tika-config-ignite.json  |  24 +
 .../org/apache/tika/pipes/ExternalTestBase.java    | 364 ++++++++++++++
 .../pipes/filesystem/FileSystemFetcherTest.java    | 164 +++++++
 .../tika/pipes/ignite/IgniteConfigStoreTest.java   | 533 +++++++++++++++++++++
 .../src/test/resources/test-fixtures/sample.csv    |   4 +
 .../src/test/resources/test-fixtures/sample.html   |   8 +
 .../src/test/resources/test-fixtures/sample.txt    |   3 +
 .../src/test/resources/test-fixtures/sample.xml    |   5 +
 .../test/resources/tika-config-ignite-local.json   |  52 ++
 .../src/test/resources/tika-config-ignite.json     |  52 ++
 .../tika-grpc/src/test/resources/tika-config.json  |  32 ++
 19 files changed, 1762 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/main-jdk17-build.yml 
b/.github/workflows/main-jdk17-build.yml
index bd51d496a0..75c923c10e 100644
--- a/.github/workflows/main-jdk17-build.yml
+++ b/.github/workflows/main-jdk17-build.yml
@@ -45,3 +45,22 @@ jobs:
           cache: 'maven'
       - name: Build with Maven
         run: mvn clean apache-rat:check test install javadoc:aggregate -Pci -B 
"-Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn"
+
+  e2e-tests:
+    runs-on: ubuntu-latest
+    timeout-minutes: 30
+    needs: build
+    strategy:
+      matrix:
+        java: [ '17' ]
+
+    steps:
+      - uses: actions/checkout@v4
+      - name: Set up JDK ${{ matrix.java }}
+        uses: actions/setup-java@v4
+        with:
+          distribution: 'temurin'
+          java-version: ${{ matrix.java }}
+          cache: 'maven'
+      - name: Run E2E Tests
+        run: mvn -pl tika-e2e-tests -am clean verify -Pe2e -B 
"-Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn"
diff --git a/.github/workflows/main-jdk17-windows-build-multi-locale.yml 
b/.github/workflows/main-jdk17-windows-build-multi-locale.yml
index c99ee86e30..d5ce36220c 100644
--- a/.github/workflows/main-jdk17-windows-build-multi-locale.yml
+++ b/.github/workflows/main-jdk17-windows-build-multi-locale.yml
@@ -48,4 +48,4 @@ jobs:
           java-version: ${{ matrix.java }}
           cache: 'maven'
       - name: Build with Maven
-        run: mvn clean test install javadoc:aggregate -Pci -B 
"-Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn"
+        run: mvn clean test install javadoc:aggregate -Pci -Pe2e -B 
"-Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn"
diff --git a/.github/workflows/main-jdk17-windows-build.yml 
b/.github/workflows/main-jdk17-windows-build.yml
index 9a7f85715d..41d55d6f60 100644
--- a/.github/workflows/main-jdk17-windows-build.yml
+++ b/.github/workflows/main-jdk17-windows-build.yml
@@ -47,4 +47,4 @@ jobs:
           cache: 'maven'
       - name: Build with Maven
         working-directory: 'tika build dir'
-        run: mvn clean test install javadoc:aggregate -Pci -B 
"-Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn"
+        run: mvn clean test install javadoc:aggregate -Pci -Pe2e -B 
"-Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn"
diff --git a/pom.xml b/pom.xml
index 71e23c5823..493c617957 100644
--- a/pom.xml
+++ b/pom.xml
@@ -60,6 +60,12 @@
   </modules>
 
   <profiles>
+    <profile>
+      <id>e2e</id>
+      <modules>
+        <module>tika-e2e-tests</module>
+      </modules>
+    </profile>
     <profile>
       <id>apache-release</id>
       <modules>
diff --git a/tika-e2e-tests/README.md b/tika-e2e-tests/README.md
new file mode 100644
index 0000000000..bb0c538ace
--- /dev/null
+++ b/tika-e2e-tests/README.md
@@ -0,0 +1,59 @@
+# Apache Tika End-to-End Tests
+
+End-to-end integration tests for Apache Tika components.
+
+## Overview
+
+This module contains standalone end-to-end (E2E) tests for various Apache Tika 
distribution formats and deployment modes. Unlike unit and integration tests in 
the main Tika build, these E2E tests validate complete deployment scenarios 
using Docker containers and real-world test data.
+
+**Note:** This module is included in the main Tika build under the `e2e` Maven 
profile (`-Pe2e`). Run `mvn test -Pe2e` from the repo root to execute these 
tests.
+
+## Test Modules
+
+- **tika-grpc** - E2E tests for tika-grpc server
+
+## Prerequisites
+
+- Java 17 or later
+- Maven 3.6 or later
+- Internet connection (only when running tests that download external corpora, 
e.g. with `-Dtika.e2e.useGovdocs=true`)
+- Docker and Docker Compose (only required for Docker Compose mode; not needed 
for the default local-server mode)
+
+## Building All E2E Tests
+
+From this directory:
+
+```bash
+../mvnw clean install
+```
+
+## Running All E2E Tests
+
+```bash
+../mvnw test
+```
+
+## Running Specific Test Module
+
+```bash
+cd tika-grpc
+../../mvnw test
+```
+
+## Why Standalone?
+
+The E2E tests are kept separate from the main build because they:
+
+- Have different build requirements (Docker, Testcontainers)
+- Take significantly longer to run than unit tests
+- Require external resources (test corpora, Docker images)
+- Can be run independently in CI/CD pipelines
+- Allow developers to run them selectively
+
+## Integration with CI/CD
+
+These tests can be integrated into the release pipeline as a separate step.
+
+## License
+
+Licensed under the Apache License, Version 2.0. See the main Tika LICENSE.txt 
file for details.
diff --git a/tika-e2e-tests/pom.xml b/tika-e2e-tests/pom.xml
new file mode 100644
index 0000000000..e87d0149c5
--- /dev/null
+++ b/tika-e2e-tests/pom.xml
@@ -0,0 +1,173 @@
+<?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+<project xmlns="http://maven.apache.org/POM/4.0.0";
+         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
https://maven.apache.org/xsd/maven-4.0.0.xsd";>
+    <modelVersion>4.0.0</modelVersion>
+
+    <parent>
+        <groupId>org.apache.tika</groupId>
+        <artifactId>tika-parent</artifactId>
+        <version>${revision}</version>
+        <relativePath>../tika-parent/pom.xml</relativePath>
+    </parent>
+
+    <artifactId>tika-e2e-tests</artifactId>
+    <packaging>pom</packaging>
+    <name>Apache Tika End-to-End Tests</name>
+    <description>End-to-end integration tests for Apache Tika 
components</description>
+
+    <properties>
+        <maven.compiler.source>17</maven.compiler.source>
+        <maven.compiler.target>17</maven.compiler.target>
+        <maven.compiler.release>17</maven.compiler.release>
+        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+
+        <!-- Tika version -->
+        <tika.version>${revision}</tika.version>
+
+        <!-- Test dependencies -->
+        <junit.version>5.11.4</junit.version>
+        <testcontainers.version>2.0.3</testcontainers.version>
+
+        <!-- Logging -->
+        <slf4j.version>2.0.16</slf4j.version>
+        <log4j.version>2.25.3</log4j.version>
+
+        <!-- Other -->
+        <lombok.version>1.18.32</lombok.version>
+    </properties>
+
+    <modules>
+        <module>tika-grpc</module>
+    </modules>
+
+    <dependencyManagement>
+        <dependencies>
+            <!-- JUnit 5 -->
+            <dependency>
+                <groupId>org.junit.jupiter</groupId>
+                <artifactId>junit-jupiter-engine</artifactId>
+                <version>${junit.version}</version>
+                <scope>test</scope>
+            </dependency>
+            <dependency>
+                <groupId>org.junit.jupiter</groupId>
+                <artifactId>junit-jupiter-api</artifactId>
+                <version>${junit.version}</version>
+                <scope>test</scope>
+            </dependency>
+
+            <!-- Testcontainers -->
+            <dependency>
+                <groupId>org.testcontainers</groupId>
+                <artifactId>testcontainers</artifactId>
+                <version>${testcontainers.version}</version>
+                <scope>test</scope>
+            </dependency>
+            <dependency>
+                <groupId>org.testcontainers</groupId>
+                <artifactId>testcontainers-junit-jupiter</artifactId>
+                <version>${testcontainers.version}</version>
+                <scope>test</scope>
+            </dependency>
+
+            <!-- Logging -->
+            <dependency>
+                <groupId>org.apache.logging.log4j</groupId>
+                <artifactId>log4j-core</artifactId>
+                <version>${log4j.version}</version>
+            </dependency>
+            <dependency>
+                <groupId>org.apache.logging.log4j</groupId>
+                <artifactId>log4j-slf4j2-impl</artifactId>
+                <version>${log4j.version}</version>
+            </dependency>
+            <dependency>
+                <groupId>org.slf4j</groupId>
+                <artifactId>slf4j-api</artifactId>
+                <version>${slf4j.version}</version>
+            </dependency>
+
+            <!-- Jackson for JSON -->
+            <dependency>
+                <groupId>com.fasterxml.jackson.core</groupId>
+                <artifactId>jackson-databind</artifactId>
+                <version>${jackson.version}</version>
+            </dependency>
+
+            <!-- Micronaut version alignment for Ignite 3.x -->
+            <dependency>
+                <groupId>io.micronaut</groupId>
+                <artifactId>micronaut-validation</artifactId>
+                <version>3.10.4</version>
+            </dependency>
+            <dependency>
+                <groupId>io.micronaut</groupId>
+                <artifactId>micronaut-http</artifactId>
+                <version>3.10.4</version>
+            </dependency>
+
+            <!-- Lombok -->
+            <dependency>
+                <groupId>org.projectlombok</groupId>
+                <artifactId>lombok</artifactId>
+                <version>${lombok.version}</version>
+                <optional>true</optional>
+            </dependency>
+        </dependencies>
+    </dependencyManagement>
+
+    <build>
+        <pluginManagement>
+            <plugins>
+                <plugin>
+                    <groupId>org.apache.maven.plugins</groupId>
+                    <artifactId>maven-compiler-plugin</artifactId>
+                    <version>3.13.0</version>
+                    <configuration>
+                        <release>17</release>
+                    </configuration>
+                </plugin>
+                <plugin>
+                    <groupId>org.apache.maven.plugins</groupId>
+                    <artifactId>maven-surefire-plugin</artifactId>
+                    <version>3.5.2</version>
+                </plugin>
+            </plugins>
+        </pluginManagement>
+        <plugins>
+            <plugin>
+                <groupId>org.apache.rat</groupId>
+                <artifactId>apache-rat-plugin</artifactId>
+                <configuration>
+                    <inputExcludes>
+                        <inputExclude>README.md</inputExclude>
+                        <inputExclude>**/README.md</inputExclude>
+                        <inputExclude>**/target/**</inputExclude>
+                        <inputExclude>**/.idea/**</inputExclude>
+                    </inputExcludes>
+                </configuration>
+            </plugin>
+        </plugins>
+    </build>
+</project>
diff --git a/tika-e2e-tests/tika-grpc/README.md 
b/tika-e2e-tests/tika-grpc/README.md
new file mode 100644
index 0000000000..2da3969e94
--- /dev/null
+++ b/tika-e2e-tests/tika-grpc/README.md
@@ -0,0 +1,84 @@
+# Tika gRPC End-to-End Tests
+
+End-to-end integration tests for Apache Tika gRPC Server.
+
+## Overview
+
+This test module validates the functionality of Apache Tika gRPC Server by:
+- Starting a local tika-grpc server using the Maven exec plugin (default)
+- Parsing small committed test fixture documents
+- Testing various fetchers (filesystem, Ignite config store, etc.)
+- Verifying parsing results and metadata extraction
+
+## Prerequisites
+
+- Java 17 or later
+- Maven 3.6 or later
+- Docker and Docker Compose (only required when using 
`tika.e2e.useLocalServer=false`)
+
+## Building
+
+```bash
+../../mvnw clean install
+```
+
+## Running Tests
+
+### Run all tests (default: local server mode, committed fixtures)
+
+```bash
+../../mvnw test
+```
+
+### Run specific test
+
+```bash
+../../mvnw test -Dtest=FileSystemFetcherTest
+../../mvnw test -Dtest=IgniteConfigStoreTest
+```
+
+### Test with the full GovDocs1 corpus (opt-in)
+
+By default tests use small committed fixture files. To run against the real 
GovDocs1 corpus, pass `-Dtika.e2e.useGovdocs=true` to trigger a download:
+
+```bash
+../../mvnw test -Dtika.e2e.useGovdocs=true
+```
+
+`govdocs1.fromIndex` and `govdocs1.toIndex` control which zip files are 
downloaded (default: zip 001 only). To fetch a wider range or cap the number of 
documents parsed:
+
+```bash
+../../mvnw test -Dtika.e2e.useGovdocs=true -Dgovdocs1.fromIndex=1 
-Dgovdocs1.toIndex=5 -Dcorpus.numDocs=100
+```
+
+## Test Structure
+
+- `ExternalTestBase.java` - Base class for all tests
+  - Manages local server or Docker Compose containers
+  - Provides utility methods for gRPC communication
+
+- `filesystem/FileSystemFetcherTest.java` - Tests for filesystem fetcher
+  - Tests fetching and parsing files from local filesystem
+
+- `ignite/IgniteConfigStoreTest.java` - Tests for Ignite config store
+  - Tests configuration storage and retrieval via Ignite
+
+## Docker Mode
+
+To run against a Docker Compose deployment instead of a local server:
+
+```bash
+../../mvnw test -Dtika.e2e.useLocalServer=false 
-Dtika.docker.compose.file=/path/to/docker-compose.yml
+```
+
+The Docker image `apache/tika-grpc:local` can be built from the Tika root:
+
+```bash
+cd /path/to/tika
+./mvnw clean install -DskipTests
+# then follow tika-grpc Docker build instructions
+```
+
+## License
+
+Licensed under the Apache License, Version 2.0. See the main Tika LICENSE.txt 
file for details.
diff --git a/tika-e2e-tests/tika-grpc/pom.xml b/tika-e2e-tests/tika-grpc/pom.xml
new file mode 100644
index 0000000000..3d7c8ead4c
--- /dev/null
+++ b/tika-e2e-tests/tika-grpc/pom.xml
@@ -0,0 +1,178 @@
+<?xml version="1.0" encoding="UTF-8"?>
+
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+<project xmlns="http://maven.apache.org/POM/4.0.0";
+         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
https://maven.apache.org/xsd/maven-4.0.0.xsd";>
+    <modelVersion>4.0.0</modelVersion>
+
+    <parent>
+        <groupId>org.apache.tika</groupId>
+        <artifactId>tika-e2e-tests</artifactId>
+        <version>${revision}</version>
+        <relativePath>../pom.xml</relativePath>
+    </parent>
+
+    <artifactId>tika-grpc-e2e-test</artifactId>
+    <name>Apache Tika gRPC End-to-End Tests</name>
+    <description>End-to-end tests for Apache Tika gRPC Server using test 
containers</description>
+
+    <properties>
+        <!-- Use local server mode by default in CI (faster, no Docker 
required) -->
+        <govdocs1.fromIndex>1</govdocs1.fromIndex>
+        <govdocs1.toIndex>1</govdocs1.toIndex>
+        <tika.e2e.useLocalServer>true</tika.e2e.useLocalServer>
+        <tika.e2e.useGovdocs>false</tika.e2e.useGovdocs>
+        <corpus.numDocs>2</corpus.numDocs>
+    </properties>
+
+    <dependencies>
+        <!-- Tika gRPC -->
+        <dependency>
+            <groupId>org.apache.tika</groupId>
+            <artifactId>tika-grpc</artifactId>
+            <version>${tika.version}</version>
+        </dependency>
+
+        <!-- Tika Fetchers -->
+        <dependency>
+            <groupId>org.apache.tika</groupId>
+            <artifactId>tika-pipes-file-system</artifactId>
+            <version>${tika.version}</version>
+        </dependency>
+        <dependency>
+            <groupId>org.apache.tika</groupId>
+            <artifactId>tika-pipes-core</artifactId>
+            <version>${tika.version}</version>
+        </dependency>
+
+        <!-- Jackson for JSON -->
+        <dependency>
+            <groupId>com.fasterxml.jackson.core</groupId>
+            <artifactId>jackson-databind</artifactId>
+        </dependency>
+
+        <!-- Lombok -->
+        <dependency>
+            <groupId>org.projectlombok</groupId>
+            <artifactId>lombok</artifactId>
+            <optional>true</optional>
+        </dependency>
+
+        <!-- JUnit 5 -->
+        <dependency>
+            <groupId>org.junit.jupiter</groupId>
+            <artifactId>junit-jupiter-engine</artifactId>
+            <scope>test</scope>
+        </dependency>
+        <dependency>
+            <groupId>org.junit.jupiter</groupId>
+            <artifactId>junit-jupiter-api</artifactId>
+            <scope>test</scope>
+        </dependency>
+
+        <!-- Testcontainers -->
+        <dependency>
+            <groupId>org.testcontainers</groupId>
+            <artifactId>testcontainers</artifactId>
+            <scope>test</scope>
+        </dependency>
+        <dependency>
+            <groupId>org.testcontainers</groupId>
+            <artifactId>testcontainers-junit-jupiter</artifactId>
+            <scope>test</scope>
+        </dependency>
+
+        <!-- Logging -->
+        <dependency>
+            <groupId>org.apache.logging.log4j</groupId>
+            <artifactId>log4j-core</artifactId>
+        </dependency>
+        <dependency>
+            <groupId>org.apache.logging.log4j</groupId>
+            <artifactId>log4j-slf4j2-impl</artifactId>
+        </dependency>
+        <dependency>
+            <groupId>org.slf4j</groupId>
+            <artifactId>slf4j-api</artifactId>
+        </dependency>
+        
+        <!-- Awaitility for robust waiting -->
+        <dependency>
+            <groupId>org.awaitility</groupId>
+            <artifactId>awaitility</artifactId>
+            <version>4.2.0</version>
+            <scope>test</scope>
+        </dependency>
+    </dependencies>
+
+    <build>
+        <plugins>
+            <plugin>
+                <groupId>org.apache.maven.plugins</groupId>
+                <artifactId>maven-compiler-plugin</artifactId>
+            </plugin>
+            <plugin>
+                <groupId>org.apache.maven.plugins</groupId>
+                <artifactId>maven-surefire-plugin</artifactId>
+                <configuration>
+                    <includes>
+                        <include>**/*Test.java</include>
+                    </includes>
+                    <systemPropertyVariables>
+                        
<govdocs1.fromIndex>${govdocs1.fromIndex}</govdocs1.fromIndex>
+                        
<govdocs1.toIndex>${govdocs1.toIndex}</govdocs1.toIndex>
+                        
<tika.e2e.useLocalServer>${tika.e2e.useLocalServer}</tika.e2e.useLocalServer>
+                        
<tika.e2e.useGovdocs>${tika.e2e.useGovdocs}</tika.e2e.useGovdocs>
+                        <corpus.numDocs>${corpus.numDocs}</corpus.numDocs>
+                    </systemPropertyVariables>
+                </configuration>
+            </plugin>
+            <!-- Disable dependency convergence check for e2e tests -->
+            <!-- Ignite 3.x brings many transitive dependencies that conflict 
-->
+            <!-- but tests work fine in practice -->
+            <plugin>
+                <groupId>org.apache.maven.plugins</groupId>
+                <artifactId>maven-enforcer-plugin</artifactId>
+                <executions>
+                    <execution>
+                        <id>enforce-maven</id>
+                        <phase>none</phase>
+                    </execution>
+                </executions>
+            </plugin>
+            <!-- Configure RAT to exclude files that don't need license 
headers -->
+            <plugin>
+                <groupId>org.apache.rat</groupId>
+                <artifactId>apache-rat-plugin</artifactId>
+                <configuration>
+                    <inputExcludes>
+                        <inputExclude>**/README*.md</inputExclude>
+                        
<inputExclude>src/test/resources/docker-compose*.yml</inputExclude>
+                        
<inputExclude>src/test/resources/log4j2.xml</inputExclude>
+                        
<inputExclude>src/test/resources/tika-config*.json</inputExclude>
+                        
<inputExclude>src/test/resources/test-fixtures/**</inputExclude>
+                    </inputExcludes>
+                </configuration>
+            </plugin>
+        </plugins>
+    </build>
+</project>
diff --git 
a/tika-e2e-tests/tika-grpc/sample-configs/ignite/tika-config-ignite.json 
b/tika-e2e-tests/tika-grpc/sample-configs/ignite/tika-config-ignite.json
new file mode 100644
index 0000000000..1262f8c549
--- /dev/null
+++ b/tika-e2e-tests/tika-grpc/sample-configs/ignite/tika-config-ignite.json
@@ -0,0 +1,24 @@
+{
+  "pipes": {
+    "configStoreType": "ignite",
+    "configStoreParams": "{\n      \"tableName\": \"tika-config-store\",\n     
 \"igniteInstanceName\": \"TikaIgniteCluster\",\n      \"replicas\": 2,\n      
\"partitions\": 10,\n      \"autoClose\": true\n    }"
+  },
+  "fetchers": [
+    {
+      "id": "fs",
+      "name": "file-system",
+      "params": {
+        "basePath": "/data/input"
+      }
+    }
+  ],
+  "emitters": [
+    {
+      "id": "fs",
+      "name": "file-system",
+      "params": {
+        "basePath": "/data/output"
+      }
+    }
+  ]
+}
diff --git 
a/tika-e2e-tests/tika-grpc/src/test/java/org/apache/tika/pipes/ExternalTestBase.java
 
b/tika-e2e-tests/tika-grpc/src/test/java/org/apache/tika/pipes/ExternalTestBase.java
new file mode 100644
index 0000000000..d0d135309d
--- /dev/null
+++ 
b/tika-e2e-tests/tika-grpc/src/test/java/org/apache/tika/pipes/ExternalTestBase.java
@@ -0,0 +1,364 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.tika.pipes;
+
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.InputStreamReader;
+import java.io.OutputStream;
+import java.net.URL;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.StandardCopyOption;
+import java.time.Duration;
+import java.time.temporal.ChronoUnit;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Locale;
+import java.util.Set;
+import java.util.concurrent.TimeUnit;
+import java.util.regex.Pattern;
+import java.util.stream.Stream;
+import java.util.zip.ZipEntry;
+import java.util.zip.ZipInputStream;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import io.grpc.ManagedChannel;
+import io.grpc.ManagedChannelBuilder;
+import lombok.extern.slf4j.Slf4j;
+import org.junit.jupiter.api.AfterAll;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeAll;
+import org.junit.jupiter.api.Tag;
+import org.junit.jupiter.api.TestInstance;
+import org.testcontainers.containers.DockerComposeContainer;
+import org.testcontainers.containers.output.Slf4jLogConsumer;
+import org.testcontainers.containers.wait.strategy.Wait;
+import org.testcontainers.junit.jupiter.Testcontainers;
+
+import org.apache.tika.FetchAndParseReply;
+import org.apache.tika.ListFetchersRequest;
+import org.apache.tika.TikaGrpc;
+
+@TestInstance(TestInstance.Lifecycle.PER_CLASS)
+@Testcontainers
+@Slf4j
+@Tag("E2ETest")
+public abstract class ExternalTestBase {
+    public static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
+    public static final int MAX_STARTUP_TIMEOUT = 120;
+    public static final String GOV_DOCS_FOLDER = "/tika/govdocs1";
+    public static final File TEST_FOLDER = new File("target", "govdocs1");
+    public static final int GOV_DOCS_FROM_IDX = 
Integer.parseInt(System.getProperty("govdocs1.fromIndex", "1"));
+    public static final int GOV_DOCS_TO_IDX = 
Integer.parseInt(System.getProperty("govdocs1.toIndex", "1"));
+    public static final String DIGITAL_CORPORA_ZIP_FILES_URL = 
"https://corp.digitalcorpora.org/corpora/files/govdocs1/zipfiles";;
+    private static final boolean USE_LOCAL_SERVER = 
Boolean.parseBoolean(System.getProperty("tika.e2e.useLocalServer", "true"));
+    private static final int GRPC_PORT = 
Integer.parseInt(System.getProperty("tika.e2e.grpcPort", "50052"));
+    
+    public static DockerComposeContainer<?> composeContainer;
+    private static Process localGrpcProcess;
+
+    @BeforeAll
+    static void setup() throws Exception {
+        loadGovdocs1();
+        
+        if (USE_LOCAL_SERVER) {
+            startLocalGrpcServer();
+        } else {
+            startDockerGrpcServer();
+        }
+    }
+    
+    private static void startLocalGrpcServer() throws Exception {
+        log.info("Starting local tika-grpc server using Maven exec");
+        
+        Path tikaGrpcDir = findTikaGrpcDirectory();
+        Path configFile = 
Path.of("src/test/resources/tika-config.json").toAbsolutePath();
+        
+        if (!Files.exists(configFile)) {
+            throw new IllegalStateException("Config file not found: " + 
configFile);
+        }
+        
+        log.info("Using tika-grpc from: {}", tikaGrpcDir);
+        log.info("Using config file: {}", configFile);
+        
+        String javaHome = System.getProperty("java.home");
+        boolean isWindows = 
System.getProperty("os.name").toLowerCase(Locale.ROOT).contains("win");
+        String javaCmd = javaHome + (isWindows ? "\\bin\\java.exe" : 
"/bin/java");
+        String mvnCmd = tikaGrpcDir.getParent().resolve(isWindows ? "mvnw.cmd" 
: "mvnw").toString();
+        
+        ProcessBuilder pb = new ProcessBuilder(
+            mvnCmd,
+            "exec:exec",
+            "-Dexec.executable=" + javaCmd,
+            "-Dexec.args=" +
+                "--add-opens=java.base/java.lang=ALL-UNNAMED " +
+                "--add-opens=java.base/java.nio=ALL-UNNAMED " +
+                "--add-opens=java.base/java.util=ALL-UNNAMED " +
+                "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED " +
+                "-classpath %classpath " +
+                "org.apache.tika.pipes.grpc.TikaGrpcServer " +
+                "-c \"" + configFile + "\" " +
+                "-p " + GRPC_PORT
+        );
+        
+        pb.directory(tikaGrpcDir.toFile());
+        pb.redirectErrorStream(true);
+        pb.redirectOutput(ProcessBuilder.Redirect.PIPE);
+        
+        localGrpcProcess = pb.start();
+        
+        Thread logThread = new Thread(() -> {
+            try (BufferedReader reader = new BufferedReader(
+                    new InputStreamReader(localGrpcProcess.getInputStream(), 
StandardCharsets.UTF_8))) {
+                String line;
+                while ((line = reader.readLine()) != null) {
+                    log.info("tika-grpc: {}", line);
+                }
+            } catch (IOException e) {
+                log.error("Error reading server output", e);
+            }
+        });
+        logThread.setDaemon(true);
+        logThread.start();
+        
+        waitForServerReady();
+        
+        log.info("Local tika-grpc server started successfully on port {}", 
GRPC_PORT);
+    }
+    
+    private static Path findTikaGrpcDirectory() {
+        Path currentDir = Path.of("").toAbsolutePath();
+        Path tikaRootDir = currentDir;
+        
+        while (tikaRootDir != null && 
+               !(Files.exists(tikaRootDir.resolve("tika-grpc")) && 
+                 Files.exists(tikaRootDir.resolve("tika-e2e-tests")))) {
+            tikaRootDir = tikaRootDir.getParent();
+        }
+        
+        if (tikaRootDir == null) {
+            throw new IllegalStateException("Cannot find tika root directory. 
" +
+                "Current dir: " + currentDir);
+        }
+        
+        return tikaRootDir.resolve("tika-grpc");
+    }
+    
+    private static void waitForServerReady() throws Exception {
+        int maxAttempts = 60;
+        for (int i = 0; i < maxAttempts; i++) {
+            ManagedChannel testChannel = ManagedChannelBuilder
+                    .forAddress("localhost", GRPC_PORT)
+                    .usePlaintext()
+                    .build();
+            try {
+                TikaGrpc.TikaBlockingStub stub = 
TikaGrpc.newBlockingStub(testChannel);
+                stub.listFetchers(ListFetchersRequest.newBuilder().build());
+                log.info("gRPC server is ready");
+                return;
+            } catch (Exception e) {
+                log.trace("gRPC server not ready yet (attempt {}/{}): {}", i + 
1, maxAttempts, e.getMessage());
+            } finally {
+                testChannel.shutdown();
+                testChannel.awaitTermination(1, TimeUnit.SECONDS);
+            }
+            TimeUnit.SECONDS.sleep(1);
+        }
+
+        if (localGrpcProcess != null && localGrpcProcess.isAlive()) {
+            localGrpcProcess.destroyForcibly();
+        }
+        throw new RuntimeException("Local gRPC server failed to start within 
timeout");
+    }
+    
+    private static void startDockerGrpcServer() {
+        log.info("Starting Docker Compose tika-grpc server");
+        
+        String composeFilePath = 
System.getProperty("tika.docker.compose.file");
+        if (composeFilePath == null || composeFilePath.isBlank()) {
+            throw new IllegalStateException(
+                    "Docker Compose mode requires system property 
'tika.docker.compose.file' " +
+                    "pointing to a valid docker-compose.yml file.");
+        }
+        File composeFile = new File(composeFilePath);
+        if (!composeFile.isFile()) {
+            throw new IllegalStateException("Docker Compose file not found: " 
+ composeFile.getAbsolutePath());
+        }
+        composeContainer = new DockerComposeContainer<>(composeFile)
+                .withEnv("HOST_GOVDOCS1_DIR", TEST_FOLDER.getAbsolutePath())
+                .withStartupTimeout(Duration.of(MAX_STARTUP_TIMEOUT, 
ChronoUnit.SECONDS))
+                .withExposedService("tika-grpc", 50052, 
+                    Wait.forLogMessage(".*Server started.*\\n", 1))
+                .withLogConsumer("tika-grpc", new Slf4jLogConsumer(log));
+        
+        composeContainer.start();
+        
+        log.info("Docker Compose containers started successfully");
+    }
+
+    private static void loadGovdocs1() throws IOException, 
InterruptedException {
+        if (Boolean.parseBoolean(System.getProperty("tika.e2e.useGovdocs", 
"false"))) {
+            // Opt-in: download the actual GovDocs1 corpus when explicitly 
requested via -Dtika.e2e.useGovdocs=true.
+            // Default CI runs use committed test fixtures to avoid any 
network dependency.
+            int retries = 3;
+            int attempt = 0;
+            while (true) {
+                try {
+                    downloadAndUnzipGovdocs1(GOV_DOCS_FROM_IDX, 
GOV_DOCS_TO_IDX);
+                    break;
+                } catch (IOException e) {
+                    attempt++;
+                    if (attempt >= retries) {
+                        throw e;
+                    }
+                    log.warn("Download attempt {} failed, retrying in 10 
seconds...", attempt, e);
+                    TimeUnit.SECONDS.sleep(10);
+                }
+            }
+        } else {
+            copyTestFixtures();
+        }
+    }
+
+    public static void copyTestFixtures() throws IOException {
+        Path targetDir = TEST_FOLDER.toPath();
+        Files.createDirectories(targetDir);
+        String[] fixtures = {"sample.txt", "sample.html", "sample.csv", 
"sample.xml"};
+        for (String fixture : fixtures) {
+            URL resource = ExternalTestBase.class.getClassLoader()
+                    .getResource("test-fixtures/" + fixture);
+            if (resource == null) {
+                throw new IllegalStateException("Test fixture not found: 
test-fixtures/" + fixture);
+            }
+            try (InputStream in = resource.openStream()) {
+                Files.copy(in, targetDir.resolve(fixture), 
StandardCopyOption.REPLACE_EXISTING);
+            }
+        }
+        log.info("Copied {} test fixtures to {}", fixtures.length, targetDir);
+    }
+
+    @AfterAll
+    void close() {
+        if (USE_LOCAL_SERVER && localGrpcProcess != null) {
+            log.info("Stopping local gRPC server");
+            localGrpcProcess.destroy();
+            try {
+                if (!localGrpcProcess.waitFor(10, TimeUnit.SECONDS)) {
+                    localGrpcProcess.destroyForcibly();
+                }
+            } catch (InterruptedException e) {
+                Thread.currentThread().interrupt();
+                localGrpcProcess.destroyForcibly();
+            }
+        } else if (composeContainer != null) {
+            composeContainer.close();
+        }
+    }
+
+    public static void downloadAndUnzipGovdocs1(int fromIndex, int toIndex) 
throws IOException {
+        Path targetDir = TEST_FOLDER.toPath();
+        Files.createDirectories(targetDir);
+
+        for (int i = fromIndex; i <= toIndex; i++) {
+            String zipName = String.format(java.util.Locale.ROOT, "%03d.zip", 
i);
+            String url = DIGITAL_CORPORA_ZIP_FILES_URL + "/" + zipName;
+            Path zipPath = targetDir.resolve(zipName);
+
+            if (Files.exists(zipPath)) {
+                log.info("{} already exists, skipping download", zipName);
+            } else {
+                log.info("Downloading {} from {}...", zipName, url);
+                try (InputStream in = new URL(url).openStream()) {
+                    Files.copy(in, zipPath, 
StandardCopyOption.REPLACE_EXISTING);
+                }
+            }
+            log.info("Unzipping {}...", zipName);
+            try (ZipInputStream zis = new ZipInputStream(new 
FileInputStream(zipPath.toFile()))) {
+                ZipEntry entry;
+                while ((entry = zis.getNextEntry()) != null) {
+                    Path outPath = targetDir.resolve(entry.getName());
+                    if (entry.isDirectory()) {
+                        Files.createDirectories(outPath);
+                    } else {
+                        Files.createDirectories(outPath.getParent());
+                        try (OutputStream out = 
Files.newOutputStream(outPath)) {
+                            zis.transferTo(out);
+                        }
+                    }
+                    zis.closeEntry();
+                }
+            }
+        }
+        
+        log.info("Finished downloading and extracting govdocs1 files");
+    }
+
+    public static void assertAllFilesFetched(Path baseDir, 
List<FetchAndParseReply> successes, 
+                                            List<FetchAndParseReply> errors) {
+        Set<String> allFetchKeys = new HashSet<>();
+        for (FetchAndParseReply reply : successes) {
+            allFetchKeys.add(reply.getFetchKey());
+        }
+        for (FetchAndParseReply reply : errors) {
+            allFetchKeys.add(reply.getFetchKey());
+        }
+        
+        Set<String> keysFromGovdocs1 = new HashSet<>();
+        try (Stream<Path> paths = Files.walk(baseDir)) {
+            paths.filter(Files::isRegularFile)
+                    .forEach(file -> {
+                        String relPath = baseDir.relativize(file).toString();
+                        if 
(Pattern.compile("\\d{3}\\.zip").matcher(relPath).find()) {
+                            return;
+                        }
+                        keysFromGovdocs1.add(relPath);
+                    });
+        } catch (IOException e) {
+            throw new RuntimeException(e);
+        }
+        
+        Assertions.assertNotEquals(0, successes.size(), "Should have some 
successful fetches");
+        log.info("Processed {} files: {} successes, {} errors", 
allFetchKeys.size(), successes.size(), errors.size());
+        Assertions.assertEquals(keysFromGovdocs1, allFetchKeys, () -> {
+            Set<String> missing = new HashSet<>(keysFromGovdocs1);
+            missing.removeAll(allFetchKeys);
+            return "Missing fetch keys: " + missing;
+        });
+    }
+
+    public static ManagedChannel getManagedChannel() {
+        if (USE_LOCAL_SERVER) {
+            return ManagedChannelBuilder
+                    .forAddress("localhost", GRPC_PORT)
+                    .usePlaintext()
+                    .maxInboundMessageSize(160 * 1024 * 1024)
+                    .build();
+        } else {
+            return ManagedChannelBuilder
+                    .forAddress(composeContainer.getServiceHost("tika-grpc", 
50052),
+                               composeContainer.getServicePort("tika-grpc", 
50052))
+                    .usePlaintext()
+                    .maxInboundMessageSize(160 * 1024 * 1024)
+                    .build();
+        }
+    }
+}
diff --git 
a/tika-e2e-tests/tika-grpc/src/test/java/org/apache/tika/pipes/filesystem/FileSystemFetcherTest.java
 
b/tika-e2e-tests/tika-grpc/src/test/java/org/apache/tika/pipes/filesystem/FileSystemFetcherTest.java
new file mode 100644
index 0000000000..9947763e08
--- /dev/null
+++ 
b/tika-e2e-tests/tika-grpc/src/test/java/org/apache/tika/pipes/filesystem/FileSystemFetcherTest.java
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.tika.pipes.filesystem;
+
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Stream;
+
+import io.grpc.ManagedChannel;
+import io.grpc.stub.StreamObserver;
+import lombok.extern.slf4j.Slf4j;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.condition.DisabledOnOs;
+import org.junit.jupiter.api.condition.OS;
+
+import org.apache.tika.FetchAndParseReply;
+import org.apache.tika.FetchAndParseRequest;
+import org.apache.tika.SaveFetcherReply;
+import org.apache.tika.SaveFetcherRequest;
+import org.apache.tika.TikaGrpc;
+import org.apache.tika.pipes.ExternalTestBase;
+import org.apache.tika.pipes.fetcher.fs.FileSystemFetcherConfig;
+
+@Slf4j
+@DisabledOnOs(value = OS.WINDOWS, disabledReason = "exec:exec classpath 
exceeds Windows CreateProcess command-line length limit")
+class FileSystemFetcherTest extends ExternalTestBase {
+    
+    @Test
+    void testFileSystemFetcher() throws Exception {
+        String fetcherId = "defaultFetcher";
+        ManagedChannel channel = getManagedChannel();
+        try {
+        TikaGrpc.TikaBlockingStub blockingStub = 
TikaGrpc.newBlockingStub(channel);
+        TikaGrpc.TikaStub tikaStub = TikaGrpc.newStub(channel);
+
+        FileSystemFetcherConfig config = new FileSystemFetcherConfig();
+        boolean useLocalServer = 
Boolean.parseBoolean(System.getProperty("tika.e2e.useLocalServer", "true"));
+        String basePath = useLocalServer ? TEST_FOLDER.getAbsolutePath() : 
GOV_DOCS_FOLDER;
+        config.setBasePath(basePath);
+        
+        String configJson = OBJECT_MAPPER.writeValueAsString(config);
+        log.info("Creating fetcher with config (basePath={}): {}", basePath, 
configJson);
+        
+        SaveFetcherReply saveReply = 
blockingStub.saveFetcher(SaveFetcherRequest
+                .newBuilder()
+                .setFetcherId(fetcherId)
+                
.setFetcherClass("org.apache.tika.pipes.fetcher.fs.FileSystemFetcher")
+                .setFetcherConfigJson(configJson)
+                .build());
+        
+        log.info("Fetcher created: {}", saveReply.getFetcherId());
+
+        List<FetchAndParseReply> successes = Collections.synchronizedList(new 
ArrayList<>());
+        List<FetchAndParseReply> errors = Collections.synchronizedList(new 
ArrayList<>());
+
+        CountDownLatch countDownLatch = new CountDownLatch(1);
+        StreamObserver<FetchAndParseRequest>
+                requestStreamObserver = 
tikaStub.fetchAndParseBiDirectionalStreaming(new StreamObserver<>() {
+            @Override
+            public void onNext(FetchAndParseReply fetchAndParseReply) {
+                log.debug("Reply from fetch-and-parse - key={}, status={}", 
+                    fetchAndParseReply.getFetchKey(), 
fetchAndParseReply.getStatus());
+                if 
("FETCH_AND_PARSE_EXCEPTION".equals(fetchAndParseReply.getStatus())) {
+                    errors.add(fetchAndParseReply);
+                } else {
+                    successes.add(fetchAndParseReply);
+                }
+            }
+
+            @Override
+            public void onError(Throwable throwable) {
+                log.error("Received an error", throwable);
+                Assertions.fail(throwable);
+                countDownLatch.countDown();
+            }
+
+            @Override
+            public void onCompleted() {
+                log.info("Finished streaming fetch and parse replies");
+                countDownLatch.countDown();
+            }
+        });
+
+        int maxDocs = Integer.parseInt(System.getProperty("corpus.numDocs", 
"-1"));
+        try (Stream<Path> paths = Files.walk(TEST_FOLDER.toPath())) {
+            Stream<Path> fileStream = paths
+                    .filter(Files::isRegularFile)
+                    .filter(p -> !p.toString().endsWith(".zip"));
+            if (maxDocs > 0) {
+                fileStream = fileStream.limit(maxDocs);
+            }
+            fileStream.forEach(file -> {
+                        try {
+                            String relPath = 
TEST_FOLDER.toPath().relativize(file).toString();
+                            requestStreamObserver.onNext(FetchAndParseRequest
+                                    .newBuilder()
+                                    .setFetcherId(fetcherId)
+                                    .setFetchKey(relPath)
+                                    .build());
+                        } catch (Exception e) {
+                            throw new RuntimeException(e);
+                        }
+                    });
+        }
+        log.info("Done submitting files to fetcher {}", fetcherId);
+
+        requestStreamObserver.onCompleted();
+
+        try {
+            if (!countDownLatch.await(3, TimeUnit.MINUTES)) {
+                log.error("Timed out waiting for parse to complete");
+                Assertions.fail("Timed out waiting for parsing to complete");
+            }
+        } catch (InterruptedException e) {
+            Thread.currentThread().interrupt();
+            Assertions.fail("Interrupted while waiting for parsing to 
complete");
+        }
+        
+        if (maxDocs == -1) {
+            assertAllFilesFetched(TEST_FOLDER.toPath(), successes, errors);
+        } else {
+            int totalProcessed = successes.size() + errors.size();
+            log.info("Processed {} documents (limit was {})", totalProcessed, 
maxDocs);
+            Assertions.assertTrue(totalProcessed <= maxDocs, 
+                "Should not process more than " + maxDocs + " documents");
+            Assertions.assertTrue(totalProcessed > 0, 
+                "Should have processed at least one document");
+        }
+        
+        log.info("Test completed successfully - {} successes, {} errors", 
+            successes.size(), errors.size());
+        } finally {
+            channel.shutdown();
+            try {
+                if (!channel.awaitTermination(5, TimeUnit.SECONDS)) {
+                    channel.shutdownNow();
+                }
+            } catch (InterruptedException e) {
+                channel.shutdownNow();
+                Thread.currentThread().interrupt();
+            }
+        }
+    }
+}
diff --git 
a/tika-e2e-tests/tika-grpc/src/test/java/org/apache/tika/pipes/ignite/IgniteConfigStoreTest.java
 
b/tika-e2e-tests/tika-grpc/src/test/java/org/apache/tika/pipes/ignite/IgniteConfigStoreTest.java
new file mode 100644
index 0000000000..fb986362a9
--- /dev/null
+++ 
b/tika-e2e-tests/tika-grpc/src/test/java/org/apache/tika/pipes/ignite/IgniteConfigStoreTest.java
@@ -0,0 +1,533 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.tika.pipes.ignite;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.time.Duration;
+import java.time.temporal.ChronoUnit;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Locale;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Stream;
+
+import io.grpc.ManagedChannel;
+import io.grpc.ManagedChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import lombok.extern.slf4j.Slf4j;
+import org.junit.jupiter.api.AfterAll;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeAll;
+import org.junit.jupiter.api.Tag;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.TestInstance;
+import org.junit.jupiter.api.condition.DisabledOnOs;
+import org.junit.jupiter.api.condition.OS;
+import org.testcontainers.containers.DockerComposeContainer;
+import org.testcontainers.containers.output.Slf4jLogConsumer;
+import org.testcontainers.containers.wait.strategy.Wait;
+import org.testcontainers.junit.jupiter.Testcontainers;
+
+import org.apache.tika.FetchAndParseReply;
+import org.apache.tika.FetchAndParseRequest;
+import org.apache.tika.SaveFetcherReply;
+import org.apache.tika.SaveFetcherRequest;
+import org.apache.tika.TikaGrpc;
+import org.apache.tika.pipes.ExternalTestBase;
+import org.apache.tika.pipes.fetcher.fs.FileSystemFetcherConfig;
+
+@TestInstance(TestInstance.Lifecycle.PER_CLASS)
+@Testcontainers
+@Slf4j
+@Tag("E2ETest")
+@DisabledOnOs(value = OS.WINDOWS, disabledReason = "Windows classpath length 
limit (CreateProcess error=206) exceeded by exec:exec with full Tika classpath")
+class IgniteConfigStoreTest {
+    
+    private static final int MAX_STARTUP_TIMEOUT = 
ExternalTestBase.MAX_STARTUP_TIMEOUT;
+    private static final File TEST_FOLDER = ExternalTestBase.TEST_FOLDER;
+    private static final boolean USE_LOCAL_SERVER = 
Boolean.parseBoolean(System.getProperty("tika.e2e.useLocalServer", "true"));
+    private static final int GRPC_PORT = 
Integer.parseInt(System.getProperty("tika.e2e.grpcPort", "50052"));
+    
+    private static DockerComposeContainer<?> igniteComposeContainer;
+    private static Process localGrpcProcess;
+    
+    @BeforeAll
+    static void setupIgnite() throws Exception {
+        if (USE_LOCAL_SERVER) {
+            try {
+                killProcessOnPort(GRPC_PORT);
+                killProcessOnPort(3344);
+                killProcessOnPort(10800);
+            } catch (Exception e) {
+                log.debug("No orphaned processes to clean up");
+            }
+        }
+        
+        if (!hasExtractedFiles(TEST_FOLDER)) {
+            if (Boolean.parseBoolean(System.getProperty("tika.e2e.useGovdocs", 
"false"))) {
+                
ExternalTestBase.downloadAndUnzipGovdocs1(ExternalTestBase.GOV_DOCS_FROM_IDX, 
ExternalTestBase.GOV_DOCS_TO_IDX);
+            } else {
+                ExternalTestBase.copyTestFixtures();
+            }
+        }
+        
+        if (USE_LOCAL_SERVER) {
+            startLocalGrpcServer();
+        } else {
+            startDockerGrpcServer();
+        }
+    }
+    
+    /** Returns true only if the folder contains at least one non-zip 
extracted file. */
+    private static boolean hasExtractedFiles(File folder) {
+        if (!folder.exists()) {
+            return false;
+        }
+        File[] files = folder.listFiles(f -> f.isFile() && 
!f.getName().endsWith(".zip"));
+        return files != null && files.length > 0;
+    }
+
+    private static void startLocalGrpcServer() throws Exception {
+        log.info("Starting local tika-grpc server using Maven");
+        
+        Path currentDir = Path.of("").toAbsolutePath();
+        Path tikaRootDir = currentDir;
+        
+        while (tikaRootDir != null && 
+               !(Files.exists(tikaRootDir.resolve("tika-grpc")) && 
+                 Files.exists(tikaRootDir.resolve("tika-e2e-tests")))) {
+            tikaRootDir = tikaRootDir.getParent();
+        }
+        
+        if (tikaRootDir == null) {
+            throw new IllegalStateException("Cannot find tika root directory. 
" +
+                "Current dir: " + currentDir + ". " +
+                "Please run from within the tika project.");
+        }
+        
+        Path tikaGrpcDir = tikaRootDir.resolve("tika-grpc");
+        if (!Files.exists(tikaGrpcDir)) {
+            throw new IllegalStateException("Cannot find tika-grpc directory 
at: " + tikaGrpcDir);
+        }
+        
+        String configFileName = "tika-config-ignite-local.json";
+        Path configFile = Path.of("src/test/resources/" + 
configFileName).toAbsolutePath();
+        
+        if (!Files.exists(configFile)) {
+            throw new IllegalStateException("Config file not found: " + 
configFile);
+        }
+        
+        log.info("Tika root: {}", tikaRootDir);
+        log.info("Using tika-grpc from: {}", tikaGrpcDir);
+        log.info("Using config file: {}", configFile);
+        
+        // Use mvn exec:exec to run as external process (not exec:java which 
breaks ServiceLoader)
+        String javaHome = System.getProperty("java.home");
+        boolean isWindows = 
System.getProperty("os.name").toLowerCase(Locale.ROOT).contains("win");
+        String javaCmd = javaHome + (isWindows ? "\\bin\\java.exe" : 
"/bin/java");
+        String mvnCmd = tikaRootDir.resolve(isWindows ? "mvnw.cmd" : 
"mvnw").toString();
+        
+        ProcessBuilder pb = new ProcessBuilder(
+            mvnCmd,
+            "exec:exec",
+            "-Dexec.executable=" + javaCmd,
+            "-Dexec.args=" +
+                "--add-opens=java.base/java.lang=ALL-UNNAMED " +
+                "--add-opens=java.base/java.lang.invoke=ALL-UNNAMED " +
+                "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED " +
+                "--add-opens=java.base/java.io=ALL-UNNAMED " +
+                "--add-opens=java.base/java.nio=ALL-UNNAMED " +
+                "--add-opens=java.base/java.math=ALL-UNNAMED " +
+                "--add-opens=java.base/java.util=ALL-UNNAMED " +
+                "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED " +
+                "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED 
" +
+                "--add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED 
" +
+                "--add-opens=java.base/java.time=ALL-UNNAMED " +
+                "--add-opens=java.base/jdk.internal.misc=ALL-UNNAMED " +
+                "--add-opens=java.base/jdk.internal.access=ALL-UNNAMED " +
+                "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED " +
+                
"--add-opens=java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED " +
+                
"--add-opens=jdk.management/com.sun.management.internal=ALL-UNNAMED " +
+                "-Dio.netty.tryReflectionSetAccessible=true " +
+                "-Dignite.work.dir=\"" + 
tikaGrpcDir.resolve("target/ignite-work") + "\" " +
+                "-classpath %classpath " +
+                "org.apache.tika.pipes.grpc.TikaGrpcServer " +
+                "-c \"" + configFile + "\" " +
+                "-p " + GRPC_PORT
+        );
+        
+        pb.directory(tikaGrpcDir.toFile());
+        pb.redirectErrorStream(true);
+        pb.redirectOutput(ProcessBuilder.Redirect.PIPE);
+        
+        localGrpcProcess = pb.start();
+        
+        final boolean[] igniteStarted = {false};
+        
+        Thread logThread = new Thread(() -> {
+            try (java.io.BufferedReader reader = new java.io.BufferedReader(
+                    new 
java.io.InputStreamReader(localGrpcProcess.getInputStream(), 
java.nio.charset.StandardCharsets.UTF_8))) {
+                String line;
+                while ((line = reader.readLine()) != null) {
+                    log.info("tika-grpc: {}", line);
+                    
+                    if (line.contains("Ignite server started") ||
+                        line.contains("Table") && line.contains("created 
successfully") ||
+                        line.contains("Server started, listening on")) {
+                        synchronized (igniteStarted) {
+                            igniteStarted[0] = true;
+                            igniteStarted.notifyAll();
+                        }
+                    }
+                }
+            } catch (IOException e) {
+                log.error("Error reading server output", e);
+            }
+        });
+        logThread.setDaemon(true);
+        logThread.start();
+        
+        try {
+            org.awaitility.Awaitility.await()
+                .atMost(java.time.Duration.ofSeconds(180))
+                .pollInterval(java.time.Duration.ofSeconds(2))
+                .until(() -> {
+                    boolean igniteReady;
+                    synchronized (igniteStarted) {
+                        igniteReady = igniteStarted[0];
+                    }
+                    
+                    if (!igniteReady) {
+                        log.debug("Waiting for Ignite to start...");
+                        return false;
+                    }
+                    
+                    try {
+                        ManagedChannel testChannel = ManagedChannelBuilder
+                            .forAddress("localhost", GRPC_PORT)
+                            .usePlaintext()
+                            .build();
+                        
+                        try {
+                            io.grpc.health.v1.HealthGrpc.HealthBlockingStub 
healthStub = 
+                                
io.grpc.health.v1.HealthGrpc.newBlockingStub(testChannel)
+                                    .withDeadlineAfter(2, TimeUnit.SECONDS);
+                            
+                            io.grpc.health.v1.HealthCheckResponse response = 
healthStub.check(
+                                
io.grpc.health.v1.HealthCheckRequest.getDefaultInstance());
+                            
+                            boolean serving = response.getStatus() == 
+                                
io.grpc.health.v1.HealthCheckResponse.ServingStatus.SERVING;
+                            
+                            if (serving) {
+                                log.info("gRPC server is healthy and 
serving!");
+                                return true;
+                            } else {
+                                log.debug("gRPC server responding but not 
serving yet: {}", response.getStatus());
+                                return false;
+                            }
+                        } finally {
+                            testChannel.shutdown();
+                            testChannel.awaitTermination(1, TimeUnit.SECONDS);
+                        }
+                    } catch (io.grpc.StatusRuntimeException e) {
+                        if (e.getStatus().getCode() == 
io.grpc.Status.Code.UNIMPLEMENTED) {
+                            // Health check not implemented, just verify 
channel works
+                            log.info("Health check not available, assuming 
server is ready");
+                            return true;
+                        }
+                        log.debug("gRPC server not ready yet: {}", 
e.getMessage());
+                        return false;
+                    } catch (Exception e) {
+                        log.debug("gRPC server not ready yet: {}", 
e.getMessage());
+                        return false;
+                    }
+                });
+            
+            log.info("Both gRPC server and Ignite are ready!");
+        } catch (org.awaitility.core.ConditionTimeoutException e) {
+            if (localGrpcProcess.isAlive()) {
+                localGrpcProcess.destroyForcibly();
+            }
+            throw new RuntimeException("Local gRPC server or Ignite failed to 
start within timeout", e);
+        }
+        
+        log.info("Local tika-grpc server started successfully on port {}", 
GRPC_PORT);
+    }
+    
+    
+    private static void startDockerGrpcServer() {
+        String composeFilePath = 
System.getProperty("tika.docker.compose.ignite.file");
+        if (composeFilePath == null || composeFilePath.isBlank()) {
+            throw new IllegalStateException(
+                    "Docker Compose mode requires system property 
'tika.docker.compose.ignite.file' " +
+                    "pointing to a valid docker-compose-ignite.yml file.");
+        }
+        File composeFile = new File(composeFilePath);
+        if (!composeFile.isFile()) {
+            throw new IllegalStateException("Docker Compose file not found: " 
+ composeFile.getAbsolutePath());
+        }
+        igniteComposeContainer = new DockerComposeContainer<>(composeFile)
+                .withEnv("HOST_GOVDOCS1_DIR", TEST_FOLDER.getAbsolutePath())
+                .withStartupTimeout(Duration.of(MAX_STARTUP_TIMEOUT, 
ChronoUnit.SECONDS))
+                .withExposedService("tika-grpc", 50052,
+                    Wait.forLogMessage(".*Server started.*\\n", 1))
+                .withLogConsumer("tika-grpc", new Slf4jLogConsumer(log));
+        
+        igniteComposeContainer.start();
+    }
+    
+    @AfterAll
+    static void teardownIgnite() {
+        if (USE_LOCAL_SERVER && localGrpcProcess != null) {
+            log.info("Stopping local gRPC server and all child processes");
+            
+            try {
+                long mvnPid = localGrpcProcess.pid();
+                log.info("Maven process PID: {}", mvnPid);
+                localGrpcProcess.destroy();
+                
+                if (!localGrpcProcess.waitFor(10, TimeUnit.SECONDS)) {
+                    log.warn("Process didn't stop gracefully, forcing 
shutdown");
+                    localGrpcProcess.destroyForcibly();
+                    localGrpcProcess.waitFor(5, TimeUnit.SECONDS);
+                }
+                
+                Thread.sleep(2000);
+                
+                try {
+                    killProcessOnPort(GRPC_PORT);
+                    killProcessOnPort(3344);
+                    killProcessOnPort(10800);
+                } catch (Exception e) {
+                    log.debug("Error killing processes on ports (may already 
be stopped): {}", e.getMessage());
+                }
+                
+                log.info("Local gRPC server stopped");
+            } catch (InterruptedException e) {
+                Thread.currentThread().interrupt();
+                localGrpcProcess.destroyForcibly();
+            }
+        } else if (igniteComposeContainer != null) {
+            igniteComposeContainer.close();
+        }
+    }
+    
+    private static void killProcessOnPort(int port) throws IOException, 
InterruptedException {
+        ProcessBuilder findPb = new ProcessBuilder("lsof", "-ti", ":" + port);
+        findPb.redirectErrorStream(true);
+        Process findProcess = findPb.start();
+        
+        try (java.io.BufferedReader reader = new java.io.BufferedReader(
+                new java.io.InputStreamReader(findProcess.getInputStream(), 
java.nio.charset.StandardCharsets.UTF_8))) {
+            String pidStr = reader.readLine();
+            if (pidStr != null && !pidStr.trim().isEmpty()) {
+                long pid = Long.parseLong(pidStr.trim());
+                long myPid = ProcessHandle.current().pid();
+                
+                if (pid == myPid || isParentProcess(pid)) {
+                    log.debug("Skipping kill of PID {} on port {} (test 
process or parent)", pid, port);
+                    return;
+                }
+                
+                // Only kill processes we can identify as tika-grpc or Ignite 
instances to avoid
+                // accidentally killing unrelated processes that happen to be 
on the same port.
+                String cmdLine = ProcessHandle.of(pid)
+                        .flatMap(h -> h.info().commandLine())
+                        .orElse("");
+                if (!cmdLine.contains("tika") && !cmdLine.contains("TikaGrpc") 
&& !cmdLine.contains("ignite")) {
+                    log.debug("Skipping kill of PID {} on port {} — not a 
tika/ignite process: {}", pid, port, cmdLine);
+                    return;
+                }
+                
+                log.info("Found tika/ignite process {} on port {}, killing 
it", pid, port);
+                
+                ProcessBuilder killPb = new ProcessBuilder("kill", 
String.valueOf(pid));
+                Process killProcess = killPb.start();
+                killProcess.waitFor(2, TimeUnit.SECONDS);
+                
+                Thread.sleep(1000);
+                ProcessBuilder forceKillPb = new ProcessBuilder("kill", "-9", 
String.valueOf(pid));
+                Process forceKillProcess = forceKillPb.start();
+                forceKillProcess.waitFor(2, TimeUnit.SECONDS);
+            }
+        }
+        
+        findProcess.waitFor(2, TimeUnit.SECONDS);
+    }
+    
+    private static boolean isParentProcess(long pid) {
+        try {
+            ProcessHandle current = ProcessHandle.current();
+            while (current.parent().isPresent()) {
+                current = current.parent().get();
+                if (current.pid() == pid) {
+                    return true;
+                }
+            }
+        } catch (Exception e) {
+            log.debug("Error checking parent process", e);
+        }
+        return false;
+    }
+    
+    @Test
+    void testIgniteConfigStore() throws Exception {
+        String fetcherId = "dynamicIgniteFetcher";
+        ManagedChannel channel = getManagedChannelForIgnite();
+        
+        try {
+            TikaGrpc.TikaBlockingStub blockingStub = 
TikaGrpc.newBlockingStub(channel);
+            TikaGrpc.TikaStub tikaStub = TikaGrpc.newStub(channel);
+
+            FileSystemFetcherConfig config = new FileSystemFetcherConfig();
+            String basePath = USE_LOCAL_SERVER ? TEST_FOLDER.getAbsolutePath() 
: "/tika/govdocs1";
+            config.setBasePath(basePath);
+            
+            String configJson = 
ExternalTestBase.OBJECT_MAPPER.writeValueAsString(config);
+            log.info("Creating fetcher with Ignite ConfigStore (basePath={}): 
{}", basePath, configJson);
+            
+            SaveFetcherReply saveReply = 
blockingStub.saveFetcher(SaveFetcherRequest
+                    .newBuilder()
+                    .setFetcherId(fetcherId)
+                    
.setFetcherClass("org.apache.tika.pipes.fetcher.fs.FileSystemFetcher")
+                    .setFetcherConfigJson(configJson)
+                    .build());
+            
+            log.info("Fetcher saved to Ignite: {}", saveReply.getFetcherId());
+
+            List<FetchAndParseReply> successes = 
Collections.synchronizedList(new ArrayList<>());
+            List<FetchAndParseReply> errors = Collections.synchronizedList(new 
ArrayList<>());
+
+            CountDownLatch countDownLatch = new CountDownLatch(1);
+            StreamObserver<FetchAndParseRequest>
+                    requestStreamObserver = 
tikaStub.fetchAndParseBiDirectionalStreaming(new StreamObserver<>() {
+                @Override
+                public void onNext(FetchAndParseReply fetchAndParseReply) {
+                    log.debug("Reply from fetch-and-parse - key={}, 
status={}", 
+                        fetchAndParseReply.getFetchKey(), 
fetchAndParseReply.getStatus());
+                    if 
("FETCH_AND_PARSE_EXCEPTION".equals(fetchAndParseReply.getStatus())) {
+                        errors.add(fetchAndParseReply);
+                    } else {
+                        successes.add(fetchAndParseReply);
+                    }
+                }
+
+                @Override
+                public void onError(Throwable throwable) {
+                    log.error("Received an error", throwable);
+                    Assertions.fail(throwable);
+                    countDownLatch.countDown();
+                }
+
+                @Override
+                public void onCompleted() {
+                    log.info("Finished streaming fetch and parse replies");
+                    countDownLatch.countDown();
+                }
+            });
+
+            int maxDocs = 
Integer.parseInt(System.getProperty("corpus.numDocs", "-1"));
+            log.info("Document limit: {}", maxDocs == -1 ? "unlimited" : 
maxDocs);
+            
+            try (Stream<Path> paths = Files.walk(TEST_FOLDER.toPath())) {
+                Stream<Path> fileStream = paths
+                        .filter(Files::isRegularFile)
+                        .filter(p -> !p.getFileName().toString()
+                                .toLowerCase(Locale.ROOT)
+                                .endsWith(".zip"));
+                
+                if (maxDocs > 0) {
+                    fileStream = fileStream.limit(maxDocs);
+                }
+
+                fileStream.forEach(file -> {
+                    try {
+                        String relPath = 
TEST_FOLDER.toPath().relativize(file).toString();
+                        requestStreamObserver.onNext(FetchAndParseRequest
+                                .newBuilder()
+                                .setFetcherId(fetcherId)
+                                .setFetchKey(relPath)
+                                .build());
+                    } catch (Exception e) {
+                        throw new RuntimeException(e);
+                    }
+                });
+            }
+            log.info("Done submitting files to Ignite-backed fetcher {}", 
fetcherId);
+
+            requestStreamObserver.onCompleted();
+
+            try {
+                if (!countDownLatch.await(3, TimeUnit.MINUTES)) {
+                    log.error("Timed out waiting for parse to complete");
+                    Assertions.fail("Timed out waiting for parsing to 
complete");
+                }
+            } catch (InterruptedException e) {
+                Thread.currentThread().interrupt();
+                Assertions.fail("Interrupted while waiting for parsing to 
complete");
+            }
+            
+            if (maxDocs == -1) {
+                ExternalTestBase.assertAllFilesFetched(TEST_FOLDER.toPath(), 
successes, errors);
+            } else {
+                int totalProcessed = successes.size() + errors.size();
+                log.info("Processed {} documents with Ignite ConfigStore 
(limit was {})", 
+                    totalProcessed, maxDocs);
+                Assertions.assertTrue(totalProcessed <= maxDocs, 
+                    "Should not process more than " + maxDocs + " documents");
+                Assertions.assertTrue(totalProcessed > 0, 
+                    "Should have processed at least one document");
+            }
+            
+            log.info("Ignite ConfigStore test completed successfully - {} 
successes, {} errors", 
+                successes.size(), errors.size());
+        } finally {
+            channel.shutdown();
+            try {
+                if (!channel.awaitTermination(5, TimeUnit.SECONDS)) {
+                    channel.shutdownNow();
+                }
+            } catch (InterruptedException e) {
+                channel.shutdownNow();
+                Thread.currentThread().interrupt();
+            }
+        }
+    }
+    
+    private static ManagedChannel getManagedChannelForIgnite() {
+        if (USE_LOCAL_SERVER) {
+            return ManagedChannelBuilder
+                    .forAddress("localhost", GRPC_PORT)
+                    .usePlaintext()
+                    .maxInboundMessageSize(160 * 1024 * 1024)
+                    .build();
+        } else {
+            return ManagedChannelBuilder
+                    
.forAddress(igniteComposeContainer.getServiceHost("tika-grpc", 50052),
+                               
igniteComposeContainer.getServicePort("tika-grpc", 50052))
+                    .usePlaintext()
+                    .maxInboundMessageSize(160 * 1024 * 1024)
+                    .build();
+        }
+    }
+}
diff --git 
a/tika-e2e-tests/tika-grpc/src/test/resources/test-fixtures/sample.csv 
b/tika-e2e-tests/tika-grpc/src/test/resources/test-fixtures/sample.csv
new file mode 100644
index 0000000000..a48fabae5b
--- /dev/null
+++ b/tika-e2e-tests/tika-grpc/src/test/resources/test-fixtures/sample.csv
@@ -0,0 +1,4 @@
+name,value,description
+alpha,1,first entry
+beta,2,second entry
+gamma,3,third entry
diff --git 
a/tika-e2e-tests/tika-grpc/src/test/resources/test-fixtures/sample.html 
b/tika-e2e-tests/tika-grpc/src/test/resources/test-fixtures/sample.html
new file mode 100644
index 0000000000..81c7c85a19
--- /dev/null
+++ b/tika-e2e-tests/tika-grpc/src/test/resources/test-fixtures/sample.html
@@ -0,0 +1,8 @@
+<!DOCTYPE html>
+<html>
+<head><title>Sample E2E Test Document</title></head>
+<body>
+<h1>Hello from Tika</h1>
+<p>This HTML fixture is used by the tika-grpc end-to-end tests.</p>
+</body>
+</html>
diff --git 
a/tika-e2e-tests/tika-grpc/src/test/resources/test-fixtures/sample.txt 
b/tika-e2e-tests/tika-grpc/src/test/resources/test-fixtures/sample.txt
new file mode 100644
index 0000000000..099b946056
--- /dev/null
+++ b/tika-e2e-tests/tika-grpc/src/test/resources/test-fixtures/sample.txt
@@ -0,0 +1,3 @@
+This is a sample plain text document used by the tika-grpc e2e tests.
+It contains several lines so the parser has something to work with.
+Apache Tika extracts text and metadata from a wide variety of document formats.
diff --git 
a/tika-e2e-tests/tika-grpc/src/test/resources/test-fixtures/sample.xml 
b/tika-e2e-tests/tika-grpc/src/test/resources/test-fixtures/sample.xml
new file mode 100644
index 0000000000..cdd1962aa3
--- /dev/null
+++ b/tika-e2e-tests/tika-grpc/src/test/resources/test-fixtures/sample.xml
@@ -0,0 +1,5 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<document>
+  <title>Sample E2E Test Document</title>
+  <body>This XML fixture is used by the tika-grpc end-to-end tests.</body>
+</document>
diff --git 
a/tika-e2e-tests/tika-grpc/src/test/resources/tika-config-ignite-local.json 
b/tika-e2e-tests/tika-grpc/src/test/resources/tika-config-ignite-local.json
new file mode 100644
index 0000000000..a8e19f8bb0
--- /dev/null
+++ b/tika-e2e-tests/tika-grpc/src/test/resources/tika-config-ignite-local.json
@@ -0,0 +1,52 @@
+{
+  "plugin-roots": ["/var/cache/tika/plugins"],
+  "pipes": {
+    "numClients": 1,
+    "configStoreType": "ignite",
+    "configStoreParams": "{\"tableName\": \"tika_e2e_test\", 
\"igniteInstanceName\": \"TikaE2ETest\", \"replicas\": 1, \"partitions\": 10, 
\"autoClose\": true}",
+    "forkedJvmArgs": [
+      "--add-opens=java.base/jdk.internal.access=ALL-UNNAMED",
+      "--add-opens=java.base/jdk.internal.misc=ALL-UNNAMED",
+      "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
+      "--add-opens=java.base/sun.util.calendar=ALL-UNNAMED",
+      "--add-opens=java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED",
+      "--add-opens=jdk.internal.jvmstat/sun.jvmstat.monitor=ALL-UNNAMED",
+      
"--add-opens=java.base/sun.reflect.generics.reflectiveObjects=ALL-UNNAMED",
+      "--add-opens=jdk.management/com.sun.management.internal=ALL-UNNAMED",
+      "--add-opens=java.base/java.io=ALL-UNNAMED",
+      "--add-opens=java.base/java.nio=ALL-UNNAMED",
+      "--add-opens=java.base/java.net=ALL-UNNAMED",
+      "--add-opens=java.base/java.util=ALL-UNNAMED",
+      "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED",
+      "--add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED",
+      "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED",
+      "--add-opens=java.base/java.lang=ALL-UNNAMED",
+      "--add-opens=java.base/java.lang.invoke=ALL-UNNAMED",
+      "--add-opens=java.base/java.math=ALL-UNNAMED",
+      "--add-opens=java.sql/java.sql=ALL-UNNAMED",
+      "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED",
+      "--add-opens=java.base/java.time=ALL-UNNAMED",
+      "--add-opens=java.base/java.text=ALL-UNNAMED",
+      "--add-opens=java.management/sun.management=ALL-UNNAMED",
+      "--add-opens=java.desktop/java.awt.font=ALL-UNNAMED"
+    ]
+  },
+  "fetchers": [
+    {
+      "fs": {
+        "staticFetcher": {
+          "basePath": "target/govdocs1"
+        }
+      }
+    }
+  ],
+  "emitters": [
+    {
+      "fs": {
+        "defaultEmitter": {
+          "basePath": "/tmp/output"
+        }
+      }
+    }
+  ]
+}
diff --git 
a/tika-e2e-tests/tika-grpc/src/test/resources/tika-config-ignite.json 
b/tika-e2e-tests/tika-grpc/src/test/resources/tika-config-ignite.json
new file mode 100644
index 0000000000..e39b9fb2a2
--- /dev/null
+++ b/tika-e2e-tests/tika-grpc/src/test/resources/tika-config-ignite.json
@@ -0,0 +1,52 @@
+{
+  "plugin-roots": ["/var/cache/tika/plugins"],
+  "pipes": {
+    "numClients": 1,
+    "configStoreType": "ignite",
+    "configStoreParams": "{\"tableName\": \"tika_e2e_test\", 
\"igniteInstanceName\": \"TikaE2ETest\", \"replicas\": 2, \"partitions\": 10, 
\"autoClose\": true}",
+    "forkedJvmArgs": [
+      "--add-opens=java.base/jdk.internal.access=ALL-UNNAMED",
+      "--add-opens=java.base/jdk.internal.misc=ALL-UNNAMED",
+      "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
+      "--add-opens=java.base/sun.util.calendar=ALL-UNNAMED",
+      "--add-opens=java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED",
+      "--add-opens=jdk.internal.jvmstat/sun.jvmstat.monitor=ALL-UNNAMED",
+      
"--add-opens=java.base/sun.reflect.generics.reflectiveObjects=ALL-UNNAMED",
+      "--add-opens=jdk.management/com.sun.management.internal=ALL-UNNAMED",
+      "--add-opens=java.base/java.io=ALL-UNNAMED",
+      "--add-opens=java.base/java.nio=ALL-UNNAMED",
+      "--add-opens=java.base/java.net=ALL-UNNAMED",
+      "--add-opens=java.base/java.util=ALL-UNNAMED",
+      "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED",
+      "--add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED",
+      "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED",
+      "--add-opens=java.base/java.lang=ALL-UNNAMED",
+      "--add-opens=java.base/java.lang.invoke=ALL-UNNAMED",
+      "--add-opens=java.base/java.math=ALL-UNNAMED",
+      "--add-opens=java.sql/java.sql=ALL-UNNAMED",
+      "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED",
+      "--add-opens=java.base/java.time=ALL-UNNAMED",
+      "--add-opens=java.base/java.text=ALL-UNNAMED",
+      "--add-opens=java.management/sun.management=ALL-UNNAMED",
+      "--add-opens=java.desktop/java.awt.font=ALL-UNNAMED"
+    ]
+  },
+  "fetchers": [
+    {
+      "fs": {
+        "staticFetcher": {
+          "basePath": "/tika/govdocs1"
+        }
+      }
+    }
+  ],
+  "emitters": [
+    {
+      "fs": {
+        "defaultEmitter": {
+          "basePath": "/tmp/output"
+        }
+      }
+    }
+  ]
+}
diff --git a/tika-e2e-tests/tika-grpc/src/test/resources/tika-config.json 
b/tika-e2e-tests/tika-grpc/src/test/resources/tika-config.json
new file mode 100644
index 0000000000..8863a4d90a
--- /dev/null
+++ b/tika-e2e-tests/tika-grpc/src/test/resources/tika-config.json
@@ -0,0 +1,32 @@
+{
+  "plugin-roots": ["/var/cache/tika/plugins"],
+  "pipes": {
+    "numClients": 1,
+    "forkedJvmArgs": [
+      "--add-opens=java.base/jdk.internal.access=ALL-UNNAMED",
+      "--add-opens=java.base/jdk.internal.misc=ALL-UNNAMED",
+      "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED",
+      "--add-opens=java.base/sun.util.calendar=ALL-UNNAMED",
+      "--add-opens=java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED",
+      "--add-opens=jdk.internal.jvmstat/sun.jvmstat.monitor=ALL-UNNAMED",
+      
"--add-opens=java.base/sun.reflect.generics.reflectiveObjects=ALL-UNNAMED",
+      "--add-opens=jdk.management/com.sun.management.internal=ALL-UNNAMED",
+      "--add-opens=java.base/java.io=ALL-UNNAMED",
+      "--add-opens=java.base/java.nio=ALL-UNNAMED",
+      "--add-opens=java.base/java.net=ALL-UNNAMED",
+      "--add-opens=java.base/java.util=ALL-UNNAMED",
+      "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED",
+      "--add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED",
+      "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED",
+      "--add-opens=java.base/java.lang=ALL-UNNAMED",
+      "--add-opens=java.base/java.lang.invoke=ALL-UNNAMED",
+      "--add-opens=java.base/java.math=ALL-UNNAMED",
+      "--add-opens=java.sql/java.sql=ALL-UNNAMED",
+      "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED",
+      "--add-opens=java.base/java.time=ALL-UNNAMED",
+      "--add-opens=java.base/java.text=ALL-UNNAMED",
+      "--add-opens=java.management/sun.management=ALL-UNNAMED",
+      "--add-opens=java.desktop/java.awt.font=ALL-UNNAMED"
+    ]
+  }
+}


Reply via email to