LucaDai opened a new pull request, #2360:
URL: https://github.com/apache/tika/pull/2360
## Description
**Type of change**
* [ ] New feature
* [ ] Bug fix for existing feature
* [x] Code quality improvement
* [x] Addition or Improvement of tests
* [ ] Addition/Improvement of documentation
## Summary
The failure was caused by ambiguous method mappings in LanguageResource,
where both the /language/stream and /language/string endpoints were annotated
with both @PUT and @POST on the same method.
Example of the original code:
```java
@PUT
@POST
@Path("/stream")
@Consumes("*/*")
@Produces("text/plain")
public String detect(final InputStream is) throws IOException {
...
}
```
While this appears to work deterministically in most runs, the CXF JAX-RS
server internally registers resource methods using reflection.
Reflection order is not guaranteed, and under randomized iteration order
using NonDex,
the router may associate the wrong HTTP method handler. This led to HTTP 405
(Method Not Allowed) responses being returned intermittently.
**Error Message running NonDex:**
```powershell
[INFO] Running org.apache.tika.server.core.LanguageResourceTest
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed:
0.638 s <<< FAILURE! -- in org.apache.tika.server.core.LanguageResourceTest
[ERROR]
org.apache.tika.server.core.LanguageResourceTest.testDetectEnglishFile -- Time
elapsed: 0.629 s <<< FAILURE!
org.opentest4j.AssertionFailedError: expected: <en> but was: <>
at
org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:158)
at
org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:139)
at
org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:201)
at
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:184)
at
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:179)
at
org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1188)
at
org.apache.tika.server.core.LanguageResourceTest.testDetectEnglishFile(LanguageResourceTest.java:98)
```
**In Debug Mode:**
<img width="1040" height="240" alt="image"
src="https://github.com/user-attachments/assets/f0130002-ee7f-405d-91b6-ae33260ac6ba"
/>
`jakarta.ws.rs.ClientErrorException: HTTP 405 Method Not Allowed
at
org.apache.cxf.jaxrs.utils.JAXRSUtils.findTargetMethod(JAXRSUtils.java:673)`
## Fix
The fix splits the ambiguous methods into four deterministic handlers (one
per HTTP verb and endpoint), removing the shared annotation conflict:
```java
@PUT
@Path("/stream")
public String detectStreamPut(InputStream is) { ... }
@POST
@Path("/stream")
public String detectStreamPost(InputStream is) { ... }
@PUT
@Path("/string")
public String detectStringPut(String text) { ... }
@POST
@Path("/string")
public String detectStringPost(String text) { ... }
```
A private helper method handles shared logic in a deterministic and
type-safe way:
```java
private String toText(Object input) throws IOException {
return (input instanceof InputStream)
? IOUtils.toString((InputStream) input, UTF_8)
: (String) input;
}
```
This ensures that the same logic is executed consistently, regardless of
HTTP method registration order.
---
## Related Tests
```
org.apache.tika.server.core.LanguageResourceTest.testDetectEnglishFile
org.apache.tika.server.core.LanguageResourceTest.testDetectFrenchFile
org.apache.tika.server.core.LanguageResourceTest.testDetectEnglishString
org.apache.tika.server.core.LanguageResourceTest.testDetectFrenchString
```
## How to Reproduce
Reproduced the failure using
[NonDex](https://github.com/TestingResearchIllinois/NonDex), a tool from the
University of Illinois designed to detect ID tests (Iteration-Dependent tests).
**Environment**
* Java: 17.x
* Maven: 3.9.x
**Build module**
```bash
mvn clean install -DskipTests -pl tika-server/tika-server-core -am
```
**Run a single test (no shuffling)**
```bash
mvn test -pl tika-server/tika-server-core \
-Dtest=org.apache.tika.server.core.LanguageResourceTest \
-Dmaven.test.redirectTestOutputToFile=false
```
**Run with NonDex (shuffling)**
```bash
mvn edu.illinois:nondex-maven-plugin:2.2.1:nondex \
-pl tika-server/tika-server-core \
-Dtest=org.apache.tika.server.core.LanguageResourceTest \
-DnondexRuns=5
```
**Expected**
* Without this fix: intermittent test failures with errorLog:
`org.opentest4j.AssertionFailedError: expected: <en> but was: <>`
Caused by HTTP 405 failures.
* With this fix: all runs pass consistently.
---
## Verification
* ✅ `mvn test -pl tika-server/tika-server-core` passes.
* ✅ Multiple NonDex runs (`-DnondexRuns=100`) pass with no flakes.
* ✅ Checkstyle passes.
* ✅ No behavior change to REST paths or media types; only internal method
split and safer input handling.
---
## Risk / Impact
* **Low**: External API unchanged (`/language/stream`, `/language/string`,
same verbs and content types).
* Internal method names changed, but CXF mapping uses annotations, so other
classes/tests are unaffected.
## Why this matters
Nondeterministic behavior may not surface in every run, but it introduces
long-term risks for reliability and maintainability:
- **Different environments**: Different JDK versions or JVM implementations
may alter HashMap iteration behavior.
- **Scale effects**: With more data or rehashing, iteration order can differ.
- **CI/CD reliability**: Tests may pass locally but fail in automated
pipelines.
- **Parallel execution**: Concurrency may exacerbate ordering issues.
By making the test resilient to iteration order, we ensure stable,
reproducible test results across environments and future JDK versions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]