Kaiyao Ke created TIKA-4254: ------------------------------- Summary: The test `TestMimeTypes#testJavaRegex` is not idempotent, as it passes in the first run and fails in repeated runs in the same environment. Key: TIKA-4254 URL: https://issues.apache.org/jira/browse/TIKA-4254 Project: Tika Issue Type: Bug Reporter: Kaiyao Ke
### Brief Description of the Bug The test `TestMimeTypes#testJavaRegex` is non-idempotent, as it passes in the first run but fails in the second run in the same environment. The source of the problem is that each test execution initializes a new media type (`MimeType`) instance `testType` (same problem for `testType2`), and all media types across different test executions attempt to use the same name pattern `"rtg_sst_grb_0\\.5\\.\\d{8}"`. Therefore, in the second execution of the test, the line `this.repo.addPattern(testType, pattern, true);` will throw an error, since the name pattern is already used by the `testType` instance initiated from the first test execution. Specifically, in the second run, the `addGlob()` method of the `Pattern` class will assert conflict patterns and throw a`MimeTypeException`(line 123 in `Patterns.java`). ### Failure Message in the 2nd Test Run: ``` org.apache.tika.mime.MimeTypeException: Conflicting glob pattern: rtg_sst_grb_0\.5\.\d{8} at org.apache.tika.mime.Patterns.addGlob(Patterns.java:123) at org.apache.tika.mime.Patterns.add(Patterns.java:71) at org.apache.tika.mime.MimeTypes.addPattern(MimeTypes.java:450) at org.apache.tika.mime.TestMimeTypes.testJavaRegex(TestMimeTypes.java:851) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) ``` ### Reproduce Use the `NIOInspector` plugin that supports rerunning individual tests in the same environment: ``` cd tika-parsers/tika-parsers-standard/tika-parsers-standard-package mvn edu.illinois:NIODetector:rerun -Dtest=org.apache.tika.mime.TestMimeTypes#testJavaRegex ``` ### Proposed Fix Declare `testType` and `testType2` as static variables and initialize them at class loading time. Therefore, repeated runs of `testJavaRegex()` will not conflict each other. All tests pass and are idempotent after the fix. ### Necessity of Fix A fix is recommended as unit tests shall be idempotent, and state pollution shall be mitigated so that newly introduced tests do not fail in the future due to polluted shared states. -- This message was sent by Atlassian Jira (v8.20.10#820010)