[ https://issues.apache.org/jira/browse/TIKA-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845566#comment-17845566 ]
Tilman Hausherr commented on TIKA-4254: --------------------------------------- Why would we ever run the test twice in the same environment? > The test `TestMimeTypes#testJavaRegex` is not idempotent, as it passes in the > first run and fails in repeated runs in the same environment. > -------------------------------------------------------------------------------------------------------------------------------------------- > > Key: TIKA-4254 > URL: https://issues.apache.org/jira/browse/TIKA-4254 > Project: Tika > Issue Type: Bug > Reporter: Kaiyao Ke > Priority: Major > > ### Brief Description of the Bug > The test `TestMimeTypes#testJavaRegex` is non-idempotent, as it passes in the > first run but fails in the second run in the same environment. The source of > the problem is that each test execution initializes a new media type > (`MimeType`) instance `testType` (same problem for `testType2`), and all > media types across different test executions attempt to use the same name > pattern `"rtg_sst_grb_0\\.5\\.\\d{8}"`. Therefore, in the second execution of > the test, the line `this.repo.addPattern(testType, pattern, true);` will > throw an error, since the name pattern is already used by the `testType` > instance initiated from the first test execution. Specifically, in the second > run, the `addGlob()` method of the `Pattern` class will assert conflict > patterns and throw a`MimeTypeException`(line 123 in `Patterns.java`). > ### Failure Message in the 2nd Test Run: > ``` > org.apache.tika.mime.MimeTypeException: Conflicting glob pattern: > rtg_sst_grb_0\.5\.\d{8} > at org.apache.tika.mime.Patterns.addGlob(Patterns.java:123) > at org.apache.tika.mime.Patterns.add(Patterns.java:71) > at org.apache.tika.mime.MimeTypes.addPattern(MimeTypes.java:450) > at > org.apache.tika.mime.TestMimeTypes.testJavaRegex(TestMimeTypes.java:851) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) > ``` > ### Reproduce > Use the `NIOInspector` plugin that supports rerunning individual tests in the > same environment: > ``` > cd tika-parsers/tika-parsers-standard/tika-parsers-standard-package > mvn edu.illinois:NIODetector:rerun > -Dtest=org.apache.tika.mime.TestMimeTypes#testJavaRegex > ``` > ### Proposed Fix > Declare `testType` and `testType2` as static variables and initialize them at > class loading time. Therefore, repeated runs of `testJavaRegex()` will not > conflict each other. All tests pass and are idempotent after the fix. > ### Necessity of Fix > A fix is recommended as unit tests shall be idempotent, and state pollution > shall be mitigated so that newly introduced tests do not fail in the future > due to polluted shared states. -- This message was sent by Atlassian Jira (v8.20.10#820010)