[
https://issues.apache.org/jira/browse/TIKA-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18044515#comment-18044515
]
Hudson commented on TIKA-4565:
------------------------------
SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk17 #1101 (See
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk17/1101/])
TIKA-4565 -- tweak configurations for include/exclude (#2441) (github:
[https://github.com/apache/tika/commit/f63bebfdea38c152ab1ffdff591938d7ef8c02b3])
* (edit) tika-serialization/src/test/resources/configs/example-tika-config.json
* (edit)
tika-serialization/src/test/resources/configs/test-default-parser-with-exclusions.json
* (edit)
tika-serialization/src/main/java/org/apache/tika/config/loader/TikaLoader.java
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/configs/TIKA-1708-detector-default.json
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/configs/TIKA-2273-encoding-detector-outside-static-init.json
* (edit) tika-app/src/test/resources/configs/tika-config2.json
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/configs/TIKA-1702-detector-exclude.json
* (edit)
tika-serialization/src/main/java/org/apache/tika/config/loader/ParserLoader.java
* (edit)
tika-serialization/src/main/java/org/apache/tika/config/loader/FrameworkConfig.java
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/configs/TIKA-2273-exclude-encoding-detector-default.json
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/configs/TIKA-2273-no-icu4j-encoding-detector.json
* (edit)
tika-serialization/src/main/java/org/apache/tika/config/loader/EncodingDetectorLoader.java
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/configs/tika-config-lib-pst.json
* (edit)
tika-app/src/test/java/org/apache/tika/cli/XmlToJsonConfigConverterTest.java
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/configs/tika-config-digests-pdf-only.json
* (edit)
tika-serialization/src/main/java/org/apache/tika/config/loader/TikaJsonConfig.java
* (edit)
tika-serialization/src/test/java/org/apache/tika/config/loader/FrameworkConfigTest.java
* (edit)
tika-serialization/src/test/resources/configs/test-decoration-config.json
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/org/apache/tika/config/TIKA-1558-exclude.json
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/configs/tika-4424-config.json
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/org/apache/tika/config/TIKA-1558-excludesub.json
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/org/apache/tika/parser/ocr/tesseract-config.json
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/configs/test-default-with-exclusions.json
* (edit)
tika-app/src/main/java/org/apache/tika/cli/XmlToJsonConfigConverter.java
* (edit)
tika-serialization/src/test/java/org/apache/tika/config/loader/TikaLoaderTest.java
* (edit) tika-serialization/src/test/resources/configs/test-loader-config.json
* (edit)
tika-serialization/src/main/java/org/apache/tika/config/loader/DetectorLoader.java
* (edit)
tika-integration-tests/tika-pipes-s3-integration-tests/src/test/resources/s3/tika-config-s3.json
* (edit) tika-app/src/test/resources/configs/tika-config1.json
> Tweak include/exclude syntax in json for parsers
> ------------------------------------------------
>
> Key: TIKA-4565
> URL: https://issues.apache.org/jira/browse/TIKA-4565
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Minor
>
> For parsers we currently require the user to know about the decorator
> implementation detail:
> {noformat}
> "parsers": [
> {
> "default-parser": {
> "exclude": ["executable-parser"],
> "_decorate": {
> "mimeExclude": ["image/jpeg", "application/pdf"]
> }
> }
> }, ...{noformat}
> I like the _ to signify that the "exclude" is not a configuration parameter
> on the actual parser, but is a system thing. How about something like this:
> {noformat}
> "parsers": [
> {
> "default-parser": {
> "_exclude": ["executable-parser"],
> "_mime-exclude": ["image/jpeg", "application/pdf"]
> }
> },
> ... {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)