[ 
https://issues.apache.org/jira/browse/TIKA-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr reopened TIKA-4545:
-----------------------------------

Windows build fails in the tika pipes solr integration tests because of the 
path problem we've had in the past. It can be fixed at 3 places in 
TikaPipesSolrTestBase.getTikaConfig() like this
{code:java}
       String res = json.replace("UPDATE_STRATEGY", updateStrategy.toString())
                .replace("ATTACHMENT_STRATEGY", attachmentStrategy.toString())
                .replaceAll("FETCHER_BASE_PATH",
                        
Matcher.quoteReplacement(testFileFolder.toAbsolutePath().toString().replace("\\",
 "/")))
                .replace("PARSE_MODE", parseMode.name())
                .replace("SOLR_URLS", solrUrls)
                .replace("SOLR_ZK_HOSTS", solrZkHosts);

        res = res.replace("TIKA_CONFIG", 
tikaConfig.toAbsolutePath().toString().replace("\\", "/"));

        Path log4jPropFile = pipesDirectory.resolve("log4j2.xml");
        try (InputStream is = 
this.getClass().getResourceAsStream("/pipes-fork-server-custom-log4j2.xml")) {
            Files.copy(is, log4jPropFile, 
java.nio.file.StandardCopyOption.REPLACE_EXISTING);
        }
        res = res.replace("LOG4J_PROPERTIES_FILE", 
log4jPropFile.toAbsolutePath().toString().replace("\\", "/"));
{code}


> Fully integrate new json based deserializer in 4.x
> --------------------------------------------------
>
>                 Key: TIKA-4545
>                 URL: https://issues.apache.org/jira/browse/TIKA-4545
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>             Fix For: 4.0.0
>
>
> Follow on for TIKA-4544.
> Steps:
>  * Add annotations to components (parsers, etc.) and unit tests to confirm 
> they work (finished this today)
>  * Modify components (parsers etc), at least a few of them so that they are 
> actually configurable. We don't have to modify all, just the most important 
> ones PDFParser, tesseract, MSOffice, and others???
>  * Move to tika-config.json in tika-pipes client/server, tika-async-cli, 
> tika-app and tika-server one by one



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to