[
https://issues.apache.org/jira/browse/TIKA-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-4387.
-------------------------------
Fix Version/s: 2.9.4
4.0.0
3.1.1
Resolution: Fixed
> Improve robustness of file extension parsing
> --------------------------------------------
>
> Key: TIKA-4387
> URL: https://issues.apache.org/jira/browse/TIKA-4387
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Major
> Fix For: 2.9.4, 4.0.0, 3.1.1
>
>
> {{FilenameUtils.getSuffixFromPath()}} isn't checking that the extension
> contains only alphanumeric characters.
> If a "file path" derives from an internal path in a pst, like so {{/Début du
> fichier de données Outlook/[WEBINAR] - "Introducing Couchbase Server 2.5"}},
> then the extension is {{.5"}}, which causes problems on Windows.
> The problem happens when TemporaryResources goes to write a temp file and
> tries to maintain the file extension based on the {{resourceName}} in the
> Metadata.
> We should add a check that the extension contains only alphanumerics? Or
> something?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)