[ https://issues.apache.org/jira/browse/TIKA-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689203#comment-17689203 ]
Tim Allison commented on TIKA-3973: ----------------------------------- To emphasize Nick's point... if you need detection of other container formats, like OLE2 (.doc, .ppt, .xls) or zip-based (docx, pptx, xlsx), you should include the full tika-parsers-standard-package. If you only care about Ogg, then go with what Nick recommends. > Content of Ogg file with Opus encoded content not correctly recognized > ---------------------------------------------------------------------- > > Key: TIKA-3973 > URL: https://issues.apache.org/jira/browse/TIKA-3973 > Project: Tika > Issue Type: Bug > Components: detector > Affects Versions: 2.7.0 > Reporter: Adam Bialas > Priority: Major > Attachments: speech_output.ogg > > > We are using tika-core:2.7.0 for file content detection. We have a ogg file > which uses Opus audio codec (see attachment). When we try to detect content > with metadata: > > {code:java} > Metadata metadata = new Metadata(); > metadata.set(TikaCoreProperties.RESOURCE_NAME_KEY, > FilenameUtils.getName(url));{code} > this file is recognized as audio/vorbis which is not ok. Can you please > verify? -- This message was sent by Atlassian Jira (v8.20.10#820010)