[ 
https://issues.apache.org/jira/browse/TIKA-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689161#comment-17689161
 ] 

Nick Burch commented on TIKA-3973:
----------------------------------

For container-based detection (such as the Ogg container format), you really 
need to include the Tika Parsers jars too.

With the Ogg container detector enabled (which comes with the Tika media 
parsers), Tika can correctly detect the type as {{audio/opus}}

We have magic which will detect an opus file with a single stream if you're 
lucky, but with containers it's very hit-and-miss if you can tell with magic 
alone. Enabling the Ogg container detector is the best solution though, that 
should always work no matter what order the streams are in, what streams are 
contained etc{{{}
{}}}

> Content of Ogg file with Opus encoded content not correctly recognized
> ----------------------------------------------------------------------
>
>                 Key: TIKA-3973
>                 URL: https://issues.apache.org/jira/browse/TIKA-3973
>             Project: Tika
>          Issue Type: Bug
>          Components: detector
>    Affects Versions: 2.7.0
>            Reporter: Adam Bialas
>            Priority: Major
>         Attachments: speech_output.ogg
>
>
> We are using tika-core:2.7.0 for file content detection. We have a ogg file 
> which uses Opus audio codec (see attachment). When we try to detect content 
> with metadata:
>  
> {code:java}
> Metadata metadata = new Metadata(); 
> metadata.set(TikaCoreProperties.RESOURCE_NAME_KEY, 
> FilenameUtils.getName(url));{code}
> this file is recognized as audio/vorbis which is not ok. Can you please 
> verify?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to