[ 
https://issues.apache.org/jira/browse/TIKA-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison reassigned TIKA-4612:
---------------------------------

    Assignee: Tim Allison

> Some mp3 files are detected as audio/x-aac instead of audio/mpeg
> ----------------------------------------------------------------
>
>                 Key: TIKA-4612
>                 URL: https://issues.apache.org/jira/browse/TIKA-4612
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 2.9.0, 3.2.3
>            Reporter: V. S.
>            Assignee: Tim Allison
>            Priority: Major
>         Attachments: test.mp3
>
>
> When reading the attached test.mp3 file into Tika.detect, _all versions since 
> Tika 2.9.0_ incorrectly report "audio/x-aac" instead of "audio/mpeg". Tika 
> 2.8.0 reports "audio/mpeg" correctly.
> I believe this might be due to the priority setting here, but I am not fully 
> aware how this works:
> [https://github.com/apache/tika/blob/main/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml#L6166|https://github.com/apache/tika/blob/3.2.3/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml#L6166]
> Note that I can only supply the first 1024 bytes of the MP3 file due to legal 
> reasons. However, this seems to be enough for the detection logic.
> This error has occured with about 30% of the MP3 files we were processing.
>  
> Other tools correctly report MP3, e.g. 
> {{$ file test.mp3 }}
> {{test.mp3: Audio file with ID3 version 2.3.0, contains:\012- MPEG ADTS, 
> layer III, v2,  64 kbps, 16 kHz, JntStereo}}
>  
> Minimal test program:
> {{{}package com.example;{}}}{{{}import org.apache.tika.Tika;{}}}
> {{import java.io.FileInputStream;}}
> {{{}import java.io.IOException;{}}}{{{}public class TikaTest {{}}}{{  public 
> static void main(String args[]) {}}
> {{    Tika tika = new Tika();}}
> {{    }}
> {{    try (FileInputStream fis = new FileInputStream("test.mp3")) {}}
> {{      System.out.println(tika.detect(fis));}}
> {{    } catch (IOException e) { }}
> {{      e.printStackTrace(); }}
> {{    }}}
> {{  }}}
> {{}}}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to