[ 
https://issues.apache.org/jira/browse/TIKA-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17731249#comment-17731249
 ] 

Hudson commented on TIKA-4060:
------------------------------

SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1103 (See 
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk11/1103/])
TIKA-4060 Test AAC files, based on testWAV.wav, one without ID3, one with dummy 
ID3 values (nick: 
[https://github.com/apache/tika/commit/500900d67ede02e87440caa9f67501d3fe59b770])
* (add) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/test-documents/testAACid3.aac
* (add) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/resources/test-documents/testAAC.aac


> Add magic to audio/aac in tika-mimetypes.xml
> --------------------------------------------
>
>                 Key: TIKA-4060
>                 URL: https://issues.apache.org/jira/browse/TIKA-4060
>             Project: Tika
>          Issue Type: Sub-task
>            Reporter: Gregory Lepore
>            Priority: Minor
>             Fix For: 2.8.1
>
>         Attachments: 
> 067aece423d8694a891a61a45ac0e870914bc1314ef510ac40b36ca3397843ef, 
> cb1bec08898db7a733b42ac44bdd76b6177cd3a07a2435a83fd99b7453d564d1
>
>
> Currently tika-mimetypes only recognizes audio/aac files by the file 
> extension. PRONOM recently added support for identifying aac files, but the 
> signature is tricky. There are two signatures, below in PRONOM format curly 
> braces mean to look ahead between the two values for the subsequent patterns.
>  
> The first pattern is pretty basic, the second pattern is the first pattern 
> after a 2048 ID3 header.
>  
> ||Name|Audio Data Transport Stream sig.1|
> ||Description|An FF pattern from BOF with variation of byte stream|
> ||Byte sequences|
> ||Position type|Absolute from BOF|
> ||Offset|0|
> ||Maximum Offset|0|
> ||Byte order| |
> ||Value|FF(F0\|F1\|F8\|F9)(40\|41\|44\|45\|48\|49\|4C\|4D\|50\|51\|54\|55\|58\|59\|5C\|5D\|60\|61\|64\|65\|68\|69\|6C\|6D\|70\|71\|80\|81\|84\|85\|88\|89\|8C\|8D\|90\|91\|94\|95\|98\|99\|9C\|9D\|A0\|A1\|A4\|A5\|A8\|A9\|AC\|AD\|B0\|B1)(00\|01\|20\|40\|41\|60\|80\|81\|60\|A0\|C0\|C1\|E0)|
> |
> ||Name|Audio Data Transport Stream sig.2|
> ||Description|ID3 tag variation with variable byte stream|
> ||Byte sequences|
> ||Position type|Absolute from BOF|
> ||Offset|0|
> ||Maximum Offset|0|
> ||Byte order| |
> ||Value|494433\{0-2045}FF(F0\|F1\|F8\|F9)(40\|41\|44\|45\|48\|49\|4C\|4D\|50\|51\|54\|55\|58\|59\|5C\|5D\|60\|61\|64\|65\|68\|69\|6C\|6D\|70\|71\|80\|81\|84\|85\|88\|89\|8C\|8D\|90\|91\|94\|95\|98\|99\|9C\|9D\|A0\|A1\|A4\|A5\|A8\|A9\|AC\|AD\|B0\|B1)(00\|01\|20\|40\|41\|60\|80\|81\|60\|A0\|C0\|C1\|E0)|
> |



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to