[ https://issues.apache.org/jira/browse/TIKA-4054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17726220#comment-17726220 ]
Tim Allison commented on TIKA-4054: ----------------------------------- These look great, [~g...@rhobard.com]. Would you be able to add proposed mime-types for each? Thank you! > Add various file identifications to reduce application/octet-stream > ------------------------------------------------------------------- > > Key: TIKA-4054 > URL: https://issues.apache.org/jira/browse/TIKA-4054 > Project: Tika > Issue Type: Sub-task > Reporter: Gregory Lepore > Priority: Major > > Catch all task for various format identification data which are currently > being identified as application/octet-stream. Most data is from PRONOM. > > SPSS Data File > ||External signatures|File extension: sav| > ||Internal signatures| > ||Name|SPSS Data File| > ||Description|BOF: $FL2@(#)| > ||Byte sequences| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|24464C3240282329| > | > | > > Amiga Disk File > > ||External signatures|File extension: adf| > ||Internal signatures| > ||Name|Amiga Disk File| > ||Description|BOF: ‘DOS’ followed by ‘00\|01\|02\|03\|04\|05\|06\|07’ > depending on the format of the disk. More information on the internal > signature can be found here: http://lclevy.free.fr/adflib/adf_info.html#p41| > ||Byte sequences| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|444F53(00\|01\|02\|03\|04\|05\|06\|07)| > | > | > > JEOL NMR Spectroscopy > ||External signatures|File extension: jdf| > ||Internal signatures| | > ||Name|JDF NMR Spectroscopy big endian| > ||Description|Big Endian: BOF: 4A454F4C2E4E4D52 (JEOL.NMR)| > ||Byte sequences| > > | > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|4A454F4C2E4E4D52| > | | | > ||Name|JDF little endian| > ||Description|Little Endian: 524D4E2E4C4F454A (RMN.LOEJ)| > ||Byte sequences| | > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|524D4E2E4C4F454A| > > ASPRS Lidar Data Exchange Format > ||External signatures|File extension: las > File extension: laz| > ||Internal signatures| > ||Name|ASPRS Lidar Data Exchange Format 1.2| > ||Description|ASCII header: LASF, followed after 20 bytes by version number > 1.2| > ||Byte sequences| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Byte order| | > ||Value|4C415346\{20}0102\{78}[00:99]| > | > | > > ASPRS Lidar Data Exchange Format v1.1 > > ||External signatures|File extension: las > File extension: laz| > ||Internal signatures| > ||Name|ASPRS Lidar Data Exchange Format 1.1| > ||Description|ASCII header: LASF, followed after 20 bytes by version number > 1.1| > ||Byte sequences| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Byte order| | > ||Value|4C415346\{20}0101\{78}[00:99]| > | > | > > 3D Studio > ||External signatures|File extension: 3ds| > ||Internal signatures| > ||Name|3D Studio (V1)| > ||Description|Primary chunk ID, chunk length, version subchunk ID, chunk > length, version, 3D-editor chunk ID.| > ||Byte sequences| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Byte order|Little-endian| > ||Value|4D4D\{4}02000A000000(03\|04)\{3}3D3D| > | > ||Name|3D Studio (V2)| > ||Description|Primary chunk ID, chunk length, 3D-editor chunk ID| > ||Byte sequences| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|4D4D\{4}3D3D| > | > | > > TAP (ZX Spectrum) > ||External signatures|File extension: tap| > ||Internal signatures| > ||Name|TAPZX| > ||Description|…\{20}ÿ| > ||Byte sequences| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|130000\{20}FF| > | > | > > Sibelius > ||External signatures|File extension: sib| > ||Internal signatures| > ||Name|Sibelius| > ||Description|Absolute from beginning of file, magic bytes: .SIBELIUS| > ||Byte sequences| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|0F534942454C495553| > | > | > > Portable Sound Format > ||External signatures|File extension: psf > File extension: psf1 > File extension: psflib > File extension: minipsf > File extension: minipsf1 > File extension: gsf > File extension: gsflib > File extension: minigsf| > ||Internal signatures| > ||Name|Portable Sound Format| > ||Description|BOF: PSFx, where x represents one of the following values for > which PSF has been adapted 4th byte: 0x01: Playstation (PSF1) 0x02: > Playstation 2 (PSF2) 0x11: Sega Saturn (SSF) 0x12: Sega Dreamcast (DSF) 0x13: > Sega Genesis 0x21: Nintendo 64 (USF) 0x22: GameBoy Advance (GSF) 0x23: Super > NES (SNSF) 0x41: Capcom QSound (QSF) Format description: > http://web.archive.org/web/20140125155137/http://wiki.neillcorlett.com/PSFFormat| > ||Byte sequences| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|505346(01\|02\|11\|12\|13\|21\|22\|23\|41)| > | > | -- This message was sent by Atlassian Jira (v8.20.10#820010)