[ https://issues.apache.org/jira/browse/TIKA-4054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Allison resolved TIKA-4054. ------------------------------- Fix Version/s: 2.8.1 Resolution: Fixed Thank you [~g...@rhobard.com]! > Add various file identifications to reduce application/octet-stream > ------------------------------------------------------------------- > > Key: TIKA-4054 > URL: https://issues.apache.org/jira/browse/TIKA-4054 > Project: Tika > Issue Type: Sub-task > Reporter: Gregory Lepore > Priority: Major > Fix For: 2.8.1 > > > Catch all task for various format identification data which are currently > being identified as application/octet-stream. Most data is from PRONOM. > > SPSS Data File > application/x-spss-sav > ||External signatures|File extension: sav| > ||Internal signatures|| > ||Name|SPSS Data File| > ||Description|BOF: $FL2@(#)| > ||Byte sequences|| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|24464C3240282329| > > Amiga Disk File > application/x-amiga-disk-format > ||External signatures|File extension: adf| > ||Internal signatures|| > ||Name|Amiga Disk File| > ||Description|BOF: ‘DOS’ followed by ‘00\|01\|02\|03\|04\|05\|06\|07’ > depending on the format of the disk. More information on the internal > signature can be found here: [http://lclevy.free.fr/adflib/adf_info.html#p41]| > ||Byte sequences|| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|444F53(00\|01\|02\|03\|04\|05\|06\|07)| > > JEOL NMR Spectroscopy > chemical/x-jeol-jdf > ||External signatures|File extension: jdf| > ||Internal signatures| | > ||Name|JDF NMR Spectroscopy big endian| > ||Description|Big Endian: BOF: 4A454F4C2E4E4D52 (JEOL.NMR)| > ||Byte sequences|| > > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|4A454F4C2E4E4D52| > | | | > ||Name|JDF little endian| > ||Description|Little Endian: 524D4E2E4C4F454A (RMN.LOEJ)| > ||Byte sequences| | > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|524D4E2E4C4F454A| > > ASPRS Lidar Data Exchange Format > no mimetype found > ||External signatures|File extension: las > File extension: laz| > ||Internal signatures|| > ||Name|ASPRS Lidar Data Exchange Format 1.2| > ||Description|ASCII header: LASF, followed after 20 bytes by version number > 1.2| > ||Byte sequences|| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Byte order| | > ||Value|4C415346\{20}0102\{78}[00:99]| > > ASPRS Lidar Data Exchange Format v1.1 > no mimetype found > ||External signatures|File extension: las > File extension: laz| > ||Internal signatures|| > ||Name|ASPRS Lidar Data Exchange Format 1.1| > ||Description|ASCII header: LASF, followed after 20 bytes by version number > 1.1| > ||Byte sequences|| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Byte order| | > ||Value|4C415346\{20}0101\{78}[00:99]| > > 3D Studio > image/x-3ds > ||External signatures|File extension: 3ds| > ||Internal signatures|| > ||Name|3D Studio (V1)| > ||Description|Primary chunk ID, chunk length, version subchunk ID, chunk > length, version, 3D-editor chunk ID.| > ||Byte sequences|| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Byte order|Little-endian| > ||Value|4D4D\{4}02000A000000(03\|04)\{3}3D3D| > ||Name|3D Studio (V2)| > ||Description|Primary chunk ID, chunk length, 3D-editor chunk ID| > ||Byte sequences|| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|4D4D\{4}3D3D| > > TAP (ZX Spectrum) > [application/x-spectrum-tzx|https://www.digipres.org/formats/mime-types/#application/x-spectrum-tzx] > ||External signatures|File extension: tap| > ||Internal signatures|| > ||Name|TAPZX| > ||Description|…\{20}ÿ| > ||Byte sequences|| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|130000\{20}FF| > > Sibelius > no mimetype found > ||External signatures|File extension: sib| > ||Internal signatures|| > ||Name|Sibelius| > ||Description|Absolute from beginning of file, magic bytes: .SIBELIUS| > ||Byte sequences|| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|0F534942454C495553| > > Portable Sound Format > no mimetype found > ||External signatures|File extension: psf > File extension: psf1 > File extension: psflib > File extension: minipsf > File extension: minipsf1 > File extension: gsf > File extension: gsflib > File extension: minigsf| > ||Internal signatures|| > ||Name|Portable Sound Format| > ||Description|BOF: PSFx, where x represents one of the following values for > which PSF has been adapted 4th byte: 0x01: Playstation (PSF1) 0x02: > Playstation 2 (PSF2) 0x11: Sega Saturn (SSF) 0x12: Sega Dreamcast (DSF) 0x13: > Sega Genesis 0x21: Nintendo 64 (USF) 0x22: GameBoy Advance (GSF) 0x23: Super > NES (SNSF) 0x41: Capcom QSound (QSF) Format description: > [http://web.archive.org/web/20140125155137/http://wiki.neillcorlett.com/PSFFormat]| > ||Byte sequences|| > ||Position type|Absolute from BOF| > ||Offset|0| > ||Maximum Offset|0| > ||Byte order| | > ||Value|505346(01\|02\|11\|12\|13\|21\|22\|23\|41)| -- This message was sent by Atlassian Jira (v8.20.10#820010)