[ https://issues.apache.org/jira/browse/TIKA-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17816191#comment-17816191 ]
Tim Allison commented on TIKA-3784: ----------------------------------- [~nick] (and cc [~tom_1st] from TIKA-4194), I agree that parsing these things would probably be best as a container detector. When I run AS1Dump on one of the p12 files, I get this: {noformat} Sequence Integer(3) Sequence ObjectIdentifier(1.2.840.113549.1.7.1) Tagged [CONTEXT 0] DER Octet String[2603] Sequence Sequence Sequence ObjectIdentifier(2.16.840.1.101.3.4.2.3) NULL DER Octet String[64] DER Octet String[64] Integer(1000000) {noformat} Is there anything in there I can use to detect p12? > Detector returns "application/x-x509-key" when scanning a .p12 file > ------------------------------------------------------------------- > > Key: TIKA-3784 > URL: https://issues.apache.org/jira/browse/TIKA-3784 > Project: Tika > Issue Type: Bug > Components: detector > Affects Versions: 1.26 > Reporter: Matthias Hofbauer > Priority: Critical > > We are using tika to check if the MIME type of the file extensions matches > with the MIME type of the file content. > After our upgrade from tika-core 1.22 to 1.26 our logic does not work anymore > for certificates of type .p12, .pfx, .cer, .der. > For the .p12 and .pfx extension the MIME type is "application/x-pkcs12" but > the tika detector returns "application/x-x509-key" instead. > After checking the tika-mimetype.xml and comparing it to my .p12 file I found > the following MIME magic which explains why I got these types back. > {code:xml} > <mime-type type="application/x-x509-key;format=der"> > <sub-class-of type="application/x-x509-key"/> > <!-- These are just a bunch of magic integers as defined by the key > format... --> > <!-- Always seem to have a version integer as their first entry, --> > <!-- normally 00, 01 or 02, check for that --> > <magic priority="40"> > <match value="0x3081FF020100" type="string" > mask="0xFFFF00FFFFFC" offset="0"/> > <match value="0x3082FFFF020100" type="string" > mask="0xFFFF0000FFFFFC" offset="0"/> > </magic> > </mime-type> {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)