[ https://issues.apache.org/jira/browse/TIKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372133#comment-14372133 ]
Tyler Palsulich commented on TIKA-1379: --------------------------------------- The file is still detected as text/html. Should we update the magic to detect it as xml? > error in Tika().detect for xml files with xades signature > --------------------------------------------------------- > > Key: TIKA-1379 > URL: https://issues.apache.org/jira/browse/TIKA-1379 > Project: Tika > Issue Type: Bug > Components: detector > Affects Versions: 1.4 > Reporter: Alessandro De Angelis > Labels: new-parser > Fix For: 1.8 > > > we tried to get the mime type of an xml file with xades signature embedded. > the result is "text/html" and not the expected "text/xml" or > "application/xml". > here is an example of the xml file: > {code} > <VERBALI ad_cod="D69017" batch_id="0" cds_cod="D69" data_app="2013-09-23"> > <VERBALE Id="1" tipologia="Verbale esame"> > <VERB_NUM>00094853 0003 2</VERB_NUM> > <DATA_APP>2013-09-23</DATA_APP> > <DATA_ESA>2013-09-23</DATA_ESA> > <AD_COD>D69017</AD_COD> > <AD>FILOSOFIA DELLA SCIENZA</AD> > <CDS_COD>D69</CDS_COD> > <CDS>TEATRO E ARTI VISIVE</CDS> > <TIPO_ESA></TIPO_ESA> > <MAT>1233456</MAT> > <NOME>PAOLINO</NOME> > <COGNOME>PAPERINO</COGNOME> > <VOTO>23.0</VOTO> > <VOTODECOD>23</VOTODECOD> > <CAUSALE></CAUSALE> > <TIPO_MODULO></TIPO_MODULO> > <IMG_PATH></IMG_PATH> > <AA_SES_ID>2012</AA_SES_ID> > <AD_CFU>6.0</AD_CFU> > <NOTA></NOTA> > <ATENEO>99999</ATENEO> > <ATENEO_DES>جامعة البندقية - TEST</ATENEO_DES> > <TIPO_DOCUMENTO>Verbale_XXXX3</TIPO_DOCUMENTO> > <TITOLARE_PROCEDIMENTO>QUI QUO QUA</TITOLARE_PROCEDIMENTO> > <AD_STU_COD>D69017</AD_STU_COD> > <AD_STU>FILOSOFIA DELLA SCIENZA</AD_STU> > <CDS_STU_COD>D69</CDS_STU_COD> > <CDS_STU>TEATRO E ARTI VISIVE</CDS_STU> > <DOCENTE>QUI QUO QUA</DOCENTE> > <DATA_DOCUMENTO>26-09-2013 09:55:53 CEST(+0200)</DATA_DOCUMENTO> > <SOFTWARE_DI_CREAZIONE> > <NOME>XXXX3</NOME> > <VERSIONE>11.09.03</VERSIONE> > </SOFTWARE_DI_CREAZIONE> > </VERBALE><ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#" > Id="sig0000087443000008748200000000010000048377"> > <ds:SignedInfo> > <ds:CanonicalizationMethod > Algorithm="http://www.w3.org/2006/12/xml-c14n11"></ds:CanonicalizationMethod> > <ds:SignatureMethod > Algorithm="http://www.w3.org/2001/04/xmldsig-more#rsa-sha256"></ds:SignatureMethod> > <ds:Reference URI=""> > <ds:Transforms> > <ds:Transform Algorithm="http://www.w3.org/2002/06/xmldsig-filter2"> > <dsig-xpath:XPath > xmlns:dsig-xpath="http://www.w3.org/2002/06/xmldsig-filter2" > Filter="subtract">/descendant::ds:Signature</dsig-xpath:XPath> > </ds:Transform> > <ds:Transform Algorithm="http://www.w3.org/TR/1999/REC-xslt-19991116"> > <xsl:stylesheet xmlns:kion="http://www.kion.it/webesse3/multilingua" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > exclude-result-prefixes="kion" version="1.0"> > <kion:ml module="FirmaDigitale" target="kion"></kion:ml> > <xsl:output method="xml"></xsl:output> > <xsl:variable name="mostra_ad_figlie" select="1"></xsl:variable> > <xsl:variable name="verbale_root" > select="/VERBALI/VERBALE"></xsl:variable> > <xsl:variable name="sostituzione_root" > select="/VERBALI/VERBALE/SOSTITUZIONE_DOCUMENTO"></xsl:variable> > <xsl:variable name="RAGG_ROOT" > select="/VERBALI/VERBALE/RAGGRUPPAMENTO"></xsl:variable> > <xsl:variable name="COMM_ROOT" > select="/VERBALI/VERBALE/COMMISSIONE"></xsl:variable> > > <xsl:template match="/"> > <html> > <head> > <meta content="text/html;charset=UTF-8" > http-equiv="Content-Type"></meta> > <xsl:choose> > <xsl:when > test="$sostituzione_root"> > <title>Dichiarazione > conformità Verbale Esame</title> > </xsl:when> > <xsl:otherwise> > <title>Verbalizzazione > esame</title> > </xsl:otherwise> > </xsl:choose> > <style type="text/css"> > td {font-family: Arial; font-size:10pt;} > div {font-family: Arial; font-size:10pt;} > pre {font-family: Arial; font-size:10pt;} > </style> > </head> > <body> > <table> > <xsl:choose> > <xsl:when > test="$sostituzione_root"> > <tr><td align="center" > colspan="2"><big><strong><xsl:value-of > select="$verbale_root/ATENEO_DES"></xsl:value-of></strong></big><br></br></td></tr> > <tr><td align="center" > colspan="2"><big><strong>DICHIARAZIONE DI > CONFORMITÀ</strong></big><br></br></td></tr> > <tr><td align="left" > colspan="2"><strong>Il sottoscritto <xsl:value-of > select="$verbale_root/TITOLARE_PROCEDIMENTO"></xsl:value-of>, docente di > <xsl:value-of select="$verbale_root/AD"></xsl:value-of></strong><br></br> > </td> > </tr> > <tr> > <td> </td> > <td> > <xsl:if > test="$sostituzione_root/MOTIVAZIONE"> > > PREMESSO CHE > > <div> </div> > > <pre><xsl:value-of > select="$sostituzione_root/MOTIVAZIONE"></xsl:value-of></pre> > > <div> </div> > > </xsl:if> > </td></tr> > <tr><td></td> > <td> > > <strong>DICHIARA</strong> > <div> > </div> > <div>- > <xsl:value-of > select="$sostituzione_root/DESCRIZIONE"></xsl:value-of>(**)</div> > <div>- > che il verbale in calce, firmato digitalmente dal sottoscritto, sostituisce a > tutti gli effetti di legge quello precedentemente firmato, indicato nella > linea precedente e conservato a norma</div> > <div>- > A maggior tutela del firmatario viene riportata la versione originale e gli > estremi dell'ultima versione firmata</div> > <div> > </div> > </td> > </tr> > <tr><td colspan="2"> > > <div > align="center"><table border="1" > cellpadding="10px"><tr><td><xsl:call-template > name="tabella_verbale"></xsl:call-template></td></tr></table></div> > </td></tr> > </xsl:when> > <xsl:otherwise> > <tr><td > colspan="2"><xsl:call-template > name="tabella_verbale"></xsl:call-template></td></tr> > </xsl:otherwise> > </xsl:choose> > <tr><td colspan="2"> > <xsl:if > test="$sostituzione_root"> > > <br></br><xsl:call-template > name="intestazione_rifirma_verbale"></xsl:call-template> > </xsl:if> > <xsl:call-template > name="produzione_documento"> > <xsl:with-param name="root" > select="$verbale_root"></xsl:with-param> > </xsl:call-template> > </td></tr> > </table> > </body> > </html> > </xsl:template> > > <xsl:template name="produzione_documento"> > <xsl:param name="root"></xsl:param> > <xsl:if test="$root/DATA_DOCUMENTO and > $root/SOFTWARE_DI_CREAZIONE/NOME and $root/SOFTWARE_DI_CREAZIONE/VERSIONE"> > <br></br><div><small><i>Documento Generato in data: > <xsl:value-of select="$root/DATA_DOCUMENTO"></xsl:value-of> dal sistema > <xsl:value-of select="$root/SOFTWARE_DI_CREAZIONE/NOME"></xsl:value-of> con > versione <xsl:value-of > select="$root/SOFTWARE_DI_CREAZIONE/VERSIONE"></xsl:value-of></i></small></div> > </xsl:if> > </xsl:template> > > <xsl:template name="intestazione_rifirma_verbale"> > <br></br> > <table border="0"> > <tr> > <td><strong>(**)Estremi verbale > sostituito</strong></td> > </tr> > <xsl:choose> > <xsl:when > test="$sostituzione_root/DATA_DOCUMENTO_SOSTITUITO"> > <tr> > <td><i>Data Documento: > </i><xsl:value-of > select="$sostituzione_root/DATA_DOCUMENTO_SOSTITUITO"></xsl:value-of></td> > </tr> > </xsl:when> > <xsl:when > test="$sostituzione_root/DATA_FIRMA_DOCUMENTO_SOSTITUITO"> > <tr> > <td><i>Data Firma Documento: > </i>xsl:value-of > select="$sostituzione_root/DATA_FIRMA_DOCUMENTO_SOSTITUITO"/></td> > </tr> > </xsl:when> > </xsl:choose> > <tr> > <td><i>Imponta Documento (<xsl:value-of > select="$sostituzione_root/HASH_DOCUMENTO_SOSTITUITO/@tipo"></xsl:value-of>): > </i><xsl:value-of > select="$sostituzione_root/HASH_DOCUMENTO_SOSTITUITO"></xsl:value-of></td> > </tr> > </table> > </xsl:template> > > <xsl:template name="tabella_verbale"> > <strong> > <div align="center"><big><xsl:value-of > select="$verbale_root/ATENEO_DES"></xsl:value-of></big></div><br></br> > <div align="center"><big>VERBALE D'ESAME</big></div><br></br> > <div>ATTIVITA' DIDATTICA: <xsl:value-of > select="concat($verbale_root/AD,' > [',$verbale_root/AD_COD,']')"></xsl:value-of></div> > <xsl:if test="$verbale_root/UD_COD != '' "> > <div>UNITA' DIDATTICA: <xsl:value-of > select="concat($verbale_root/UD,' > [',$verbale_root/UD_COD,']')"></xsl:value-of></div> > </xsl:if> > </strong> > <div>CORSO: <xsl:value-of select="concat($verbale_root/CDS,' > [',$verbale_root/CDS_COD,']')"></xsl:value-of></div> > <div>APPELLO DEL <xsl:value-of > select="concat(substring($verbale_root/DATA_APP,9,2),'/',substring($verbale_root/DATA_APP,6,2),'/',substring($verbale_root/DATA_APP,1,4))"></xsl:value-of></div> > <div>DOCENTE: <xsl:value-of > select="$verbale_root/DOCENTE"></xsl:value-of></div> > <br></br> > <table border="1"> > <thead> > <tr> > <th>Matricola</th> > <th>Cognome</th> > <th>Nome</th> > <th>Voto<i>(*)</i></th> > <th>CFU</th> > <th>Data esame</th> > <th>Verbale N.</th> > </tr> > </thead> > <tbody> > <tr> > <td><xsl:value-of > select="$verbale_root/MAT"></xsl:value-of></td> > <td><xsl:value-of > select="$verbale_root/COGNOME"></xsl:value-of></td> > <td><xsl:value-of > select="$verbale_root/NOME"></xsl:value-of></td> > <td><xsl:value-of > select="$verbale_root/VOTODECOD"></xsl:value-of></td> > <td><xsl:value-of > select="$verbale_root/AD_CFU"></xsl:value-of></td> > <td><xsl:value-of > select="concat(substring($verbale_root/DATA_ESA,9,2),'/',substring($verbale_root/DATA_ESA,6,2),'/',substring($verbale_root/DATA_ESA,1,4))"></xsl:value-of></td> > <td><xsl:value-of > select="$verbale_root/VERB_NUM"></xsl:value-of></td> > </tr> > <tr> > <td colspan="7"> > Corso di studi [Codice]: > <xsl:value-of select="$verbale_root/CDS_STU"></xsl:value-of> [<xsl:value-of > select="$verbale_root/CDS_STU_COD"></xsl:value-of>] > </td> > </tr> > <tr> > <td colspan="7"> > <xsl:choose> > <xsl:when > test="$mostra_ad_figlie and $RAGG_ROOT"> > <xsl:value-of > select="concat($RAGG_ROOT/TIPO_RAG_DES,' ', $RAGG_ROOT/AD_PADRE_DES,' [', > $RAGG_ROOT/AD_PADRE_COD,'],')"></xsl:value-of> > <xsl:for-each > select="$RAGG_ROOT/FIGLIO"> > > <xsl:sort select="AD_DES"></xsl:sort> > > <xsl:value-of select="concat(' ', AD_DES)"></xsl:value-of> [<xsl:value-of > select="AD_COD"></xsl:value-of>]<xsl:if test="position() != last()">, > </xsl:if> > </xsl:for-each> > </xsl:when> > <xsl:otherwise> > Attivita' > didattica [Codice]: > <xsl:value-of > select="$verbale_root/AD_STU"></xsl:value-of> [<xsl:value-of > select="$verbale_root/AD_STU_COD"></xsl:value-of>] > </xsl:otherwise> > </xsl:choose> > </td> > </tr> > <xsl:if test="$verbale_root/UD_COD != '' "> > <tr> > <td colspan="7"> > Unita' didattica > [Codice]: <xsl:value-of select="concat($verbale_root/UD,' > [',$verbale_root/UD_COD,']')"></xsl:value-of> > </td> > </tr> > </xsl:if> > <tr> > <td colspan="7"> > Domande d'esame:<br></br> > <xsl:value-of > select="$verbale_root/DOMANDE"></xsl:value-of> > </td> > </tr> > </tbody> > </table> > <small><i>(*) Codifica di sistema : [voto:<xsl:value-of > select="$verbale_root/VOTO"></xsl:value-of>][causale:<xsl:value-of > select="$verbale_root/CAUSALE"></xsl:value-of>]</i></small> > <xsl:if test="$COMM_ROOT"> > <br></br> > <div><strong>COMMISSIONE</strong></div> > <xsl:for-each select="$COMM_ROOT/DOCENTE"> > <div><xsl:value-of > select="concat(COGNOME,' ',NOME,' [',RUOLO,']')"></xsl:value-of></div> > </xsl:for-each> > </xsl:if> > </xsl:template> > </xsl:stylesheet> > </ds:Transform> > <ds:Transform Algorithm="http://www.w3.org/2006/12/xml-c14n11"></ds:Transform> > </ds:Transforms> > <ds:DigestMethod > Algorithm="http://www.w3.org/2001/04/xmlenc#sha256"></ds:DigestMethod> > <ds:DigestValue>TJZV1fkO7me+9cwfgO8PB/711aBqCCVEnmpoyesGJ10=</ds:DigestValue> > </ds:Reference> > <ds:Reference URI=""> > <ds:Transforms> > <ds:Transform Algorithm="http://www.w3.org/2002/06/xmldsig-filter2"> > <dsig-xpath:XPath > xmlns:dsig-xpath="http://www.w3.org/2002/06/xmldsig-filter2" > Filter="subtract">/descendant::ds:Signature</dsig-xpath:XPath> > </ds:Transform> > <ds:Transform Algorithm="http://www.w3.org/2006/12/xml-c14n11"></ds:Transform> > </ds:Transforms> > <ds:DigestMethod > Algorithm="http://www.w3.org/2001/04/xmlenc#sha256"></ds:DigestMethod> > <ds:DigestValue>e7b+kmdfPuly923kod7ZiwXR4/0GZeqDqc/0jBGboLQ=</ds:DigestValue> > </ds:Reference> > <ds:Reference Type="http://uri.etsi.org/01903#SignedProperties" > URI="#sig0000087443000008748200000000010000048377SignedProperties"> > <ds:Transforms> > <ds:Transform Algorithm="http://www.w3.org/2006/12/xml-c14n11"></ds:Transform> > </ds:Transforms> > <ds:DigestMethod > Algorithm="http://www.w3.org/2001/04/xmlenc#sha256"></ds:DigestMethod> > <ds:DigestValue>By8pd2dVbxlVQLBsVTJr4Owak9qZp6uPwL5FWltD39w=</ds:DigestValue> > </ds:Reference> > <ds:Reference URI="#sig0000087443000008748200000000010000048377KeyInfo"> > <ds:Transforms> > <ds:Transform Algorithm="http://www.w3.org/2006/12/xml-c14n11"></ds:Transform> > </ds:Transforms> > <ds:DigestMethod > Algorithm="http://www.w3.org/2001/04/xmlenc#sha256"></ds:DigestMethod> > <ds:DigestValue>avLFaD5rNypP3GLG7z8/k7q99JhjJy7ipVjaN5gbYrs=</ds:DigestValue> > </ds:Reference> > </ds:SignedInfo> > <ds:SignatureValue Id="sigSV0000087443000008748200000000010000048377"> > LKeSxNvEsCI0m+xAOAVkcBAbDdnNhExz+khmn8gYN/OaspYEQXXCfEx5R1eb+Zh30LovGIZpjIYc > r7fAJyA1WJlXvSlXP/XGH99E5WYymCmJhar+uMMCIz1PRXvaXdGNvfVCMNPYgIGQEX8HXFiUG7ha > qwkfhn3xcJWg0RqnJRk= > </ds:SignatureValue> > <ds:KeyInfo Id="sig0000087443000008748200000000010000048377KeyInfo"> > <ds:X509Data> > <ds:X509Certificate> > MIIFWTCCBEGgAwIBAgIDMiomMA0GCSqGSIb3DQEBCwUAMIGDMQswCQYDVQQGEwJJVDEVMBMGA1UE > CgwMSU5GT0NFUlQgU1BBMRQwEgYDVQQFEwswNzk0NTIxMTAwNjEiMCAGA1UECwwZQ2VydGlmaWNh > dG9yZSBBY2NyZWRpdGF0bzEjMCEGA1UEAwwaSW5mb0NlcnQgRmlybWEgUXVhbGlmaWNhdGEwHhcN > MTMwNzA5MDkwNDQyWhcNMTYwNzA5MDAwMDAwWjCBkDELMAkGA1UEBhMCSVQxFTATBgNVBAoMDE5P > TiBQUkVTRU5URTEOMAwGA1UEBAwFQ0VMTEkxDzANBgNVBCoMBk1BVFRFTzEcMBoGA1UEBRMTSVQ6 > Q0xMTVRUNzhFMzBIMTk5SzEUMBIGA1UELhMLMjAxMzUwMTA3MjkxFTATBgNVBAMMDE1hdHRlbyBD > ZWxsaTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEA0VjYU8AQFOvaNcyp7Xoe8RcfaYtBJwiZ > j2sgmlqbojduPDWxM7LuxT1/qh79LIMQGHljI+fzyWON7skLNNd4VOwiZpTkYTNuIBB1OzBFxWUc > vKida+4+Q/E1HG5RVuzmGs/rsV94YDhWmHOW7B+8c90y/D6L18Y/6/Eil1w7Xs0CAwEAAaOCAkkw > ggJFMAkGA1UdEwQCMAAwZAYDVR0gBF0wWzBPBgYrTCQBAQEwRTBDBggrBgEFBQcCARY3aHR0cDov > L3d3dy5maXJtYS5pbmZvY2VydC5pdC9kb2N1bWVudGF6aW9uZS9tYW51YWxpLnBocDAIBgYrTBgB > AQIwLwYIKwYBBQUHAQMEIzAhMAgGBgQAjkYBATALBgYEAI5GAQMCARQwCAYGBACORgEEMCgGA1Ud > CQQhMB8wHQYIKwYBBQUHCQExERgPMTk3ODA1MzAwMDAwMDBaME4GCCsGAQUFBwEBBEIwQDA+Bggr > BgEFBQcwAYYyaHR0cDovL29jc3AuaW5mb2NlcnQuaXQvT0NTUFNlcnZlcl9JQ0UvT0NTUFNlcnZs > ZXQwDgYDVR0PAQH/BAQDAgZAMCUGA1UdEgQeMByBGmZpcm1hLmRpZ2l0YWxlQGluZm9jZXJ0Lml0 > MB8GA1UdIwQYMBaAFDD8IXx80nxtvIzDuhNQ93qgK8W2MIGvBgNVHR8EgacwgaQwgaGggZ6ggZuG > gZhsZGFwOi8vbGRhcC5pbmZvY2VydC5pdC9jbiUzZEluZm9DZXJ0JTIwRmlybWElMjBRdWFsaWZp > Y2F0YSUyMENSTDAyLG91JTNkQ2VydGlmaWNhdG9yZSUyMEFjY3JlZGl0YXRvLG8lM2RJTkZPQ0VS > VCUyMFNQQSxjJTNkSVQ/Y2VydGlmaWNhdGVSZXZvY2F0aW9uTGlzdDAdBgNVHQ4EFgQUvh5KW1bU > Rk1rwJOsU4DmlDEoCAQwDQYJKoZIhvcNAQELBQADggEBAEDnj2mCFmlvAMXWmMoCaWFJsm5qMWuG > 5y1z60PplUnn45CGnxYosho+HxwO7TTH7gj8M4P/FcKPOdCbvyue4av8yUhaUMKfkSwZd3sCGC7q > Bvi3KJaftzuLNKred6KKOaD8qHEM+Cs8uU8hm8w6Fec2ClQBdL7Jr0+rZtCqkEhUaiWgEPkJjkpk > 7Ia3pYnxtY+1odCUq6k1i76CJCxFdwsTQMUK2sf0abbngzgNy3E3v3oFuOV3CO8Uii/XrVr9+C3k > 1c6tuqn9mSNNfSYFAbpBBPJyaVu6K3cHCeei2vmQMOc18e7PBgRz8fTKSx4QXWCRVRXxP5WMej+j > WCYA4fE= > </ds:X509Certificate> > </ds:X509Data> > </ds:KeyInfo> > <ds:Object><xades:QualifyingProperties > xmlns:xades="http://uri.etsi.org/01903/v1.3.2#" > Target="#sig0000087443000008748200000000010000048377"><xades:SignedProperties > Id="sig0000087443000008748200000000010000048377SignedProperties"><xades:SignedSignatureProperties><xades:SigningTime>2013-09-26T11:55:54.000+02:00</xades:SigningTime></xades:SignedSignatureProperties></xades:SignedProperties></xades:QualifyingProperties></ds:Object> > </ds:Signature></VERBALI> > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)