Jiri B said: > file -i *.epub returns 'application/x-not-regular-file' or 'application/zip' > and it should return 'application/epub+zip' (at least this is on Fedora).
Below comes the patch that changes the way zip archives are treated in magic file: 1. Immediately write "Zip archive data" for all zip files. 2. If zip file does not contain extra attributes and starts with file "mimetype", whose content begins with the word "application", treat it as container format - add document type name. 3. Otherwise detect zip version required to open the file and note it. #2 breaks backward compatibility in several ways: * previously "mimetype" test only considered zip 2.0 archives, although at least OpenDocument standard permits any. * output for ODT and friends changed from "<Document type>" to "Zip archive data: <Document type>". * if you run file(1) on an archive that you make by decompressing and recompressing ODT file, the type would be "Zip archive data, at least v1.0 to extract" instead of just "data". Comments? OKs? P.S.: While doing this I noticed that under some circumstances file gets tricked into discarding all matches and falling back to 'application/x-not-regular-file', as Jiri B notes above. This looks like a bug. P.P.S.: it looks like magic format could benefit from something like python's "finally". Do we want to maintain compatibility with other magic file parsers? -- Dmitrij D. Czarkoff Index: magdir/archive =================================================================== RCS file: /cvs/src/usr.bin/file/magdir/archive,v retrieving revision 1.6 diff -u -p -r1.6 archive --- magdir/archive 24 Apr 2009 18:54:34 -0000 1.6 +++ magdir/archive 4 Mar 2016 18:20:07 -0000 @@ -563,75 +563,78 @@ 0 string UC2\x1a UC2 archive data # ZIP archives (Greg Roelofs, c/o zip-b...@wkuvx1.wku.edu) -0 string PK\003\004 ->4 byte 0x00 Zip archive data -!:mime application/zip ->4 byte 0x09 Zip archive data, at least v0.9 to extract -!:mime application/zip ->4 byte 0x0a Zip archive data, at least v1.0 to extract -!:mime application/zip ->4 byte 0x0b Zip archive data, at least v1.1 to extract -!:mime application/zip ->0x161 string WINZIP Zip archive data, WinZIP self-extracting -!:mime application/zip ->4 byte 0x14 ->>30 ubelong !0x6d696d65 Zip archive data, at least v2.0 to extract +0 string PK\003\004 Zip archive data !:mime application/zip # OpenOffice.org / KOffice / StarOffice documents # From: Abel Cheung <a...@oaka.org> # Listed here because they are basically zip files ->>30 string mimetype +>30 string mimetypeapplication # KOffice (1.2 or above) formats ->>>50 string vnd.kde. KOffice (>=1.2) ->>>>58 string karbon Karbon document ->>>>58 string kchart KChart document ->>>>58 string kformula KFormula document ->>>>58 string kivio Kivio document ->>>>58 string kontour Kontour document ->>>>58 string kpresenter KPresenter document ->>>>58 string kspread KSpread document ->>>>58 string kword KWord document +>>50 string vnd.kde. \b, KOffice (>=1.2) +>>>58 string karbon Karbon document +>>>58 string kchart KChart document +>>>58 string kformula KFormula document +>>>58 string kivio Kivio document +>>>58 string kontour Kontour document +>>>58 string kpresenter KPresenter document +>>>58 string kspread KSpread document +>>>58 string kword KWord document # OpenOffice formats (for OpenOffice 1.x / StarOffice 6/7) ->>>50 string vnd.sun.xml. OpenOffice.org 1.x ->>>>62 string writer Writer ->>>>>68 byte !0x2e document ->>>>>68 string .template template ->>>>>68 string .global global document ->>>>62 string calc Calc ->>>>>66 byte !0x2e spreadsheet ->>>>>66 string .template template ->>>>62 string draw Draw ->>>>>66 byte !0x2e document ->>>>>66 string .template template ->>>>62 string impress Impress ->>>>>69 byte !0x2e presentation ->>>>>69 string .template template ->>>>62 string math Math document +>>50 string vnd.sun.xml. \b, OpenOffice.org 1.x +>>>62 string writer Writer +>>>>68 byte !0x2e document +>>>>68 string .template template +>>>>68 string .global global document +>>>62 string calc Calc +>>>>66 byte !0x2e spreadsheet +>>>>66 string .template template +>>>62 string draw Draw +>>>>66 byte !0x2e document +>>>>66 string .template template +>>>62 string impress Impress +>>>>69 byte !0x2e presentation +>>>>69 string .template template +>>>62 string math Math document # OpenDocument formats (for OpenOffice 2.x / StarOffice >= 8) # http://lists.oasis-open.org/archives/office/200505/msg00006.html ->>>50 string vnd.oasis.opendocument. OpenDocument ->>>>73 string text ->>>>>77 byte !0x2d Text +>>50 string vnd.oasis.opendocument. \b, OpenDocument +>>>73 string text +>>>>77 byte !0x2d Text !:mime application/vnd.oasis.opendocument.text ->>>>>77 string -template Text Template ->>>>>77 string -web HTML Document Template ->>>>>77 string -master Master Document ->>>>73 string graphics Drawing ->>>>>81 string -template Template ->>>>73 string presentation Presentation ->>>>>85 string -template Template ->>>>73 string spreadsheet Spreadsheet ->>>>>84 string -template Template ->>>>73 string chart Chart ->>>>>78 string -template Template ->>>>73 string formula Formula ->>>>>80 string -template Template ->>>>73 string database Database ->>>>73 string image Image +>>>>77 string -template Text Template +>>>>77 string -web HTML Document Template +>>>>77 string -master Master Document +>>>73 string graphics Drawing +>>>>81 string -template Template +>>>73 string presentation Presentation +>>>>85 string -template Template +>>>73 string spreadsheet Spreadsheet +>>>>84 string -template Template +>>>73 string chart Chart +>>>>78 string -template Template +>>>73 string formula Formula +>>>>80 string -template Template +>>>73 string database Database +>>>73 string image Image + +# EPUB (OEBPS) books using OCF (OEBPS Container Format) +>>50 string epub+zip \b, EPUB book +!:mime application/epub+zip + +# Generic ZIP archives +# Greg Roelofs <zip-b...@wkuvx1.wku.edu> +>30 string !mimetypeapplication +!:mime application/zip +>>4 byte 0x00 +>>4 byte 0x09 \b, at least v0.9 to extract +>>4 byte 0x0a \b, at least v1.0 to extract +>>4 byte 0x0b \b, at least v1.1 to extract +>>0x161 string WINZIP \b, WinZIP self-extracting +>>4 byte 0x14 \b, at least v2.0 to extract # Zoo archiver 20 lelong 0xfdc4a7dc Zoo archive data