[jira] [Updated] (TIKA-779) Detection of Microsoft Works 2000 Word Processor files

2011-11-10 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-779: -- Description: In older versions of Tika, our Microsoft Works 2000 Word Processor example file would get r

[jira] [Updated] (TIKA-779) Detection of Microsoft Works 2000 Word Processor files

2011-11-10 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-779: -- Attachment: microsoft-works-word-processor-2000.wps a test WPS files with no SPELLING top level name

[jira] [Updated] (TIKA-779) Detection of Microsoft Works 2000 Word Processor files

2011-11-10 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-779: -- Attachment: tika-779.patch My workaround + test. > Detection of Microsoft Works 2000 Wor

[jira] [Updated] (TIKA-791) Fix the detection of protected OOXML files

2011-11-25 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-791: -- Attachment: tika-791.zip A ZIP file with the patch and some test documents. They differ from the ones in

[jira] [Updated] (TIKA-791) Fix the detection of protected OOXML files

2011-11-28 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-791: -- Attachment: tika-791-ver2.zip Attached an updated patch which uses a new media type "application/x-tika-

[jira] [Updated] (TIKA-797) MimeType.getExtension for application/vnd.ms-powerpoint returns ppz. I'd expect ppt.

2011-12-02 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-797: -- Attachment: tika-powerpointextension.patch A patch which reversed the order of globs for vnd.ms-powerpoin

[jira] [Updated] (TIKA-798) Distinguish between EMF and WMF

2011-12-02 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-798: -- Attachment: tika-emfwmf.zip A patch with two example files. From now WMF stays at applicaton/x-msmetafile

[jira] [Updated] (TIKA-806) MS Word Detection magics are a bit overzealous

2011-12-09 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-806: -- Attachment: tika-806.patch A patch which removes those magics from tika-mimetypes.xml. >

[jira] [Updated] (TIKA-806) MS Word Detection magics are a bit overzealous

2011-12-09 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-806: -- Attachment: (was: tika-806.patch) > MS Word Detection magics are a bit overzealous >

[jira] [Updated] (TIKA-806) MS Word Detection magics are a bit overzealous

2011-12-09 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-806: -- Attachment: tika-806-ver2.patch A second version of the patch which doesn't break the build. The unit tes

[jira] [Updated] (TIKA-806) MS Word Detection magics are a bit overzealous

2011-12-12 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-806: -- Attachment: tika-806-ver3.zip It turns out that the XLR files are not detected by POIFSContainerDetector.

[jira] [Updated] (TIKA-812) Improve the detection of Works Spreadsheet 7.0 files

2011-12-13 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-812: -- Attachment: tika-812.patch testWORKSSpreadsheet7.0.xlr Attached a test file and a patch,

[jira] [Updated] (TIKA-813) Webarchive detection.

2011-12-13 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-813: -- Attachment: tika-webarchive-detection.patch A patch which adds the appropriate rules to tika-mimetypes.xm

[jira] [Updated] (TIKA-814) Increase the amount of bytes read by TextDetector

2011-12-13 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-814: -- Attachment: tika-textdetector.patch A patch, which makes the text detector work on the entire array suppl

[jira] [Updated] (TIKA-813) Webarchive detection.

2011-12-14 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-813: -- Attachment: (was: tika-webarchive-detection.patch) > Webarchive detection. >

[jira] [Updated] (TIKA-813) Webarchive detection.

2011-12-14 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-813: -- Attachment: testWEBARCHIVE.webarchive tika-813.patch A second version of the patch which

[jira] [Updated] (TIKA-812) Improve the detection of Works Spreadsheet 7.0 files

2011-12-14 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-812: -- Attachment: tika-812-ver2.patch A second version of the patch. Contains a magic pattern for WksSSWorkBook

[jira] [Updated] (TIKA-823) Detect StarOffice files

2011-12-20 Thread Antoni Mylka (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoni Mylka updated TIKA-823: -- Attachment: testStarOffice-5.2-write.sdw testStarOffice-5.2-impress.sdd te