[jira] [Commented] (TIKA-1536) Upgrade compiler definition in pom's to Java 7

2015-02-03 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304388#comment-14304388 ] Tyler Palsulich commented on TIKA-1536: --- Are there significant upgrades we're looking

[jira] [Commented] (TIKA-1343) Create a Tika Translator implementation that uses JoshuaDecoder

2015-02-03 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304163#comment-14304163 ] Lewis John McGibbney commented on TIKA-1343: [~chrismattmann] what is current s

[jira] [Comment Edited] (TIKA-1334) Add presentation layer for results of each run

2015-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303809#comment-14303809 ] Tim Allison edited comment on TIKA-1334 at 2/3/15 7:58 PM: --- This

[jira] [Commented] (TIKA-1331) Find/configure a vm and gather initial corpus

2015-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303869#comment-14303869 ] Tim Allison commented on TIKA-1331: --- There's a full run on an early version of 1.7 here:

[jira] [Comment Edited] (TIKA-1334) Add presentation layer for results of each run

2015-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303809#comment-14303809 ] Tim Allison edited comment on TIKA-1334 at 2/3/15 7:29 PM: --- This

[jira] [Updated] (TIKA-1334) Add presentation layer for results of each run

2015-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1334: -- Attachment: static_stats.zip This is the current output of a static dump from a sqlite database that is

[jira] [Resolved] (TIKA-1383) Simplify TikeServerCli endpoint setup code

2015-02-03 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Palsulich resolved TIKA-1383. --- Resolution: Fixed > Simplify TikeServerCli endpoint setup code > --

[jira] [Commented] (TIKA-1331) Find/configure a vm and gather initial corpus

2015-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303801#comment-14303801 ] Tim Allison commented on TIKA-1331: --- The uncompressed output of tika-batch for govdocs1 i

[jira] [Updated] (TIKA-1473) Apache Tika is not working for .docx documents

2015-02-03 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Palsulich updated TIKA-1473: -- Priority: Major (was: Blocker) > Apache Tika is not working for .docx documents > -

[jira] [Commented] (TIKA-1331) Find/configure a vm and gather initial corpus

2015-02-03 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303783#comment-14303783 ] Tyler Palsulich commented on TIKA-1331: --- The big chunks of data are the corpus (or co

[jira] [Commented] (TIKA-1540) New Tika plugin for image based feature extraction using computer vision techniques

2015-02-03 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303701#comment-14303701 ] Tyler Palsulich commented on TIKA-1540: --- Will this feature extraction happen external

[jira] [Created] (TIKA-1540) New Tika plugin for image based feature extraction using computer vision techniques

2015-02-03 Thread Aashish Chaudhary (JIRA)
Aashish Chaudhary created TIKA-1540: --- Summary: New Tika plugin for image based feature extraction using computer vision techniques Key: TIKA-1540 URL: https://issues.apache.org/jira/browse/TIKA-1540

[jira] [Commented] (TIKA-1538) Wrong mimetype detection

2015-02-03 Thread Miguel (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303162#comment-14303162 ] Miguel commented on TIKA-1538: -- I was working on the junit test, but Konstantin's comment coul

[jira] [Commented] (TIKA-1538) Wrong mimetype detection

2015-02-03 Thread Konstantin Gribov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303146#comment-14303146 ] Konstantin Gribov commented on TIKA-1538: - {code:java} Tika tika = new Tika(); byte

[jira] [Commented] (TIKA-1538) Wrong mimetype detection

2015-02-03 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303128#comment-14303128 ] Nick Burch commented on TIKA-1538: -- It's possible that it could still be a bug. Are you ab

[jira] [Comment Edited] (TIKA-1538) Wrong mimetype detection

2015-02-03 Thread Miguel (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303125#comment-14303125 ] Miguel edited comment on TIKA-1538 at 2/3/15 11:44 AM: --- I have tried

[jira] [Commented] (TIKA-1538) Wrong mimetype detection

2015-02-03 Thread Miguel (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303125#comment-14303125 ] Miguel commented on TIKA-1538: -- I have tried it and the result is as you describe, Nick (i cop

[jira] [Commented] (TIKA-1539) GRB file magic bytes and extension matching

2015-02-03 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303102#comment-14303102 ] Nick Burch commented on TIKA-1539: -- The GRB parser will need one (or possibly a few) test

[jira] [Created] (TIKA-1539) GRB file magic bytes and extension matching

2015-02-03 Thread Luke sh (JIRA)
Luke sh created TIKA-1539: - Summary: GRB file magic bytes and extension matching Key: TIKA-1539 URL: https://issues.apache.org/jira/browse/TIKA-1539 Project: Tika Issue Type: Improvement C

[jira] [Commented] (TIKA-1538) Wrong mimetype detection

2015-02-03 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303089#comment-14303089 ] Nick Burch commented on TIKA-1538: -- I've just tried with {{java -jar tika-app-1.7.jar < Pr

[jira] [Updated] (TIKA-1538) Wrong mimetype detection

2015-02-03 Thread Miguel (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miguel updated TIKA-1538: - Description: [SCENARIO] - Working on a "supposed to be a valid JPEG file" (the file is attached to this issue repo

[jira] [Updated] (TIKA-1538) Wrong mimetype detection

2015-02-03 Thread Miguel (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miguel updated TIKA-1538: - Attachment: Product345037-000.jpg Troublesome image file > Wrong mimetype detection > > >

[jira] [Issue Comment Deleted] (TIKA-1538) Wrong mimetype detection

2015-02-03 Thread Miguel (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miguel updated TIKA-1538: - Comment: was deleted (was: Troublesome image file) > Wrong mimetype detection > > >

[jira] [Created] (TIKA-1538) Wrong mimetype detection

2015-02-03 Thread Miguel (JIRA)
Miguel created TIKA-1538: Summary: Wrong mimetype detection Key: TIKA-1538 URL: https://issues.apache.org/jira/browse/TIKA-1538 Project: Tika Issue Type: Bug Affects Versions: 1.7 Rep