RE: aspectj dependency

2016-04-22 Thread Allison, Timothy B.
Hi Ben, We tried to upgrade to 1.1.18 on TIKA-1924. Unfortunately, there was a bug (which we reported [0]) that causes the parser to go into an infinite loop on some files in our test corpus. We had to back off to 1.1.7 (TIKA-1931), and unfortunately as I look this morning, that seems to be

Re: JIRA issue?

2016-04-22 Thread Nick Burch
On Thu, 21 Apr 2016, Ben McCann wrote: I'd like to create an issue on the JIRA. When I visit https://issues.apache.org/jira/browse/TIKA/ and hit Create I don't see Tika as an option. I can only create issues for Zookeeper and other projects If you let us know your JIRA username, someone can g

RE: aspectj dependency

2016-04-22 Thread Ben McCann
Thank you! On Apr 22, 2016 5:18 AM, "Allison, Timothy B." wrote: > Hi Ben, > > We tried to upgrade to 1.1.18 on TIKA-1924. Unfortunately, there was a > bug (which we reported [0]) that causes the parser to go into an infinite > loop on some files in our test corpus. We had to back off to 1.1.

[jira] [Reopened] (TIKA-1924) Upgrade com.googlecode.mp4parser's isoparser to 1.1.18

2016-04-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reopened TIKA-1924: --- I think we can add a workaround to prevent the infinite loop at the Tika level. > Upgrade com.googlecode.m

[jira] [Reopened] (TIKA-1931) Revert mp4 parser version because of new permanent hangs with 1.1.18

2016-04-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reopened TIKA-1931: --- No need to revert. > Revert mp4 parser version because of new permanent hangs with 1.1.18 > --

[jira] [Resolved] (TIKA-1931) Revert mp4 parser version because of new permanent hangs with 1.1.18

2016-04-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1931. --- Resolution: Won't Fix I think there's a workaround that will allow 1.1.18 and prevent infinite loops.

[jira] [Commented] (TIKA-1924) Upgrade com.googlecode.mp4parser's isoparser to 1.1.18

2016-04-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254029#comment-15254029 ] Tim Allison commented on TIKA-1924: --- As [~b...@benmccann.com] pointed out, this upgrade r

[GitHub] tika pull request: Tika 1913 - MIT Information Extraction itegrate...

2016-04-22 Thread manalishah
GitHub user manalishah opened a pull request: https://github.com/apache/tika/pull/108 Tika 1913 - MIT Information Extraction itegrated with Tika This pull request comprises of yet another NamedEntityRecognizer that uses the open-source trained models and functions of MIT-nlp to perf

[jira] [Resolved] (TIKA-1924) Upgrade com.googlecode.mp4parser's isoparser to 1.1.18

2016-04-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1924. --- Resolution: Fixed Re-upgraded to 1.1.18, added work-around to avoid infinite loops, added tiny trigger

[GitHub] tika pull request: Backport tika-langdetect from 2.x branch to 1.1...

2016-04-22 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/tika/pull/90 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled

[jira] [Resolved] (TIKA-1872) Backport tika-langdetect from 2.x branch to 1.13 branch

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1872. - Resolution: Fixed This is now done, Ken's Optimaize langdetect, N-gram langdetect and Text.

[jira] [Resolved] (TIKA-1696) Language Identification with Text Processing Toolkit from MITLL

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1696. - Resolution: Fixed This is now done, Ken's Optimaize langdetect, N-gram langdetect and Text.

[jira] [Resolved] (TIKA-1723) Integrate language-detector into Tika

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1723. - Resolution: Fixed Fix Version/s: 1.13 This is now done, Ken's Optimaize langdetect,

[jira] [Updated] (TIKA-776) ExifTool Embedder

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-776: --- Fix Version/s: (was: 1.13) 1.14 > ExifTool Embedder > - >

[jira] [Updated] (TIKA-1808) Head section closed too eager

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1808: Fix Version/s: (was: 1.13) 1.14 > Head section closed too eager >

[jira] [Updated] (TIKA-1840) No way to link slide notes to slide in PPT output.

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1840: Fix Version/s: (was: 1.13) 1.14 > No way to link slide notes to slide

[jira] [Updated] (TIKA-1513) Add mime detection and parsing for dbf files

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1513: Fix Version/s: (was: 1.13) 1.14 > Add mime detection and parsing for d

[jira] [Updated] (TIKA-1674) Add example to show how to extract embedded files

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1674: Fix Version/s: (was: 1.13) 1.14 > Add example to show how to extract e

[jira] [Updated] (TIKA-1505) chmparser breaks down when extracting from file of CHM format v3

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1505: Fix Version/s: (was: 1.13) 1.14 > chmparser breaks down when extractin

[jira] [Updated] (TIKA-1609) Leverage Google's LibPhonenumber for enhanced phone number extraction and metadata modeling

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1609: Fix Version/s: (was: 1.13) 1.14 > Leverage Google's LibPhonenumber for

[jira] [Updated] (TIKA-987) Embedded drawing (SHAPE MERGEFORMAT) sometimes not extracted

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-987: --- Fix Version/s: (was: 1.13) 1.14 > Embedded drawing (SHAPE MERGEFORMAT) so

[jira] [Updated] (TIKA-1367) Tika documentation should list tika-parsers parser dependencies

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1367: Fix Version/s: (was: 1.13) 1.14 > Tika documentation should list tika-

[jira] [Updated] (TIKA-1917) Just a quick fix to allow NLTK Parser extract measurement information from text

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1917: Fix Version/s: (was: 1.13) 1.14 > Just a quick fix to allow NLTK Parse

[jira] [Updated] (TIKA-1465) Implement extraction of non-global variables from netCDF3 and netCDF4

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1465: Fix Version/s: (was: 1.13) 1.14 > Implement extraction of non-global v

[jira] [Updated] (TIKA-1343) Create a Tika Translator implementation that uses JoshuaDecoder

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1343: Fix Version/s: (was: 1.13) 1.14 > Create a Tika Translator implementat

[jira] [Updated] (TIKA-1577) NetCDF Data Extraction

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1577: Fix Version/s: (was: 1.13) 1.14 > NetCDF Data Extraction > ---

[jira] [Updated] (TIKA-1295) Make some Dublin Core items multi-valued

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1295: Fix Version/s: (was: 1.13) 1.14 > Make some Dublin Core items multi-va

[jira] [Updated] (TIKA-1379) error in Tika().detect for xml files with xades signature

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1379: Fix Version/s: (was: 1.13) 1.14 > error in Tika().detect for xml files

[jira] [Updated] (TIKA-1508) Add uniformity to parser parameter configuration

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1508: Fix Version/s: (was: 1.13) 1.14 > Add uniformity to parser parameter c

[jira] [Updated] (TIKA-1436) improvement to PDFParser

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1436: Fix Version/s: (was: 1.13) 1.14 > improvement to PDFParser > -

[jira] [Updated] (TIKA-1829) org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:92) NPE

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1829: Fix Version/s: (was: 1.13) 1.14 > org.apache.tika.parser.ocr.Tesseract

[jira] [Updated] (TIKA-1885) Tika MIME updates for *.cdf and *.xar and custom zero length file detector based on TREC-DD-Polar

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1885: Fix Version/s: (was: 1.13) 1.14 > Tika MIME updates for *.cdf and *.xa

[jira] [Updated] (TIKA-1276) Missing embedded dependencies in tika-bundle

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1276: Fix Version/s: (was: 1.13) 1.14 > Missing embedded dependencies in tik

[jira] [Updated] (TIKA-1308) Support in memory parse mode(don't create temp file): to support run Tika in GAE

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1308: Fix Version/s: (was: 1.13) 1.14 > Support in memory parse mode(don't c

[jira] [Updated] (TIKA-1939) Preparation for Tika 1.13 release

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1939: Fix Version/s: (was: 1.13) 1.14 > Preparation for Tika 1.13 release >

[jira] [Updated] (TIKA-1301) Establish TikaServer on Apache hosted VM

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1301: Fix Version/s: (was: 1.13) 1.14 > Establish TikaServer on Apache hoste

[jira] [Updated] (TIKA-1913) Integrate MIT Information Extraction(MITIE) into Tika to perform Named Entity Recognition

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1913: Fix Version/s: (was: 1.13) 1.14 > Integrate MIT Information Extraction

[jira] [Updated] (TIKA-1888) Update mimetype for application/x-netcdf

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1888: Fix Version/s: (was: 1.13) 1.14 > Update mimetype for application/x-ne

[jira] [Updated] (TIKA-1108) Represent individual slides in pptx

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1108: Fix Version/s: (was: 1.13) 1.14 > Represent individual slides in pptx

[jira] [Updated] (TIKA-1705) Update ASM dependency to 5.0.4

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1705: Fix Version/s: (was: 1.13) 1.14 > Update ASM dependency to 5.0.4 > ---

[jira] [Updated] (TIKA-1709) Tika Server doesn't handle multi-part attachments or form-encoded inputs

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1709: Fix Version/s: (was: 1.13) 1.14 > Tika Server doesn't handle multi-par

[jira] [Updated] (TIKA-1390) Create tika-example module

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1390: Fix Version/s: (was: 1.13) 1.14 > Create tika-example module > ---

[jira] [Updated] (TIKA-539) Encoding detection is too biased by encoding in meta tag

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-539: --- Fix Version/s: (was: 1.13) 1.14 > Encoding detection is too biased by enc

[jira] [Updated] (TIKA-1955) MIME types updates and additions for Scientific Data based on TREC-DD-Polar

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1955: Fix Version/s: (was: 1.13) 1.14 > MIME types updates and additions for

[jira] [Updated] (TIKA-1724) Create parser for .obo file format.

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1724: Fix Version/s: (was: 1.13) 1.14 > Create parser for .obo file format.

[jira] [Updated] (TIKA-1106) CLAVIN Integration

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1106: Fix Version/s: (was: 1.13) 1.14 > CLAVIN Integration > ---

[jira] [Updated] (TIKA-1328) Translate Metadata and Content

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1328: Fix Version/s: (was: 1.13) 1.14 > Translate Metadata and Content > ---

[jira] [Updated] (TIKA-1800) MediaType#parse does not decode escaped special characters

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1800: Fix Version/s: (was: 1.13) 1.14 > MediaType#parse does not decode esca

[jira] [Updated] (TIKA-1425) Automatic batching of Microsoft service calls

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1425: Fix Version/s: (was: 1.13) 1.14 > Automatic batching of Microsoft serv

[jira] [Updated] (TIKA-1395) Create embedded image extraction example

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1395: Fix Version/s: (was: 1.13) 1.14 > Create embedded image extraction exa

[jira] [Updated] (TIKA-1952) Access Date is getting modified while capturing the MetaData information using AutoDetectParser

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1952: Fix Version/s: (was: 1.13) 1.14 > Access Date is getting modified whil

[jira] [Updated] (TIKA-819) Make Option to Exclude Embedded Files' Text for Text Content

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-819: --- Fix Version/s: (was: 1.13) 1.14 > Make Option to Exclude Embedded Files'

[jira] [Updated] (TIKA-1607) Introduce new arbitrary object key/values data structure for persistence of Tika Metadata

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1607: Fix Version/s: (was: 1.13) 1.14 > Introduce new arbitrary object key/v

[jira] [Updated] (TIKA-891) Use POST in addition to PUT on method calls in tika-server

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-891: --- Fix Version/s: (was: 1.13) 1.14 > Use POST in addition to PUT on method c

[jira] [Updated] (TIKA-1598) Parser Implementation for Streaming Video

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1598: Fix Version/s: (was: 1.13) 1.14 > Parser Implementation for Streaming

[jira] [Updated] (TIKA-988) We don't extract a placeholder for a Word document embedded in an Excel document

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-988: --- Fix Version/s: (was: 1.13) 1.14 > We don't extract a placeholder for a Wo

[jira] [Updated] (TIKA-1059) Better Handling of InterruptedException in ExternalParser and ExternalEmbedder

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1059: Fix Version/s: (was: 1.13) 1.14 > Better Handling of InterruptedExcept

[jira] [Updated] (TIKA-1456) Visual Sentiment API parser

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1456: Fix Version/s: (was: 1.13) 1.14 > Visual Sentiment API parser > --

[jira] [Updated] (TIKA-1329) Add RecursiveParserWrapper aka Jukka's (and Nick's) RecursiveMetadataParser

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1329: Fix Version/s: (was: 1.13) 1.14 > Add RecursiveParserWrapper aka Jukka

[jira] [Updated] (TIKA-1801) Integrate MITIE Named Entity Recognition support

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1801: Fix Version/s: (was: 1.13) 1.14 > Integrate MITIE Named Entity Recogni

[jira] [Updated] (TIKA-1953) tika-server NullPointerException while processing rtfs

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1953: Fix Version/s: (was: 1.13) 1.14 > tika-server NullPointerException whi

[jira] [Updated] (TIKA-1925) Composite External Parser like Exiftool fails to run on Windows.

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1925: Fix Version/s: (was: 1.13) 1.14 > Composite External Parser like Exift

[jira] [Updated] (TIKA-1688) Tika Version in Metadata

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1688: Fix Version/s: (was: 1.13) 1.14 > Tika Version in Metadata > -

[jira] [Updated] (TIKA-715) Some parsers produce non-well-formed XHTML SAX events

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-715: --- Fix Version/s: (was: 1.13) 1.14 > Some parsers produce non-well-formed XH

[jira] [Updated] (TIKA-1518) Docker with Tika Server

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1518: Fix Version/s: (was: 1.13) 1.14 > Docker with Tika Server > --

[jira] [Updated] (TIKA-1220) Parser implementration for IFC files

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1220: Fix Version/s: (was: 1.13) 1.14 > Parser implementration for IFC files

[jira] [Updated] (TIKA-985) Support for HTML5 elements

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-985: --- Fix Version/s: (was: 1.13) 1.14 > Support for HTML5 elements > --

[jira] [Updated] (TIKA-894) Add webapp mode for Tika Server, simplifies deployment

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-894: --- Fix Version/s: (was: 1.13) 1.14 > Add webapp mode for Tika Server, simpli

[jira] [Updated] (TIKA-774) ExifTool Parser

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-774: --- Fix Version/s: (was: 1.13) 1.14 > ExifTool Parser > --- > >

[jira] [Updated] (TIKA-1366) Update some of Tika Server services to support JAX-RS 2.0 AsyncResponse

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1366: Fix Version/s: (was: 1.13) 1.14 > Update some of Tika Server services

[jira] [Updated] (TIKA-1318) Use of Deprecated Word6Extractor.getParagraphText() Method

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1318: Fix Version/s: (was: 1.13) 1.14 > Use of Deprecated Word6Extractor.get

[jira] [Updated] (TIKA-1417) Create Extract Embedded Images from PDFs Example

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1417: Fix Version/s: (was: 1.13) 1.14 > Create Extract Embedded Images from

[jira] [Updated] (TIKA-1616) Tika Parser for GIBS Metadata

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1616: Fix Version/s: (was: 1.13) 1.14 > Tika Parser for GIBS Metadata >

[jira] [Updated] (TIKA-1540) New Tika plugin for image based feature extraction using computer vision techniques

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1540: Fix Version/s: (was: 1.13) 1.14 > New Tika plugin for image based feat

[jira] [Updated] (TIKA-1706) Bring back commons-io to tika-core

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1706: Fix Version/s: (was: 1.13) 1.14 > Bring back commons-io to tika-core >

[jira] [Updated] (TIKA-1640) Make ExternalParser support aliases for key names in extracted metadata

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1640: Fix Version/s: (was: 1.13) 1.14 > Make ExternalParser support aliases

[jira] [Updated] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1815: Fix Version/s: (was: 1.13) 1.14 > Text content from parser is empty wh

[jira] [Updated] (TIKA-980) MicrodataContentHandler for Apache Tika

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-980: --- Fix Version/s: (was: 1.13) 1.14 > MicrodataContentHandler for Apache Tika

[jira] [Updated] (TIKA-1672) Integrate tika-java7 component

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1672: Fix Version/s: (was: 1.13) 1.14 > Integrate tika-java7 component > ---

[jira] [Updated] (TIKA-1208) Migrate Any23 mime contributions to Tika

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1208: Fix Version/s: (was: 1.13) 1.14 > Migrate Any23 mime contributions to

[jira] [Updated] (TIKA-1697) Parser Implementation for AkomaNtoso Legal XML Documents

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1697: Fix Version/s: (was: 1.13) 1.14 > Parser Implementation for AkomaNtoso

[jira] [Updated] (TIKA-1738) ForkClient does not always delete temporary bootstrap jar

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1738: Fix Version/s: (was: 1.13) 1.14 > ForkClient does not always delete te

[jira] [Updated] (TIKA-1801) Integrate MITIE Named Entity Recognition support

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1801: Fix Version/s: (was: 1.14) 1.13 > Integrate MITIE Named Entity Recogni

[jira] [Updated] (TIKA-1888) Update mimetype for application/x-netcdf

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1888: Fix Version/s: (was: 1.14) 1.13 > Update mimetype for application/x-ne

[jira] [Updated] (TIKA-1913) Integrate MIT Information Extraction(MITIE) into Tika to perform Named Entity Recognition

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1913: Fix Version/s: (was: 1.14) 1.13 > Integrate MIT Information Extraction

[jira] [Updated] (TIKA-1955) MIME types updates and additions for Scientific Data based on TREC-DD-Polar

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1955: Fix Version/s: (was: 1.14) 1.13 > MIME types updates and additions for

[jira] [Updated] (TIKA-1885) Tika MIME updates for *.cdf and *.xar and custom zero length file detector based on TREC-DD-Polar

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1885: Fix Version/s: (was: 1.14) 1.13 > Tika MIME updates for *.cdf and *.xa

[jira] [Updated] (TIKA-1939) Preparation for Tika 1.13 release

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1939: Fix Version/s: (was: 1.14) 1.13 > Preparation for Tika 1.13 release >

[jira] [Updated] (TIKA-1917) Just a quick fix to allow NLTK Parser extract measurement information from text

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1917: Fix Version/s: (was: 1.14) 1.13 > Just a quick fix to allow NLTK Parse

[jira] [Commented] (TIKA-1885) Tika MIME updates for *.cdf and *.xar and custom zero length file detector based on TREC-DD-Polar

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254825#comment-15254825 ] Chris A. Mattmann commented on TIKA-1885: - ping [~adeshgup] > Tika MIME updates fo

tika-trunk-jdk1.7 - Build # 963 - Failure

2016-04-22 Thread Apache Jenkins Server
The Apache Jenkins build system has built tika-trunk-jdk1.7 (build #963) Status: Failure Check console output at https://builds.apache.org/job/tika-trunk-jdk1.7/963/ to view the results.

tika-trunk-jdk1.7 - Build # 964 - Still Failing

2016-04-22 Thread Apache Jenkins Server
The Apache Jenkins build system has built tika-trunk-jdk1.7 (build #964) Status: Still Failing Check console output at https://builds.apache.org/job/tika-trunk-jdk1.7/964/ to view the results.