[jira] [Commented] (TIKA-1442) Upgrade to PDFBox 1.8.8

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183996#comment-14183996 ] Chris A. Mattmann commented on TIKA-1442: - Working on a solution over in TIKA-1445.

[jira] [Updated] (TIKA-988) We don't extract a placeholder for a Word document embedded in an Excel document

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-988: --- Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > We don't extract a placeho

[jira] [Updated] (TIKA-1300) Switch default PDFBox parser to NonSequentialParser

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1300: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Switch default PDFBox p

[jira] [Updated] (TIKA-1416) Refactor Translator Exception Handling

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1416: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Refactor Translator Exc

[jira] [Updated] (TIKA-776) ExifTool Embedder

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-776: --- Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > ExifTool Embedder > --

[jira] [Updated] (TIKA-1367) Tika documentation should list tika-parsers parser dependencies

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1367: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Tika documentation shou

[jira] [Updated] (TIKA-1307) Jenkins Java7 job requires a profile in order to build 'tika-java7' module.

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1307: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Jenkins Java7 job requi

[jira] [Updated] (TIKA-1366) Update some of Tika Server services to support JAX-RS 2.0 AsyncResponse

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1366: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Update some of Tika Ser

[jira] [Updated] (TIKA-1108) Represent individual slides in pptx

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1108: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Represent individual sl

[jira] [Updated] (TIKA-1079) Word document hits AIOOBE in SummaryExtractor.parseSummaries

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1079: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Word document hits AIOO

[jira] [Updated] (TIKA-539) Encoding detection is too biased by encoding in meta tag

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-539: --- Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Encoding detection is too

[jira] [Updated] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1423: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Build a parser to extra

[jira] [Updated] (TIKA-1379) error in Tika().detect for xml files with xades signature

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1379: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > error in Tika().detect

[jira] [Updated] (TIKA-1456) Visual Sentiment API parser

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1456: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Visual Sentiment API pa

[jira] [Updated] (TIKA-1301) Establish TikaServer on Apache hosted VM

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1301: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Establish TikaServer on

[jira] [Updated] (TIKA-980) MicrodataContentHandler for Apache Tika

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-980: --- Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > MicrodataContentHandler fo

[jira] [Updated] (TIKA-1269) Self-hosted documentation for the JAX-RS Server

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1269: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Self-hosted documentati

[jira] [Updated] (TIKA-1395) Create embedded image extraction example

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1395: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Create embedded image e

[jira] [Updated] (TIKA-1383) Simplify TikeServerCli endpoint setup code

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1383: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Simplify TikeServerCli

[jira] [Updated] (TIKA-1167) Embedded object not extracted

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1167: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Embedded object not ext

[jira] [Updated] (TIKA-995) XHTMLContentHandler doesn't pass attributes of body element

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-995: --- Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > XHTMLContentHandler doesn'

[jira] [Updated] (TIKA-1435) Update rome dependency to 1.5

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1435: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Update rome dependency

[jira] [Updated] (TIKA-1387) Add forbidden-apis checker to TIKA build

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1387: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Add forbidden-apis chec

[jira] [Updated] (TIKA-987) Embedded drawing (SHAPE MERGEFORMAT) sometimes not extracted

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-987: --- Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Embedded drawing (SHAPE ME

[jira] [Updated] (TIKA-1072) AIOOBE when handling embedded document in .doc file

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1072: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > AIOOBE when handling em

[jira] [Updated] (TIKA-1308) Support in memory parse mode(don't create temp file): to support run Tika in GAE

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1308: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Support in memory parse

[jira] [Updated] (TIKA-819) Make Option to Exclude Embedded Files' Text for Text Content

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-819: --- Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Make Option to Exclude Emb

[jira] [Updated] (TIKA-1306) ClassCastException WARN [main] (COSDocument.java:303) - java.lang.ClassCastException: org.apache.pdfbox.cos.COSString cannot be cast to org.apache.pdfbox.cos.COSName in o

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1306: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > ClassCastException WAR

[jira] [Updated] (TIKA-1276) Missing embedded dependencies in tika-bundle

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1276: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Missing embedded depend

[jira] [Updated] (TIKA-1390) Create tika-example module

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1390: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Create tika-example mod

[jira] [Updated] (TIKA-1315) Basic list support in WordExtractor

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1315: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Basic list support in W

[jira] [Updated] (TIKA-1318) Use of Deprecated Word6Extractor.getParagraphText() Method

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1318: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Use of Deprecated Word6

[jira] [Updated] (TIKA-1388) Tika IOUtils java.lang.OutOfMemoryError

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1388: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Tika IOUtils java.lang.

[jira] [Updated] (TIKA-1425) Automatic batching of Microsoft service calls

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1425: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Automatic batching of M

[jira] [Updated] (TIKA-715) Some parsers produce non-well-formed XHTML SAX events

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-715: --- Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Some parsers produce non-w

[jira] [Updated] (TIKA-1106) CLAVIN Integration

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1106: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > CLAVIN Integration > --

[jira] [Updated] (TIKA-891) Use POST in addition to PUT on method calls in tika-server

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-891: --- Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Use POST in addition to PU

[jira] [Updated] (TIKA-1238) Update OutlookExtractor to handle codepage identification more rigorously

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1238: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Update OutlookExtractor

[jira] [Updated] (TIKA-1324) Use a common path for the Tika Server unpacker resources

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1324: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Use a common path for t

[jira] [Updated] (TIKA-1273) old tika-server jar artifact contains no manifest so not able to invoke from shell

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1273: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > old tika-server jar art

[jira] [Updated] (TIKA-1445) Figure out how to add Image metadata extraction to Tesseract parser

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1445: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Figure out how to add I

[jira] [Updated] (TIKA-1384) Use tika-parent dependency management for common dependencies

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1384: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Use tika-parent depende

[jira] [Updated] (TIKA-985) Support for HTML5 elements

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-985: --- Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Support for HTML5 elements

[jira] [Updated] (TIKA-1343) Create a Tika Translator implementation that uses JoshuaDecoder

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1343: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Create a Tika Translato

[jira] [Updated] (TIKA-1295) Make some Dublin Core items multi-valued

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1295: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Make some Dublin Core i

[jira] [Updated] (TIKA-1059) Better Handling of InterruptedException in ExternalParser and ExternalEmbedder

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1059: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Better Handling of Inte

[jira] [Updated] (TIKA-1417) Create Extract Embedded Images from PDFs Example

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1417: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Create Extract Embedded

[jira] [Updated] (TIKA-1442) Upgrade to PDFBox 1.8.8

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1442: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Upgrade to PDFBox 1.8.8

[jira] [Updated] (TIKA-1328) Translate Metadata and Content

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1328: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Translate Metadata and

[jira] [Updated] (TIKA-1426) Let's allow users to specify a tika config file on the commandline for tika-app and tika-server

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1426: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Let's allow users to sp

[jira] [Updated] (TIKA-774) ExifTool Parser

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-774: --- Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > ExifTool Parser >

[jira] [Updated] (TIKA-1208) Migrate Any23 mime contributions to Tika

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1208: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Migrate Any23 mime cont

[jira] [Updated] (TIKA-1220) Parser implementration for IFC files

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1220: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Parser implementration

[jira] [Updated] (TIKA-1408) Fix version for tikadotnet to be tracked along with trunk and release version

2014-10-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1408: Fix Version/s: (was: 1.7) 1.8 - push to 1.8 > Fix version for tikadot

[jira] [Commented] (TIKA-1445) Figure out how to add Image metadata extraction to Tesseract parser

2014-10-24 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183873#comment-14183873 ] Tyler Palsulich commented on TIKA-1445: --- I've been trying my hand at this some time n

Re: 1.7 release?

2014-10-24 Thread Mattmann, Chris A (3980)
Hey Tim, What do you think about my existing patch for 1445? For example to just call all the parsers? I thought I was seeing behavior that was slow because of that, but it turned out to be Tesseract and my machine at the time? I think my patch for 1445 may be enough, and we should get the metada

RE: 1.7 release?

2014-10-24 Thread Allison, Timothy B.
Sorry for coming late to the game on the implications of TIKA-1445. I don't want to hold up the release of 1.7. However, would it be possible to return to the legacy default behavior of extracting metadata from images? We can then document on the OCR parser page on the wiki that you need t

[jira] [Comment Edited] (TIKA-1442) Upgrade to PDFBox 1.8.8

2014-10-24 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183383#comment-14183383 ] Tyler Palsulich edited comment on TIKA-1442 at 10/24/14 8:05 PM:

[jira] [Commented] (TIKA-1442) Upgrade to PDFBox 1.8.8

2014-10-24 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183383#comment-14183383 ] Tyler Palsulich commented on TIKA-1442: --- Yes, unfortunately. Please see TIKA-1445. [~

[jira] [Comment Edited] (TIKA-1442) Upgrade to PDFBox 1.8.8

2014-10-24 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183338#comment-14183338 ] Tim Allison edited comment on TIKA-1442 at 10/24/14 7:22 PM: - H

[jira] [Commented] (TIKA-1442) Upgrade to PDFBox 1.8.8

2014-10-24 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183338#comment-14183338 ] Tim Allison commented on TIKA-1442: --- Hmmm...I can't explain those files, and I recently d

Re: 1.7 release?

2014-10-24 Thread Oleg Tikhonov
Hi Tyler, don't mention. Cheers, Oleg On Oct 24, 2014 8:02 PM, "Tyler Palsulich" wrote: > Thank you for the help, Oleg! I just resolved TIKA-1422. So, are there any > other issues anyone would like to resolve before a new release? > > Thanks, > Tyler > > On Tue, Oct 21, 2014 at 2:42 AM, Oleg Tik

[jira] [Commented] (TIKA-1422) org.apache.tika.parser.mail.RFC822ParserTest fails

2014-10-24 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183161#comment-14183161 ] Hudson commented on TIKA-1422: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #262 (See [https://b

[jira] [Commented] (TIKA-1422) org.apache.tika.parser.mail.RFC822ParserTest fails

2014-10-24 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183103#comment-14183103 ] Hudson commented on TIKA-1422: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #282 (See [https://b

Re: 1.7 release?

2014-10-24 Thread Tyler Palsulich
Thank you for the help, Oleg! I just resolved TIKA-1422. So, are there any other issues anyone would like to resolve before a new release? Thanks, Tyler On Tue, Oct 21, 2014 at 2:42 AM, Oleg Tikhonov wrote: > Sorry!!! > > On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980) < > chris.a.mat

[jira] [Resolved] (TIKA-1422) org.apache.tika.parser.mail.RFC822ParserTest fails

2014-10-24 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Palsulich resolved TIKA-1422. --- Resolution: Fixed Fixed in r1634094. Skip over the two failing checks if Tesseract is installed

RE: import (re)ordering?

2014-10-24 Thread Nick Burch
On Fri, 24 Oct 2014, Allison, Timothy B. wrote: Y, I'll try to be more careful about separating out formatting from content in the future (apologies for TIKA-1451). What I didn't want to do was start an IDE war if others have different settings that will order imports in a different way. I'd

RE: import (re)ordering?

2014-10-24 Thread Tyler Palsulich
Thanks, Tim. I'll be sure to update my settings for this. On a similar note, can we standardize the formatting of the pom.xml files? Right now, they are pretty irregular. Tyler On Oct 24, 2014 10:52 AM, "Allison, Timothy B." wrote: > Y, I'll try to be more careful about separating out formatting

RE: import (re)ordering?

2014-10-24 Thread Allison, Timothy B.
Y, I'll try to be more careful about separating out formatting from content in the future (apologies for TIKA-1451). What I didn't want to do was start an IDE war if others have different settings that will order imports in a different way. Thank you! -Original Message- From: Mattmann

[jira] [Commented] (TIKA-1451) Add Recursive Metadata Parser Wrapper output to tika-app and gui

2014-10-24 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14182674#comment-14182674 ] Tim Allison commented on TIKA-1451: --- Thank you, Chris. The credit goes to [~jukkaz] and

[jira] [Comment Edited] (TIKA-1442) Upgrade to PDFBox 1.8.8

2014-10-24 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173983#comment-14173983 ] Tilman Hausherr edited comment on TIKA-1442 at 10/24/14 11:02 AM: ---