[jira] [Updated] (TIKA-2417) Not able to parase a doc file

2017-07-03 Thread Gaurav (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav updated TIKA-2417: - Attachment: R1-1608674 Discussion on measurement related reference signals.doc > Not able to parase a doc file > -

[jira] [Created] (TIKA-2417) Not able to parase a doc file

2017-07-03 Thread Gaurav (JIRA)
Gaurav created TIKA-2417: Summary: Not able to parase a doc file Key: TIKA-2417 URL: https://issues.apache.org/jira/browse/TIKA-2417 Project: Tika Issue Type: Bug Affects Versions: 1.14

build failures

2017-07-03 Thread Allison, Timothy B.
Any ideas what might be causing the following? e.g.: https://builds.apache.org/job/Tika-trunk/1306/consoleFull java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at java.io.ObjectInputStream$BlockDataInputStream.readInt(ObjectInputStream.java:28

[jira] [Commented] (TIKA-2414) Upgrade gson to 2.8.1

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072907#comment-16072907 ] Hudson commented on TIKA-2414: -- FAILURE: Integrated in Jenkins build Tika-trunk #1306 (See [h

[jira] [Commented] (TIKA-2415) Upgrade libpst to 0.9.3

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072908#comment-16072908 ] Hudson commented on TIKA-2415: -- FAILURE: Integrated in Jenkins build Tika-trunk #1306 (See [h

[jira] [Commented] (TIKA-2416) Upgrade dependencies in tika-eval

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072909#comment-16072909 ] Hudson commented on TIKA-2416: -- FAILURE: Integrated in Jenkins build Tika-trunk #1306 (See [h

[jira] [Comment Edited] (TIKA-2415) Upgrade libpst to 0.9.3

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072862#comment-16072862 ] Tim Allison edited comment on TIKA-2415 at 7/3/17 8:30 PM: --- Y, I

[jira] [Resolved] (TIKA-2414) Upgrade gson to 2.8.1

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2414. --- Resolution: Fixed Fix Version/s: 1.16 > Upgrade gson to 2.8.1 > - > >

[jira] [Resolved] (TIKA-2413) Upgrade mime4j to 0.8.1

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2413. --- Resolution: Fixed Fix Version/s: 1.16 > Upgrade mime4j to 0.8.1 > --- > >

[jira] [Resolved] (TIKA-2416) Upgrade dependencies in tika-eval

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2416. --- Resolution: Fixed Fix Version/s: 1.16 > Upgrade dependencies in tika-eval >

[jira] [Created] (TIKA-2416) Upgrade dependencies in tika-eval

2017-07-03 Thread Tim Allison (JIRA)
Tim Allison created TIKA-2416: - Summary: Upgrade dependencies in tika-eval Key: TIKA-2416 URL: https://issues.apache.org/jira/browse/TIKA-2416 Project: Tika Issue Type: Improvement Comp

[jira] [Resolved] (TIKA-2415) Upgrade libpst to 0.9.3

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2415. --- Resolution: Won't Fix Too many problems...for now. > Upgrade libpst to 0.9.3 > --

[jira] [Comment Edited] (TIKA-2415) Upgrade libpst to 0.9.3

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072862#comment-16072862 ] Tim Allison edited comment on TIKA-2415 at 7/3/17 7:52 PM: --- Y, I

[jira] [Commented] (TIKA-2415) Upgrade libpst to 0.9.3

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072862#comment-16072862 ] Tim Allison commented on TIKA-2415: --- Y, I just found that we can't upgrade anyhow because

[jira] [Commented] (TIKA-2312) [Mp3Parser] expose fields form ID3TagsAndAudio

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072853#comment-16072853 ] Hudson commented on TIKA-2312: -- FAILURE: Integrated in Jenkins build Tika-trunk #1305 (See [h

[jira] [Commented] (TIKA-2368) Clean up SentimentParser dependencies

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072852#comment-16072852 ] Hudson commented on TIKA-2368: -- FAILURE: Integrated in Jenkins build Tika-trunk #1305 (See [h

[jira] [Commented] (TIKA-2313) Old Word document (Word 6.0, 1997) has a badly encoded(?) output.

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072854#comment-16072854 ] Hudson commented on TIKA-2313: -- FAILURE: Integrated in Jenkins build Tika-trunk #1305 (See [h

[jira] [Commented] (TIKA-2415) Upgrade libpst to 0.9.3

2017-07-03 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072842#comment-16072842 ] Luis Filipe Nassif commented on TIKA-2415: -- Hi Tim, Not sure if we should update

[jira] [Created] (TIKA-2415) Upgrade libpst to 0.9.3

2017-07-03 Thread Tim Allison (JIRA)
Tim Allison created TIKA-2415: - Summary: Upgrade libpst to 0.9.3 Key: TIKA-2415 URL: https://issues.apache.org/jira/browse/TIKA-2415 Project: Tika Issue Type: Improvement Reporter: T

[jira] [Created] (TIKA-2414) Upgrade gson to 2.8.1

2017-07-03 Thread Tim Allison (JIRA)
Tim Allison created TIKA-2414: - Summary: Upgrade gson to 2.8.1 Key: TIKA-2414 URL: https://issues.apache.org/jira/browse/TIKA-2414 Project: Tika Issue Type: Improvement Reporter: Tim

[jira] [Created] (TIKA-2413) Upgrade mime4j to 0.8.1

2017-07-03 Thread Tim Allison (JIRA)
Tim Allison created TIKA-2413: - Summary: Upgrade mime4j to 0.8.1 Key: TIKA-2413 URL: https://issues.apache.org/jira/browse/TIKA-2413 Project: Tika Issue Type: Improvement Reporter: Ti

[jira] [Resolved] (TIKA-2412) Upgrade xerial to 3.19.3

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2412. --- Resolution: Fixed > Upgrade xerial to 3.19.3 > > > Key: TIKA-2

[jira] [Commented] (TIKA-2411) Clean up tika-bundle

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072811#comment-16072811 ] Hudson commented on TIKA-2411: -- FAILURE: Integrated in Jenkins build Tika-trunk #1304 (See [h

[jira] [Commented] (TIKA-2089) Macros not extracted from ppt files

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072810#comment-16072810 ] Hudson commented on TIKA-2089: -- FAILURE: Integrated in Jenkins build Tika-trunk #1304 (See [h

[jira] [Created] (TIKA-2412) Upgrade xerial to 3.19.3

2017-07-03 Thread Tim Allison (JIRA)
Tim Allison created TIKA-2412: - Summary: Upgrade xerial to 3.19.3 Key: TIKA-2412 URL: https://issues.apache.org/jira/browse/TIKA-2412 Project: Tika Issue Type: Improvement Reporter: T

RE: Tika 1.15.1? -> 1.16

2017-07-03 Thread Allison, Timothy B.
Sounds good. I'll kick off regression tests now, with a goal of creating 1.16-rc1 on Wednesday 14:00 UTC? -Original Message- From: Mattmann, Chris A (3010) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Monday, July 3, 2017 2:24 PM To: dev@tika.apache.org Subject: Re: Tika 1.15.1? -> 1.16

Re: Tika 1.15.1? -> 1.16

2017-07-03 Thread Mattmann, Chris A (3010)
Hey Tim, if I don’t get it done by today, push 1.16 and we’ll put Age Detection in 1.17. ++ Chris Mattmann, Ph.D. Principal Data Scientist, Engineering Administrative Office (3010) Manager, NSF & Open Source Projects Formulat

[jira] [Resolved] (TIKA-2411) Clean up tika-bundle

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2411. --- Resolution: Fixed Fix Version/s: 1.16 Thank you, [~bobpaulin]! > Clean up tika-bundle > ---

[jira] [Commented] (TIKA-2335) Extract path info from Excel 2013 .xlsx and .xlsb

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072768#comment-16072768 ] Hudson commented on TIKA-2335: -- FAILURE: Integrated in Jenkins build Tika-trunk #1303 (See [h

[jira] [Resolved] (TIKA-2089) Macros not extracted from ppt files

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2089. --- Resolution: Fixed Fix Version/s: 1.16 > Macros not extracted from ppt files > --

[jira] [Comment Edited] (TIKA-2089) Macros not extracted from ppt files

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072754#comment-16072754 ] Tim Allison edited comment on TIKA-2089 at 7/3/17 5:32 PM: --- PPT s

[jira] [Reopened] (TIKA-2089) Macros not extracted from ppt files

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reopened TIKA-2089: --- Assignee: Tim Allison PPT stores macros in a different npoifs than xls, doc. Until this is all stre

[jira] [Commented] (TIKA-2411) Clean up tika-bundle

2017-07-03 Thread Bob Paulin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072715#comment-16072715 ] Bob Paulin commented on TIKA-2411: -- opennlp-maxent and jwnl used to be a transitive depend

[jira] [Commented] (TIKA-2410) RTF parser is tagging non-bold text as bold

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072712#comment-16072712 ] Hudson commented on TIKA-2410: -- FAILURE: Integrated in Jenkins build Tika-trunk #1302 (See [h

[jira] [Resolved] (TIKA-2336) Upgrade to POI 3.17-beta1 when available

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2336. --- Resolution: Fixed Fix Version/s: 1.16 > Upgrade to POI 3.17-beta1 when available > -

[jira] [Resolved] (TIKA-2335) Extract path info from Excel 2013 .xlsx and .xlsb

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2335. --- Resolution: Fixed Fix Version/s: 1.16 > Extract path info from Excel 2013 .xlsx and .xlsb >

[jira] [Updated] (TIKA-2254) Provide chart support for MS Office documents

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2254: -- Fix Version/s: (was: 1.15.1) 1.16 > Provide chart support for MS Office documents

[jira] [Updated] (TIKA-2386) Improve digest options

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2386: -- Fix Version/s: (was: 1.15.1) 1.16 > Improve digest options > -

[jira] [Updated] (TIKA-2391) Extract

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2391: -- Fix Version/s: (was: 1.15.1) 1.16 > Extract

[jira] [Updated] (TIKA-1945) Powerpoint parser doesn't extract text from diagrams

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1945: -- Fix Version/s: (was: 1.15.1) 1.16 > Powerpoint parser doesn't extract text from di

[jira] [Updated] (TIKA-2379) tika-bundle 1.15 has wrong import of org.sfl4j.event package which does not exists

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2379: -- Fix Version/s: (was: 1.15.1) 1.16 > tika-bundle 1.15 has wrong import of org.sfl4j

[jira] [Updated] (TIKA-2377) Remove org.json from TEIParser

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2377: -- Fix Version/s: (was: 1.15.1) 1.16 > Remove org.json from TEIParser > -

[jira] [Updated] (TIKA-2384) Double close of InputStream in accept text/plain in tika-server

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2384: -- Fix Version/s: (was: 1.15.1) 1.16 > Double close of InputStream in accept text/pla

[jira] [Commented] (TIKA-2410) RTF parser is tagging non-bold text as bold

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072671#comment-16072671 ] Tim Allison commented on TIKA-2410: --- Thank you for opening this issue and supplying a tri

[jira] [Resolved] (TIKA-2410) RTF parser is tagging non-bold text as bold

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2410. --- Resolution: Fixed Fix Version/s: 1.16 > RTF parser is tagging non-bold text as bold > --

Re: Tika 1.15.1? -> 1.16

2017-07-03 Thread Tyler Bui-Palsulich
+1 for 1.16. Tyler On Mon, Jul 3, 2017 at 7:17 AM, Allison, Timothy B. wrote: > All, > I think we're now solidly at 1.16. Anyone still strongly in favor of > 1.15.1? > > Chris, > Will age detection be ready soon, or should we push that to 1.17? > > -Original Message- > From: Alliso

[jira] [Commented] (TIKA-2368) Clean up SentimentParser dependencies

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072548#comment-16072548 ] Hudson commented on TIKA-2368: -- FAILURE: Integrated in Jenkins build Tika-trunk #1301 (See [h

[jira] [Comment Edited] (TIKA-2403) Elasticsearch 5.2.2 - Ingest Node - PDF - Parsing Issue

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072483#comment-16072483 ] Tim Allison edited comment on TIKA-2403 at 7/3/17 2:22 PM: --- Y. T

RE: Tika 1.15.1? -> 1.16

2017-07-03 Thread Allison, Timothy B.
All, I think we're now solidly at 1.16. Anyone still strongly in favor of 1.15.1? Chris, Will age detection be ready soon, or should we push that to 1.17? -Original Message- From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: Friday, June 30, 2017 7:01 AM To: dev@tika.apa

Re: documenting configuration

2017-07-03 Thread Nick Burch
On Mon, 3 Jul 2017, Allison, Timothy B. wrote: To help a user configure a parameter in the PDFParser, I just started: https://wiki.apache.org/tika/TikaConfig. I realize, though, that I probably should update: https://tika.apache.org/1.15/configuring.html instead. Preferences, recommendations

documenting configuration

2017-07-03 Thread Allison, Timothy B.
To help a user configure a parameter in the PDFParser, I just started: https://wiki.apache.org/tika/TikaConfig. I realize, though, that I probably should update: https://tika.apache.org/1.15/configuring.html instead. Preferences, recommendations?

[jira] [Resolved] (TIKA-2403) Elasticsearch 5.2.2 - Ingest Node - PDF - Parsing Issue

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2403. --- Resolution: Not A Problem > Elasticsearch 5.2.2 - Ingest Node - PDF - Parsing Issue > -

[jira] [Commented] (TIKA-2403) Elasticsearch 5.2.2 - Ingest Node - PDF - Parsing Issue

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072483#comment-16072483 ] Tim Allison commented on TIKA-2403: --- Y. Thank you. Sorry for the delay. The text you d

[jira] [Commented] (TIKA-2374) Tika App -z should extract PDF inline images by default

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072464#comment-16072464 ] Hudson commented on TIKA-2374: -- FAILURE: Integrated in Jenkins build Tika-trunk #1300 (See [h

[jira] [Commented] (TIKA-2368) Clean up SentimentParser dependencies

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072416#comment-16072416 ] Tim Allison commented on TIKA-2368: --- For now, I'll rename Tika's SentimentParser to Senti

[jira] [Updated] (TIKA-2411) Clean up tika-bundle

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2411: -- Priority: Trivial (was: Major) > Clean up tika-bundle > > > Key: TI

[jira] [Created] (TIKA-2411) Clean up tika-bundle

2017-07-03 Thread Tim Allison (JIRA)
Tim Allison created TIKA-2411: - Summary: Clean up tika-bundle Key: TIKA-2411 URL: https://issues.apache.org/jira/browse/TIKA-2411 Project: Tika Issue Type: Improvement Reporter: Tim A

[jira] [Commented] (TIKA-2389) Warn log level is pretty strong for missing JBIG2ImageReader

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072381#comment-16072381 ] Hudson commented on TIKA-2389: -- FAILURE: Integrated in Jenkins build Tika-trunk #1299 (See [h

[jira] [Resolved] (TIKA-2374) Tika App -z should extract PDF inline images by default

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2374. --- Resolution: Fixed Fix Version/s: 1.16 If a user does not supply a TikaConfig on the commandline,

[jira] [Commented] (TIKA-2403) Elasticsearch 5.2.2 - Ingest Node - PDF - Parsing Issue

2017-07-03 Thread Boopathi (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072355#comment-16072355 ] Boopathi commented on TIKA-2403: Hope you have received the file > Elasticsearch 5.2.2 - I

[jira] [Resolved] (TIKA-2389) Warn log level is pretty strong for missing JBIG2ImageReader

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2389. --- Resolution: Fixed Fix Version/s: 1.16 > Warn log level is pretty strong for missing JBIG2ImageRe

[jira] [Commented] (TIKA-2380) Upgrade to Jackcess 2.1.8 when available

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072303#comment-16072303 ] Hudson commented on TIKA-2380: -- FAILURE: Integrated in Jenkins build Tika-trunk #1298 (See [h

[jira] [Commented] (TIKA-2336) Upgrade to POI 3.17-beta1 when available

2017-07-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072302#comment-16072302 ] Hudson commented on TIKA-2336: -- FAILURE: Integrated in Jenkins build Tika-trunk #1298 (See [h

[jira] [Commented] (TIKA-2399) Version conflict with non-ASL jai-imageio-jpeg2000 and edu.ucar jj2000

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072295#comment-16072295 ] Tim Allison commented on TIKA-2399: --- For the next release (1.16?) should we exclude this

[jira] [Resolved] (TIKA-2404) XMLException in DOCX->TXT conversion

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2404. --- Resolution: Workaround Fix Version/s: 1.16 Without significant work, we can't fix this in POI's

[jira] [Resolved] (TIKA-2405) SAXParseException in text extraction from DOCX file

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2405. --- Resolution: Workaround Fix Version/s: 1.16 Without significant work, we can't fix this in POI's

[jira] [Resolved] (TIKA-2408) ZipException in text extraction from DOCX file

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2408. --- Resolution: Workaround Fix Version/s: 1.16 Without significant work, we can't fix this in POI's

[jira] [Resolved] (TIKA-2201) OutOfMemoryError on a reasonably sized document

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2201. --- Resolution: Workaround Fix Version/s: 1.16 Without significant work, we can't fix this in POI's

[jira] [Resolved] (TIKA-2147) ClassCastException on a valid Word template

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2147. --- Resolution: Fixed Fix Version/s: 1.15 Without significant work, we can't fix this in POI's DOM p

[jira] [Resolved] (TIKA-2376) Avoid org.json dependency

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2376. --- Resolution: Fixed Fix Version/s: 1.16 > Avoid org.json dependency > - >

[jira] [Resolved] (TIKA-1804) Tika use no free json.org

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1804. --- Resolution: Fixed Fix Version/s: 1.16 > Tika use no free json.org > - >

[jira] [Commented] (TIKA-2378) Error extracting text from application/x-msaccess mime type

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16072279#comment-16072279 ] Tim Allison commented on TIKA-2378: --- Many thanks, again, to James Ahlborn! [~sreynolds],

[jira] [Resolved] (TIKA-2378) Error extracting text from application/x-msaccess mime type

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2378. --- Resolution: Fixed Fix Version/s: 1.16 > Error extracting text from application/x-msaccess mime t

[jira] [Resolved] (TIKA-2380) Upgrade to Jackcess 2.1.8 when available

2017-07-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2380. --- Resolution: Fixed Fix Version/s: 1.16 > Upgrade to Jackcess 2.1.8 when available > -