Re: [VOTE] Release Apache Tika 2.8.0 Candidate #2

2023-05-14 Thread Dave Meikle
On Thu, 11 May 2023 at 21:08, Tim Allison wrote: > > Please vote on releasing this package as Apache Tika 2.8.0. > The vote is open for the next 72 hours and passes if a majority of at > least three +1 Tika PMC votes are cast. > > [ ] +1 Release this package as Apache Tika 2.8.0 > [ ] -1 Do not

[jira] [Commented] (TIKA-3884) MarianTranslator blocks on Windows

2022-11-06 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17629472#comment-17629472 ] Dave Meikle commented on TIKA-3884: --- Hi [~tallison]  - yes it is now. Just marking

[jira] [Resolved] (TIKA-3884) MarianTranslator blocks on Windows

2022-11-06 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-3884. --- Resolution: Fixed > MarianTranslator blocks on Wind

[jira] [Created] (TIKA-3884) MarianTranslator blocks on Windows

2022-10-17 Thread Dave Meikle (Jira)
Dave Meikle created TIKA-3884: - Summary: MarianTranslator blocks on Windows Key: TIKA-3884 URL: https://issues.apache.org/jira/browse/TIKA-3884 Project: Tika Issue Type: Bug Components

[jira] [Resolved] (TIKA-3660) Add parser for TMX Files

2022-01-22 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-3660. --- Fix Version/s: 2.2.2 Resolution: Done Added in: https://github.com/apache/tika/commit

[jira] [Created] (TIKA-3660) Add parser for TMX Files

2022-01-22 Thread Dave Meikle (Jira)
Dave Meikle created TIKA-3660: - Summary: Add parser for TMX Files Key: TIKA-3660 URL: https://issues.apache.org/jira/browse/TIKA-3660 Project: Tika Issue Type: New Feature Components

[jira] [Resolved] (TIKA-3636) Add MarianTranslator to support Marian NMT Engines

2021-12-30 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-3636. --- Resolution: Fixed > Add MarianTranslator to support Marian NMT Engi

[jira] [Commented] (TIKA-3636) Add MarianTranslator to support Marian NMT Engines

2021-12-30 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17467057#comment-17467057 ] Dave Meikle commented on TIKA-3636: --- Merged in https://github.com/apache/tika/commit

[jira] [Updated] (TIKA-3636) Add MarianTranslator to support Marian NMT Engines

2021-12-30 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-3636: -- Fix Version/s: 2.2.1 > Add MarianTranslator to support Marian NMT Engi

[jira] [Created] (TIKA-3636) Add MarianTranslator to support Marian NMT Engines

2021-12-30 Thread Dave Meikle (Jira)
Dave Meikle created TIKA-3636: - Summary: Add MarianTranslator to support Marian NMT Engines Key: TIKA-3636 URL: https://issues.apache.org/jira/browse/TIKA-3636 Project: Tika Issue Type: New

Re: [VOTE] Release Apache Tika 2.2.1 Candidate #3

2021-12-20 Thread Dave Meikle
On Mon, 20 Dec 2021 at 15:59, Tim Allison wrote: > A candidate for the Tika 2.2.1 release is available at: > https://dist.apache.org/repos/dist/dev/tika/2.2.1 > > The release candidate is a zip archive of the sources in: > https://github.com/apache/tika/tree/2.2.1-rc3/ > > The SHA-512 checksum

[jira] [Commented] (TIKA-3453) SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" Defaulting to no-operation (NOP) logger implementation for tika-docker 2.0.0-BETA and 2.1.0

2021-10-10 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426851#comment-17426851 ] Dave Meikle commented on TIKA-3453: --- Good spot [~lewismc]. I got to the same conclusion as you. Merging

Re: [VOTE] Release Apache Tika 2.0.0 Candidate #1

2021-07-18 Thread Dave Meikle
+1 Cheers, Dave On Wed, 14 Jul 2021 at 19:16, Tim Allison wrote: > All, > A candidate for the Tika 2.0.0 release is available > at: > https://dist.apache.org/repos/dist/dev/tika/2.0.0 > > The release candidate is a zip archive

Re: logging formatter configuration compatible with StackDriver

2021-06-17 Thread Dave Meikle
;>> >> >>> >> On Fri, 11 Jun 2021, Cristian Zamfir wrote: >>> >> > I think for most people it would be quite critical to have logs >>> working. Do >>> >> > you happen to know how I can reach out to the person maintaining >>> the docker >>> >> > images https://hub.docker.com/u/dameikle to see if they are >>> available to >>> >> > update the images? Sounds like it is mostly >>> >> > https://hub.docker.com/u/dameikle >>> >> >>> >> Paging our very own Dave Meikle! >>> >> >>> >> Nick >>> >>

Re: [VOTE] Release Apache Tika 1.25 Candidate #2

2020-11-25 Thread Dave Meikle
On Wed, 25 Nov 2020 at 12:20, Tim Allison wrote: > Please vote on releasing this package as Apache Tika 1.25. > The vote is open for the next 72 hours and passes if a majority of at > least three +1 Tika PMC votes are cast. > > [ ] +1 Release this package as Apache Tika 1.25 > [ ] -1 Do not

[jira] [Created] (TIKA-3227) Allow Tika Server to skip embedded files through HTTP Header

2020-11-11 Thread Dave Meikle (Jira)
Dave Meikle created TIKA-3227: - Summary: Allow Tika Server to skip embedded files through HTTP Header Key: TIKA-3227 URL: https://issues.apache.org/jira/browse/TIKA-3227 Project: Tika Issue

[jira] [Resolved] (TIKA-3191) Issue with GrobidJournalParser

2020-11-09 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-3191. --- Fix Version/s: 1.25 Resolution: Fixed Added to branch_1x in [https://github.com/apache/tika

[jira] [Assigned] (TIKA-3191) Issue with GrobidJournalParser

2020-11-09 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle reassigned TIKA-3191: - Assignee: Dave Meikle > Issue with GrobidJournalPar

[jira] [Resolved] (TIKA-3156) Missing content from .odt file with hyperlinked image

2020-11-09 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-3156. --- Fix Version/s: 1.25 Resolution: Fixed Resolved in main in: [https://github.com/apache/tika

[jira] [Assigned] (TIKA-3156) Missing content from .odt file with hyperlinked image

2020-11-08 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle reassigned TIKA-3156: - Assignee: Dave Meikle > Missing content from .odt file with hyperlinked im

[jira] [Commented] (TIKA-3189) Add FrameMaker MIF Parser

2020-09-21 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199597#comment-17199597 ] Dave Meikle commented on TIKA-3189: --- No worries. It was good getting up to speed with all the awesome

[jira] [Commented] (TIKA-3189) Add FrameMaker MIF Parser

2020-09-21 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199584#comment-17199584 ] Dave Meikle commented on TIKA-3189: --- HI [~tallison] Looks like we were off trying to do a similar thing

[jira] [Closed] (TIKA-2976) Add an XLZ parser

2020-09-20 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle closed TIKA-2976. - Resolution: Implemented Implemented in: [https://github.com/apache/tika/commit

[jira] [Updated] (TIKA-2976) Add an XLZ parser

2020-09-20 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-2976: -- Fix Version/s: 1.23 > Add an XLZ parser > - > > Ke

[jira] [Updated] (TIKA-3188) Add IDML Parser

2020-09-20 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-3188: -- Description: Add a basic IDML parser to get content, XMP metadata and spread counts. > Add IDML Par

[jira] [Resolved] (TIKA-3188) Add IDML Parser

2020-09-20 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-3188. --- Resolution: Implemented Implemented in commit: [https://github.com/apache/tika/commit

[jira] [Updated] (TIKA-3188) Add IDML Parser

2020-09-20 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-3188: -- Fix Version/s: 1.25 > Add IDML Parser > --- > > Ke

Re: release planning?

2020-09-16 Thread Dave Meikle
Hi Tim, I have a couple of things to push (MIF and IDML parsers). Is there still time? I didn't get a chance to complete it before going in for surgery the other week. Thanks, Dave On Wed, 9 Sep 2020 at 17:31, Tim Allison wrote: > Hi Lee, > Thank you for those PRs. I merged them into main

[jira] [Created] (TIKA-3189) Add FrameMaker MIF Parser

2020-09-01 Thread Dave Meikle (Jira)
Dave Meikle created TIKA-3189: - Summary: Add FrameMaker MIF Parser Key: TIKA-3189 URL: https://issues.apache.org/jira/browse/TIKA-3189 Project: Tika Issue Type: Task Components: parser

[jira] [Created] (TIKA-3188) Add IDML Parser

2020-08-31 Thread Dave Meikle (Jira)
Dave Meikle created TIKA-3188: - Summary: Add IDML Parser Key: TIKA-3188 URL: https://issues.apache.org/jira/browse/TIKA-3188 Project: Tika Issue Type: Task Components: parser

[jira] [Commented] (TIKA-3121) Rename master branch

2020-07-13 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17156951#comment-17156951 ] Dave Meikle commented on TIKA-3121: --- Hi [~tallison], Just pushed a main branch there and let Drew know

Re: [EXTERNAL] Do we have a community supported approach for deploying Tika Server in production?

2020-01-08 Thread Dave Meikle
Hi Eric, Will take a look. On a related note, I've created a new repos: https://github.com/apache/tika-docker Thinking based on looking at the PRs and Issues on LogicalSpark docker-tikaserver, I'll create an updated docker file using what you've added here and look to publish builds to docker

[jira] [Commented] (TIKA-3014) XLIFF12Parser fails with ToXMLHandler

2019-12-18 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999571#comment-16999571 ] Dave Meikle commented on TIKA-3014: --- Scratch that, easier just to map lang over to XHTML one as no need

[jira] [Commented] (TIKA-3014) XLIFF12Parser fails with ToXMLHandler

2019-12-18 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999507#comment-16999507 ] Dave Meikle commented on TIKA-3014: --- Good spot. I think we need to add the explict declaration

[jira] [Created] (TIKA-2976) Add an XLZ parser

2019-10-29 Thread Dave Meikle (Jira)
Dave Meikle created TIKA-2976: - Summary: Add an XLZ parser Key: TIKA-2976 URL: https://issues.apache.org/jira/browse/TIKA-2976 Project: Tika Issue Type: New Feature Components: parser

[jira] [Resolved] (TIKA-2894) Add support for WebAssembly (Content-Type application/wasm, or .wasm extension)

2019-10-28 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-2894. --- Fix Version/s: 1.23 Resolution: Fixed Added to master

[jira] [Assigned] (TIKA-2894) Add support for WebAssembly (Content-Type application/wasm, or .wasm extension)

2019-10-28 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle reassigned TIKA-2894: - Assignee: Dave Meikle > Add support for WebAssembly (Content-Type application/wasm, or .w

[jira] [Assigned] (TIKA-2900) Removing comments from *.docx, *.pdf files

2019-10-26 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle reassigned TIKA-2900: - Assignee: Dave Meikle > Removing comments from *.docx, *.pdf fi

[jira] [Resolved] (TIKA-2975) XLIFF 1.2 Parser

2019-10-26 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-2975. --- Fix Version/s: 1.23 Resolution: Fixed > XLIFF 1.2 Par

[jira] [Commented] (TIKA-2975) XLIFF 1.2 Parser

2019-10-26 Thread Dave Meikle (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960484#comment-16960484 ] Dave Meikle commented on TIKA-2975: --- Merged into master in 80b533b86cf8e4c8e090e95479ef60f2a641194f

[jira] [Created] (TIKA-2975) XLIFF 1.2 Parser

2019-10-26 Thread Dave Meikle (Jira)
Dave Meikle created TIKA-2975: - Summary: XLIFF 1.2 Parser Key: TIKA-2975 URL: https://issues.apache.org/jira/browse/TIKA-2975 Project: Tika Issue Type: New Feature Components: parser

[jira] [Commented] (TIKA-2760) LinkContentHandler does not report hyperlinks

2018-11-01 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671189#comment-16671189 ] Dave Meikle commented on TIKA-2760: --- Hi [~markus17], Looking at the Nutch code I can see

[jira] [Commented] (TIKA-2760) LinkContentHandler does not report hyperlinks

2018-10-31 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671148#comment-16671148 ] Dave Meikle commented on TIKA-2760: --- Hi [~markus17], I used your test but moved it in the tika-parsers

[jira] [Updated] (TIKA-2760) LinkContentHandler does not report hyperlinks

2018-10-31 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-2760: -- Attachment: TIKA-2760 - Test for Outlinks.diff > LinkContentHandler does not report hyperli

[jira] [Commented] (TIKA-2760) LinkContentHandler does not report hyperlinks

2018-10-29 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667966#comment-16667966 ] Dave Meikle commented on TIKA-2760: --- [~markus17] - is it typically the HTML parser being used in Nutch

[jira] [Commented] (TIKA-2630) Wrong height and width metadata for JPEG images

2018-10-29 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667918#comment-16667918 ] Dave Meikle commented on TIKA-2630: --- After writing it, I know it really wont given the class of metadata

[jira] [Commented] (TIKA-2630) Wrong height and width metadata for JPEG images

2018-10-29 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667903#comment-16667903 ] Dave Meikle commented on TIKA-2630: --- Thanks for raising this one. Short term we can add in the reading

[jira] [Assigned] (TIKA-2630) Wrong height and width metadata for JPEG images

2018-10-29 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle reassigned TIKA-2630: - Assignee: Dave Meikle > Wrong height and width metadata for JPEG ima

[jira] [Commented] (TIKA-2767) Problem with import xlsx with null cells

2018-10-29 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667840#comment-16667840 ] Dave Meikle commented on TIKA-2767: --- Hi [~iodor] - I've tried to recreate this by building my own Excel

[jira] [Resolved] (TIKA-2599) Hyperlink surrounded by Italics not closed Properly

2018-10-29 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-2599. --- Resolution: Fixed Commited to branch_1x in 324cbd2eb4d64f1e34aba9789ee8b06cbf4d991e and master

[jira] [Commented] (TIKA-2599) Hyperlink surrounded by Italics not closed Properly

2018-10-29 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667823#comment-16667823 ] Dave Meikle commented on TIKA-2599: --- Commited to branch_1x in 324cbd2eb4d64f1e34aba9789ee8b06cbf4d991e

[jira] [Updated] (TIKA-2599) Hyperlink surrounded by Italics not closed Properly

2018-10-29 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-2599: -- Fix Version/s: 1.20 > Hyperlink surrounded by Italics not closed Prope

[jira] [Assigned] (TIKA-2599) Hyperlink surrounded by Italics not closed Properly

2018-10-29 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle reassigned TIKA-2599: - Assignee: Dave Meikle > Hyperlink surrounded by Italics not closed Prope

[jira] [Created] (TIKA-2740) Update Python dependency check for TesseractOCR Parser rotation.py script

2018-09-28 Thread Dave Meikle (JIRA)
Dave Meikle created TIKA-2740: - Summary: Update Python dependency check for TesseractOCR Parser rotation.py script Key: TIKA-2740 URL: https://issues.apache.org/jira/browse/TIKA-2740 Project: Tika

[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390317#comment-16390317 ] Dave Meikle commented on TIKA-1518: --- [~talli...@mitre.org] - ah it looks like the proxy settings aren't

[jira] [Comment Edited] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390284#comment-16390284 ] Dave Meikle edited comment on TIKA-1518 at 3/7/18 9:41 PM: --- It is a choice we

[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390284#comment-16390284 ] Dave Meikle commented on TIKA-1518: --- It is a choice we have to make. There are three mains routes

[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390241#comment-16390241 ] Dave Meikle commented on TIKA-1518: --- {quote}I do have Docker installed, [0] but it is Windows, and I've

[jira] [Comment Edited] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390202#comment-16390202 ] Dave Meikle edited comment on TIKA-1518 at 3/7/18 8:51 PM: --- Sorry [~talli

[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390202#comment-16390202 ] Dave Meikle commented on TIKA-1518: --- Sorry [~talli...@mitre.org] - this is me getting too excited. I'll

[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-04 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385167#comment-16385167 ] Dave Meikle commented on TIKA-1518: --- As the current Dockerfile was out of date, I've updated it to use

[jira] [Assigned] (TIKA-1518) Docker with Tika Server

2018-03-04 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle reassigned TIKA-1518: - Assignee: Dave Meikle > Docker with Tika Server > --- > >

Fwd: Travel Assistance applications open. Please inform your communities

2018-02-16 Thread Dave Meikle
Hello, With ApacheCon NA coming up later this year, please see the below from the Travel Assistance Committee (TAC). Cheers, Dave The Travel Assistance Committee (TAC) are pleased to announce that travel assistance applications for ApacheCon NA 2018 are now open! We will be supporting

[jira] [Commented] (TIKA-2509) TesseractOCRParser ignores configured ImageMagickPath in processImage method

2018-01-15 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326160#comment-16326160 ] Dave Meikle commented on TIKA-2509: --- Created new improvement ticket for the Python path configuration

[jira] [Created] (TIKA-2548) Add Python Path configuration to TesseractOCRParser

2018-01-15 Thread Dave Meikle (JIRA)
Dave Meikle created TIKA-2548: - Summary: Add Python Path configuration to TesseractOCRParser Key: TIKA-2548 URL: https://issues.apache.org/jira/browse/TIKA-2548 Project: Tika Issue Type

[jira] [Resolved] (TIKA-2509) TesseractOCRParser ignores configured ImageMagickPath in processImage method

2018-01-15 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-2509. --- Resolution: Fixed Fix Version/s: 1.18 Updated in  [0b9aa9b5efde795f6b863c987abff5be07530a41

[jira] [Assigned] (TIKA-2509) TesseractOCRParser ignores configured ImageMagickPath in processImage method

2018-01-15 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle reassigned TIKA-2509: - Assignee: Dave Meikle > TesseractOCRParser ignores configured ImageMagickPath in processIm

[jira] [Assigned] (TIKA-2385) Tesseract OCR rotation.py not run

2017-11-24 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle reassigned TIKA-2385: - Assignee: Dave Meikle > Tesseract OCR rotation.py not

[jira] [Resolved] (TIKA-2347) Underlined text is not decorated as such when extracting from word documents

2017-11-23 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-2347. --- Resolution: Fixed Fix Version/s: 1.17 Committed in [639f3bf361a08210da8fae68e3eeb4e12df6c4de

[jira] [Assigned] (TIKA-2347) Underlined text is not decorated as such when extracting from word documents

2017-11-23 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle reassigned TIKA-2347: - Assignee: Dave Meikle > Underlined text is not decorated as such when extracting from w

[ANNOUNCE] Welcome Madhav Sharan as Tika Committer and PMC Member

2017-08-31 Thread Dave Meikle
Hello Everyone, Please join me in welcoming Madhav Sharan as a PMC Members and Committer to the project! Welcome to the team, Madhav. Feel free to say a bit about yourselves and how you got involved in Tika. Cheers, Dave

Re: [VOTE] Release Apache Tika 1.16 Candidate #1

2017-07-12 Thread Dave Meikle
On 8 July 2017 at 03:40, Tim Allison wrote: > > A candidate for the Tika 1.16 release is available at: > https://dist.apache.org/repos/dist/dev/tika/ > > The release candidate is a zip archive of the sources in: > https://github.com/apache/tika/tree/1.16-rc1 > > The SHA1

[jira] [Resolved] (TIKA-2357) Allow Tesseract PSM up to 13

2017-05-08 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-2357. --- Resolution: Fixed Assignee: Dave Meikle Merged in [0aaa121|https://github.com/apache/tika

[jira] [Created] (TIKA-2357) Allow Tesseract PSM up to 13

2017-05-08 Thread Dave Meikle (JIRA)
Dave Meikle created TIKA-2357: - Summary: Allow Tesseract PSM up to 13 Key: TIKA-2357 URL: https://issues.apache.org/jira/browse/TIKA-2357 Project: Tika Issue Type: Improvement

[jira] [Resolved] (TIKA-2297) Add Lingo24 Language Detector

2017-03-13 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-2297. --- Resolution: Fixed Added in commit 64652824f4fd7e9bbbd0c66701c6a814d3739157 (https://github.com/apache

[jira] [Commented] (TIKA-2297) Add Lingo24 Language Detector

2017-03-13 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15907117#comment-15907117 ] Dave Meikle commented on TIKA-2297: --- Failure due to issue communicating with https

[jira] [Resolved] (TIKA-2292) Update CXF version to 3.0.12

2017-03-12 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-2292. --- Resolution: Fixed Assignee: Dave Meikle (was: Sergey Beryozkin) Committed in https

[jira] [Updated] (TIKA-2297) Add Lingo24 Language Detector

2017-03-11 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-2297: -- Fix Version/s: 1.15 > Add Lingo24 Language Detec

[jira] [Created] (TIKA-2297) Add Lingo24 Language Detector

2017-03-11 Thread Dave Meikle (JIRA)
Dave Meikle created TIKA-2297: - Summary: Add Lingo24 Language Detector Key: TIKA-2297 URL: https://issues.apache.org/jira/browse/TIKA-2297 Project: Tika Issue Type: Improvement

[jira] [Assigned] (TIKA-2003) Tika 1.13 gpg signature not validating.

2016-06-15 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle reassigned TIKA-2003: - Assignee: Dave Meikle > Tika 1.13 gpg signature not validat

[jira] [Resolved] (TIKA-1972) Download page points to 1.12 which is not on the ASF mirror hosts anymore

2016-05-16 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-1972. --- Resolution: Fixed Assignee: Dave Meikle The website is up to date now (pushed after mirror

[jira] [Commented] (TIKA-1972) Download page points to 1.12 which is not on the ASF mirror hosts anymore

2016-05-16 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284782#comment-15284782 ] Dave Meikle commented on TIKA-1972: --- Hi [~s...@apache.org] No, we have done a few like this. I just

[jira] [Commented] (TIKA-1885) Tika MIME updates for *.cdf and *.xar and custom zero length file detector based on TREC-DD-Polar

2016-05-08 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275763#comment-15275763 ] Dave Meikle commented on TIKA-1885: --- Good point re Stream. Checking for -1 from read() will be more

[jira] [Resolved] (TIKA-1939) Preparation for Tika 1.13 release

2016-05-08 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-1939. --- Resolution: Fixed > Preparation for Tika 1.13 rele

[jira] [Resolved] (TIKA-1955) MIME types updates and additions for Scientific Data based on TREC-DD-Polar

2016-05-08 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-1955. --- Resolution: Fixed > MIME types updates and additions for Scientific Data based on TREC-DD-Po

[jira] [Resolved] (TIKA-1885) Tika MIME updates for *.cdf and *.xar and custom zero length file detector based on TREC-DD-Polar

2016-05-08 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-1885. --- Resolution: Fixed Code committed in d447193f29531df3022f5137b8f0ec1c73e58cc8 > Tika MIME upda

[jira] [Comment Edited] (TIKA-1885) Tika MIME updates for *.cdf and *.xar and custom zero length file detector based on TREC-DD-Polar

2016-05-08 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275549#comment-15275549 ] Dave Meikle edited comment on TIKA-1885 at 5/8/16 10:31 AM: Have incorporated

[jira] [Commented] (TIKA-1885) Tika MIME updates for *.cdf and *.xar and custom zero length file detector based on TREC-DD-Polar

2016-05-08 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275549#comment-15275549 ] Dave Meikle commented on TIKA-1885: --- Have incorporated this code as to not block TIKA-1955. Ended up

[jira] [Resolved] (TIKA-1965) Added types to Grobid quantities parser

2016-05-07 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-1965. --- Resolution: Fixed Tested locally and added in 8e4c3ff0a37fa7a64f5f675ffb7c0f7a8322cfc4. Thanks

[jira] [Commented] (TIKA-1885) Tika MIME updates for *.cdf and *.xar and custom zero length file detector based on TREC-DD-Polar

2016-05-07 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275267#comment-15275267 ] Dave Meikle commented on TIKA-1885: --- Hi [~adeshgup] - Just reviewing the pull request. Do you have any

[jira] [Commented] (TIKA-1966) Issue in parsing iWorksDocument with Apache Tika

2016-05-04 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270642#comment-15270642 ] Dave Meikle commented on TIKA-1966: --- Yes, the iWorks 13 formats are very different. I have done some work

[jira] [Commented] (TIKA-1939) Preparation for Tika 1.13 release

2016-05-04 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270316#comment-15270316 ] Dave Meikle commented on TIKA-1939: --- Just reviewing the two remaining items (TIKA-1885 and TIKA-1955

[jira] [Commented] (TIKA-1705) Update ASM dependency to 5.0.4

2015-08-11 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681530#comment-14681530 ] Dave Meikle commented on TIKA-1705: --- Thanks [~thetaphi]. Have made the change

[jira] [Resolved] (TIKA-1705) Update ASM dependency to 5.0.4

2015-08-10 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle resolved TIKA-1705. --- Resolution: Fixed Assignee: Dave Meikle Fix Version/s: 1.11 Fixed committed

[jira] [Commented] (TIKA-1705) Update ASM dependency to 5.0.4

2015-08-10 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680853#comment-14680853 ] Dave Meikle commented on TIKA-1705: --- Committed in r1695177. Thanks [~thetaphi]! Update

[jira] [Updated] (TIKA-776) ExifTool Embedder

2015-08-08 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-776: - Fix Version/s: (was: 1.10) 1.11 * Pushed to 1.11 following 1.10 release ExifTool

[jira] [Updated] (TIKA-1435) Update rome dependency to 1.5

2015-08-08 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-1435: -- Fix Version/s: (was: 1.10) 1.11 * Pushed to 1.11 following 1.10 release Update

[jira] [Updated] (TIKA-1106) CLAVIN Integration

2015-08-08 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-1106: -- Fix Version/s: (was: 1.10) 1.11 * Pushed to 1.11 following 1.10 release CLAVIN

[jira] [Updated] (TIKA-987) Embedded drawing (SHAPE MERGEFORMAT) sometimes not extracted

2015-08-08 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-987: - Fix Version/s: (was: 1.10) 1.11 * Pushed to 1.11 following 1.10 release Embedded

[jira] [Updated] (TIKA-1672) Integrate tika-java7 component

2015-08-08 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-1672: -- Fix Version/s: (was: 1.10) 1.11 * Pushed to 1.11 following 1.10 release

  1   2   3   >