Re: [VOTE] Release Apache Tika 2.9.2 Candidate #2

2024-04-01 Thread Oleg Tikhonov
+1, Thanks. On Mon, 1 Apr 2024 at 23:36 Tim Allison wrote: > Any fellow devs able to vote? We need one more vote. Thank you! > > On Tue, Mar 26, 2024 at 12:22 PM Tilman Hausherr > wrote: > > > +1 > > > > successful build on Windows 10, oracle jdk 1.8.0_391 > > > > Tilman > > > > On 26.03.2024

Re: [VOTE] Release Apache Tika 2.9.1 Candidate #1

2023-10-18 Thread Oleg Tikhonov
+1 Jdk 8 and 11, ubuntu 20 On Tue, 17 Oct 2023 at 21:05 Tilman Hausherr wrote: > +1 > > successful build on german windows on jdk 11.0.20 > > Tilman > > On 17.10.2023 13:13, Tim Allison wrote: > > A candidate for the Tika 2.9.1 release is available at: > >

Re: [VOTE] Release Apache Tika 2.9.0 Candidate #1

2023-08-25 Thread Oleg Tikhonov
Here is mine +1 Thanks On Fri, 25 Aug 2023 at 11:48 Konstantin Gribov wrote: > Hi, folks. > > Built successfully on ArchLinux, OpenJdk 11 & 17 (Temurin 11.0.20+8 and > 17.0.8+7) with tesseract 5.3.2 and leptonica 1.83.1. > > SHA512 and GPG signatures look fine to me. > > [x] +1 Release this

Re: [VOTE] Release Apache Tika 2.8.0 Candidate #2

2023-05-13 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 2.8.0 Ubuntu 20.04, open jdk 11, basic stuff. On Sat, 13 May 2023 at 18:03 Tilman Hausherr wrote: > +1 > > Successful build on latest oracle jdk8 on german windows. > > Tilman > > On 11.05.2023 22:07, Tim Allison wrote: > > A candidate for the Tika

Re: [VOTE] Apache Tika 2.8.0 Release Candidate 1

2023-05-10 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 2.8.0 Ubuntu 20, java 11. Thanks, Oleg > On Tue, May 9, 2023, 11:40 AM Tim Allison wrote: > > > A candidate for the Tika 2.8.0 release is available at: > > https://dist.apache.org/repos/dist/dev/tika/2.8.0 > > > > The release candidate is a zip

Re: [VOTE] Release Apache Tika 2.7.0 Candidate #1

2023-02-02 Thread Oleg Tikhonov
Hey, +1 Ubuntu, jdk 8 (Oracle). Thanks, Oleg On Fri, Feb 3, 2023 at 6:09 AM Tilman Hausherr wrote: > +1 > > builds on german W10 with jdk8 > > Tilman > > On 31.01.2023 20:13, Tim Allison wrote: > > A candidate for the Tika 2.7.0 release is available at: > >

Re: [VOTE] Release Apache Tika 2.6.0 Candidate #1

2022-11-04 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 2.6.0 [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 38:54 min [INFO] Finished at:

Re: Possibly speeding up tests with Gradle - anyone interested?

2022-10-05 Thread Oleg Tikhonov
Hi Nick, Honestly I am trying to port our project to gradle. But it goes not well. It is good idea. Is some folk can help, we can do it together. +1 Cheers, Oleg On Wed, Oct 5, 2022, 22:05 Nick Burch wrote: > Hi All > > At ApacheCon this week, a Bob and myself ended up chatting with the folks >

Re: [VOTE] Release Apache Tika 2.5.0 Candidate #1

2022-09-30 Thread Oleg Tikhonov
Ubuntu 20.04, java sdk 11, +1 Thanks On Fri, Sep 30, 2022, 21:33 Tilman Hausherr wrote: > +1 > > builds on windows 10, oracle jdk1.8.0_341 > > Tilman > > On 30.09.2022 16:12, Tim Allison wrote: > > A candidate for the Tika 2.5.0 release is available at: > >

Re: [VOTE] Release Apache Tika 1.28.4 Candidate #1

2022-06-16 Thread Oleg Tikhonov
Hey, [x] +1 Release this package as Apache Tika 1.28.4 Java 8, ubuntu 20, basic stuff. Thanks, Oleg On Thu, Jun 16, 2022, 17:42 Konstantin Gribov wrote: > Built successfully on ArchLinux, OpenJDK 11 & 17 (Temurin-11.0.15+10 & > 17.0.3+7) w/ Tesseract 5.1.0, Leptonica 1.82. > The issue with the

Re: [VOTE] Release Apache Tika 2.4.1 Candidate #1

2022-06-15 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 2.4.1 [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 23:55 min [INFO] Finished at:

Re: [VOTE] Release Apache Tika 1.28.3 Candidate #1

2022-05-26 Thread Oleg Tikhonov
Hi, Here is +1, ubuntu, java 11, x86_64. Thanks, Oleg On Thu, May 26, 2022, 11:04 Tilman Hausherr wrote: > +1 > > Tilman > > Am 23.05.2022 um 20:38 schrieb Tim Allison: > > I'm indifferent but lean slightly towards going forward as is. > > > > If anyone has a hesitation, I'm happy to revert the

Re: next release: 1.28.3?

2022-05-18 Thread Oleg Tikhonov
Good idea! +1. Cheers, Oleg On Wed, May 18, 2022, 17:11 Tim Allison wrote: > All, > I propose kicking off a release for 1.28.3 early next week. I've updated > some dependencies. What do you think? > > Best, > > Tim >

Re: [VOTE] Release Apache Tika 2.4.0 Candidate #1

2022-04-29 Thread Oleg Tikhonov
Hi, +1, Ubuntu 20, x86, Java 11. Thanks! > On 29 Apr 2022, at 2:23, Tim Allison wrote: > > A candidate for the Tika 2.4.0 release is available at: > https://dist.apache.org/repos/dist/dev/tika/2.4.0 > > The release candidate is a zip archive of the sources in: >

Re: [VOTE] Release Apache Tika 1.28.2 Candidate #2

2022-04-29 Thread Oleg Tikhonov
Hi, +1. Basic stuff, linux ubuntu 20, x86, java 11. Thanks. On Thu, Apr 28, 2022, 20:23 Tilman Hausherr wrote: > +1 > > Tilman > > Am 28.04.2022 um 16:54 schrieb Tim Allison: > > A candidate for the Tika 1.28.2 release is available at: > >https://dist.apache.org/repos/dist/dev/tika/1.28.2 >

Re: [VOTE] Release Apache Tika 1.28.1 Candidate #1

2022-02-10 Thread Oleg Tikhonov
+1 , ubuntu 20.04, open jdk 11. Thanks, Oleg On Fri, Feb 11, 2022, 04:34 David Meikle wrote: > Hello, > > On Tue, 8 Feb 2022 at 18:22, Tim Allison wrote: > > > A candidate for the Tika 1.28.1 release is available at: > > https://dist.apache.org/repos/dist/dev/tika/1.28.1 > > > > The release

Re: [VOTE] Release Apache Tika 2.3.0 Candidate 1

2022-02-06 Thread Oleg Tikhonov
Hi, Linux Ubuntu 20.04, java 11. +1 Thanks, Oleg On Sun, Feb 6, 2022, 22:05 Konstantin Gribov wrote: > Hi, folks. > > SHA512 checksums and GPG signatures are fine. > > Built successfully on ArchLinux, OpenJDK 17 & 11 (Temurin-17.0.1+12 & > Temurin-11.0.13+8), Tesseract 5.0.1-2, Leptonica

Re: [VOTE] Release Apache Tika 1.28 Candidate #3

2021-12-21 Thread Oleg Tikhonov
Hi, [x] +1 Release this package as Apache Tika 1.28 mvn clean install -U OK *OS and arch*: Linux oleg-vb 5.11.0-41-generic #45~20.04.1-Ubuntu SMP Wed Nov 10 10:20:10 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux *Java version*: openjdk version "1.8.0_312" OpenJDK Runtime Environment (build

Re: [VOTE] Release Apache Tika 2.2.1 Candidate #3

2021-12-20 Thread Oleg Tikhonov
Hi, [x] +1 Release this package as Apache Tika 2.2.1 mvn clean install -U *OK* OS and arch: Linux oleg-vb 5.11.0-41-generic #45~20.04.1-Ubuntu SMP Wed Nov 10 10:20:10 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux Java version: openjdk version "1.8.0_312" OpenJDK Runtime Environment (build

Re: [VOTE] Release Apache Tika 2.2.0 Candidate #1

2021-12-14 Thread Oleg Tikhonov
+1 > On 15 Dec 2021, at 0:01, Tim Allison wrote: > > +1 > > On Tue, Dec 14, 2021 at 4:31 PM Lewis John McGibbney > wrote: > >> I'll submit a PR for the README but I think it's also worthwile to augment >> the release management guide so that the message to review the release >> candidate

Re: [VOTE] Release Apache Tika 2.1.0 Candidate #2

2021-08-23 Thread Oleg Tikhonov
+1 basic staff, ubuntu 20.04, java 11 Thanks, Oleg On Mon, Aug 23, 2021, 20:58 Konstantin Gribov wrote: > Hi, Tim. > > SHA512 and gpg signatures are fine, build succeeds on Linux/OpenJDK11 > except Tesseract issue (same as before, 4.1.1 extracts "Page?2" instead of > "Page 2" in multipage

Re: [DISCUSS] Support Elasticsearch in the tika-pipes module?

2021-07-26 Thread Oleg Tikhonov
Hi Tim, I would prefer to cut our suppot for non Apache realm lisences. Thanks, Oleg On Tue, Jul 27, 2021, 00:08 Tim Allison wrote: > All, > > As you may have heard, Amazon forked the last Apache licensed > version of Elasticsearch and is now releasing it as pure ASL 2.0 under > the name

Re: [VOTE] Release Apache Tika 2.0.0 Candidate #1

2021-07-18 Thread Oleg Tikhonov
+1 Thanks, Oleg > On 19 Jul 2021, at 4:04, Dave Meikle wrote: > > +1 > > Cheers, > Dave > > On Wed, 14 Jul 2021 at 19:16, Tim Allison wrote: > >> All, >> A candidate for the Tika 2.0.0 release is available >> at: >>

Re: [VOTE] Release Apache Tika 1.27 Candidate #1

2021-07-02 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 1.27 > On 2 Jul 2021, at 21:21, Tilman Hausherr wrote: > > +1 > > Tilman > > Am 30.06.2021 um 22:03 schrieb Tim Allison: >> A candidate for the Tika 1.27 release is available at: >> https://dist.apache.org/repos/dist/dev/tika/1.27 >> >> The KEYS

Re: [VOTE] Release Apache Tika 2.0.0-BETA Candidate #1

2021-05-21 Thread Oleg Tikhonov
Hi Tim, My +1. Ubuntu 20, basic stuff. Java 11. Best regards, Oleg > On 19 May 2021, at 18:29, Tim Allison wrote: > > All, > > A candidate for the Tika 2.0.0-BETA release is available at: > https://dist.apache.org/repos/dist/dev/tika/ > > The release candidate is a zip archive of the sources

Re: 2.0.0-BETA?

2021-05-11 Thread Oleg Tikhonov
Hi Tim, Thanks for the effort! +1. BR, Oleg On Tue, May 11, 2021, 16:51 Tim Allison wrote: > All, > What would you say to a beta release towards the end of this > week/beginning of next? > > Cheers, > > Tim >

Re: Release 1.27?

2021-04-28 Thread Oleg Tikhonov
+1 On Wed, Apr 28, 2021, 19:22 Tim Allison wrote: > All, > > There have been a number of key fixes in 1.x and some security fixes > in some of our dependencies. Any objections to starting the release > process for 1.27 in the next few weeks? Any blockers we need to fix > for 1.27? > >

Re: [VOTE] Accept tika-helm source code into the Apache Tika project

2021-04-10 Thread Oleg Tikhonov
Great! +1 On Fri, Apr 9, 2021, 06:10 Lewis John McGibbney wrote: > Hi dev@, > > I am opening this VOTE with the goal of donating the tika-helm source code > [0] into the Apache Tika project. > Tika-helm is a Helm chart [1] to deploy Apache Tika on Kubernetes (K8s) > [2]. More specifically the

Re: [VOTE] Release Apache Tika 1.26 Candidate #1

2021-03-25 Thread Oleg Tikhonov
[INFO] [INFO] Reactor Summary for Apache Tika 1.26: [INFO] [INFO] Apache Tika parent . SUCCESS [ 40.841 s] [INFO] Apache Tika core ... SUCCESS [01:08 min] [INFO]

Re: [VOTE] Release Apache Tika 2.0.0-ALPHA Candidate #1

2021-01-15 Thread Oleg Tikhonov
+1. Good job! On Thu, Jan 14, 2021 at 8:44 PM Tilman Hausherr wrote: > +1 > > Tilman > > Am 14.01.2021 um 02:19 schrieb Tim Allison: > > All, > > > > A candidate for the Tika 2.0.0-ALPHA release is available at: > >https://dist.apache.org/repos/dist/dev/tika/ > > > > The release candidate

Re: [VOTE] Release Apache Tika 1.25 Candidate #2

2020-11-27 Thread Oleg Tikhonov
Here is my +1. Did basic stuff. Seems ok. Thanks! On Thu, Nov 26, 2020, 01:15 Ken Krugler wrote: > +1 > > Thanks Tim. > > — Ken > > > On Nov 25, 2020, at 4:20 AM, Tim Allison wrote: > > > > A candidate for the Tika 1.25 release is available at: > >

Re: [EXTERNAL] Tika 2.0 modularization

2020-08-18 Thread Oleg Tikhonov
Hi Tim, looks awesome. Somehow I did not find a couple of parsers, probably it is because of on-going work ... In addition, I was thinking about "getting rid of" maven. If we are going to make Tika more modern, maybe gradle can do a trick? Do we plan to add new Java "gooddies" like lambdas,

Re: renaming master?

2020-06-16 Thread Oleg Tikhonov
Hi Tim, for me, "main" makes more sense. But, no objection to any other option! Thanks, Oleg On Tue, Jun 16, 2020 at 8:31 PM Tim Allison wrote: > All, > > As you may have seen, there's a movement to rename the "master" branch to > "main" or "trunk" (at least in the U.S.)[1][2]. Github is

Re: [VOTE] Release Apache Tika 1.24.1 Candidate #1

2020-04-19 Thread Oleg Tikhonov
Hi Tim, Thanks for doing this! I've ran all basic stuff on Ubuntu 18 with Java 8. All tests are passed. Here is my + 1. BR, Oleg On Sat, Apr 18, 2020 at 12:38 AM Tim Allison wrote: > A candidate for the Tika 1.24.1 release is available at: > https://dist.apache.org/repos/dist/dev/tika/ > >

Re: 1.24.1?

2020-04-15 Thread Oleg Tikhonov
+1. Seems ok to me. Thanks, Oleg On Wed, Apr 15, 2020, 00:18 Tim Allison wrote: > I fixed the hwp5 multithreading problem. > > I looked into tar files, and the handful I reviewed had a "skip the rest of > the final block with x bytes", but there weren't actually x bytes. This > didn't harm

Re: [EXTERNAL] Re: JDK 12 build issues

2020-03-18 Thread Oleg Tikhonov
Hi Chris, I'm currently trying to build an env with java 12/13 ... in order to try your setup. What java version are you using? open jdk or oracle? One upon a time was a bug in openjdk https://bugs.openjdk.java.net/browse/JDK-8131146 But it seems to be ok in recent releases. Keep you updated.

Re: 1.24?

2020-02-05 Thread Oleg Tikhonov
>> Should we wait for the next version of PDFBox? May be it's worth waiting >> what would you think of the week of the 23rd/ first week of March? Sounds good. BR, Oleg On Wed, Feb 5, 2020 at 4:41 PM Tim Allison wrote: > All, > > The new version of POI will be out soon. I have a couple of

Re: [VOTE] Release Apache Tika 1.23 Candidate #2

2019-12-03 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 1.23 Thanks, Oleg On Tue, Dec 3, 2019 at 5:15 AM Tim Allison wrote: > A candidate for the Tika 1.23 release is available at: > https://dist.apache.org/repos/dist/dev/tika/ > > The release candidate is a zip archive of the sources in: >

Re: [VOTE] Release Apache Tika 1.23 Candidate #1

2019-11-29 Thread Oleg Tikhonov
Hi, here is my +1. All tests are passed un ubuntu 19.04. Thanks Tim! Best Regards, Oleg On Thu, Nov 28, 2019, 15:39 Markus Jelsma wrote: > +1! > > All tests pass and i can seamlessly update our internal software to 1.23. > > Thanks! > > -Original message- > > From:Tim Allison > >

Re: [EXTERNAL] Docker image along with 1.23?

2019-11-21 Thread Oleg Tikhonov
My question is more pragmatic. What we put inside the Dockerfile, on which image it will be based on (say Ubuntu) ... What will contain an entrypoint? Tika Server? Should we "install" a tesseract? Anything more? Thanks, Oleg On Thu, Nov 21, 2019 at 4:46 AM Chris Mattmann wrote: > Yeah

Re: [VOTE] Release Apache Tika 1.22 Candidate #4

2019-07-30 Thread Oleg Tikhonov
Hi Tim, thanks for the release !!! Here is my +1, tested on Ubuntu 18.04.2 LTS, x_86 arc. Best wishes, Oleg On Mon, Jul 29, 2019 at 8:50 PM Tim Allison wrote: > A candidate for the Tika 1.22 release is available at: > > https://dist.apache.org/repos/dist/dev/tika/ > > > The release candidate

Re: 1.22?

2019-07-15 Thread Oleg Tikhonov
+1 On Mon, Jul 15, 2019 at 2:41 PM Tim Allison wrote: > Anyone have anything they want to get into 1.22? If not, I’ll kick off the > regression tests shortly. > > Cheers, > Tim >

Re: Tika 1.22?

2019-06-25 Thread Oleg Tikhonov
Would be great!!! Cheers, Oleg On Tue, Jun 25, 2019, 17:45 Tim Allison wrote: > All, > The vote for the next version of PDFBox is under way. I think we've > had a number of useful upgrades since our last release. Any > objections to starting the release process for Tika 1.22 a week or so >

Re: [jira] [Commented] (TIKA-2878) Update dependencies for 1.21.1 or 1.22

2019-05-20 Thread Oleg Tikhonov
Today I've also used a master branch and got the same result. On Mon, May 20, 2019 at 8:59 PM Tim Allison (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/TIKA-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844167#comment-16844167 > ] > > Tim

Re: [VOTE] Release Apache Tika 1.21 Candidate #2

2019-05-15 Thread Oleg Tikhonov
Here is my +1. Thanks, Tim! On Wed, May 15, 2019 at 5:16 AM Tim Allison wrote: > A candidate for the Tika 1.21 release is available at: > > https://dist.apache.org/repos/dist/dev/tika/ > > The release candidate is a zip archive of the sources in: >

Re: [VOTE] Release Apache Tika 1.21 Candidate #1

2019-05-14 Thread Oleg Tikhonov
:-) I'm good with any option. RC1 seems to be good from my point of view. Cheers, Oleg On Tue, May 14, 2019 at 3:56 PM Tim Allison wrote: > All, > I'm happy to close rc1 and respin an rc2 after Oleg's findings > (TIKA-2871 and TIKA-2872)...many thanks, Oleg! I'm also happy to > proceed with

Re: [VOTE] Release Apache Tika 1.21 Candidate #1

2019-05-14 Thread Oleg Tikhonov
Hi all, [x] +1 Release this package as Apache Tika 1.21 I've ran just basic stuff, mvn clean install (Ubuntu x86, java 8). Seems to be good. Thanks, Oleg On Mon, May 13, 2019 at 8:33 PM Tim Allison wrote: > A candidate for the Tika 1.21 release is available at: > >

[jira] [Commented] (TIKA-2872) tika-dl - add slf4j-log4j12 dependency to pom.xml

2019-05-14 Thread Oleg Tikhonov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839311#comment-16839311 ] Oleg Tikhonov commented on TIKA-2872: - Possible fix attached. > tika-dl - add slf4j-log4

[jira] [Updated] (TIKA-2872) tika-dl - add slf4j-log4j12 dependency to pom.xml

2019-05-14 Thread Oleg Tikhonov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleg Tikhonov updated TIKA-2872: Attachment: tika-dl-pom.xml.patch > tika-dl - add slf4j-log4j12 dependency to pom.

[jira] [Created] (TIKA-2872) tika-dl - add slf4j-log4j12 dependency to pom.xml

2019-05-14 Thread Oleg Tikhonov (JIRA)
Oleg Tikhonov created TIKA-2872: --- Summary: tika-dl - add slf4j-log4j12 dependency to pom.xml Key: TIKA-2872 URL: https://issues.apache.org/jira/browse/TIKA-2872 Project: Tika Issue Type: Bug

[jira] [Created] (TIKA-2871) TestChmExtraction - testMultiThreaded throws exception 1.21-rc1

2019-05-14 Thread Oleg Tikhonov (JIRA)
Oleg Tikhonov created TIKA-2871: --- Summary: TestChmExtraction - testMultiThreaded throws exception 1.21-rc1 Key: TIKA-2871 URL: https://issues.apache.org/jira/browse/TIKA-2871 Project: Tika

Re: Tika 1.21?

2019-04-23 Thread Oleg Tikhonov
could start the > regression tests now (well, tomorrowish), though, unless anyone has > anything they want to get in...I'm happy to wait, though, till next > week to start the regression tests. > WDYT? > >Cheers, > >Tim > > On Mon, Apr 8, 2019 at 2:2

Re: Tika 1.21?

2019-04-08 Thread Oleg Tikhonov
Great! +1. Thanks, Oleg On Mon, Apr 8, 2019, 21:11 Tim Allison wrote: > All, > PDFBox will be out in a few days, and POI should be out soon as > well. I _think_ I'd like to get in a first draft of "auto" mode for > OCR'ing PDFs (TIKA-2749), but other than that, I'd be willing to run a >

[jira] [Commented] (TIKA-2650) Soft-hyphen is not extracted properly

2019-03-31 Thread Oleg Tikhonov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806141#comment-16806141 ] Oleg Tikhonov commented on TIKA-2650: - There is no simple solution. Here is some research related

Re: [VOTE] Release Apache Tika 1.20 Candidate #1

2018-12-22 Thread Oleg Tikhonov
*stuff On Sat, Dec 22, 2018, 11:01 Oleg Tikhonov All basic staff passed. > +1. > Oleg > > On Fri, Dec 21, 2018, 22:02 Ken Krugler wrote: > >> Hi Tim, >> >> Thanks for rolling the release. >> >> Built & validated on Mac OS X 10.12 >> >&

Re: [VOTE] Release Apache Tika 1.20 Candidate #1

2018-12-22 Thread Oleg Tikhonov
All basic staff passed. +1. Oleg On Fri, Dec 21, 2018, 22:02 Ken Krugler Hi Tim, > > Thanks for rolling the release. > > Built & validated on Mac OS X 10.12 > > Updated flink-crawler, all tests pass. > > So here’s my +1 > > — Ken > > > > On Dec 17, 2018, at 6:14 PM, Tim Allison wrote: > > > > A

[jira] [Comment Edited] (TIKA-2368) Clean up SentimentParser dependencies

2018-10-14 Thread Oleg Tikhonov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649290#comment-16649290 ] Oleg Tikhonov edited comment on TIKA-2368 at 10/14/18 8:14 AM: --- {code:java

[jira] [Commented] (TIKA-2368) Clean up SentimentParser dependencies

2018-10-14 Thread Oleg Tikhonov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649290#comment-16649290 ] Oleg Tikhonov commented on TIKA-2368: - {code:java} [INFO] Apache Tika parent

Fwd: DIH for TikaEntityProcessor

2018-10-12 Thread Oleg Tikhonov
-- Forwarded message - From: Martin Frank Hansen (MHQ) Date: Wed, Oct 10, 2018, 11:15 Subject: DIH for TikaEntityProcessor To: solr-u...@lucene.apache.org Hi, I am trying to read documents from a file system into Solr, using dataimporthandler but keep getting the following

Re: [VOTE] Release Apache Tika 1.19.1 Candidate #2

2018-10-09 Thread Oleg Tikhonov
sorry. +1 On Tue, Oct 9, 2018 at 7:26 PM Tim Allison wrote: > Thank you, Dave! > > Fellow devs, would anyone else have a chance to vote? We need a third > for the release. Thank you! > On Mon, Oct 8, 2018 at 4:36 AM wrote: > > > > Hello, > > > > On Thu, 4 Oct 2018 at 23:03, Tim Allison

Re: Release Announcement: General Availability of JDK 11

2018-09-26 Thread Oleg Tikhonov
Good news!!! On Thu, Sep 27, 2018, 00:06 Tim Allison wrote: > +1 successful build > On Wed, Sep 26, 2018 at 5:20 AM Rory O'Donnell > wrote: > > > > Hi Tim, > > > > *1) Release Announcement: General Availability of JDK 11 * > > > > * JDK 11, the reference implementation of Java 11 and the

Re: [jira] [Created] (TIKA-2730) parseToString fails for a simple mp3

2018-09-19 Thread Oleg Tikhonov
Hi, It would be great, if you could attach such a file. Or does it fails on any? On Wed, Sep 19, 2018, 13:13 Boris Petrov (JIRA) wrote: > Boris Petrov created TIKA-2730: > -- > > Summary: parseToString fails for a simple mp3 > Key:

Re: [VOTE] Release Apache Tika 1.19 Candidate #1

2018-09-17 Thread Oleg Tikhonov
Hi Tim, thanks ! [INFO] Apache Tika parent . SUCCESS [ 5.138 s] [INFO] Apache Tika core ... SUCCESS [ 58.722 s] [INFO] Apache Tika parsers SUCCESS [04:20 min] [INFO] Apache Tika XMP

Re: [jira] [Commented] (TIKA-2725) Make tika-server robust against ooms/infinite loops/memory leaks

2018-09-07 Thread Oleg Tikhonov
Yep, seems to be best match... unblocked execution. On Thu, Sep 6, 2018, 23:47 Tim Allison (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/TIKA-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606373#comment-16606373 > ] > > Tim Allison commented on

Re: [jira] [Commented] (TIKA-2725) Make tika-server robust against ooms/infinite loops/memory leaks

2018-09-06 Thread Oleg Tikhonov
Ideally, tika server is dockerized, runs on swarm as a service. In addition, it has healthckeck mechanism, say something ... like http get request with return code 200. Docker will runs this hc periodically, and if it fails, will restart tika server. However, we are far away. Two ways to go, fmpov

Re: [jira] [Commented] (TIKA-2725) Make tika-server robust against ooms/infinite loops/memory leaks

2018-09-06 Thread Oleg Tikhonov
In this approach, probably it is the only way ... What is tika-server typical env? stand-alone, distributed ... like replicas in cluster? Are there some time limitation for recovery? How do we know what point to start processing from? Do we mark documents which were processed? For example, if

Re: [jira] [Created] (TIKA-2725) Make tika-server robust against ooms/infinite loops/memory leaks

2018-09-06 Thread Oleg Tikhonov
Hi Tim, What if watcher thread fails/gets stuck etc? On Thu, Sep 6, 2018 at 3:27 PM Tim Allison (JIRA) wrote: > Tim Allison created TIKA-2725: > - > > Summary: Make tika-server robust against ooms/infinite > loops/memory leaks >

Re: [jira] [Created] (TIKA-2647) Create a "security" page on our website

2018-05-22 Thread Oleg Tikhonov
Hi Tim, definitely would be helpful ! +1 Thanks, Oleg On Tue, May 22, 2018 at 3:38 PM, Tim Allison (JIRA) wrote: > Tim Allison created TIKA-2647: > - > > Summary: Create a "security" page on our website > Key:

Re: [VOTE] Release Apache Tika 1.18 Candidate #3

2018-04-22 Thread Oleg Tikhonov
Hi, thanks a lot. [x] +1 Release this package as Apache Tika 1.18 Even did a security scan: mvn org.owasp:dependency-check-maven:3.1.2:check Report is attached. Best regards, Oleg On Sat, Apr 21, 2018 at 12:54 AM, talli...@apache.org wrote: > All, > A candidate for the

Re: [VOTE] Release Apache Tika 1.18 Candidate #1

2018-04-11 Thread Oleg Tikhonov
[+] Release this package as Apache Tika 1.18 [INFO] Apache Tika parent . SUCCESS [ 12.379 s] [INFO] Apache Tika core ... SUCCESS [ 55.650 s] [INFO] Apache Tika parsers SUCCESS [05:55 min] [INFO]

Re: tsdb extraction

2018-03-29 Thread Oleg Tikhonov
ok. time to read the spec :-) On Thu, Mar 29, 2018 at 4:02 PM, Allison, Timothy B. <talli...@mitre.org> wrote: > Sorry...not aware of anything... > > -Original Message- > From: olegtikho...@gmail.com [mailto:olegtikho...@gmail.com] On Behalf Of > Oleg Tikhonov >

tsdb extraction

2018-03-28 Thread Oleg Tikhonov
Hi guys, I am wondering if we have a parser which can deal with time series, like influxDB or Prometheus? May be you know such "work in progress" - it's also good. Thanks in advance, Oleg

Re: [VOTE] Release Apache Tika 1.16 Candidate #1

2017-07-12 Thread Oleg Tikhonov
[x]+1 Release this package as Apache Tika 1.16 Basic tests and build on Ubuntu 17.04 + Java 8 (Oracle). Thanks, Oleg On Wed, Jul 12, 2017 at 11:03 AM, Dave Meikle wrote: > On 8 July 2017 at 03:40, Tim Allison wrote: > > > > > A candidate for the Tika

Re: experiences with Tika in Docker

2017-06-02 Thread Oleg Tikhonov
Guys, i can help with Tika dockerization. just let design/plan what we gonna do. On Thu, Jun 1, 2017 at 4:02 PM, Eric Pugh wrote: > As the Tika project starts embracing more non Java tools (I’m thinking of > Tesseract for example), dockerizing your Tika setup

Re: [VOTE] Release Apache Tika 1.15 Candidate #1

2017-05-24 Thread Oleg Tikhonov
@gmail.com] On Behalf Of > Oleg Tikhonov > Sent: Tuesday, May 23, 2017 4:33 PM > To: dev@tika.apache.org > Subject: Re: [VOTE] Release Apache Tika 1.15 Candidate #1 > > Also put > ./tika-dl/src/test/java/org/apache/tika/dl/imagerec/ > DL4JInceptionV3NetTest.java > @Ignore b

Re: [VOTE] Release Apache Tika 1.15 Candidate #2

2017-05-24 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 1.15 [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 19:41 min [INFO] Finished at:

Re: [VOTE] Release Apache Tika 1.15 Candidate #1

2017-05-23 Thread Oleg Tikhonov
Also put ./tika-dl/src/test/java/org/apache/tika/dl/imagerec/DL4JInceptionV3NetTest.java @Ignore because I do not have any DL installed on my comp. On Tue, May 23, 2017 at 11:00 PM, Oleg Tikhonov <o...@apache.org> wrote: > Hi guys, > Here is wrong ... > > org.apache.tika

Re: [VOTE] Release Apache Tika 1.15 Candidate #1

2017-05-23 Thread Oleg Tikhonov
Hi guys, Here is wrong ... org.apache.tika tika-parent 1.16-SNAPSHOT tika-parent/pom.xml If you are cloning the project, the upper level pom contains this. The fix is to change 1.16-SNAPSHOT to 1.15 What i did was: git clone https://github.com/apache/tika.git Any

Re: 1.15?

2017-04-18 Thread Oleg Tikhonov
+1 for the release. On Mon, Apr 17, 2017 at 8:39 PM, David Meikle wrote: > +1 from me too. > > Cheers, > Dave > > On 13 April 2017 at 13:08, Konstantin Gribov wrote: > > > Preliminary +1 from me, I'll the a closer look this weekend > > > > чт, 13 апр. 2017,

Re: Master Build Failing

2016-10-25 Thread Oleg Tikhonov
hi Luis, Here what I did: git clone https://git-wip-us.apache.org/repos/asf/tika.git git branch * master gdalinfo --version GDAL 1.11.3, released 2015/09/16 mvn clean install -U Tests run: 3, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 42.59 sec - in

Re: [VOTE] Apache Tika 1.14 Release Candidate #1

2016-10-20 Thread Oleg Tikhonov
Hi, +1 for release. Built on Ubuntu 16.04 and CentOS 7.0 x86_64. All tests are passed. Java 8. BR, Oleg On Thu, Oct 20, 2016 at 5:54 PM, Julien Nioche < lists.digitalpeb...@gmail.com> wrote: > Hi Tim > > I had exiftool installed indeed, so that might explain it. All tests now > pass. Will have

Re: [VOTE] Apache Tika 1.12 Release Candidate #1

2016-01-28 Thread Oleg Tikhonov
Hi Chris, thanks for doing it. Yesterday I successfuly build the tika using mvn clean install. All tests are passed. Platform: x86_64 Kubuntu with Oracle Java 8. Nothing special was ran. [x] +1 Release this package as Apache Tika 1.12 Best regards, Oleg On Mon, Jan 25, 2016 at 9:58 PM,

Re: [DISCUSS] Moving to Git

2015-11-19 Thread Oleg Tikhonov
+1. There is a bunch of add-ons. For instance - git flow. On Wed, Nov 18, 2015 at 7:15 PM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > Hey Nick, > > Git has something similar to svn:externals: > > http://stackoverflow.com/questions/571232/svnexternals-equivalent-in-git >

Re: [VOTE] Apache Tika 1.11 Release Candidate #1

2015-10-25 Thread Oleg Tikhonov
Hi guys, all looks fine on basic set up in x86_64 Ubuntu, however I got the following: Running org.apache.tika.parser.journal.JournalParserTest 25 Oct 2015 10:45:53 WARN PhaseInterceptorChain - Interceptor for { http://localhost:8080/grobid}WebClient has thrown exception, unwinding now

Re: [ANNOUNCE] Welcome Bob Paulin as Tika Committer + PMC Member

2015-09-17 Thread Oleg Tikhonov
Good intro. Welcome a board. Oleg On 17 Sep 2015 03:05, "David Meikle" wrote: > Hello All, > > Please welcome Bob Paulin as he joins us as the latest Tika committer and > PMC Member. > > Bob, please feel free to say a bit about yourself as an introduction to > the group. > >

Re: Remove support for building language identifier profiles?

2015-08-30 Thread Oleg Tikhonov
Hi Ken, I would be choose the last option you've mentioned. -- Oleg On Sat, Aug 29, 2015 at 7:58 PM, Ken Krugler kkrugler_li...@transpac.com wrote: Hi all, As part of integrating language-detector into Tika (see TIKA-1723), I noticed TIKA-546 (Add ability to create language profiles to

Re: Apache Tika: In use at Goldman Sachs

2015-08-20 Thread Oleg Tikhonov
Wow !!! Amazing. How does it perform? BR, Oleg On Thu, Aug 20, 2015 at 9:48 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Just saw this online: http://www.informationweek.com/software/enterprise-applications/goldman-sac hs-puts-elasticsearch-to-work/d/d-id/1321778

Re: release Tika 1.10?

2015-08-04 Thread Oleg Tikhonov
Thanks! +1 BR, Oleg On Tue, Aug 4, 2015 at 5:37 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: +1 ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA

Re: [VOTE] Apache Tika 1.10 Release Candidate #1

2015-08-04 Thread Oleg Tikhonov
Hi, thanks for doing that !!! +1 for the release. Ran on Kubuntu 15 x64. All basic tests are passed. BR, Oleg On Tue, Aug 4, 2015 at 6:17 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: +1 from me, great work Dave SIGS and CHECKSUMS are sound:

Re: Bayesian N-Gram Language Detection

2015-07-29 Thread Oleg Tikhonov
+1 !!! My two cents. Please also add ability to change/retrain/tote language profiles. Thanks !!! BR, Oleg On Wed, Jul 29, 2015 at 3:59 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Cool. Well with this one I found, along with language-detector, along with Ramirez and the

Re: [VOTE] Release Apache Tika 1.9 Candidate #2

2015-06-09 Thread Oleg Tikhonov
Hi, All basic tests are passed. java version 1.7.0_75 Java(TM) SE Runtime Environment (build 1.7.0_75-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.75-b04, mixed mode) Linux/Ubuntu x86_64 Superb !!! [x] +1 Release this package as Apache Tika 1.9 Thanks, Oleg On Tue, Jun 9, 2015 at 2:12 PM,

Re: [VOTE] Apache Tika 1.8 Release Candidate #2

2015-04-15 Thread Oleg Tikhonov
Hi Tyler, good job, indeed !!! [x] +1 Release this package as Apache Tika 1.8 On Wed, Apr 15, 2015 at 8:22 AM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Thanks Tyler! +1 from me: SIGS, checksums check out: [chipotle:~/tmp/apache-tika-1.8-rc2] mattmann%

Re: [VOTE] Release Apache Tika 1.8 Candidate #1

2015-04-08 Thread Oleg Tikhonov
Hi, [x] +1 Release this package as Apache Tika 1.8. Tested on: Ubuntu 14.10, x86_64. Java 1.7 (Oracle) Don't we want to update the following dependencies: biz.aQute:bndlib . 1.43.0 - 2.0.0.20130123-133441 org.apache.felix:org.apache.felix.scr.annotations 1.6.0 - 1.9.10

Re: FW: Any interest in running Apache Tika as part of CommonCrawl?

2015-04-03 Thread Oleg Tikhonov
I Tim, Having looked at CC, a couple of ideas crossed the mind. I think it's cool. +1. BR, Oleg On 3 Apr 2015 17:29, Allison, Timothy B. talli...@mitre.org wrote: All, What do you think? https://groups.google.com/forum/#!topic/common-crawl/Cv21VRQjGN0 On Friday, April 3, 2015 at 8:23:11

Re: [DISCUSS] Tika 1.8 or 1.7.1

2015-03-29 Thread Oleg Tikhonov
+1 for 1.8 release. On 29 Mar 2015 02:04, Konstantin Gribov gros...@gmail.com wrote: Also, I think, we should resolve TIKA-1575 (upgrade to pdfbox 1.8.9) since pdfbox 1.8.8 hangs on some pdf forms. -- Best regards, Konstantin Gribov сб, 28 марта 2015 г. в 23:22, Konstantin Gribov

Re: trunk test failure

2015-03-26 Thread Oleg Tikhonov
Hi Chris, just to confirm: [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Tika parent . SUCCESS [ 9.268 s] [INFO] Apache Tika core ... SUCCESS [ 25.823 s]

Re: [jira] [Closed] (TIKA-993) Language Detection Fault

2015-03-03 Thread Oleg Tikhonov
, What do you mean, the detection is faulty? What is the expected result in that case? Thanks, Tyler On Mar 3, 2015 1:10 AM, Oleg Tikhonov o...@apache.org wrote: Hi, Just for the record ... It can happen if a file contains context that at least written in two different languages

Re: [jira] [Closed] (TIKA-993) Language Detection Fault

2015-03-02 Thread Oleg Tikhonov
Hi, Just for the record ... It can happen if a file contains context that at least written in two different languages. For instance, the first half of file, say, is a German and the second one, say ... a French. In such case detection would be faulty. Br, Oleg On 3 Mar 2015 04:03, Tyler Palsulich

Re: [jira] [Created] (TIKA-1543) TesseractOCRParser.setTesseractPath() doesn't work on Linux

2015-02-06 Thread Oleg Tikhonov
Hi, Just one quess. Did you check the permissons, does it have executable permission? Br, Oleg On 6 Feb 2015 12:15, Sean Zhao (JIRA) j...@apache.org wrote: Sean Zhao created TIKA-1543: --- Summary: TesseractOCRParser.setTesseractPath() doesn't work

Re: TIKA-1423 Build a parser to extract data from GRIB formats not good with Java 6

2015-01-30 Thread Oleg Tikhonov
Hi there, +1 for dropping. On 30 Jan 2015 05:05, Tyler Palsulich tpalsul...@gmail.com wrote: +1 Tyler On Jan 29, 2015 9:52 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: +1 move to 1.7 Sent from my iPhone On Jan 29, 2015, at 5:04 PM, Allison, Timothy B.

  1   2   >