[jira] [Created] (TIKA-3850) Spanish text is incorrectly detected as Galician

2022-09-13 Thread Lenne Hendrickx (Jira)
Lenne Hendrickx created TIKA-3850: - Summary: Spanish text is incorrectly detected as Galician Key: TIKA-3850 URL: https://issues.apache.org/jira/browse/TIKA-3850 Project: Tika Issue Type: Bug

[jira] [Commented] (TIKA-3850) Spanish text is incorrectly detected as Galician

2022-09-13 Thread Nick Burch (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603483#comment-17603483 ] Nick Burch commented on TIKA-3850: -- The kind of statistical language model used in Tika s

[jira] [Commented] (TIKA-3850) Spanish text is incorrectly detected as Galician

2022-09-13 Thread Lenne Hendrickx (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603498#comment-17603498 ] Lenne Hendrickx commented on TIKA-3850: --- If I append Spanish text to the original te

[jira] [Created] (TIKA-3851) Add detection for e57

2022-09-13 Thread Tim Allison (Jira)
Tim Allison created TIKA-3851: - Summary: Add detection for e57 Key: TIKA-3851 URL: https://issues.apache.org/jira/browse/TIKA-3851 Project: Tika Issue Type: Wish Reporter: Tim Allison

[jira] [Resolved] (TIKA-3851) Add detection for e57

2022-09-13 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3851. --- Fix Version/s: 2.4.2 Resolution: Fixed > Add detection for e57 > - > >

[jira] [Created] (TIKA-3852) Extract signature info from PDFs

2022-09-13 Thread Tim Allison (Jira)
Tim Allison created TIKA-3852: - Summary: Extract signature info from PDFs Key: TIKA-3852 URL: https://issues.apache.org/jira/browse/TIKA-3852 Project: Tika Issue Type: Wish Reporter:

[jira] [Updated] (TIKA-3852) Extract signature info from PDFs

2022-09-13 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-3852: -- Description: We already do this for acroforms (at least in the contents), but we should also do this at

Re: [VOTE] Release Apache Tika 1.28.5 Candidate #1

2022-09-13 Thread Tim Allison
Fellow devs, we need one more vote to release 1.28.5 On Thu, Sep 8, 2022 at 11:19 AM Tim Allison wrote: > A candidate for the Tika 1.28.5 release is available at: > https://dist.apache.org/repos/dist/dev/tika/1.28.5 > > The release candidate is a zip archive of the sources in: > https://gith

Re: [VOTE] Release Apache Tika 1.28.5 Candidate #1

2022-09-13 Thread Tim Allison
W00t! One dev has dm'd that the build is underway. On Tue, Sep 13, 2022 at 11:34 AM Tim Allison wrote: > Fellow devs, we need one more vote to release 1.28.5 > > On Thu, Sep 8, 2022 at 11:19 AM Tim Allison wrote: > >> A candidate for the Tika 1.28.5 release is available at: >> https://dist.a

[jira] [Commented] (TIKA-3852) Extract signature info from PDFs

2022-09-13 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603644#comment-17603644 ] Tim Allison commented on TIKA-3852: --- [~tilman]...this is embarrassing... Is it possible

Re: [VOTE] Release Apache Tika 1.28.5 Candidate #1

2022-09-13 Thread Konstantin Gribov
Built successfully on ArchLinux, OpenJDK 11 & 17 (Temurin-11.0.16+8 & 17.0.4.1+1) w/ Tesseract 5.1.0, Leptonica 1.82.0. The issue with the tesseract multipage test is still the same, it extracts "Page?2" instead of "Page 2" on my laptop. GPG signatures and SHA512 hashes are fine. [x] +1 Release t

[jira] [Commented] (TIKA-3851) Add detection for e57

2022-09-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603654#comment-17603654 ] Hudson commented on TIKA-3851: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #7

[jira] [Commented] (TIKA-3852) Extract signature info from PDFs

2022-09-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603692#comment-17603692 ] Hudson commented on TIKA-3852: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #7

[jira] [Commented] (TIKA-3852) Extract signature info from PDFs

2022-09-13 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603700#comment-17603700 ] Tilman Hausherr commented on TIKA-3852: --- I'm not sure what you mean... acroform fiel

[RESULT][VOTE] Release Apache Tika 1.28.5 Candidate #1

2022-09-13 Thread Tim Allison
The vote has passed with 3 PMCs +1s and no -1s. I'll publish the artifacts tomorrow and update the website. * Tim Allison * Konstantin Gribov * Tilman Hausherr Thank you all! Best, Tim On Tue, Sep 13, 2022 at 12:07 PM Konstantin Gribov wrote: > Built successfully on ArchLinux,

[GitHub] [tika] dependabot[bot] opened a new pull request, #693: Bump netty.version from 4.1.81.Final to 4.1.82.Final

2022-09-13 Thread GitBox
dependabot[bot] opened a new pull request, #693: URL: https://github.com/apache/tika/pull/693 Bumps `netty.version` from 4.1.81.Final to 4.1.82.Final. Updates `netty-common` from 4.1.81.Final to 4.1.82.Final Commits https://github.com/netty/netty/commit/47799635143d7a11b56c4a

[GitHub] [tika] dependabot[bot] opened a new pull request, #694: Bump reactor-core from 3.4.22 to 3.4.23

2022-09-13 Thread GitBox
dependabot[bot] opened a new pull request, #694: URL: https://github.com/apache/tika/pull/694 Bumps [reactor-core](https://github.com/reactor/reactor-core) from 3.4.22 to 3.4.23. Release notes Sourced from https://github.com/reactor/reactor-core/releases";>reactor-core's releases.

[GitHub] [tika] THausherr merged pull request #693: Bump netty.version from 4.1.81.Final to 4.1.82.Final

2022-09-13 Thread GitBox
THausherr merged PR #693: URL: https://github.com/apache/tika/pull/693 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[GitHub] [tika] THausherr merged pull request #694: Bump reactor-core from 3.4.22 to 3.4.23

2022-09-13 Thread GitBox
THausherr merged PR #694: URL: https://github.com/apache/tika/pull/694 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org