[jira] [Commented] (TIKA-3890) Identifying an efficient approach for getting page count prior to running an extraction

2022-10-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620995#comment-17620995 ] Tim Allison commented on TIKA-3890: --- I concur with all that [~nick] said. Within a docu

[jira] [Comment Edited] (TIKA-3890) Identifying an efficient approach for getting page count prior to running an extraction

2022-10-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620995#comment-17620995 ] Tim Allison edited comment on TIKA-3890 at 10/20/22 10:41 AM: --

[jira] [Commented] (TIKA-3890) Identifying an efficient approach for getting page count prior to running an extraction

2022-10-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620996#comment-17620996 ] Tim Allison commented on TIKA-3890: --- Oh, finally, for docx and pptx, there is a SAX opti

[jira] [Commented] (TIKA-1508) Add uniformity to parser parameter configuration

2022-10-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621103#comment-17621103 ] Tim Allison commented on TIKA-1508: --- I think we're basically good with this in 2.x now.

[jira] [Resolved] (TIKA-1508) Add uniformity to parser parameter configuration

2022-10-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1508. --- Fix Version/s: 1.15 (was: 1.14) Resolution: Fixed Opening a follow on ti

[jira] [Created] (TIKA-3891) Add generic serialization of params to TikaConfigSeralizer

2022-10-20 Thread Tim Allison (Jira)
Tim Allison created TIKA-3891: - Summary: Add generic serialization of params to TikaConfigSeralizer Key: TIKA-3891 URL: https://issues.apache.org/jira/browse/TIKA-3891 Project: Tika Issue Type: T

[jira] [Commented] (TIKA-3890) Identifying an efficient approach for getting page count prior to running an extraction

2022-10-20 Thread Ethan Wilansky (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621155#comment-17621155 ] Ethan Wilansky commented on TIKA-3890: -- Thanks Nick and Tim. This is really helpful.

[jira] [Commented] (TIKA-3890) Identifying an efficient approach for getting page count prior to running an extraction

2022-10-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621191#comment-17621191 ] Tim Allison commented on TIKA-3890: --- d) writeLimit can be used with /rmeta (and /tika II

[jira] [Commented] (TIKA-3890) Identifying an efficient approach for getting page count prior to running an extraction

2022-10-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621192#comment-17621192 ] Tim Allison commented on TIKA-3890: --- bq. Outside of this, I'm assuming it's okay to send

[jira] [Commented] (TIKA-3890) Identifying an efficient approach for getting page count prior to running an extraction

2022-10-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621194#comment-17621194 ] Tim Allison commented on TIKA-3890: --- There are some gotchas with /tika because it was de

[jira] [Created] (TIKA-3892) Add csv emitter for tika-pipes

2022-10-20 Thread Tim Allison (Jira)
Tim Allison created TIKA-3892: - Summary: Add csv emitter for tika-pipes Key: TIKA-3892 URL: https://issues.apache.org/jira/browse/TIKA-3892 Project: Tika Issue Type: Task Reporter: Ti

[jira] [Created] (TIKA-3893) Create jdbc-pipes-reporter

2022-10-20 Thread Tim Allison (Jira)
Tim Allison created TIKA-3893: - Summary: Create jdbc-pipes-reporter Key: TIKA-3893 URL: https://issues.apache.org/jira/browse/TIKA-3893 Project: Tika Issue Type: Task Reporter: Tim Al

[jira] [Updated] (TIKA-3892) Add csv emitter for tika-pipes

2022-10-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-3892: -- Description: This is actually non-trivial because it requires cross process coordination. My initial ap

[jira] [Commented] (TIKA-3890) Identifying an efficient approach for getting page count prior to running an extraction

2022-10-20 Thread Ethan Wilansky (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621322#comment-17621322 ] Ethan Wilansky commented on TIKA-3890: -- Great information, thanks. I'll close this is

[jira] [Closed] (TIKA-3890) Identifying an efficient approach for getting page count prior to running an extraction

2022-10-20 Thread Ethan Wilansky (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Wilansky closed TIKA-3890. Fix Version/s: 2.5.0 Resolution: Fixed > Identifying an efficient approach for getting page c

[jira] [Created] (TIKA-3894) Documentation update needed

2022-10-20 Thread Ethan Wilansky (Jira)
Ethan Wilansky created TIKA-3894: Summary: Documentation update needed Key: TIKA-3894 URL: https://issues.apache.org/jira/browse/TIKA-3894 Project: Tika Issue Type: Improvement Comp

[jira] [Created] (TIKA-3895) Turn off dl4j tests on aarch64

2022-10-20 Thread Tim Allison (Jira)
Tim Allison created TIKA-3895: - Summary: Turn off dl4j tests on aarch64 Key: TIKA-3895 URL: https://issues.apache.org/jira/browse/TIKA-3895 Project: Tika Issue Type: Task Reporter: Ti

[jira] [Resolved] (TIKA-3895) Turn off dl4j tests on aarch64

2022-10-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3895. --- Fix Version/s: 2.5.1 Resolution: Fixed > Turn off dl4j tests on aarch64 > -

[jira] [Created] (TIKA-3896) General upgrades for 2.5.1

2022-10-20 Thread Tim Allison (Jira)
Tim Allison created TIKA-3896: - Summary: General upgrades for 2.5.1 Key: TIKA-3896 URL: https://issues.apache.org/jira/browse/TIKA-3896 Project: Tika Issue Type: Task Reporter: Tim Al

[jira] [Commented] (TIKA-3896) General upgrades for 2.5.1

2022-10-20 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621362#comment-17621362 ] Tim Allison commented on TIKA-3896: --- reactor-netty-http and reactor-netty-core are causi

[jira] [Commented] (TIKA-3896) General upgrades for 2.5.1

2022-10-20 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621489#comment-17621489 ] Hudson commented on TIKA-3896: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #8

[jira] [Commented] (TIKA-3895) Turn off dl4j tests on aarch64

2022-10-20 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621488#comment-17621488 ] Hudson commented on TIKA-3895: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #8

[GitHub] [tika] dependabot[bot] opened a new pull request, #757: Bump twelvemonkeys.version from 3.9.1 to 3.9.3

2022-10-20 Thread GitBox
dependabot[bot] opened a new pull request, #757: URL: https://github.com/apache/tika/pull/757 Bumps `twelvemonkeys.version` from 3.9.1 to 3.9.3. Updates `common-io` from 3.9.1 to 3.9.3 Updates `imageio-bmp` from 3.9.1 to 3.9.3 Updates `imageio-jpeg` from 3.9.1 to 3.9.3

[GitHub] [tika] dependabot[bot] opened a new pull request, #758: Bump aws.version from 1.12.324 to 1.12.325

2022-10-20 Thread GitBox
dependabot[bot] opened a new pull request, #758: URL: https://github.com/apache/tika/pull/758 Bumps `aws.version` from 1.12.324 to 1.12.325. Updates `aws-java-sdk-transcribe` from 1.12.324 to 1.12.325 Changelog Sourced from https://github.com/aws/aws-sdk-java/blob/master/CHANGELO

[GitHub] [tika] dependabot[bot] opened a new pull request, #759: Bump google-cloud-storage from 2.13.0 to 2.13.1

2022-10-20 Thread GitBox
dependabot[bot] opened a new pull request, #759: URL: https://github.com/apache/tika/pull/759 Bumps [google-cloud-storage](https://github.com/googleapis/java-storage) from 2.13.0 to 2.13.1. Release notes Sourced from https://github.com/googleapis/java-storage/releases";>google-clou

[GitHub] [tika] THausherr merged pull request #758: Bump aws.version from 1.12.324 to 1.12.325

2022-10-20 Thread GitBox
THausherr merged PR #758: URL: https://github.com/apache/tika/pull/758 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[GitHub] [tika] THausherr merged pull request #757: Bump twelvemonkeys.version from 3.9.1 to 3.9.3

2022-10-20 Thread GitBox
THausherr merged PR #757: URL: https://github.com/apache/tika/pull/757 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[GitHub] [tika] THausherr merged pull request #759: Bump google-cloud-storage from 2.13.0 to 2.13.1

2022-10-20 Thread GitBox
THausherr merged PR #759: URL: https://github.com/apache/tika/pull/759 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org