THausherr merged PR #787:
URL: https://github.com/apache/tika/pull/787
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org
THausherr merged PR #786:
URL: https://github.com/apache/tika/pull/786
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org
dependabot[bot] opened a new pull request, #787:
URL: https://github.com/apache/tika/pull/787
Bumps [google-cloud-storage](https://github.com/googleapis/java-storage)
from 2.14.0 to 2.15.0.
Release notes
Sourced from https://github.com/googleapis/java-storage/releases";>google-clou
dependabot[bot] opened a new pull request, #786:
URL: https://github.com/apache/tika/pull/786
Bumps `aws.version` from 1.12.336 to 1.12.337.
Updates `aws-java-sdk-s3` from 1.12.336 to 1.12.337
Changelog
Sourced from https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md";>a
[
https://issues.apache.org/jira/browse/TIKA-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630091#comment-17630091
]
Hudson commented on TIKA-3917:
--
SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #9
Hi, folks.
Sorry, I totally missed the moment when TIKA- 3842 [1] (update slf4j to
2.0.x) was discussed and merged.
*TL;DR: **slf4j-api *update from 1.7.x to 2.0.x in *tika-core *2.5.0 requires
downstream library users to update logging backend. *Tika Server* & *Tika
Python* not affected. In most
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630067#comment-17630067
]
Hudson commented on TIKA-3919:
--
SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #9
[
https://issues.apache.org/jira/browse/TIKA-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630029#comment-17630029
]
Tim Allison edited comment on TIKA-3735 at 11/7/22 9:01 PM:
Th
[
https://issues.apache.org/jira/browse/TIKA-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630029#comment-17630029
]
Tim Allison commented on TIKA-3735:
---
This kinda feels like we're hitting a tipping point
[
https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630019#comment-17630019
]
Tim Allison commented on TIKA-2536:
---
Thank you!
It looks like we include this in our RE
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630004#comment-17630004
]
Tim Allison commented on TIKA-3919:
---
All that said, we can do better in our codebase. L
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630002#comment-17630002
]
Tim Allison commented on TIKA-3919:
---
Further, if you are parsing enough files, you'll hi
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1762#comment-1762
]
Tim Allison commented on TIKA-3919:
---
As I look at the code, I think we should use the Bo
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629993#comment-17629993
]
Tim Allison edited comment on TIKA-3919 at 11/7/22 7:56 PM:
Yo
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629993#comment-17629993
]
Tim Allison edited comment on TIKA-3919 at 11/7/22 7:37 PM:
Yo
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629993#comment-17629993
]
Tim Allison commented on TIKA-3919:
---
You can set the markLimit with this:
{noformat}
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629914#comment-17629914
]
Narendran Solai Sridharan edited comment on TIKA-3919 at 11/7/22 4:59 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629914#comment-17629914
]
Narendran Solai Sridharan commented on TIKA-3919:
-
Thanks for the quick re
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Description:
Out of Memory during file parsing in AutoDetectParser. Issue is
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Attachment: (was: Thread dump-1.PNG)
> Out of Memory during file parsing
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Attachment: Model.xlsx
> Out of Memory during file parsing in AutoDetectPars
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Attachment: (was: Model_comparison.xls)
> Out of Memory during file pars
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Description:
Out of Memory during file parsing in AutoDetectParser. Issue is
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Attachment: Thread dump.PNG
> Out of Memory during file parsing in AutoDetec
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Attachment: Large Object-1.PNG
Thread dump.PNG
> Out of Memo
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Attachment: (was: Large Object.PNG)
> Out of Memory during file parsing
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Attachment: (was: Thread dump.PNG)
> Out of Memory during file parsing i
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629810#comment-17629810
]
Tim Allison commented on TIKA-3919:
---
We could also modify the LookaheadInputStream to no
[
https://issues.apache.org/jira/browse/TIKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629794#comment-17629794
]
David Pilato commented on TIKA-2536:
For future readers, the workaround to depend on T
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629793#comment-17629793
]
Tim Allison commented on TIKA-3919:
---
I can update the documentation on how to decrease t
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629792#comment-17629792
]
Tim Allison commented on TIKA-3919:
---
The LookaheadInputStream buffers the marklimit to m
The Apache Tika project is pleased to announce the release of Apache
Tika 2.6.0. The release contents have been pushed out to the main
Apache release site and to the Maven Central sync.
Apache Tika is a toolkit for detecting and extracting metadata and
structured text content from various document
The vote has passed with 4 PMC +1s and no -1s.
Dave Meikle
Oleg Tikhonov
Tilman Hausherr
Tim Allison
Thank you, all. I'll release the artifacts now and update the website.
On Sun, Nov 6, 2022 at 7:51 AM David Meikle wrote:
> On Thu, 3 Nov 2022 at 13:47, Tim Allison wrote:
>
> > A candidate f
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Component/s: detector
parser
> Out of Memory during file pa
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Description:
Out of Memory during file parsing in AutoDetectParser. Issue is
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Environment:
OS : Windows 10,
Software Platform : Java
was:
While t
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Description:
Out of Memory during file parsing in AutoDetectParser. Issue is
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Description:
Out of Memory during file parsing in AutoDetectParser. Issue is
[
https://issues.apache.org/jira/browse/TIKA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Narendran Solai Sridharan updated TIKA-3919:
Description:
Out of Memory during file parsing in AutoDetectParser. Issue is
Greetings,
With JavaOne in Las Vegas, last month was epically busy! It was great to
finally have the ability to meet and discuss the Quality Outreach
program with some of you... face-to-face!
This installment of the newsletter is packed as we have several
heads-ups, including new Early-Acces
40 matches
Mail list logo