[jira] [Created] (TIKA-3134) totalCharsPerPage and unmappedUnicodeCharsPerPage configuration

2020-07-15 Thread Jira
Dávid Tóth created TIKA-3134: Summary: totalCharsPerPage and unmappedUnicodeCharsPerPage configuration Key: TIKA-3134 URL: https://issues.apache.org/jira/browse/TIKA-3134 Project: Tika Issue Typ

[jira] [Closed] (TIKA-3097) Out of memory while parsing docx

2020-07-15 Thread suchendra (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] suchendra closed TIKA-3097. --- Resolution: Implemented > Out of memory while parsing docx > > >

[jira] [Commented] (TIKA-3097) Out of memory while parsing docx

2020-07-15 Thread suchendra (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158187#comment-17158187 ] suchendra commented on TIKA-3097: - [~tallison], thank you very much for your guidance and

[jira] [Closed] (TIKA-3098) Detecting embedded image

2020-07-15 Thread suchendra (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] suchendra closed TIKA-3098. --- Resolution: Fixed > Detecting embedded image > > > Key: TIKA-3098 >

[GitHub] [tika] tothd91 opened a new pull request #327: fix for TIKA-3134 contributed by tothd

2020-07-15 Thread GitBox
tothd91 opened a new pull request #327: URL: https://github.com/apache/tika/pull/327 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[jira] [Commented] (TIKA-3134) totalCharsPerPage and unmappedUnicodeCharsPerPage configuration

2020-07-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158226#comment-17158226 ] ASF GitHub Bot commented on TIKA-3134: -- tothd91 opened a new pull request #327: URL:

[jira] [Created] (TIKA-3135) No need to spool file for HeifParser

2020-07-15 Thread Tim Allison (Jira)
Tim Allison created TIKA-3135: - Summary: No need to spool file for HeifParser Key: TIKA-3135 URL: https://issues.apache.org/jira/browse/TIKA-3135 Project: Tika Issue Type: Task Report

[GitHub] [tika] tballison merged pull request #326: TIKA-3133 - writeLimit and maxEmbeddedResources for recursive parsing - add header

2020-07-15 Thread GitBox
tballison merged pull request #326: URL: https://github.com/apache/tika/pull/326 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[jira] [Commented] (TIKA-3133) /rmeta endpoint should not hard code writeLimit and maxEmbeddedResources

2020-07-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158568#comment-17158568 ] ASF GitHub Bot commented on TIKA-3133: -- tballison merged pull request #326: URL: http

[jira] [Commented] (TIKA-3134) totalCharsPerPage and unmappedUnicodeCharsPerPage configuration

2020-07-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158571#comment-17158571 ] ASF GitHub Bot commented on TIKA-3134: -- tballison commented on pull request #327: URL

[GitHub] [tika] tballison commented on pull request #327: fix for TIKA-3134 contributed by tothd

2020-07-15 Thread GitBox
tballison commented on pull request #327: URL: https://github.com/apache/tika/pull/327#issuecomment-658905767 @tothd91 thank you for opening this! It looks like there are quite a few changes that are white-space only. Would it be possible to update so that the diff includes only logic di

[jira] [Resolved] (TIKA-3133) /rmeta endpoint should not hard code writeLimit and maxEmbeddedResources

2020-07-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3133. --- Fix Version/s: 1.25 Assignee: Tim Allison Resolution: Fixed > /rmeta endpoint should n

[jira] [Commented] (TIKA-3133) /rmeta endpoint should not hard code writeLimit and maxEmbeddedResources

2020-07-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158623#comment-17158623 ] Tim Allison commented on TIKA-3133: --- Merged, fixed and added unit tests in {{main}} and

[jira] [Resolved] (TIKA-3135) No need to spool file for HeifParser

2020-07-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3135. --- Fix Version/s: 1.25 Resolution: Fixed > No need to spool file for HeifParser >

[jira] [Resolved] (TIKA-3126) Consider new endpoint (metadata + content non recursive)

2020-07-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3126. --- Fix Version/s: 1.25 Resolution: Fixed Fixed via TIKA-3133. > Consider new endpoint (metadata +

[jira] [Comment Edited] (TIKA-3133) /rmeta endpoint should not hard code writeLimit and maxEmbeddedResources

2020-07-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158623#comment-17158623 ] Tim Allison edited comment on TIKA-3133 at 7/15/20, 7:04 PM: -

[GitHub] [tika] tballison commented on pull request #325: TIKA-3131 -- swap default values of averageCharTolerance and spacingT…

2020-07-15 Thread GitBox
tballison commented on pull request #325: URL: https://github.com/apache/tika/pull/325#issuecomment-658951531 Thank you @clarkperkins ! This is an automated message from the Apache Git Service. To respond to the message, plea

[GitHub] [tika] tballison merged pull request #325: TIKA-3131 -- swap default values of averageCharTolerance and spacingT…

2020-07-15 Thread GitBox
tballison merged pull request #325: URL: https://github.com/apache/tika/pull/325 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[jira] [Commented] (TIKA-3131) PDFParserConfig default values were accidentally swapped

2020-07-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158631#comment-17158631 ] ASF GitHub Bot commented on TIKA-3131: -- tballison commented on pull request #325: URL

[jira] [Commented] (TIKA-3131) PDFParserConfig default values were accidentally swapped

2020-07-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158630#comment-17158630 ] ASF GitHub Bot commented on TIKA-3131: -- tballison merged pull request #325: URL: http

[jira] [Resolved] (TIKA-3131) PDFParserConfig default values were accidentally swapped

2020-07-15 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3131. --- Resolution: Fixed Thank you! > PDFParserConfig default values were accidentally swapped > ---

[jira] [Commented] (TIKA-3135) No need to spool file for HeifParser

2020-07-15 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158685#comment-17158685 ] Hudson commented on TIKA-3135: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #346 (See

[jira] [Commented] (TIKA-3134) totalCharsPerPage and unmappedUnicodeCharsPerPage configuration

2020-07-15 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158686#comment-17158686 ] Hudson commented on TIKA-3134: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #346 (See

[jira] [Commented] (TIKA-3130) Add "ICC:" as a namespace ICC metadata

2020-07-15 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158684#comment-17158684 ] Hudson commented on TIKA-3130: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #346 (See

[jira] [Created] (TIKA-3136) Add additional OCR support : EasyOCR

2020-07-15 Thread Kranthi Kiran GV (Jira)
Kranthi Kiran GV created TIKA-3136: -- Summary: Add additional OCR support : EasyOCR Key: TIKA-3136 URL: https://issues.apache.org/jira/browse/TIKA-3136 Project: Tika Issue Type: New Feature

[jira] [Commented] (TIKA-3136) Add additional OCR support : EasyOCR

2020-07-15 Thread Kranthi Kiran GV (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158928#comment-17158928 ] Kranthi Kiran GV commented on TIKA-3136: I would like to take this up. I have earl