[jira] [Commented] (TIKA-1599) Switch from TagSoup to JSoup

2023-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-1599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768076#comment-17768076 ] ASF GitHub Bot commented on TIKA-1599: -- tballison commented on PR #1356: URL:

[jira] [Commented] (TIKA-1599) Switch from TagSoup to JSoup

2023-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-1599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768074#comment-17768074 ] ASF GitHub Bot commented on TIKA-1599: -- tballison opened a new pull request, #1356: URL:

[jira] [Commented] (TIKA-4138) Move boilerpipehandler to its own package outside of tika-parsers

2023-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768041#comment-17768041 ] ASF GitHub Bot commented on TIKA-4138: -- tballison opened a new pull request, #1355: URL:

[jira] [Commented] (TIKA-4137) Building current Tika main branch fails under Java 20/21

2023-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768013#comment-17768013 ] ASF GitHub Bot commented on TIKA-4137: -- tballison closed pull request #1350: TIKA-4137 URL:

[jira] [Commented] (TIKA-4137) Building current Tika main branch fails under Java 20/21

2023-09-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768012#comment-17768012 ] ASF GitHub Bot commented on TIKA-4137: -- tballison commented on PR #1350: URL:

[jira] [Commented] (TIKA-4108) Upgrade IntelliJ IDEA Annotations 12.0 to latest version

2023-09-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767612#comment-17767612 ] ASF GitHub Bot commented on TIKA-4108: -- tballison merged PR #1351: URL:

[jira] [Commented] (TIKA-4137) Building current Tika main branch fails under Java 20/21

2023-09-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767611#comment-17767611 ] ASF GitHub Bot commented on TIKA-4137: -- tballison commented on PR #1350: URL:

[jira] [Commented] (TIKA-4108) Upgrade IntelliJ IDEA Annotations 12.0 to latest version

2023-09-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767600#comment-17767600 ] ASF GitHub Bot commented on TIKA-4108: -- tballison opened a new pull request, #1351: URL:

[jira] [Commented] (TIKA-4137) Building current Tika main branch fails under Java 20/21

2023-09-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767589#comment-17767589 ] ASF GitHub Bot commented on TIKA-4137: -- tballison opened a new pull request, #1350: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766256#comment-17766256 ] ASF GitHub Bot commented on TIKA-3948: -- theit commented on PR #1345: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-18 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766246#comment-17766246 ] ASF GitHub Bot commented on TIKA-3948: -- theit commented on code in PR #1345: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766008#comment-17766008 ] ASF GitHub Bot commented on TIKA-3948: -- desruisseaux commented on PR #1345: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765994#comment-17765994 ] ASF GitHub Bot commented on TIKA-3948: -- tballison commented on PR #1345: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765907#comment-17765907 ] ASF GitHub Bot commented on TIKA-3948: -- solomax commented on PR #1345: URL:

[jira] [Commented] (TIKA-4133) Add capture group metadataFilter

2023-09-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765752#comment-17765752 ] ASF GitHub Bot commented on TIKA-4133: -- tballison merged PR #1346: URL:

[jira] [Commented] (TIKA-4133) Add capture group metadataFilter

2023-09-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765717#comment-17765717 ] ASF GitHub Bot commented on TIKA-4133: -- tballison opened a new pull request, #1346: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765635#comment-17765635 ] ASF GitHub Bot commented on TIKA-3948: -- tballison commented on PR #1345: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765634#comment-17765634 ] ASF GitHub Bot commented on TIKA-3948: -- tballison opened a new pull request, #1345: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765596#comment-17765596 ] ASF GitHub Bot commented on TIKA-3948: -- tballison merged PR #1342: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765455#comment-17765455 ] ASF GitHub Bot commented on TIKA-3948: -- solomax commented on PR #1342: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765454#comment-17765454 ] ASF GitHub Bot commented on TIKA-3948: -- solomax commented on PR #1342: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765437#comment-17765437 ] ASF GitHub Bot commented on TIKA-3948: -- solomax commented on PR #1342: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765430#comment-17765430 ] ASF GitHub Bot commented on TIKA-3948: -- solomax opened a new pull request, #1342: URL:

[jira] [Commented] (TIKA-4130) Conflict with duplicate org/w3c and org/xml packages in tika-app jar

2023-09-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765342#comment-17765342 ] ASF GitHub Bot commented on TIKA-4130: -- Raahul3010 opened a new pull request, #1341: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765249#comment-17765249 ] ASF GitHub Bot commented on TIKA-3948: -- desruisseaux commented on PR #1337: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765248#comment-17765248 ] ASF GitHub Bot commented on TIKA-3948: -- tballison commented on PR #1337: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765242#comment-17765242 ] ASF GitHub Bot commented on TIKA-3948: -- desruisseaux commented on PR #1337: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765239#comment-17765239 ] ASF GitHub Bot commented on TIKA-3948: -- tballison commented on PR #1337: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765218#comment-17765218 ] ASF GitHub Bot commented on TIKA-3948: -- tballison merged PR #1337: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-14 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765206#comment-17765206 ] ASF GitHub Bot commented on TIKA-3948: -- tballison commented on PR #1337: URL:

[jira] [Commented] (TIKA-3948) Migrate to jakarta in Tika 3.x

2023-09-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764952#comment-17764952 ] ASF GitHub Bot commented on TIKA-3948: -- solomax opened a new pull request, #1337: URL:

[jira] [Commented] (TIKA-3641) Upgrade to Lucene 9.x

2023-09-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764844#comment-17764844 ] ASF GitHub Bot commented on TIKA-3641: -- tballison merged PR #1335: URL:

[jira] [Commented] (TIKA-4129) Upgrade dependencies requiring > Java 8

2023-09-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764845#comment-17764845 ] ASF GitHub Bot commented on TIKA-4129: -- tballison merged PR #1336: URL:

[jira] [Commented] (TIKA-3641) Upgrade to Lucene 9.x

2023-09-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764832#comment-17764832 ] ASF GitHub Bot commented on TIKA-3641: -- tballison opened a new pull request, #1335: URL:

[jira] [Commented] (TIKA-4129) Upgrade dependencies requiring > Java 8

2023-09-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764834#comment-17764834 ] ASF GitHub Bot commented on TIKA-4129: -- tballison opened a new pull request, #1336: URL:

[jira] [Commented] (TIKA-4128) Bump main to java 11

2023-09-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764812#comment-17764812 ] ASF GitHub Bot commented on TIKA-4128: -- tballison merged PR #1334: URL:

[jira] [Commented] (TIKA-4128) Bump main to java 11

2023-09-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764789#comment-17764789 ] ASF GitHub Bot commented on TIKA-4128: -- tballison opened a new pull request, #1334: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763946#comment-17763946 ] ASF GitHub Bot commented on TIKA-4126: -- tballison merged PR #1329: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763945#comment-17763945 ] ASF GitHub Bot commented on TIKA-4126: -- patrickdalla commented on PR #1329: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763944#comment-17763944 ] ASF GitHub Bot commented on TIKA-4126: -- tballison commented on PR #1329: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763933#comment-17763933 ] ASF GitHub Bot commented on TIKA-4126: -- patrickdalla commented on code in PR #1329: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763931#comment-17763931 ] ASF GitHub Bot commented on TIKA-4126: -- patrickdalla commented on code in PR #1329: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763928#comment-17763928 ] ASF GitHub Bot commented on TIKA-4126: -- patrickdalla commented on PR #1329: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763926#comment-17763926 ] ASF GitHub Bot commented on TIKA-4126: -- patrickdalla commented on PR #1329: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763924#comment-17763924 ] ASF GitHub Bot commented on TIKA-4126: -- patrickdalla commented on PR #1329: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763923#comment-17763923 ] ASF GitHub Bot commented on TIKA-4126: -- tballison commented on code in PR #1329: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763921#comment-17763921 ] ASF GitHub Bot commented on TIKA-4126: -- tballison commented on PR #1329: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763916#comment-17763916 ] ASF GitHub Bot commented on TIKA-4126: -- patrickdalla commented on PR #1329: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763912#comment-17763912 ] ASF GitHub Bot commented on TIKA-4126: -- patrickdalla commented on PR #1329: URL:

[jira] [Commented] (TIKA-4124) embedded html of type http://schemas.openxmlformats.org/officeDocument/2006/relationships/aFChunk is not parsed

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763902#comment-17763902 ] ASF GitHub Bot commented on TIKA-4124: -- tballison merged PR #1324: URL:

[jira] [Commented] (TIKA-4125) Tweak rfc822 detection, yet again

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763885#comment-17763885 ] ASF GitHub Bot commented on TIKA-4125: -- tballison merged PR #1323: URL:

[jira] [Commented] (TIKA-4126) PDF XMP ModifyDate extracted without TimeZone info

2023-09-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763814#comment-17763814 ] ASF GitHub Bot commented on TIKA-4126: -- tballison opened a new pull request, #1329: URL:

[jira] [Commented] (TIKA-4124) embedded html of type http://schemas.openxmlformats.org/officeDocument/2006/relationships/aFChunk is not parsed

2023-09-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763148#comment-17763148 ] ASF GitHub Bot commented on TIKA-4124: -- tballison commented on PR #1324: URL:

[jira] [Commented] (TIKA-4124) embedded html of type http://schemas.openxmlformats.org/officeDocument/2006/relationships/aFChunk is not parsed

2023-09-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763146#comment-17763146 ] ASF GitHub Bot commented on TIKA-4124: -- tballison opened a new pull request, #1324: URL:

[jira] [Commented] (TIKA-4125) Tweak rfc822 detection, yet again

2023-09-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763145#comment-17763145 ] ASF GitHub Bot commented on TIKA-4125: -- tballison opened a new pull request, #1323: URL:

[jira] [Commented] (TIKA-2328) HtmlParser fails when DOCTYPE has unbalanced quotes

2023-08-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760147#comment-17760147 ] ASF GitHub Bot commented on TIKA-2328: -- kkrugler commented on code in PR #1310: URL:

[jira] [Commented] (TIKA-2328) HtmlParser fails when DOCTYPE has unbalanced quotes

2023-08-29 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760099#comment-17760099 ] ASF GitHub Bot commented on TIKA-2328: -- yakovsh opened a new pull request, #1310: URL:

[jira] [Commented] (TIKA-4106) Digesting and content length on embedded ole/zip/pdf files are not calculated

2023-08-24 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758702#comment-17758702 ] ASF GitHub Bot commented on TIKA-4106: -- tballison merged PR #1303: URL:

[jira] [Commented] (TIKA-4016) Upgrade to PDFBox 2.0.28

2023-08-24 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758693#comment-17758693 ] ASF GitHub Bot commented on TIKA-4016: -- tballison commented on PR #1302: URL:

[jira] [Commented] (TIKA-4106) Digesting and content length on embedded ole/zip/pdf files are not calculated

2023-08-24 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758688#comment-17758688 ] ASF GitHub Bot commented on TIKA-4106: -- tballison opened a new pull request, #1303: URL:

[jira] [Commented] (TIKA-4016) Upgrade to PDFBox 2.0.28

2023-08-24 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758686#comment-17758686 ] ASF GitHub Bot commented on TIKA-4016: -- tballison closed pull request #1302: TIKA-4016 URL:

[jira] [Commented] (TIKA-4016) Upgrade to PDFBox 2.0.28

2023-08-24 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758682#comment-17758682 ] ASF GitHub Bot commented on TIKA-4016: -- tballison opened a new pull request, #1302: URL:

[jira] [Commented] (TIKA-3109) Ingest attachment: failed to extract text from iframe

2023-08-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755224#comment-17755224 ] ASF GitHub Bot commented on TIKA-3109: -- tballison merged PR #1291: URL:

[jira] [Commented] (TIKA-4048) Gzipped WARC not identifying all assets

2023-08-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755213#comment-17755213 ] ASF GitHub Bot commented on TIKA-4048: -- tballison merged PR #1290: URL:

[jira] [Commented] (TIKA-3109) Ingest attachment: failed to extract text from iframe

2023-08-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755210#comment-17755210 ] ASF GitHub Bot commented on TIKA-3109: -- tballison opened a new pull request, #1291: URL:

[jira] [Commented] (TIKA-4048) Gzipped WARC not identifying all assets

2023-08-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755192#comment-17755192 ] ASF GitHub Bot commented on TIKA-4048: -- tballison opened a new pull request, #1290: URL:

[jira] [Commented] (TIKA-4116) Duplicate macros extracted from some embedded OLE2 containers

2023-08-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754697#comment-17754697 ] ASF GitHub Bot commented on TIKA-4116: -- tballison merged PR #1285: URL:

[jira] [Commented] (TIKA-4091) OLE2 / CFB entry names should be treated case-insensitively

2023-08-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754668#comment-17754668 ] ASF GitHub Bot commented on TIKA-4091: -- tballison opened a new pull request, #1286: URL:

[jira] [Commented] (TIKA-4116) Duplicate macros extracted from some embedded OLE2 containers

2023-08-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754667#comment-17754667 ] ASF GitHub Bot commented on TIKA-4116: -- tballison opened a new pull request, #1285: URL:

[jira] [Commented] (TIKA-2545) RereadableInputStream backing byte array not constructed properly

2023-07-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748776#comment-17748776 ] ASF GitHub Bot commented on TIKA-2545: -- genekhart closed pull request #217: fix for TIKA-2545

[jira] [Commented] (TIKA-4101) Regression in identifying vnd.ms-equation

2023-07-24 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746462#comment-17746462 ] ASF GitHub Bot commented on TIKA-4101: -- tballison merged PR #1251: URL:

[jira] [Commented] (TIKA-4101) Regression in identifying vnd.ms-equation

2023-07-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745660#comment-17745660 ] ASF GitHub Bot commented on TIKA-4101: -- tballison opened a new pull request, #1251: URL:

[jira] [Commented] (TIKA-4104) Add temporary workaround to fix NPE from bug in jaiimageio-core

2023-07-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745657#comment-17745657 ] ASF GitHub Bot commented on TIKA-4104: -- tballison merged PR #1250: URL:

[jira] [Commented] (TIKA-4104) Add temporary workaround to fix NPE from bug in jaiimageio-core

2023-07-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745630#comment-17745630 ] ASF GitHub Bot commented on TIKA-4104: -- tballison opened a new pull request, #1250: URL:

[jira] [Commented] (TIKA-4102) Add filename to tika-eval reports where each line is a file or embedded file

2023-07-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745156#comment-17745156 ] ASF GitHub Bot commented on TIKA-4102: -- tballison merged PR #1247: URL:

[jira] [Commented] (TIKA-4102) Add filename to tika-eval reports where each line is a file or embedded file

2023-07-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745137#comment-17745137 ] ASF GitHub Bot commented on TIKA-4102: -- tballison opened a new pull request, #1247: URL:

[jira] [Commented] (TIKA-4100) Work-around for ~infinite loop in one of the reports in tika-eval Report tool

2023-07-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17744721#comment-17744721 ] ASF GitHub Bot commented on TIKA-4100: -- tballison merged PR #1245: URL:

[jira] [Commented] (TIKA-4100) Work-around for ~infinite loop in one of the reports in tika-eval Report tool

2023-07-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17744698#comment-17744698 ] ASF GitHub Bot commented on TIKA-4100: -- tballison opened a new pull request, #1245: URL:

[jira] [Commented] (TIKA-4098) Detection fails on PDF with garbage before header

2023-07-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17741557#comment-17741557 ] ASF GitHub Bot commented on TIKA-4098: -- SchwingSK opened a new pull request, #1231: URL:

[jira] [Commented] (TIKA-4097) Improve cleanup of temporary resources for crashed PipesServers

2023-07-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740318#comment-17740318 ] ASF GitHub Bot commented on TIKA-4097: -- tballison opened a new pull request, #1226: URL:

[jira] [Commented] (TIKA-4096) Allow users to configure how often the jdbc pipes reporter commits updates

2023-07-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740314#comment-17740314 ] ASF GitHub Bot commented on TIKA-4096: -- tballison merged PR #1225: URL:

[jira] [Commented] (TIKA-4096) Allow users to configure how often the jdbc pipes reporter commits updates

2023-07-05 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740299#comment-17740299 ] ASF GitHub Bot commented on TIKA-4096: -- tballison opened a new pull request, #1225: URL:

[jira] [Commented] (TIKA-4092) Upgrade to POI 5.2.4 when available

2023-06-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17736199#comment-17736199 ] ASF GitHub Bot commented on TIKA-4092: -- tballison opened a new pull request, #1211: URL:

[jira] [Commented] (TIKA-3666) Detect and indicate file encrypted with Rights Management Service RMS/IRM

2023-06-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735422#comment-17735422 ] ASF GitHub Bot commented on TIKA-3666: -- tballison merged PR #1204: URL:

[jira] [Commented] (TIKA-3666) Detect and indicate file encrypted with Rights Management Service RMS/IRM

2023-06-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735413#comment-17735413 ] ASF GitHub Bot commented on TIKA-3666: -- tballison opened a new pull request, #1204: URL:

[jira] [Commented] (TIKA-4091) OLE2 / CFB entry names should be treated case-insensitively

2023-06-20 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735353#comment-17735353 ] ASF GitHub Bot commented on TIKA-4091: -- tballison opened a new pull request, #1203: URL:

[jira] [Commented] (TIKA-4082) Extraction from Microsoft Sharepoint protected PDFs doesn't expose exception like other parsers.

2023-06-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17733528#comment-17733528 ] ASF GitHub Bot commented on TIKA-4082: -- tballison merged PR #1196: URL:

[jira] [Commented] (TIKA-4082) Extraction from Microsoft Sharepoint protected PDFs doesn't expose exception like other parsers.

2023-06-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17733210#comment-17733210 ] ASF GitHub Bot commented on TIKA-4082: -- tballison opened a new pull request, #1196: URL:

[jira] [Commented] (TIKA-4062) OfflineContentHandler/ContentHandlerDecorator does not provide option for custom error handling

2023-06-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731698#comment-17731698 ] ASF GitHub Bot commented on TIKA-4062: -- tballison commented on PR #1186: URL:

[jira] [Commented] (TIKA-4062) OfflineContentHandler/ContentHandlerDecorator does not provide option for custom error handling

2023-06-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731697#comment-17731697 ] ASF GitHub Bot commented on TIKA-4062: -- tballison merged PR #1186: URL:

[jira] [Commented] (TIKA-4043) Fix build for variations in tesseract and timezone info in RTFs

2023-06-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731656#comment-17731656 ] ASF GitHub Bot commented on TIKA-4043: -- tballison merged PR #1187: URL:

[jira] [Commented] (TIKA-4043) Fix build for variations in tesseract and timezone info in RTFs

2023-06-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731634#comment-17731634 ] ASF GitHub Bot commented on TIKA-4043: -- tballison opened a new pull request, #1187: URL:

[jira] [Commented] (TIKA-4062) OfflineContentHandler/ContentHandlerDecorator does not provide option for custom error handling

2023-06-09 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730835#comment-17730835 ] ASF GitHub Bot commented on TIKA-4062: -- raviranjanjha opened a new pull request, #1186: URL:

[jira] [Commented] (TIKA-3941) Consider having pipesserver return intermediate results

2023-06-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730603#comment-17730603 ] ASF GitHub Bot commented on TIKA-3941: -- tballison merged PR #1167: URL:

[jira] [Commented] (TIKA-4062) OfflineContentHandler/ContentHandlerDecorator does not provide option for custom error handling

2023-06-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730599#comment-17730599 ] ASF GitHub Bot commented on TIKA-4062: -- tballison merged PR #1179: URL:

[jira] [Commented] (TIKA-4039) Allow users to set the maximum attachment size in the /unpack resource of tika-server

2023-06-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730594#comment-17730594 ] ASF GitHub Bot commented on TIKA-4039: -- tballison merged PR #1181: URL:

[jira] [Commented] (TIKA-4039) Make max byte array parameter in Unrar parser configurable

2023-06-08 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730580#comment-17730580 ] ASF GitHub Bot commented on TIKA-4039: -- tballison opened a new pull request, #1181: URL:

[jira] [Commented] (TIKA-4062) OfflineContentHandler/ContentHandlerDecorator does not provide option for custom error handling

2023-06-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730121#comment-17730121 ] ASF GitHub Bot commented on TIKA-4062: -- tballison opened a new pull request, #1179: URL:

[jira] [Commented] (TIKA-4048) Gzipped WARC not identifying all assets

2023-06-02 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17728763#comment-17728763 ] ASF GitHub Bot commented on TIKA-4048: -- tballison merged PR #1166: URL:

[jira] [Commented] (TIKA-4048) Gzipped WARC not identifying all assets

2023-06-02 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17728764#comment-17728764 ] ASF GitHub Bot commented on TIKA-4048: -- tballison commented on PR #1166: URL:

<    1   2   3   4   5   6   7   8   9   10   >