[jira] [Commented] (TIKA-3666) Detect and indicate file encrypted with Rights Management Service RMS/IRM

2022-04-19 Thread August Valera (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524177#comment-17524177 ] August Valera commented on TIKA-3666: - [~tallison] I can confirm that this works on th

[jira] [Commented] (TIKA-3666) Detect and indicate file encrypted with Rights Management Service RMS/IRM

2022-04-19 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524204#comment-17524204 ] Tim Allison commented on TIKA-3666: --- There’s a chance pdfs are also wrapped in an ole2 c

[jira] [Created] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
denisn created TIKA-3720: Summary: IllegalArgumentException in PDF parser Key: TIKA-3720 URL: https://issues.apache.org/jira/browse/TIKA-3720 Project: Tika Issue Type: Bug Affects Versions: 1.23

[jira] [Updated] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] denisn updated TIKA-3720: - Description: Tika packages:  "org.apache.tika" % "tika" %  c "org.apache.tika" % "tika-core" %  1.28.1 "org.apach

[jira] [Updated] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] denisn updated TIKA-3720: - Description: Tika packages:  {code:java} "org.apache.tika" % "tika" %  c "org.apache.tika" % "tika-core" %  1.28.1

[jira] [Updated] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] denisn updated TIKA-3720: - Description: Tika packages:  {code:java} "org.apache.tika" % "tika" %  1.28.1 "org.apache.tika" % "tika-core" %  1

[jira] [Commented] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524298#comment-17524298 ] Tim Allison commented on TIKA-3720: --- The problem isn't the file, although, thank you for

[jira] [Commented] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524311#comment-17524311 ] Tim Allison commented on TIKA-3720: --- In TIKA-2970 (which happened between 1.22 and 1.23)

[jira] [Commented] (TIKA-3666) Detect and indicate file encrypted with Rights Management Service RMS/IRM

2022-04-19 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524334#comment-17524334 ] Tim Allison commented on TIKA-3666: --- I'm wondering if we should throw an exception in th

[jira] [Commented] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524337#comment-17524337 ] denisn commented on TIKA-3720: -- Sorry, i was confused because 2.x gave me an output and 1.28.

[jira] [Comment Edited] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524337#comment-17524337 ] denisn edited comment on TIKA-3720 at 4/19/22 2:06 PM: --- Sorry, i was

[jira] [Resolved] (TIKA-2359) Extreme slow parsing on the attachment attached

2022-04-19 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2359. --- Resolution: Not A Problem > Extreme slow parsing on the attachment attached >

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2022-04-19 Thread Alexander Bias (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524341#comment-17524341 ] Alexander Bias commented on TIKA-2359: -- === English version follows === Sehr geehrt

[jira] [Commented] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524343#comment-17524343 ] Tim Allison commented on TIKA-3720: --- bq. I had no Tesseract installed Ah, ok, that tog

[jira] [Commented] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524362#comment-17524362 ] denisn commented on TIKA-3720: -- I've waited the parser in 1.22 long enough and got the result

[jira] [Comment Edited] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524362#comment-17524362 ] denisn edited comment on TIKA-3720 at 4/19/22 2:57 PM: --- I've waited

[jira] [Commented] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524368#comment-17524368 ] denisn commented on TIKA-3720: -- Well it's a different problem it seems. Should i open another

[jira] [Commented] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524376#comment-17524376 ] Tim Allison commented on TIKA-3720: --- The problem with 1.x or 2.x? 1.x (I think) is cau

[jira] [Commented] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524386#comment-17524386 ] denisn commented on TIKA-3720: -- I guess the problem is only in 1.x starting with 1.23. 1.23 a

[jira] [Comment Edited] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524386#comment-17524386 ] denisn edited comment on TIKA-3720 at 4/19/22 3:25 PM: --- I guess the

[jira] [Commented] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524515#comment-17524515 ] Tim Allison commented on TIKA-3720: --- Interesting. The issue (I think) is that you're "h

[jira] [Commented] (TIKA-3719) Tika Server Ability to Run HTTPs

2022-04-19 Thread Daniel Coldrick (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524526#comment-17524526 ] Daniel Coldrick commented on TIKA-3719: --- I managed to get it working with https by u

[jira] [Created] (TIKA-3721) DGN parser

2022-04-19 Thread Dan Coldrick (Jira)
Dan Coldrick created TIKA-3721: -- Summary: DGN parser Key: TIKA-3721 URL: https://issues.apache.org/jira/browse/TIKA-3721 Project: Tika Issue Type: New Feature Components: parser Af

[jira] [Created] (TIKA-3722) OOM exception on xlsx parsing

2022-04-19 Thread sagi shechter (Jira)
sagi shechter created TIKA-3722: --- Summary: OOM exception on xlsx parsing Key: TIKA-3722 URL: https://issues.apache.org/jira/browse/TIKA-3722 Project: Tika Issue Type: Bug Reporter:

[jira] [Updated] (TIKA-3722) OOM exception on xlsx parsing

2022-04-19 Thread sagi shechter (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sagi shechter updated TIKA-3722: Attachment: records_headers_02.xlsx > OOM exception on xlsx parsing > -

[jira] [Updated] (TIKA-3722) OOM exception on xlsx parsing

2022-04-19 Thread sagi shechter (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sagi shechter updated TIKA-3722: Attachment: (was: records_headers_02.xlsx) > OOM exception on xlsx parsing > ---

[jira] [Updated] (TIKA-3722) OOM exception on xlsx parsing

2022-04-19 Thread sagi shechter (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sagi shechter updated TIKA-3722: Description: The file is ~3mb , fails with tika app 2.3.0 {code:java} java.lang.OutOfMemoryError: Ja

[jira] [Updated] (TIKA-3722) OOM exception on xlsx parsing

2022-04-19 Thread sagi shechter (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sagi shechter updated TIKA-3722: Affects Version/s: 2.3.0 > OOM exception on xlsx parsing > - > >

[jira] [Commented] (TIKA-3719) Tika Server Ability to Run HTTPs

2022-04-19 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524570#comment-17524570 ] Tim Allison commented on TIKA-3719: --- Yes, yes, and more yes! Thank you! How can we p

[jira] [Commented] (TIKA-3719) Tika Server Ability to Run HTTPs

2022-04-19 Thread Dan Coldrick (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524588#comment-17524588 ] Dan Coldrick commented on TIKA-3719: Hi [~tallison]  I'm far from being a java develo

New JDK 19 EA builds and JCE Survey!

2022-04-19 Thread David Delabassee
Greetings! The proposed schedule for JDK 19 is now known [1] with ‘Rampdown Phase One’ set for June 9th and ‘General Availability’ set for September 20th. The next several weeks will be interesting to watch as the scope of JDK 19 is revealed. You also play an important roll during these phas

[jira] [Commented] (TIKA-3721) DGN parser

2022-04-19 Thread Nick Burch (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524718#comment-17524718 ] Nick Burch commented on TIKA-3721: -- After a quick look, I can't spot any free tools or li

[jira] [Commented] (TIKA-3722) OOM exception on xlsx parsing

2022-04-19 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524736#comment-17524736 ] Tilman Hausherr commented on TIKA-3722: --- Please attach the file and try with differe

[jira] [Commented] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524767#comment-17524767 ] denisn commented on TIKA-3720: -- I've changed the contentParser to: {code:java} def contentPa

[jira] [Comment Edited] (TIKA-3720) IllegalArgumentException in PDF parser

2022-04-19 Thread denisn (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17524767#comment-17524767 ] denisn edited comment on TIKA-3720 at 4/20/22 6:45 AM: --- I've changed