[jira] [Commented] (TIKA-4440) General updates for 3.2.2

2025-06-24 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17986020#comment-17986020 ] Hudson commented on TIKA-4440: -- SUCCESS: Integrated in Jenkins build Tika » tika-branch_3x-jd

[jira] [Commented] (TIKA-4327) General updates for 4.0.0

2025-06-24 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985997#comment-17985997 ] Hudson commented on TIKA-4327: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk17 #

[jira] [Updated] (TIKA-4443) ClassCastException while extracting the text of a PDF

2025-06-24 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4443: -- Attachment: screenshot-1.png > ClassCastException while extracting the text of a PDF > -

[jira] [Comment Edited] (TIKA-4443) ClassCastException while extracting the text of a PDF

2025-06-24 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985860#comment-17985860 ] Tilman Hausherr edited comment on TIKA-4443 at 6/24/25 1:29 PM:

[jira] [Commented] (TIKA-4442) PDFParser does not list all metadata extracted by PDFBox

2025-06-24 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985892#comment-17985892 ] Hudson commented on TIKA-4442: -- SUCCESS: Integrated in Jenkins build Tika » tika-branch_3x-jd

[jira] [Commented] (TIKA-4442) PDFParser does not list all metadata extracted by PDFBox

2025-06-24 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985902#comment-17985902 ] Tilman Hausherr commented on TIKA-4442: --- The build is now available at https://repo

[jira] [Resolved] (TIKA-4443) ClassCastException while extracting the text of a PDF

2025-06-24 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved TIKA-4443. --- Fix Version/s: 4.0.0 3.2.2 Assignee: Tilman Hausherr Resolut

[jira] [Updated] (TIKA-4443) ClassCastException while extracting the text of a PDF

2025-06-24 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4443: -- Attachment: screenshot-2.png > ClassCastException while extracting the text of a PDF > -

[jira] [Updated] (TIKA-4442) PDFParser does not list all metadata extracted by PDFBox

2025-06-24 Thread Peter Hoogendijk (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Hoogendijk updated TIKA-4442: --- Attachment: lorem-ipsum.pdf > PDFParser does not list all metadata extracted by PDFBox > -

[jira] [Updated] (TIKA-4442) PDFParser does not list all metadata extracted by PDFBox

2025-06-24 Thread Peter Hoogendijk (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Hoogendijk updated TIKA-4442: --- Attachment: lorem-ipsum.xml > PDFParser does not list all metadata extracted by PDFBox > -

[jira] [Commented] (TIKA-4442) PDFParser does not list all metadata extracted by PDFBox

2025-06-24 Thread Peter Hoogendijk (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985879#comment-17985879 ] Peter Hoogendijk commented on TIKA-4442: I'll have a look at the code myself. both

[jira] [Commented] (TIKA-4443) ClassCastException while extracting the text of a PDF

2025-06-24 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985860#comment-17985860 ] Tilman Hausherr commented on TIKA-4443: --- I think it's a broken PDF, I also found som

[jira] [Comment Edited] (TIKA-4442) PDFParser does not list all metadata extracted by PDFBox

2025-06-24 Thread Peter Hoogendijk (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985846#comment-17985846 ] Peter Hoogendijk edited comment on TIKA-4442 at 6/24/25 11:57 AM: --

[jira] [Created] (TIKA-4443) ClassCastException while extracting the text of a PDF

2025-06-24 Thread Olivier Ceulemans (Jira)
Olivier Ceulemans created TIKA-4443: --- Summary: ClassCastException while extracting the text of a PDF Key: TIKA-4443 URL: https://issues.apache.org/jira/browse/TIKA-4443 Project: Tika Issue

[jira] [Commented] (TIKA-4442) PDFParser does not list all metadata extracted by PDFBox

2025-06-24 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985848#comment-17985848 ] Tilman Hausherr commented on TIKA-4442: --- Thank you, my code gets them all. However i

[jira] [Commented] (TIKA-4442) PDFParser does not list all metadata extracted by PDFBox

2025-06-24 Thread Peter Hoogendijk (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985846#comment-17985846 ] Peter Hoogendijk commented on TIKA-4442: I've create a PDF-file for testing purpos

[jira] [Updated] (TIKA-4442) PDFParser does not list all metadata extracted by PDFBox

2025-06-24 Thread Peter Hoogendijk (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Hoogendijk updated TIKA-4442: --- Environment: * Docker container based on python:3-slim * Debian 12.11 * Python 3.13.5 * ope

[jira] [Comment Edited] (TIKA-4442) PDFParser does not list all metadata extracted by PDFBox

2025-06-24 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985799#comment-17985799 ] Tilman Hausherr edited comment on TIKA-4442 at 6/24/25 10:22 AM: ---

[jira] [Commented] (TIKA-4327) General updates for 4.0.0

2025-06-24 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985832#comment-17985832 ] Hudson commented on TIKA-4327: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk17 #

[jira] [Commented] (TIKA-4440) General updates for 3.2.2

2025-06-24 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985809#comment-17985809 ] Hudson commented on TIKA-4440: -- SUCCESS: Integrated in Jenkins build Tika » tika-branch_3x-jd

[jira] [Commented] (TIKA-4442) PDFParser does not list all metadata extracted by PDFBox

2025-06-24 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985799#comment-17985799 ] Tilman Hausherr commented on TIKA-4442: --- That was just a note for myself. I have now