[jira] [Commented] (TIKA-4245) Tika does not get html content properly

2024-04-25 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840922#comment-17840922 ] Tilman Hausherr commented on TIKA-4245: --- The file claims to be utf-16 but it isn't. If I change it

[jira] [Commented] (TIKA-4245) Tika does not get html content properly

2024-04-25 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840908#comment-17840908 ] Tilman Hausherr commented on TIKA-4245: --- Happens also with the tika app GUI. > Tika does not get

[jira] [Updated] (TIKA-4245) Tika does not get html content properly

2024-04-25 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-4245: -- Description: We use org.apache.tika.parser.AutoDetectParser to get the content of html files.  

[jira] [Created] (TIKA-4245) Tika does not get html content properly

2024-04-25 Thread Xiaohong Yang (Jira)
Xiaohong Yang created TIKA-4245: --- Summary: Tika does not get html content properly Key: TIKA-4245 URL: https://issues.apache.org/jira/browse/TIKA-4245 Project: Tika Issue Type: Bug

[jira] [Commented] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-25 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840893#comment-17840893 ] Hudson commented on TIKA-4244: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1612 (See

[jira] [Resolved] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-4244. --- Fix Version/s: 3.0.0 2.9.3 Resolution: Fixed Thank you [~boomxlucifer]! >

[jira] [Commented] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840860#comment-17840860 ] ASF GitHub Bot commented on TIKA-4244: -- tballison merged PR #1731: URL:

Re: [PR] TIKA-4244 -- improve ics detection [tika]

2024-04-25 Thread via GitHub
tballison merged PR #1731: URL: https://github.com/apache/tika/pull/1731 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-25 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840850#comment-17840850 ] ASF GitHub Bot commented on TIKA-4244: -- tballison opened a new pull request, #1731: URL:

[jira] [Commented] (TIKA-4244) Tika idenifies MIME type of ics files with html content as text/html

2024-04-25 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840852#comment-17840852 ] Tim Allison commented on TIKA-4244: --- Thank you [~boomxlucifer] for finding this and reporting it. The

[PR] TIKA-4244 -- improve ics detection [tika]

2024-04-25 Thread via GitHub
tballison opened a new pull request, #1731: URL: https://github.com/apache/tika/pull/1731 Thanks for your contribution to [Apache Tika](https://tika.apache.org/)! Your help is appreciated! Before opening the pull request, please verify that * there is an open issue on the