[ 
https://issues.apache.org/jira/browse/TIKA-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18077407#comment-18077407
 ] 

Tim Allison commented on TIKA-4683:
-----------------------------------

Y, thank you for reviewing. It really is a number of count differences. There 
were two main causes: one is extra comment repetition in 3.x that we've fixed 
in 4.x. The other has to do with how we're handling some urls in 4.x, and the 
new handling looks correct. There may have been a third...

 

Still working on reverting back to 3.x charset chain...lots of tests were 
updated. Once that finished, I'll run hopefully the last batch of regression 
tests. :P

> Prep for 4.0.0-ALPHA release
> ----------------------------
>
>                 Key: TIKA-4683
>                 URL: https://issues.apache.org/jira/browse/TIKA-4683
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>         Attachments: reports-20260429.tar.gz, reports-4.0.0-20260411.tgz, 
> reports.tar.gz
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to