[ 
https://issues.apache.org/jira/browse/TIKA-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18077797#comment-18077797
 ] 

Tim Allison edited comment on TIKA-4683 at 5/2/26 11:47 AM:
------------------------------------------------------------

New reports: unpack a known issue. Some churn in octet-stream getting detected 
as text. Some diffs in encoding detection...will take a deeper look on 
Monday... nothing immediately leaps out.

gzip file names.

More zero byte file exceptions and some churn in ole vs msoffice in embedded 
doc detection... further look on Monday.

I'll take a look again early Monday (EST), but I think we're good enough for 
4.0.0-ALPHA?


was (Author: [email protected]):
New reports: unpack a known issue. Some churn in octet-stream getting detected 
as text. Some diffs in encoding detection...will take a deeper look on 
Monday... nothing immediately leaps out.

More zero byte file exceptions and some churn in ole vs msoffice in embedded 
doc detection... further look on Monday.

I'll take a look again early Monday (EST), but I think we're good enough for 
4.0.0-ALPHA?

> Prep for 4.0.0-ALPHA release
> ----------------------------
>
>                 Key: TIKA-4683
>                 URL: https://issues.apache.org/jira/browse/TIKA-4683
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>         Attachments: reports-20260429.tar.gz, reports-20260502.tar.gz, 
> reports-4.0.0-20260411.tgz, reports.tar.gz
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to