[ 
https://issues.apache.org/jira/browse/RAT-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18033015#comment-18033015
 ] 

Philipp Ottlinger edited comment on RAT-512 at 10/26/25 10:27 AM:
------------------------------------------------------------------

Due to the JDK8-restriction we are bound to a very old version of Tika that 
seems to have introduced regressions in document scanning. Using Tika was an 
intended big change as the manual document recognition was prone to other 
errors. Thanks for the report.


was (Author: hugo.hirsch):
Due to the JDK8-restriction we are bound to a very old version of Tika that 
seems to have introduced regressions in document scanning. Thanks for the 
report.

> PDF files fail the check
> ------------------------
>
>                 Key: RAT-512
>                 URL: https://issues.apache.org/jira/browse/RAT-512
>             Project: Apache Rat
>          Issue Type: Bug
>    Affects Versions: 0.17
>            Reporter: Niels Basjes
>            Priority: Major
>
> Simply upgrading from 0.16.1 to 0.17 now fails the build if PDF files are 
> present (desired and committed in my case) in the directory tree.
> *Reproduction:*
> * Empty directory with an empty pom.xml (i.e. only the basics and no mention 
> of apache rat at all)
> {code:java}
> <?xml version="1.0"?>
> <project xmlns="http://maven.apache.org/POM/4.0.0"; 
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; 
> xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
> http://maven.apache.org/xsd/maven-4.0.0.xsd";>
>   <modelVersion>4.0.0</modelVersion>
>   <groupId>nl.basjes.bugreports</groupId>
>   <artifactId>dummy</artifactId>
>   <version>0.0.1-SNAPSHOT</version>
> </project> {code}
>  * Add a pdf file to the directory 
> {code}
> $ file Something.pdf 
> Something.pdf: PDF document, version 1.5 (zip deflate encoded)
> {code}
> *mvn org.apache.rat:apache-rat-plugin:0.16.1:check*
> This output is expected because pom.xml does not have a license and the PDF 
> is a binary file that cannot have a license.
> {code}
> [INFO] Rat check: Summary over all files. Unapproved: 1, unknown: 1, 
> generated: 0, approved: 2 licenses.
> [WARNING] Files with unapproved licenses:
>   pom.xml
> {code}
> *mvn org.apache.rat:apache-rat-plugin:0.17:check*
> {code}
> [INFO] RAT summary:
> [INFO]   Approved:  0
> [INFO]   Archives:  0
> [INFO]   Binaries:  0
> [INFO]   Document types:  2
> [INFO]   Ignored:  1
> [INFO]   License categories:  1
> [INFO]   License names:  1
> [INFO]   Notices:  0
> [INFO]   Standards:  2
> [INFO]   Unapproved:  2
> [INFO]   Unknown:  2
> [ERROR] Unexpected count for UNAPPROVED, limit is [0,0].  Count: 2
> [INFO] UNAPPROVED (Unapproved) is a count of unapproved licenses.
> [WARNING] *****************************************************
> Generated at: 2025-10-26T09:38:21+01:00
> Files with unapproved licenses:
>   /Something.pdf
>   /pom.xml
> {code}
> I have not been able to spot a change in the change log that explains this 
> change in behavior so I think this is an unintended bug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to