If there is a requirement that a NOTICE file exists then there are probably
other projects that have requirements for other files to exist.  Perhaps we
need a way to specify that specific file patterns must exist.

We could add:

   - --require-file-literal to exclude literal file name (e.g. "AUTHOR.TXT")
   - --require-file-wildcard to exclude files based on file wildcards (e.g.
   "AUTHOR.*")
   - --require-file-regex to exclude files based on regular expressions
   (e.g. "AUTHORS?(\.[Tt][Xx][Tt])?"
   - --require-contents-literal to exclude files based on a literal match
   to text in the file.
   - --require-contents-regex to exclude files based on a regular
   expression match to text in the file.
   - --require-source to exclude files and directories based on the input
   of a file.  Multiple file structures could be accepted but in general it
   has a flag for the file/contents dichotomy and the literal/wildcard/regex
   trichotomy. For example: file:literal:AUTHOR.TXT or
   <contents><literal>Generated  by</literal></contents>
   - --no-default-require to remove any required files that are included by
   default.

We can add options to the configuration XML to provide a default mechanism
to do complex exclude / require operations in a way that is portable across
build systems.  In this case --exclude-source and --require-source are not
required.

Currently the configuration system supports multiple configuration file
formats [1] , though only XML is implemented, though commons-configuration
files, or yaml files might be reasonable implementations.  Now that I think
about it, a pom.xml parser might be a way to solve some issues with the
Maven plugin.

[1]
https://github.com/apache/creadur-rat/blob/master/apache-rat-core/src/main/java/org/apache/rat/configuration/Format.java

On Sun, May 5, 2024 at 4:02 PM P. Ottlinger <pottlin...@apache.org> wrote:

> Hi,
>
> Am 05.05.24 um 08:52 schrieb Claude Warren:
> > *Current state*
> >
> > We attempt to provide a default configuration that is ASF requirements
> > based.
> >
> > We currently categorize documents into one of six categories: Generated,
> > Unknown, Archive, Notice, Binary, Standard.
> >
> >     - Standard documents get scanned for the presence or absence of
> license
> >     headers.
> >     - Archive documents may get scanned for the presence or absence of
> >     license headers.
> >     - Notice files are determined by file name [1]  and are excluded from
> >     processing
>
> If I'm not mistaken the special handling for NOTICE files was introduced
> as there is a requirement to include these files into a JAR.
>
> Personally I wouldn't mind your changes as I never understood the
> special handling, but downstream users might use RAT's classification to
> ensure that a NOTICE file exists ....
>
> Cheers,
> Phil
>


-- 
LinkedIn: http://www.linkedin.com/in/claudewarren

Reply via email to