I think that we should do the following:

Create a repository for license files separate from the main tool.  Perhaps
in the tools section, perhaps as a separate git project.  The idea is that
teams can develop/refine license definitions and release them into a shared
space.   This will make it easier if someone wants to find the PHP license
for example.  Then we should put the PHP and EUPL licenses in there.  We
can also put the XSD for license/configuration in there.

We need to support RAT-7 but to do so I think will require some data design
changes.  Currently we have License family and License as separate
entities.  At one time I thought these should be the same, but I think now
that it is a reasonable segregation.  License files can exclude families
but this is too big a segregation.  for example to support the ASF licenses
we should:

   - Have a single file that defines all the ASF licenses under the family
   "AL" (Though I would like to change this to ASF)
   - Each license version would be listed as a separate license in the "AL"
   family.
   - Licenses have Unique IDs. make the license ID for the AL licenses
   "ASF<version>, (e.g. ASF1.0, ASF1.1, and ASF2.0)

Currently the approved licenses list specifies which families are
approved.  I think it should specify which licenses within the families are
approved.  So currently we have "AL" as an approved license when it should
be ASF2.0, which requires the ASF licenses to be defined so we can detect
1.0 and 1.1 and list them as not approved. which is the request for RAT-7.

create a licenses section in the configuration.

--licenses : a list of files to be read as license files.
--licenses-approved : a list of license IDs to approve.
--licenses-approved-file : A file containing license IDs to approve.
--licenses-no-default : An enumeration of DEF (do not load license
definitions), APPROVAL (do not load default license approvals)

We should also change

   - --list-families to --licenses-list-families
   - --list-licenses to --licenses-list

The above will group all the license commands into one section of the help
output.


The Default will be the list of ASF approved licenses.

Other CLI changes

create an "input" section that defines what files will be processed

   - change --exclude to --input-exclude (exclusion to be defined by
   .github exclude pattern)
   - change --exclude-file to --input-exclude-file
   - add --input-exclude-no-default :  to disable default exclusion.
   - add --input-exclude-scm : an enumeration of known SCM exclusion files
   formats based on the SCM (e.g. github, svn) that would apply exclusions
   found by those files in the input.
   - add --input-file : specify a file to read rather than the input file.
   - remove --scan-hidden-directories as that would be the default if
   --input-exclude-no-default were specified and no other filters were added.

Add exclusion files to the same repository that has the license definitions
and enable community support

create an "edit" section to describe how any edits on the files will take
place

   - change --copyright to --edit-copyright to add a copyright message
   - change --addLicense to --edit-license <file> : the optional <file>
   specifies the text to insert with ${copyright} in the text replaced with
   the --edit-copyright value or empty if --edit-copyright is not specified.
   - change --force to --edit-overwrite : to better describe what it does.

create an "output" section to describe how the output will be handled.

   - change --out to --output-file
   - change --xml to --output-xml
   - change --stylesheet to --output-stylesheet
   - change --standard to --output-standard
   - change --archive to --output-archive


expand the help section to provide more detailed help and to provide links
to specific documentation.

   - add --help-input
   - add --help-licenses
   - add --help-output
   - add --help-edit
   - add -? for help

Deprecate ALL single letter options. Deprecate all other long options.
Add a warning that a deprecated command is being used.

Add the ability to use the minimum characters necessary to differentiate
the commands (e.g. --help-i for --help-input, or --input-f for
--input-file).  This is/was supported by the commons-cli library and should
be a simple switch to activate.

Claude

On Sat, May 11, 2024 at 8:31 AM Claude Warren <cla...@apache.org> wrote:

> Revisiting the list:
>
> Config
>
> [X] 322 Hidden dir processing
> [X] 335 .gitignore not processed correctly
> 313 Add EUPL license
> [X] 110 Ability to define target license(s)
> 146 list of directories to exclude by default.
> 2   reports should be able to skip certain file types or contents.
> 130 support new languages and build systems
> 128 be consistent in the name of the Apache license.
> 141 Javadoc license not required is redundant
> 97  excludeSubProjects=true fails if hierarchy more than 1 deep.
>
>
> Engine
>
> [X] 150 Use Tika to determine file types
> [X] 335 .gitignore not processed correctly
> [X] 333 --force changes permissions.
> [X] 301 chinese char reports as binary file
> [X] 325 performance degradation
> 255 header above 'package' for Java files
> 238 Broken symlinks result in crash
> 235 symlinks are not recognized in compressed tarball
> [X] 211 rat-output.xml must be well formed even if BinaryGuesser failes
> [X] 190 false negatives in license checking
> 209 ignore short files when checking for licenses
> 60  create a crawler module
> [X] 54  mime detection using tika
> 183 report binary file analysis as debug
> 182 allow modification of list of binary file types
> [X] 178 binary guesser should not ignore all IOExceptions
> 149 generate incremental output
> 4   check for export controlled software noted in README
> 51  check for crypto and not its usage.
> [X] 53  factor out mime detection to allow for multiple implementations
> 14  provide number information about license headers
> 131 refactor core to a classic object oriented design
>
>
> UI
>
> 323 Harmonization
> 259 File to list files to scan
> [X] 77  output missing headers
> 240 cli exclude does not work with full path.
> 265 certain wildcard filters do not work anymore
> [X] 248 exlusions are duplicated in Maven with consoleOutput defined
> 239 maven build failes in multi project build with surfire-plugin
> forkCount=0
> 163 gradle plugin request
> [X] 107 useEclipseDefualts doe not ignore sub-module exlicpse dotfiles.
> 98  maven rat does not document skipped files.
> 134 CLI uses different wildcard syntax from ANT/Maven.
>
>
> Reporting
>
> 7   Distinguish older Apache headers
>
> Docs
>
> 326 Fix Javadoc errors
> 302 site build does not work with JDK 18
> 132 develop lexicon and drive configuration with it.
>
> Maint
>
> [X] 311 Dependabot changes
> [?] 272 multi-matrix builds
> 293 configure sonar
> 282 release 1
>

Reply via email to