James,

The in general processing for matching licenses strips out all non
essential text (e.g. '/' and '*') so the current implementation can not
determine if the license text is within a javadoc block or not.  Some
matchers (e.g. Copyright, SPDX, and regex) do use the unmodified text but
they are generally much slower.  Infact, the original SPDX and Copyright
implementations caused a significant (2 order of magnitude or more)
increase in processing time.  It would be possible to create a custom
matcher to do what you want.  But there is no mechanism currently available
in the code base to only call a matcher on specific file types.

There is a section of code that understands file types, but this is the
code that inserts headers into files that don't have them.  It may be
possible to build on that to create a custom matcher to ensure that license
comments are not within java docs.  There is a ticket open to modify how
this code works so that new file types with comment start stop definitions
and restrictions on first lines and such can be defined outside of the
codebase, making it possible to insert headers in as yet unrecognized file
formats.[1]  This might be extended and provide input to the process you
are requesting.

There is also a section of code that removes the non essential text.  The
'prune' method could be modified to remove blocks of code between the
opening javadoc '/**' and the closing '*/'.  But this may lead to problems
with non java files.  Speaking of non java files have you thought about
ensuring that the license does not appear in other javadoc like systems?
[2]  Once this can of worms is opened we will need a way to manage all the
requests that will follow for other file types.

If you have any ideas for implementing the change I would be interested to
hear them.

Claude

[1] https://issues.apache.org/jira/browse/RAT-330?
[2]
https://stackoverflow.com/questions/5334531/using-javadoc-for-python-documentation

On Fri, Jan 26, 2024 at 2:38 PM James Turton <dz...@apache.org> wrote:

> Thanks Phil.
>
> Here's some background [1] which comes from before I was involved with
> Drill. What they wanted was for the license header checker to accept, in
> .java files,
>
> /*
>   * Licensed to the Apache Software Foundation (ASF) under one
>   * or more contributor license agreements.  See the NOTICE file
>   * distributed with this work for additional information
>     etc.
>
> but reject
>
> /**
>   * Licensed to the Apache Software Foundation (ASF) under one
>   * or more contributor license agreements.  See the NOTICE file
>   * distributed with this work for additional information
>     etc.
>
> Notice the two asterisks that open the Java comment block in the second
> form thereby making it a Javadoc comment that will appear in generated
> Javadoc. There are no longer any examples of the latter in Drill but
> this has been enforced by the addition of the license-maven-plugin.
>
> I got here because I want to remove that plugin, which essentially
> duplicates RAT, in favour of another (with exactly the same name :()
> that can generate license and notice information for our third party
> code. This last task is what I'm really doing, the Javadoc license
> header rejection matter is yak shaving that came up on the road.
>
> So my yak shaving question is: if I make RAT Drill's only license header
> checker then could I make it reject license headers of the second form?
> Even if I can't I'm inclined to make it the only header checker since I
> think that it's in any case mandatory and authoritative. But in an
> effort to retain the work of the previous Drill developers I'm trying to
> preserve what they implemented.
>
> 1. https://issues.apache.org/jira/browse/DRILL-6320
>
> On 2024/01/26 14:06, P. Ottlinger wrote:
> > Hi James,
> >
> > thanks for reaching out!
> >
> > Am 26.01.24 um 08:21 schrieb James Turton:
> >> I'd like to ask about a feature to prevent RAT from allowing license
> >> headers to appear inside Javadoc comments  (/**) while still requiring
> >> them in Java comments (/*) in .java files. Currently the Drill project
> >> makes use of com.mycila.license-maven-plugin to reject licenses in
> >> Javadoc comments because the developers at the time didn't want
> >> license headers cluttering the Javadoc website that is generated from
> >> the source. Are you aware of  a general view on Apache license headers
> >> appearing in Javadoc pages? If preventing them from doing so is a good
> >> idea, could this become a (configurable) feature in RAT?
> >
> > could you be so kind to provide an example of what you want to achieve
> > and how your use case looks like?
> >
> > I'm afraid I do not really understand what you mean with
> > javadoc-specific licenses?
> >
> > At the moment we don't have a file specific parsing to exclude comments
> > - is that what you want to achieve?
> >
> > On the other hand if a license header is needed per file, it has to be
> > somewhere in the sources ;)
> >
> > Thanks,
> > Phil
>


-- 
LinkedIn: http://www.linkedin.com/in/claudewarren

Reply via email to