Currently Rat does nothing with binary files except note that they are
binary.  However, the Tika library gives us a view into the interiors of
some binary files.  The ones that come to mind are image files.

Using Tika we can extract the metadata from binary files, for image files
(and some others) this includes items like copyright and usage permission
entries.  We should process such tags and report copyrights that are not
included in notice files.  We should also explore permissions.  I expect as
we move forward some permissions will be expressed with SPDX tags making
licensing detection easier.

We may need to add a new child node to the resource node in the XML
document.  This should probably be a simple copyright statement something
like <copyright start='year' end='year' owner='somebody' /> that can be
matched by our copyright matcher.  This leads to another expansion of Rat
capabilities: detecting copyrights, listing them in the XML report.
Eventually verifying that they are recorded in the Notice files.

As always, I am looking for your thoughts?

Claude

Reply via email to