> On Dec 20, 2020, at 1:18 AM, Matt Casters <[email protected]>
> wrote:
>
> Thank you very much Julian.
> I mainly wonder where on earth that font comes from since we're not using
> it anywhere.
Yeah, fonts have a habit of sneaking in. :)
> As for rat exclusions: are there any particular file formats besides .java
> files that need an Apache license header? We'd be happy to add them
> elsewhere.
> The shell scripts perhaps as they support comments? We could even add them
> to the SVG filed even though it will probably blow up memory consumption
> unless we code the comments out of the file loads somehow.
> Perhaps it's easier to just look at other projects and ask which files need
> a header?
My preference is to put a header on pretty much any file that can have a
header. Which in my experience is pretty much all text files, except those used
as test inputs or reference logs. For example, in .md files you can add the
header inside comments that do not appear in the generated HTML. Shell scripts,
pom files, properties files, etc. all support comments, so we should add
headers.
I agree, I would not put a header on SVG files because they are treated as de
facto binaries and they need to be small.
I suggest that for 0.60 we pare down the RAT exclusions to the absolute
minimum. RAT is a powerful tool if we are not holding it back! I ran RAT with
the -debug flag and I saw lots of Java files being excluded, and that was
concerning.
Binary files are always a problem. They are just as susceptible to copyright
and licensing issues but are more difficult to audit. One strategy is to audit
them one by one and add an exclusion line for each individual file. I know
that’s a big task, so definitely not for 0.50.
By the way, I ran a command to find out what kinds of files are in Hop. The
results are interesting. There’s even one FoxPro file in there!:
$ git ls-files -z | xargs -0 file -b | sort | uniq -c
2827 ASCII text
9 ASCII text, with CRLF, LF line terminators
47 ASCII text, with CRLF line terminators
3 ASCII text, with CR line terminators
16 ASCII text, with no line terminators
428 ASCII text, with very long lines
2 Big-endian UTF-16 Unicode text, with no line terminators
7 Bourne-Again shell script, ASCII text executable
1 Bourne-Again shell script, ASCII text executable, with very long lines
2 bzip2 compressed data, block size = 900k
2 Composite Document File V2 Document, Little Endian, Os: Windows,
Version 10.0, Code page: 1252, Author: Matthias Hietland Heie, Last Saved By:
Sergio Ribeiro, Name of Creating Application: Microsoft Excel, Create
Time/Date: Fri Nov 17 14:48:53 2017, Last Saved Time/Date: Tue Jun 18 09:34:04
2019, Security: 0
2 Composite Document File V2 Document, Little Endian, Os: Windows,
Version 10.0, Code page: 1252, Author: Sergio Ribeiro, Last Saved By: Sergio
Ribeiro, Name of Creating Application: Microsoft Excel, Create Time/Date: Tue
Sep 11 09:41:24 2018, Last Saved Time/Date: Tue Sep 11 10:20:56 2018, Security: 0
2 Composite Document File V2 Document, Little Endian, Os: Windows,
Version 10.0, Code page: 1252, Author: Sergio Ribeiro, Last Saved By: Sergio
Ribeiro, Name of Creating Application: Microsoft Excel, Create Time/Date: Tue
Sep 11 09:41:24 2018, Last Saved Time/Date: Tue Sep 11 10:55:49 2018, Security: 0
2 Composite Document File V2 Document, Little Endian, Os: Windows,
Version 1.0, Code page: -535, Author: JB, Revision Number: 3, Total Editing
Time: 02:08, Create Time/Date: Thu Oct 27 19:46:23 2011, Last Saved Time/Date:
Thu Feb 20 09:00:44 2014
2 Composite Document File V2 Document, Little Endian, Os: Windows,
Version 5.0, Code page: 0
1 Composite Document File V2 Document, Little Endian, Os: Windows,
Version 5.0, Code page: 1252, Author: Jens Bleuel, Last Saved By: Jens Bleuel,
Name of Creating Application: Microsoft Excel, Create Time/Date: Wed Aug 23
15:46:56 2006, Last Saved Time/Date: Wed Aug 23 15:56:14 2006, Security: 0
1 Composite Document File V2 Document, Little Endian, Os: Windows,
Version 5.1, Code page: 1252, Author: Matt Casters, Last Saved By: Matt
Casters, Name of Creating Application: Microsoft Excel, Create Time/Date: Tue
Sep 7 16:08:18 2010, Last Saved Time/Date: Tue Sep 7 16:15:32 2010, Security: 0
2 Composite Document File V2 Document, Little Endian, Os: Windows,
Version 5.1, Code page: 1252, Last Saved By: Jens Bleuel, Name of Creating
Application: Microsoft Excel, Create Time/Date: Thu Oct 17 06:27:31 1996, Last
Saved Time/Date: Tue Nov 28 15:07:48 2006, Security: 0
5 C source, ASCII text
7 C++ source, ASCII text
25 CSV text
1 data
3 DOS batch file, ASCII text
1 Embedded OpenType (EOT), icomoon family
1 Embedded OpenType (EOT), OpenSansLight family
1 Embedded OpenType (EOT), OpenSansRegular family
28 empty
9 exported SGML document, ASCII text
1 FoxBase+/dBase III DBF, 279 records * 52, update-date 106-7-25,
codepage ID=0xf, at offset 161 1st record " 1das ist doch keine leistung
44.00hw * 2Meister 48"
2 GIF image data, version 89a, 16 x 16
1 GIF image data, version 89a, 9 x 9
1 gzip compressed data, from FAT filesystem (MS-DOS, OS/2, NT), original
size modulo 2^32 703
2 gzip compressed data, was "default.csv", last modified: Wed Aug 26
08:50:54 2015, from Unix, original size modulo 2^32 67
30 HTML document, ASCII text
1 HTML document, ASCII text, with very long lines
2 HTML document, UTF-8 Unicode text
1 ISO-8859 text
1 ISO-8859 text, with CR line terminators
3 ISO-8859 text, with very long lines
3179 Java source, ASCII text
1 Java source, ASCII text, with CRLF, LF line terminators
1 Java source, ASCII text, with very long lines
13 Java source, UTF-8 Unicode text
1 JPEG image data, JFIF standard 1.01, aspect ratio, density 1x1, segment
length 16, progressive, precision 8, 400x400, components 3
29 JSON data
2 Little-endian UTF-16 Unicode text, with CRLF line terminators
2 Little-endian UTF-16 Unicode text, with no line terminators
10 Microsoft Excel 2007+
1 Microsoft OOXML
1 MS Windows icon resource - 1 icon, 32x32, 24 bits/pixel
2 MS Windows icon resource - 1 icon, 32x32, 32 bits/pixel
3 Non-ISO extended-ASCII text, with no line terminators
5 OpenDocument Spreadsheet
1 PNG image data, 1244 x 686, 8-bit/color RGB, non-interlaced
1 PNG image data, 1460 x 816, 8-bit/color RGB, non-interlaced
2 PNG image data, 15 x 15, 8-bit/color RGBA, non-interlaced
1 PNG image data, 1680 x 1050, 8-bit/color RGB, non-interlaced
3 PNG image data, 16 x 16, 8-bit/color RGBA, non-interlaced
2 PNG image data, 22 x 22, 8-bit/color RGB, non-interlaced
1 PNG image data, 403 x 138, 8-bit/color RGB, non-interlaced
4 PNG image data, 4702 x 1702, 8-bit/color RGB, non-interlaced
4 PNG image data, 5010 x 1990, 8-bit/color RGB, non-interlaced
1 PNG image data, 551 x 626, 8-bit/color RGB, non-interlaced
1 PNG image data, 642 x 368, 8-bit/color RGBA, non-interlaced
1 PNG image data, 972 x 464, 8-bit/color RGB, non-interlaced
3 ReStructuredText file, ASCII text
1 ReStructuredText file, ASCII text, with very long lines
2 SAS
654 SVG Scalable Vector Graphics image
1 TIFF image data, big-endian, direntries=16, height=16, bps=0,
compression=none, PhotometricIntepretation=RGB, orientation=upper-left, width=16
1 TrueType Font data, 11 tables, 1st "OS/2", 14 names, Macintosh, type 1
string, icomoon
1 TrueType Font data, 18 tables, 1st "FFTM", 26 names, Macintosh
1 TrueType Font data, 18 tables, 1st "FFTM", 30 names, Macintosh
2 Unicode text, UTF-32, big-endian
2 Unicode text, UTF-32, little-endian
385 UTF-8 Unicode text
2 UTF-8 Unicode text, with no line terminators
40 UTF-8 Unicode text, with very long lines
2 UTF-8 Unicode (with BOM) text, with no line terminators
1 Visual FoxPro DBF, 2 records * 205, update-date 15-10-20, at offset 129
1st record "value11
"
1 Web Open Font Format, TrueType, length 1168, version 1.0
1 Web Open Font Format, TrueType, length 67528, version 1.10
1 Web Open Font Format, TrueType, length 69392, version 1.10
958 XML 1.0 document, ASCII text
1 XML 1.0 document, ASCII text, with CRLF, LF line terminators
82 XML 1.0 document, ASCII text, with very long lines
1 XML 1.0 document, ASCII text, with very long lines, with no line
terminators
1 XML 1.0 document, UTF-8 Unicode text
2 XML 1.0 document, UTF-8 Unicode text, with very long lines
1 XML 1.0 document, UTF-8 Unicode (with BOM) text
2 Zip data (MIME type "application/vnd.pentaho.reporting.classic"?)
Julian