Thanks, I did not realize that Coverage -> Match[n] could be that useful!
Though the field Match.Name is not a file name I can os.Open().
How can I directly access the known license texts?


On Thu, Nov 14, 2019 at 10:42 AM Dan Kortschak <d...@kortschak.io> wrote:
>
> The licensecheck.Match type holds the start and end offsets in the
> file. Can't you use that to extract the license portion and either
> check it's length against the length of the license or repeat the Check
> with only that portion of the file?
>
> On Thu, 2019-11-14 at 10:24 +0100, fge...@gmail.com wrote:
> > Sorry if I was not clear: on walking the file system, that's clear, I
> > did not intend to talk about that, only about matching and reporting
> > on matching. The example I gave was just to put in context why I
> > believe I'd need a different api.
> >
> > Using the Options field is good enough in the first example. (That's
> > how I used licensecheck first.)
> > Although for the second example Cover() does not report what I'd
> > need.
> >
> > As far as I've seen currently using func Cover(INPUT []byte, opts
> > Options) (Coverage, bool) reports 100% MIT if INPUT matches byte for
> > byte 100% MIT. If INPUT has more text than the complete 100% matching
> > text of  MIT license, for example the MIT license is only in the
> > beginning of INPUT and the rest of INPUT is for example Go code, than
> > Coverage will report len(INPUT)/len(MIT license) which is less than
> > 100%.
> >
> > In this case, the new api would report 100%, since input contains
> > 100%
> > MIT license text (and some programming code, which is not relevant
> > here).
> >
> > If I understand correctly the current api is for checking _already_
> > identified license files, which contain _only_ the license text.
> > I believe to look for files containing - complete or possibly broken
> > -
> > license references a different matching is needed.
> >
> >
> > On 11/14/19, Rob Pike <r...@golang.org> wrote:
> > > As I understand what you're trying to do, you just need to write a
> > > tree
> > > walker, perhaps using filepath.Walk, that opens each file and calls
> > > Cover
> > > on it. You can set the Options field to control the threshold for
> > > reporting, and use the result of that to choose which licenses to
> > > report.
> > >
> > > I don't believe an API change is called for.
> > >
> > > -rob
> > >
> > >
> > > On Thu, Nov 14, 2019 at 6:14 PM <fge...@gmail.com> wrote:
> > >
> > > > func Cover(input []byte, opts Options) (Coverage, bool) in
> > > > licensecheck currently reports len(input)/len(one of the
> > > > licenses) for
> > > > each known license. I'd need for all known licenses len(known
> > > > license)/len(license reference in input).
> > > >
> > > > I'd like to scan >100000 files (possibly a lot more), where some
> > > > of
> > > > them (<0.1%) contain full or partial known license texts.
> > > >
> > > > An example scenario for an example /src, containing >100000
> > > > files:
> > > > $ listlicenses /src     # to get an overview of 100% matching
> > > > license
> > > > references
> > > > LGPL-2.1
> > > > MIT
> > > > $ listlicenses -details /src            # same tree, more
> > > > detailed
> > > > output,
> > > > to
> > > > see the details
> > > > /src/license refers 100% MIT   # the bytes in /src/license
> > > > correspond
> > > > one for one for the MIT license
> > > > /src/fonts/LICENSE refers 100% MIT   # the bytes in
> > > > /src/fonts/LICENSE
> > > > correspond one for one for the MIT license
> > > > /src/a/Notice refers 100% LGPL-2.1   # same as above with LGPL-
> > > > 2.1
> > > > /src/a/b/whatever.go refers 94% GPL2   # most probably a broken
> > > > license reference in whatever.go, maybe someone inadvertently
> > > > deleted
> > > > the last word from the lines containing the GPL2 license text.
> > > > Needs
> > > > human inspection to check what's the license situation with
> > > > whatever.go
> > > > /src/c/ConfusingLicenseReferences.c refers 7% ZLIB   #
> > > > ConfusingLicenseReferences.c has most probably a false positive
> > > > report
> > > > for reference to ZLIB
> > > > /src/c/ConfusingLicenseReferences.c refers 65% MIT    #
> > > > ConfusingLicenseReferences.c has only 65% of MIT, the author
> > > > intended
> > > > to refer to MIT, but some inadvertent edit later broke the
> > > > license
> > > > reference in ConfusingLicenseReferences.c
> > > >
> > > > Command listlicenses iterates over all files in the subtree,
> > > > gathering
> > > > all full or partial (broken) license references. Command
> > > > listlicenses
> > > > uses the functionality similar to github.com/google/licensecheck
> > > > to
> > > > check the files in the file system.
> > > >
> > > >
> > > >
> > > > thanks!
> > > >
> > > > On 11/13/19, Rob Pike <r...@golang.org> wrote:
> > > > > Can you please explain in more detail what you're asking for? I
> > > > > don't
> > > > > understand the problem you have or why the current package
> > > > > cannot
> > > > > handle
> > > > > it.
> > > > >
> > > > > -rob
> > > > >
> > > > >
> > > > > On Wed, Nov 13, 2019 at 7:05 PM <fge...@gmail.com> wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > >  "licensecheck classifies license files and heuristically
> > > > > > determines
> > > > > > how well they correspond to known open source licenses."
> > > > > >
> > > > > > I'd like to identify license references in the file system.
> > > > > > If I
> > > > > > understand correctly package licensecheck in it's current
> > > > > > form is not
> > > > > > useful to help with this.
> > > > > > If it's still possible, could you please share a hint how to
> > > > > > do that?
> > > > > > (input: byte array, output: license references in the byte
> > > > > > array)
> > > > > > If I understand correctly and I can't use licensecheck in
> > > > > > it's current
> > > > > > form, which one is preferred:
> > > > > > extend current api, (maybe: func Refers(input []byte)
> > > > > > (References,
> > > > > > bool) or fork+rename the package? (References{...} being
> > > > > > similar to
> > > > > > Coverage{...})
> > > > > >
> > > > > > thanks,
> > > > > > Gergely Födémesi
> > > > > >
> > > > > > --
> > > > > > You received this message because you are subscribed to the
> > > > > > Google
> > > >
> > > > Groups
> > > > > > "golang-nuts" group.
> > > > > > To unsubscribe from this group and stop receiving emails from
> > > > > > it, send
> > > >
> > > > an
> > > > > > email to golang-nuts+unsubscr...@googlegroups.com.
> > > > > > To view this discussion on the web visit
> > > > > >
> > > >
> > > >
> https://groups.google.com/d/msgid/golang-nuts/CA%2BctqrqKKUPTHihMLhLTH5O-tBm1qENQV6y41Qwde4jHp1kNmA%40mail.gmail.com
> > > > > > .
> > > > > >
> >
> >
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CA%2BctqroO4YcqOxKfWzVNveqJ7%3D0d%3Dwe1XYgahRi%2BYiftWTdFww%40mail.gmail.com.

Reply via email to