Hi Jonas,

On 6/22/21 5:22 AM, Jonas Smedegaard wrote:
Quoting Walter Lozano (2021-06-16 14:44:01)
On 6/16/21 2:50 AM, Jonas Smedegaard wrote:
Quoting Walter Lozano (2021-06-16 04:12:23)
On 6/15/21 9:17 PM, Jonas Smedegaard wrote:
Quoting Walter Lozano (2021-06-15 20:42:53)
As as user of licensecheck I found it does not provide
deterministic results on some circumstances. And example of this
is gnutls28/m4/ax_code_coverage.m4 which is detected as UNKNOWN
or LGPL.

After some debugging I found that the root cause could be in
libregexp-pattern-license-perl, I have proposed a fix which you
can find in

https://salsa.debian.org/perl-team/modules/packages/libregexp-pattern-license-
perl/-/merge_requests/1

I hope you can help me to clarify this issue.
Great - thanks a lot!

I suspect that this might be bug#982849.
Yes, it looks exactly the same issue I faced. I hope you can
confirm and fix it
I will certainly do that.
In relation to this, I find that the problem is more evident at least
after these commits, which are related to versioning

   * eddc64dd1f0e6f9bd1769ef580a217aa4be762b8 (synthesize subject pattern
     name: optimize version matching)
   * cd75d77da201260bc9deef4631d5c4d3a42fa41d (add license patterns
     lgpl_2 lgpl-2_1 lgpl-3)

I hope this information is useful.
Thanks.  You are right that those commits are directly related to the
issue - but not the cause, it turned out:

At build-time, the library composes regular expressions from metadata
(what I call "synthesizing").  If done right, the order of stepping
through and synthesize objects should not matter - but the synthesizing
logic was buggy at three places:

a) Synthesizing metadata from single-version object (e.g. "lgpl_2_1") as
regex patterns in versioned object (e.g. "lgpl") cannot be fully random,
but must wait till after the single-version object has been synthesized.
Now fixed in commit 2ec7af9eb0fdf72711eeb2689a6726b5ff30f82d

b) Only a subset of metadata from single-version object was synthesized.
Now fixed in commit bfd071032a88fd2d56e20b3a7ef092524dc3491a

With those two underlying bugs fixed, the library should now build its
DefHash structure deterministically.

...but the structure now has more rich versioned objects, which revealed
another bug in Licensecheck:

Licensecheck looks for more specific objects first - first singleversion
objects with optional trailer (e.g. "lgpl_2_1" + "version 2.1" + "or any
later"), and then versioned object with optional trailer (e.g. "lgpl" +
"version 2.1" + "or any later").

Notice the bug?  For singleversion objects it should skip the version
part of a trailer (i.e. only e.g. "lgpl_2_1" + "or any later").

So Licensecheck would fail to detect "or later" for singleversion
objects because it bogusly looked for double version, and would then
succeed in detecting "or later" with the more general versioned object -
as long as that was crippled to miss the version on its own, so that
version was part of the trailer.

If you are still with me in all this (I am not good at describing this,
I realize that), you can imagine how frustrated I have been to try
figure out what was really failing - until you pointed out the one place
I could make the build-time (still wrong but at least) deterministic.

Thanks a lot!

Thank you for your detailed explanation. I cannot completely follow you but I can follow the high level idea. I was completely sure that the issue was related to how license versioning was handled, but my limited experience in perl and in these particular modules make it impossible for me to go deeper. So I establish a personal goal of at least make a bug report which were really useful for you and provide a basement for your investigation, mainly by pointing to what was more evident for me, the non deterministic output.

I'm really happy that this report was helpful.

New releases went out upstream to CPAN last night, and I expect to
release packages for Debian today.  Unfortunately too late to be
included with the upcoming Bullseye release of Debian.

Thanks again!

Regards,

Walter

Reply via email to