On 11/17/22 07:40, Yuchen Pei wrote:
On Fri 2022-11-11 15:27:41 -0600, Jacob K wrote:Hello, thanks for the explanation. On 11/9/22 19:44, Yuchen Pei wrote:Hello, Thanks for the detailed report. On Tue 2022-11-08 17:07:18 -0600, Jacob K wrote:[...]LibreJS removes the query part of a script url as a preprocessing in most (if not all) functions handling scripts. This means if you whitelist https://foo.com/bar.js, https://foo.com/bar.js?blah is also let through. OTOH without such whitelisting, https://foo.com/bar.js?blah is blocked as usual if it is not labelled. This is because the response processor checks the external script and rewrites it to /* LibreJS: script blocked ... */. I suspect the reason for discarding the query part is to avoid having to whitelist all possible query strings which can be tedious. Perhaps a better approach is to refine the whitelisting facility to allow patterns like globbing and regexes.Would it make sense to generally keep handling query strings the same, but make the link the user clicks on go to the version with the query string included (possibly with a warning that there is a query string and that whitelisting the script will whitelist all query strings)? That way clicking "Show" next to a script will always take the user to the currently blocked or running script.Definitely. Patches welcome, otherwise I'll work on it when I get time.
Thank you. I'm not sure if I'll send a patch, as I'm not familiar with LibreJS development, but if I happen to have extra time I may look into understanding LibreJS so I can contribute (I might have more free time this month, but then, there are also other things I want to work on, so maybe not.).
Ideally, I think LibreJS should store checksums of scripts, but it seems like it only does this for inline scripts currently?LibreJS does use hashes of scripts, but only in the built-in whiltelist (see /utilities/hash_script/whitelist). Best, YuchenSlightly off-topic, but is there a good system set up to add new scripts to the internal whitelist? I often see free libraries that are not recognized by LibreJS, and it seems like a group of motivated users might be better at labeling them than the library developers, at least when the library developers do not care about LibreJS.There isn't one yet, but I've been thinking about how to improve the script recognition. One idea is to set up a server program, that maintains a database of webpages and external scripts used in these webpages. Users can submit a url containing only free js, and the server will run the headless compliance check on the page, display the check results to the user, and record the results (librejs version, webpage url, script urls, script hash, status of each script - accepted or rejected, reason for acceptance (what licenses) / rejection). The server will provide API endpoints for listing fully compliant urls, and statistics of scripts (e.g. counts which indicates well-knowness / popularity of the scripts). The former can be used by users for discovery of nice websites, and the latter can be used by librejs users to whitelist scripts by hashes / names and librejs developers to decide mechanisms to add for more recognition (for example, if 99% of the unrecongised scripts are annotated using spdx, then maybe it makes sense to add a user option in librejs to enable spdx, despite the problems with the lack of license headers in spdx annotations). Librejs can also simply download the database from the server, and provide user options to auto whitelist scripts by hash (e.g. set a threshold for the counts). The tricky part is how do we make sure the server only contains free script. FSD has a review process, but we probably want something faster.
I have participated in FSD meetings some, and although we often see really big software with lots of dependencies that take a long time to verify, we also see lots of simple software with no dependencies (or already known-free dependencies) that are verified and added to the FSD rather quickly. I think, if the review process does not look at most dependencies (as LibreJS will usually block nonfree dependencies (an exception would be things like emulators, where a free script might e.g. download a nonfree game not written in JavaScript)) then the review process could be quite fast.
I don't think there is an automatic way to detect whether JavaScript is really source or not, but there are lots of automatic ways to detect licenses (REUSE, SPDX, scancode-toolkit, etc.).
One problem is the server is basically an SaaSS. The server program will be free and easy for self-hosting, but we'll probably want one central server with THE database. The server runs librejs headless compliance check, which is computation the user can do on their own computer. Alternatively the server can simply take user input for compliance results, but then users may make mistakes and this opens to more spam and inaccuracies. Best, Yuchen
I think, if the server is basically SaaSS, then that means it should be possible to have everyone run the server locally, perhaps even integrated into the extension. But if the server also shows things like popularity and manual reviews by trusted reviewers, then I think it is not SaaSS, as then it wouldn't make sense to run on each individuals' computer.
OpenPGP_0x8EF548378E806320.asc
Description: OpenPGP public key
OpenPGP_signature
Description: OpenPGP digital signature
