There is also this document by Debian's Deep Learning Team that is worth looking at:
https://salsa.debian.org/deeplearning-team/ml-policy/-/blob/master/ML-Policy.rst There they make a distinction between model and its artifacts and if model is non-free or trained on non-free data, then they consider its output artifacts to be proprietary. Andrius 2024 m. kovo 26 d., antradienis 16:33:56 GMT Volker Krause rašė: > On Montag, 25. März 2024 15:17:48 CET Halla Rempt wrote: > > We're looking into adding an experimental AI-based feature to Krita: > > automated inking. That gives us three components, and we're not sure about > > the license we should use for two of them: the model and the datase. Would > > CC be best here? > > Looking at https://community.kde.org/Policies/Licensing_Policy the closest > thing would either be "media" files (generalized to "data files") and thus > CC- BY-SA (and presumably CC-BY/CC0) or "source code" (xGPL, BSD/MIT). > > I think this is a bit more tricky though, depending on whether we assume a > model is derivative work of the input data, and whether the output generated > from a model is derivative work of the model (and thus potentially > derivative work of the input data). The industry assumption so far seems to > be that at least one of those isn't derivative work (AFAIK that has yet to > be legally tested though), but I'm not sure that interpretation is in the > best interest of FOSS developers or artists... > > One scenario that would work regardless I think is using a license with > practically no constraints (CC0, MIT, etc), but that also offers no > protection for the training or model data (which might or might not be what > you want). > > Any other scenario I can think of involving more protective licenses runs > into interesting issues: > - if the output is derivative work, Krita users would be bound by e.g. the > attribution or share-alike requirements of the license (which I guess is not > what you want). > - a Bison/Flex style "code generator exception" to state that the model > output is free of any license requirements regardless of the model license > itself requires that either the model isn't derivative work of the input or > that the input data is licensed in a way compatible with that. > - In the latter case we are back to essentially unprotected CC0-like input, > or a protective license with a special exception, which then gets awfully > close to developing new licenses. > > So I guess this boils down to how much protection you have in mind for the > input and model data? > > Interesting topic, sorry if my ramblings on this are of limited help :) > > Regards, > Volker