I am away from my key (travelling...), though as soon as I can, I'll second 
this option. I agree it is much clearer than Mo's proposal.


Thomas Goirand (zigo)


On Apr 24, 2025 05:44, Thorsten Glaser <[email protected]> wrote:

>

> -----BEGIN PGP SIGNED MESSAGE----- 

> Hash: SHA384 

>

> Cover letter 

> ============ 

>

> (Please do keep me in Cc, I’m not subscribed to the list.) 

>

> Hi! I had not realised it’s going to GR with this, so I’ve drafted 

> a counter proposal, based on the thread on debian-private around 

> <[email protected]> and 

> earlier thoughts I’ve collected regarding this topic, such as on 

> https://evolvis.org/~tg/cc.htm and the interpretation guidelines 

> on https://mbsd.evolvis.org/MirOS-Licence.htm (this is a mirror on 

> a more capable VM). 

>

> I’m not sure how quickly I’ll need seconds, but I would also welcome 

> input on this proposal (including from the l10n-en team as I’m not a 

> native English speaker). 

>

> I’m PGP-signing this with my DD key, as, for the avoidance of doubt, 

> should time be short indeed I’m submitting this as a choice. If time 

> isn’t short, I’m tentatively submitting it, with working in feedback 

> and updating it first as an option. 

>

>

> Counter-Proposal -- Interpretation of DFSG on (AI) Models 

> ========================================================= 

>

> Please see the original proposal for background on this. 

>

> The counter-proposal is as follows: 

>

> The Debian project requires the same level of freedom for AI models 

> than it does for other works entering the archive. 

>

> Notably: 

>

> 1. A model must be trained only from legally obtained and used works, 

>    honour all licences of the works used in training, and be licenced 

>    under a suitable licence itself that allows distribution, or it is 

>    not even acceptable for non-free. This includes an understanding 

>    that “generative AI” output are derivative works of their inputs 

>    (including training data and the prompt), insofar as these pass 

>    threshold of originality, that is, generative AI acts similar to 

>    a lossy compression followed by decompression, or to a compiler. 

>

>    Any work resulting from generative use of a model can at most be 

>    as free as the model itself; e.g. programming with a model from 

>    contrib/non-free assisting prevents the result from entering main. 

>

>    The "/usr/share/doc/PACKAGE/copyright" file must include copyright 

>    notices from all training inputs as required by Policy for “any 

>    files which are compiled into the object code shipped in the binary 

>    package”, except for inputs already separately packaged (such as 

>    the training software, libraries, or inputs already available from 

>    packages such as word lists also used for spellchecking). 

>

>    Regarding availability of sources used for training, the normal 

>    rules of the non-free archive apply. 

>

> 2  Models are not suitable for the non-free-firmware archive. 

>

> 3. For a model to enter the contrib archive, it may at runtime require 

>    components from outside of Debian main, but the model itself must 

>    still comply with the DFSG, i.e. follow below requirements for 

>    models entering main. If a model requires a component outside of 

>    main at build or training time, it is only admissible to non-free. 

>

> 4. For a model to enter the main archive, all works used in training 

>    must additionally be available, auditable, and under DFSG-compliant 

>    licencing. All software used to do the training must be available 

>    in Debian main. 

>

>    If the training happens during package build, the sources must be 

>    present in Debian packages or in the model’s source packages; if 

>    not, they must still be available in the same way. 

>

>    This is the same rule as is used for other precompiled works in 

>    Debian packages that are not regenerated during build: they must 

>    be able to be regenerated using only Debian tools, waiving the 

>    requirement to actually do the regenerating during package build 

>    is a nod to realistic build time and resource usage. 

>

> 5. For a model to enter the main archive, the model training itself 

>    must *either* happen during package build (which, for models of 

>    a certain size, may need special infrastructure; the handling of 

>    this is outside of the scope of this resolution), *or* the model 

>    resulting from training must build in a sufficiently reproducible 

>    way that a separate rebuilding effort from the same source will 

>    result in the same trained model. (This includes using reproducible 

>    seeds for PRNGs used, etc.) 

>

>    For realistic achievability of this goal, the reproducibility 

>    requirement is relaxed to not require bitwise equality, as long 

>    as the resulting model is effectively identical. (As a comparison, 

>    for C programs this would be equivalent to allowing different 

>    linking order of the object files in the binary or embedded 

>    timestamps to differ, or a different encoding of the same opcodes 

>    (like 31 C0 vs. 33 C0 for i386 “xor eax,eax”), but no functional 

>    changes as determined by experts in the field.) 

>

> 6. For handling of any large packages resulting in this, the normal 

>    processes are followed (such as discussing in advance with the 

>    relevant teams, ensuring mirrors are not over-burdened, etc). 

>

> The Debian project asks that training sources are not obtained 

> unethically, and that the ecological impact of training and using 

> AI models be considered. 

>

> [End of proposal.] 

>

> -----BEGIN PGP SIGNATURE----- 

>

> iQIcBAEBCQAGBQJoCV23AAoJEHa1NLLpkAfgfQcP/jDN+p+rY0fPhQUZ/HpJadkJ 

> BawiUYp+TMjsXowrXXy9Mp7FyrlWrj+zROfA1tup2+TkdlQSY8A62aWYS62y5z9y 

> x5TxqwS3+xH6UmtchmX7alxy7u9vUrcsdUM9NKt1DZQANyqq8+pVTpMKauNNsXr+ 

> L8zq/37ludyjCf+c9pnJ066CUaLBBMQGWmfPO8c1mjYWNnACXgYuUH1cw8Sgzr5u 

> vQrdURGfebrmTCQBbmCO5FOzQ3Q/uLjl5CocC8HWF0TBh7vcVtnYCkrvalECJpO5 

> PlCMUZ0MApuEJ1UTUcj+5lDxdH02dcMdFd7v+OB7+E5Jr+MHDR0wWoVaScm9MYno 

> Eip0sxbzVRqozeAH5bKKSaIQN+4KL/pVB2bYxwR4N5/W/9cxDsJmF/uoB1lZNtL8 

> DOvLar3RmHNVbaXin/E3afhw5L3O7JeppTSCby9Unyow8hmRjfjhz//ApEbOrWfv 

> CNH7sdM2mkEe0SXoxLyX7wfmZuWQ2SUZ4nwbj3vmHvM6jrVragCJxibQyVEIzuSQ 

> 1FB0MsFa1TrYN4tnR7/q9AiskcHKiTwcdJh0LFCiLZ2F2d2sd4ne60qQTCpmjzzG 

> WkhgeTOeLPCDgkHmC+oUEzGpQruKI/surQ9NSGWbFDyEPTGf9rVzMNlVRp0jJSob 

> 2PclqIcmvlO8Krw+9klA 

> =U1FJ 

> -----END PGP SIGNATURE----- 

>


Reply via email to