Hey all, a quick note: As a person outside of the ffmpeg project that just happened to contribute a patch, here is my understanding of the legal situation:
1) Strictly speaking, "nobody knows" what the legalities of LLMs are going to be. The big LLM providers are trying hard to establish precedent(s) so that when the actual laws are adapted, they will reflect current practice; therefore the LLM providers try very hard to establish "as practice" what is beneficial to themselves. 2) It is very instructive to look at the process that ended up with software falling under copyright law. This is much more recent than people think: The CONTU commission ran from 1974 to 1978, and it wasn't until 1980 that the law that put software firmly under the copyright regime we know today was passed. If you love copyright law, you can find their meeting notes online. 3) If you take a strict interpretation of the current copyright law, LLM weights cannot be copyrighted (they are derived by applying a formula to data, not a creative act); this hasn't stopped all the LLM companies to attach license terms to their releases, pretending as if copyright applied. The goal here is to establish precedent so that in the future LLM weights will be deemed copyrightable. 4) There are valid arguments that - if LLM weights are copyrightable - they might be derived works of the training data, and with it, the output would be tainted (by being similar to a song that consists only of sampled music: There is some input by the composer, but it remixes lots of other copyrighted material). There are practical issues with this, but more importantly, given the importance of the AI boom for US GDP currently, there are strong economic incentives for this interpretation to not gain traction. 5) So the current position that the LLM providers take is "our weights are copyrightable (even when current law says it isn't), but all your data we trained on is present in such miniscule dilution that there's no taint" (even when current law provides arguments it should be). Clearly this is primarily serving their own interests, with the goal to establish law in their favour. Given that the future legal regime is entirely unclear, it is a valid decision for each person (or group of persons) that maintains code to either (a) take the side that the most likely outcome is that LLM-generated output is taint-free, or (b) take the side that the most likely outcome is that LLM-generated output is tainted. This is less a statement about today's laws, and more a statement about "which societal forces will be stronger in shaping the consensus". I'm completely impartial to what FFmpeg (as a project) decides - for the moment, the patch is human-authored anyhow, so it doesn't matter much for *this patch*. That said, it would be helpful to know if commit messages can be authored by AI if clearly labeled. If societal consensus falls on the side of AI output being tainted, commit messages *can* be removed automatically, albeit at a cost of changing the hashes in the git commit history. Cheers, Thomas Am Mi., 12. Nov. 2025 um 09:24 Uhr schrieb Christophe Gisquet via ffmpeg-devel <[email protected]>: > Hello, > > Le mar. 11 nov. 2025 à 04:01, Michael Niedermayer via ffmpeg-devel > <[email protected]> a écrit : > > If you have concrete legal analysis or case law that supports this > claim, please share it. > > I can name at least one Fortune 500 companies, that maybe won't > disclose publicly these facts, that did equivalent analysis and have > basically forbidden use of "AI"-generated code for distributed > software. > By way of consequence, if that matters to you, maybe these companies > would be very concerned that the ffmpeg project included such code. > > Second, Gyan's Linux Foundation link is extremely telling: > 1) You need to be able to identify whether the LLM output comes from > copyrighted code. ie, what it was trained on. > 2) You need to report the portions affected, included with license > It's not making it forbidden, just impossible to abide by. > > -- > Christophe > _______________________________________________ > ffmpeg-devel mailing list -- [email protected] > To unsubscribe send an email to [email protected] > _______________________________________________ ffmpeg-devel mailing list -- [email protected] To unsubscribe send an email to [email protected]
