Thanks for your feedback!

I think your statement "There's nothing special about LLMs and this, other than perhaps the speed with which you can make mistakes" hits the nail on the head, which I think means, that there should actually be no special rule in this sense for GenAI, but rather a "warning", that with GenAI you might risk to break the existing copyright laws more easily / unconscious.

Re examples, I could imagine there are GenAI tools which make it more obvious where the content comes resp. has good references, than other tools, which of course does not mean, that you should not be less aware of possibly breaking copyright laws.

With the EU AI Act the LLMs should actually have to declare what data they were trained on, etc. which also should make it more transparent in the future.

Thanks

Michael

Am 22.04.24 um 10:35 schrieb Nick Burch:
On Sun, 21 Apr 2024, Michael Wechner wrote:
Thanks for the pointer to the Generative Tooling rules, which I was not aware of so far.

At the bottom it says, that the ASF does not tell developers what tools to use, but I think it would be useful to useful to have some concrete examples, which would make the rules more clear.

(Not a lawyer, not an official ASK response)

There's nothing special about LLMs and this, other than perhaps the speed with which you can make mistakes... When including other people's code, it's all about license compatibility and attribution

The ASF started when a bunch of people started sharing patches for a web server, with attribution and code under a compatible license. The foundation grew during a period where it got easier to find code + code snippets online, including much that wasn't under a compatible license. Rules didn't change, other than clarifying processes for checking licenses and what was/wasn't compatible.

You weren't, and still aren't, allowed to copy + paste large chunks of someone else's code without a compatible license and suitable attribution. Using a LLM to read all the internet and suggest the code to copy doesn't change that. Well, other than the well-documented issues with getting LLMs to cite their sources...

LLMs have loads of great uses, including helping you learn new things, decoding error messages, finding common patterns, rubber-ducking etc. They're even worse than many internet forums for suggesting large chunks of code of unclear provenance to copy+paste

It doesn't matter if it's ChatGPT, Github Co-pilot, a local LLM, someone on StackOverflow, or a YouTube video that's giving you some code you want to copy. 3 characters are almost certainly fine, 3 pages are almost certainly not, a general idea is often fine, and you absolutely need to engage your brain before committing to ASF repos!


Otherwise, if you do still think more rules / examples / etc are needed, you'll be wanting legal-discuss@
https://lists.apache.org/list.html?legal-disc...@apache.org

Cheers
Nick

Reply via email to