On Thu, Jul 02, 2026 at 01:51:10PM +0200, David Hildenbrand (Arm) wrote:
> On 7/2/26 12:04, Lorenzo Stoakes wrote:
> > (thanks for the cc-!)
> >
> > On Thu, Jul 02, 2026 at 09:46:37AM +0200, David Hildenbrand (Arm) wrote:
> >> On 7/2/26 09:27, Christian Brauner wrote:
> >>>
> >>> I think we should just drop any attribution as a general kernel-wide
> >>> rule and let subsystems require them as needed. Then you can have all
> >>> the complexity in mm for this that you think is needed for your
> >>> workflow to function. This is precisely what the subsystem profiles are
> >>> for. So maybe just add:
> >
> > A single comment is complexity?
>
> I think Christian meant more elaborate rules. More than just "If you used 
> LLMs,
> disclose how you used them."

What's elaborate?

"Say how much of your patch is LLM written, here are some examples".

Surely?

>
> >>
> >> I'm not really sure if having (more?) subsystem-specific tags is the way 
> >> to go.
> >> (below)
> >>
> >> So either we find a very simple, kernel-wide rule for such tags, or we 
> >> drop them
> >> entirely.
> >
> > Yup I couldn't disagree more with Christian here, the whole thing feels like
> > trying to 'wish away' the AI issue, and now punting off to subsystem
> > maintainers...
> >
> > Subsystems impact each other. Right now I'm writing a series that changes 
> > driver
> > code so we can enforce some sanity in mm APIs.
> >
> > I've had to interact with fs code quite a bit that uses mm logic.
> >
> > It's all interconnected, and one subsystem let's say going with 'let it all 
> > in'
> > say, impacts another.
> >
> > Yes some people lie about it, but having the guidelines only STRENGTHENS our
> > position on that, and I've seen that in practice.
> >
> > So yeah, sorry, I think it's beyond silly to push back on requesting 
> > somebody
> > disclose how much of a patch/series was AI generated.
> >
> > And [0] already essentially says people NEED to do this now. But that doc 
> > has
> > been rather downplayed unfortunately I think.
>
> [...]
>
> >> I agree on the "enforce" aspect. It's impossible, but it's still easy to 
> >> catch
> >> people using AI irresponsibly today ... and that's what we care about. Not
> >> people that know what they are doing using AI responsibly.
> >
> > For me it's about empowering maintainers to push back.
>
> Right, but I suspect maintainers do have this power already, it's just not
> exercised that often on obvious AI slop yet.

Well I certainly don't feel I do :)

I tried pushing back on obvious AI slop and got a huge amount of blow back for
it because the guy wasn't honest about it.

A key reason for me pushing back on the tooling documentation was precisely
because I felt we needed a clear means of doing this.

This being the part:

"As with the output of any tooling, the result may be incorrect or
inappropriate. You are expected to understand and to be able to defend
everything you submit. If you are unable to do so, then do not submit the
resulting changes.

If you do so anyway, maintainers are entitled to reject your series without
detailed review."

But if somebody denies it, no matter how strong the evidence, you can never
really 'prove' it.

I think honestly if there's a newcomer who suddenly out of nowhere does a huge
involved series in an area they've not touched before and LLMs assess it as 90%
likely to be LLM generated,and they reply making mistakes that only an LLM would
make (misinterpreting a field's symbol and then acting as if really exists) -
it's not unreasonable to cite these things as a reason to 'not really trust'
that it's their work.

Perhaps worded nicely to say 'sorry if I'm mistaken'?

All I'm really asking is for the ability to say something like "I reasonably
believe that this is generated, so we need to build more trust here, apologies
if I'm mistaken, but can we see some smaller patches in this area first" or
something like this.

>
> >
> >>
> >>>
> >>> If the information is mostly useful during review then I still would
> >>> question why it has to end up in our git logs. It's completely
> >>> irrelevant information imho.
> >>
> >> Fully agreed. In the tree it's irrelevant.
> >
> > Not sure about that, if it turns out AI-generated patches are causing 95% 
> > more
> > bugs say that's pretty useful information no?
>
> Well
>
> a) You don't know how much AI was used. In particular, it could just slip in 
> as

Hence 'tell us how much was used' :)

> the submitter tries to untangle some of the mess the AI created (so not AI's
> fault). Or the submitter just used it to write+translate the patch 
> description.
> Really, the tag itself doesn't tell you much as it stands, which is the 
> biggest
> problem I am having with it.
>
> b) You don't catch all the cases where people didn't use the tag.

Is this arguing 'we don't have complete information so let's have no
information'? Because I would say something > nothing?

>
> >
> > Or if you find that a patch somebody sent from another subsystem that has a
> > lassez faire approach to AI slop completely breaks you in some subtle way, 
> > isn't
> > it easier to push for a revert if you see it's LLM-generated?
>
> The information would have to be had from the linked mailing list posting.

That's creating a lot more work for maintainers?

You could even figure out bug rate from Fixes: tags alone using metadata.

And yes it will be imperfect but something > nothing.

>
> Given that some subsystems already started suppressing the tags when applying
> patches, that doesn't really help ... :/

Well that's unfortunate. But something > nothing, again.

>
> >
> > And is it really that egregious to include a tag? You can ignore it if you 
> > don't
> > care.
>
> I hate the current tags as they are. The question I am asking myself: assume 
> we
> stop using the Assisted-by for LLM stuff. What to do with the other tools? Why
> are LLMs suddenly no longer a tool to mention there.

Because it turns out it's useful to have this information and more information >
less information?

>
> --
> Cheers,
>
> David

Thanks, Lorenzo

Reply via email to