Re: [DISCUSS] AI-generated contributions

Nic Crane Fri, 13 Feb 2026 07:53:16 -0800

On a similar note, after conversations with folks around what appear to be
AI-generated mailing list responses, I've also opened a PR suggesting
people disclose any AI-generated questions they post to mailing list
discussions; feel free to add any comments there (if you're a human! ;) )


https://github.com/apache/arrow/pull/49277/changes


On Thu, 22 Jan 2026 at 20:42, Nic Crane <[email protected]> wrote:

> PR here for anyone interested: https://github.com/apache/arrow/pull/48952
>
> On Thu, 22 Jan 2026 at 09:56, Nic Crane <[email protected]> wrote:
>
>> Thanks Andrew, I really like how you spell out the reasoning around it, I
>> will see how we can incorporate some of those ideas
>>
>> On Thu, 22 Jan 2026 at 09:23, Andrew Lamb <[email protected]> wrote:
>>
>>> > We have had repeated attempts at contributions by some folks who simply
>>> do not understand their generated code and when asked for clarification,
>>> have the LLM generate more incorrect commentary.  It's very
>>> Dunning-Krueger
>>> and leads to lots of frustration all around.
>>>
>>> We saw this too in DataFusion and I was pleased with what we came up with
>>> for rationale about why it is not helpful[1]. Basically the reviewers are
>>> more efficient using the LLM tools directly and the contributor isn't
>>> learning anything either.
>>>
>>> Andrew
>>>
>>>
>>> [1]:
>>>
>>> https://datafusion.apache.org/contributor-guide/index.html#why-fully-ai-generated-prs-without-understanding-are-not-helpful
>>>
>>> On Mon, Jan 19, 2026 at 12:48 PM R Tyler Croy <[email protected]>
>>> wrote:
>>>
>>> > (replies inline)
>>> >
>>> > On Sunday, January 18th, 2026 at 7:43 PM, Gang Wu <[email protected]>
>>> > wrote:
>>> >
>>> > > - Summitters should review all lines of generated code before
>>> creating
>>> > the
>>> > > PR to
>>> > > understand every piece of detail just like they are written by the
>>> > > submitters
>>> > > themselves.
>>> > > - AI tools are notorious for generating overly verbose comments,
>>> > unnecessary
>>> > > test cases, fixing test failures using wrong approaches, etc. Make
>>> sure
>>> > > these
>>> > > are checked and fixed.
>>> > > - Reviewers are humans, so please try to break down large PRs into
>>> > smaller
>>> > > ones to make reviewers' life easier to get PRs promptly reviewed.
>>> >
>>> >
>>> > Like others I think Nic's draft is a good one, I would like to offer
>>> some
>>> > thoughts as a maintainer (delta-rs) which has received increased
>>> > AI-assisted pull requests over the past six months.
>>> >
>>> >
>>> > The "PR may be closed without further review" statement I would
>>> strongly
>>> > encourage moving to the very beginning of the policy.  I would also
>>> > encourage labels being used like "ai-assisted" to signal to other
>>> > contributors who may or may not wish to engage in reviewing potential
>>> slop.
>>> >
>>> > We have had repeated attempts at contributions by some folks who
>>> simply do
>>> > not understand their generated code and when asked for clarification,
>>> have
>>> > the LLM generate more incorrect commentary.  It's very Dunning-Krueger
>>> and
>>> > leads to lots of frustration all around.
>>> >
>>> > Like most policies it's important to speak to those that are acting in
>>> > good faith but don't rely on everybody following the rules, and come up
>>> > with an agreed upon way to handle those that don't.
>>> >
>>> >
>>> > Either way I think it's good to ship! :)
>>> >
>>> >
>>> >
>>> > Cheers
>>> >
>>> >
>>>
>>

Re: [DISCUSS] AI-generated contributions

Reply via email to