Hi Nic, Gang, and all,

Thanks for raising this — I agree this guidance would be very valuable. AI
tools can be helpful, but only when contributors fully understand, own, and
are willing to iterate on the generated changes.

I strongly support the emphasis on transparency, code ownership, and active
engagement during review. Gang’s additions around reviewing every generated
line, avoiding unnecessary verbosity, and keeping PRs small also address
common reviewer pain points.
Having this documented (e.g., in the contributor guide) would set clear
expectations and give maintainers a consistent reference when handling
low-engagement or undisclosed AI-generated PRs.
Happy to support moving this forward.

Best regards,
Vignesh

On Mon, 19 Jan 2026, 9:11 am Gang Wu, <[email protected]> wrote:

> Thanks Nic for raising this!
>
> I totally agree with your suggestions and would like to add additional ones
> based on my review experience:
>
> - Summitters should review all lines of generated code before creating the
> PR to
>   understand every piece of detail just like they are written by the
> submitters
>   themselves.
> - AI tools are notorious for generating overly verbose comments,
> unnecessary
>   test cases, fixing test failures using wrong approaches, etc. Make sure
> these
>   are checked and fixed.
> - Reviewers are humans, so please try to break down large PRs into smaller
>   ones to make reviewers' life easier to get PRs promptly reviewed.
>
> Best,
> Gang
>
> On Mon, Jan 19, 2026 at 3:14 AM Nic Crane <[email protected]> wrote:
>
> > Hi folks,
> >
> > I'm just emailing to solicit opinions on adding a page about AI-generated
> > contributions to the docs. The ASF has its own guidance[1] which is
> fairly
> > high-level and is mainly concerned with licensing. However, we are seeing
> > more AI generated contributions in which the author doesn't seem to have
> > engaged with the code at all and appears to have no intention of engaging
> > with review comments, and I feel like it would be beneficial to have
> > somewhere in the docs to point to if we close the pull request.
> >
> > Having guidelines also makes it easier to tell whether a contributor has
> > made any effort to follow them.
> >
> > I experimented with approaches to being transparent about AI use in my
> own
> > PRs and have an example here, where the changes were needed but the
> subject
> > matter was a little out of my comfort zone[2] - see resolved comments.
> >
> > I've made a rough draft[3] of what I think could constitute some
> > guidelines, but keen to hear what folks think. Happy to hear thoughts on
> > the wording, whether this belongs in the contributor guide, or if there
> are
> > concerns I haven't considered.
> >
> > Nic
> >
> >
> > [1] https://www.apache.org/legal/generative-tooling.html
> >
> > [2] https://github.com/apache/arrow/pull/48634
> >
> > [3]
> > We recognise that AI coding assistants are now a regular part of many
> > developers' workflows and can improve productivity. Thoughtful use of
> these
> > tools can be beneficial, but AI-generated PRs can sometimes lead to
> > undesirable additional maintainer burden.  Human-generated mistakes tend
> to
> > be easier to spot and reason about, and code review often feels like a
> > collaborative learning experience that benefits both submitter and
> > reviewer. When a PR appears to have been generated without much
> engagement
> > from the submitter, it can feel like work that the maintainer might as
> well
> > have done themselves.
> >
> > We are not opposed to the use of AI tools in generating PRs, but
> recommend
> > the following:
> > - Only take on a PR if you are able to debug and own the changes yourself
> > - Make sure that the PR title and body match the style and length of
> others
> > in this repo
> > - Follow coding conventions used in the rest of the codebase
> > - Be upfront about AI usage and summarise what was AI-generated
> > - If there are parts you don't fully understand, add inline comments,
> > explaining what steps you took to verify correctness
> >   - Reference any sources that guided your changes (e.g. "took a similar
> > approach to #123456")
> >
> > PR authors are also responsible for disclosing any copyrighted materials
> in
> > submitted contributions, as discussed in the ASF generative tooling
> > guidance: https://www.apache.org/legal/generative-tooling.html
> >
> > If a PR appears to be AI-generated, and the submitter hasn't engaged with
> > the output, doesn't respond to review feedback, or hasn't  disclosed AI
> > usage, we may close it without further review.
> >
>

Reply via email to