Re: AI/LLM and Calcite contributions

Julian Hyde Mon, 12 Jan 2026 13:18:43 -0800

I think that Calcite (and the wider ASF) needs to embrace AI — to do otherwise 
will make us irrelevant — but we need to avoid some of the excesses that will 
make the project impossible to maintain.

The enemies are slop (large volumes of code that is of dubious value) and 
inconsistency (code whose design contradicts the logic elsewhere in the 
project).

I agree with Mihai’s principle — allow a contribution as long as "the result is 
arguably correct and reviewers that we trust can understand it” — and note that 
it is basically a Turing test for good programmers. We know that good 
programmers produce clear, concise, understandable code, and bad programmers 
don’t.

Our scarcest resource is reviewer time. I would like to see reviewers push back 
hard on PRs that are too verbose, are not consistent with how Calcite does 
things, and do not clearly say the problem that they are trying to fix. If a PR 
is unfocused, reviewers should just ignore it. The person who submitted the PR 
— yes, legally, it’s always a person that submits a PR — needs to make it fit 
for review.

I strongly believe that we should require a Jira case, with a good summary and 
description, before we read a single line of a PR’s code.

Last, I think vibe-coding will be increasingly important. In vibe-coding, the 
code is written by AI and is never reviewed by a human, but we trust it because 
there is a comprehensive extensive specification and tests, written by humans. 
I propose that we have “vibe-coded” components in Calcite, where we put extra 
effort into reviewing tests and specification, and less effort into reviewing 
code. In Calcite, SQL functions and SQL dialects would be good candidates.

One of the benefits of vibe-coding, in this manner, is that from one test suite 
you can create and maintain implementations in multiple languages. (In my Morel 
project, I was easily able to port Morel’s standard library from Java to Rust.)

My vision is that five years from now, Calcite will be in multiple languages — 
say Java, Rust and Python — with a large, healthy, efficient test suite, and 
about half of the non-test code is either vibe-coded or translated by AI from 
the original code in Java. 

Julian

> On Jan 12, 2026, at 12:49 AM, Stamatis Zampetakis <[email protected]> wrote:
> 
> It depends how and to what extent AI is used in a contribution so we
> probably have to make a decision on a case by case basis. Note that
> ASF already provides some guidelines on how/when AI can be used [1].
> Obviously, using AI for generating lots of "new" code is quite risky
> and quite impractical to verify copyrights so it should be avoided.
> 
> [1] https://www.apache.org/legal/generative-tooling.html
> 
> On Mon, Jan 12, 2026 at 6:12 AM Dmitry Sysolyatin
> <[email protected]> wrote:
>> 
>> If AI is used to search for answers to project-related questions (although
>> one should be careful here when there is a lot of legacy), for
>> self-validation, to help find a solution, or for translating from one
>> language to another (specifically a 1-to-1 translation), I don’t see
>> anything wrong with that.
>> 
>> However, I am quite skeptical about using it to implement solutions. This
>> is up to each individual developer whether they use it or not as long as it
>> is not clearly visible that the code (which is sometimes very obvious) or
>> the comment is AI-generated (by “generation” I mean not translating one’s
>> own text from one language to another 1-to-1, but actual generation). In
>> such cases, it becomes unclear whether the developer actually understands
>> what they have written, and whether it is worth continuing the review, the
>> discussion, and spending time on it.
>> 
>> In the case of Apache Calcite, I have seen only once such a case. But in
>> other projects, AI-generated issues and fixes sometimes reach the point of
>> absurdity.
>> 
>> On Mon, Jan 12, 2026 at 1:08 AM jensen <[email protected]> wrote:
>> 
>>> Personally, I think using AI tools has its advantages; they often help us
>>> quickly locate simple problems. For the Calcite community, we have many
>>> experienced reviewers, and as long as we don't completely rely on AI tools
>>> to review code, I think it's acceptable. As for contributors, it's best to
>>> explain their thought process behind the changes (or provide good code
>>> comments), and ideally, to demonstrate whether the changes are reasonable
>>> (of course, new contributors may not be able to confirm the reasonableness
>>> of their changes even without using AI). If these things can be done to a
>>> certain extent, it will reduce the time and effort reviewers need to put in.
>>> 
>>> 
>>> 
>>> Best regards,
>>> 
>>> Zhen Chen
>>> 
>>> ---- Replied Message ----
>>> | From | Mihai Budiu<[email protected]> |
>>> | Date | 1/12/2026 06:03 |
>>> | To | [email protected]<[email protected]> |
>>> | Subject | Re: AI/LLM and Calcite contributions |
>>> I personally do not care which tools have been used as long as the result
>>> is arguably correct and reviewers that we trust can understand it.
>>> 
>>> Mihai
>>> 
>>> ________________________________
>>> From: Alessandro Solimando <[email protected]>
>>> Sent: Sunday, January 11, 2026 12:22 PM
>>> To: [email protected] <[email protected]>
>>> Subject: AI/LLM and Calcite contributions
>>> 
>>> Hello,
>>> a recent discussion [1] made me realize that, as a community, we haven't
>>> made a precise statement if LLM-assisted contributions should be accepted,
>>> and in case how they should be handled.
>>> 
>>> Dmitry cites [2] in the discussion (on the strict side of the spectrum),
>>> while I have seen more nuanced statements in the Apache foundation like [3]
>>> (fine as long as you understand and can justify all you submitted).
>>> 
>>> I'd like to hear your opinions, and ideally update the contributors
>>> guideline accordingly, when we reach consensus.
>>> 
>>> Best regards,
>>> Alessandro
>>> 
>>> 1: https://github.com/apache/calcite/pull/4692#discussion_r2639007178
>>> 2: https://wiki.gentoo.org/wiki/Project:Council/AI_policy
>>> 3:
>>> 
>>> https://datafusion.apache.org/contributor-guide/index.html#ai-assisted-contributions
>>>

Re: AI/LLM and Calcite contributions

Reply via email to