Dear devs,

I would like to open a discussion on the fact that currently many
Flink SQL function
 development relies on Calcite releases, which seriously blocks some
Flink SQL's features release.
Therefore, I would like to discuss whether it is possible to solve this problem
by creating Flink's own Calcite repository.

Currently, Flink depends on Caclite-1.26, FLIP-204[1] relies on Calcite-1.30,
and we recently want to support fully join-hints functionatity in Flink-1.16,
which relies on Calcite-1.31 (maybe two or three months later will be released).

In order to support some new features or fix some bugs, we need to upgrade
the Calcite version, but every time we upgrade Calcite version
(especially upgrades
across multiple versions), the processing is very tough: I remember clearly that
 the Calcite upgrade from 1.22 to 1.26 took two weeks of full-time to complete.

Currently, in order to fix some bugs while not upgrading the Calcite version,
we copy the corresponding Calcite class directly into the Flink project
and then modify it accordingly.[2] This approach is rather hacky and
hard for code maintenance and upgrades.

So, I had an idea whether we could solve this problem by maintaining a
Calcite repository
in the Flink community. This approach has been practiced within my
company for many years.
 There are similar practices in the industry. For example, Apache Dill
also maintains
a separate Calcite repository[3].

The following is a brief analysis of the approach and the pros and
cons of maintaining a separate repository.

Approach:
1. Where to put the code? https://github.com/flink-extended is a good place.
2. What extra code can be added to this repository? Only bug fixes and features
that are already merged into Calcite can be cherry-picked to this repository.
We also should try to push bug fixes to the Calcite community.
Btw, the copied Calcite class in the Flink project can be removed.
3. How to upgrade the Calcite version? Check out the target Calcite
release branch
and rebase our bug fix code. (As we upgrade, we will maintain fewer
and fewer older bug
fixes code.) And then, verify all Calcte's tests and Flink's tests in
the developer's local
 environment. If all tests are OK, release the Calcite branch, or fix
it in the branch and re-test.
 After the branch is released, then the version of Calcite in Flink
can be upgraded. For example:
 checkout calcite-1.26.0-flink-v1-SNAPSHOT branch from calcite-1.26.0,
move all the copied
 Calcite code in Flink to the branch, and pick all the hint related
changes from Calcite-1.31 to
 the branch. Then we can change the Calcite version in Flink to
calcite-1.26.0-flink-v1-SNAPSHOT,
and verify all tests in the locale. Release calcite-1.26.0-flink-v1
after all tests are successful.
At last upgrade the calcite version to
calcite-1.26.0-flink-v10-flink-v1, and open a PR.
4. Who will maintain it? The maintenance workload is minimal, but the
upgrade work is
 laborious (actually, it's similar to before). I can maintain it in
the early stage and standardise the processing.

Pros.
1. The release of Flink is decoupled from the release of Calcite,
 making feature development and bug fix quicker
2. Reduce the hassle of unnecessary calcite upgrades
3. No hacking in Flink to maintain the Calcite copied code

cons.
1. Need to maintain an additional Calcite repository
2. The Upgrades are a little more complicated than before

Any feedback is very welcome!


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-204%3A+Introduce+Hash+Lookup+Join
[2] 
https://github.com/apache/flink/tree/master/flink-table/flink-table-planner/src/main/java/org/apache/calcite
[3] https://github.com/apache/drill/blob/master/pom.xml#L64

Best,
Godfrey

Reply via email to