Hello, This is a PIP to package the Pulsar Trino distro and config in a dedicated folder.
Link: https://github.com/apache/pulsar/issues/17137 Prototype: https://github.com/apache/pulsar/pull/17062 Below you can find the proposal (I will amend the GH issue while we discuss it). Best, tison. Motivation ======== After https://github.com/apache/pulsar/pull/16683 merged, we upgrade PrestoSQL dependency in Pulsar SQL to the first several Trino version. To handle the name change cases and gradually refactor Pulsar SQL as a self-contained module so that we can move it into a standalone repository, I find that there're three major issues to resolve. 1. Configs of Pulsar SQL go under the `conf/` folder and mix with other Pulsar configs. 2. Pulsar Docker images (base and all) bundle Pulsar SQL. 3. Integration tests of Pulsar SQL are tightly coupled with the main repo (test infra). This proposal is aimed at resolving the first issue to package Pulsar Trino distro and config in a dedicated folder; that is, to make it self-contained. Goal ==== I have already prepared a draft to perform the changes as https://github.com/apache/pulsar/pull/17062. Generally, we move the config files under `PRESTO_HOME` and correspondingly update scripts. In this way, all Trino distro artifacts are under the same home path, so that we can later move it out as a whole. This change should not affect those who use Pulsar with the entry point script, but it changes the layout of the release artifact, so I'd prefer to perform a PIP process. Implementation ============ It's straightforward to inline in the "Goal" section. However, the name of the folder (`presto` or `trino`) and the level of the folder (`lib/presto/` or `trino/`) is open to discussion. I think both are fine and will try `trino/` first. To minimize unnecessary changes, I tend to keep the modules name `pulsar-presto-xxx` as is. Alternatives ========= I don't make a completed proposal to resolve all three issues listed above. Because I'm still unfamiliar with the latter two topics yet and I'd prefer to implement these improvements one by one since they're naturally independent. If I try to make a completed proposal at once, it's highly possible I give up halfway. Anything else? =========== Previous discussion: [DISCUSS] Move Pulsar SQL to a separated repository? https://lists.apache.org/thread/mflm0pb5235jjk80vol0vs7v0hvowkq8