Hello,

This is a PIP to package the Pulsar Trino distro and config in a dedicated
folder.

Link: https://github.com/apache/pulsar/issues/17137
Prototype: https://github.com/apache/pulsar/pull/17062

Below you can find the proposal (I will amend the GH issue while we discuss
it).

Best,
tison.

Motivation
========

After https://github.com/apache/pulsar/pull/16683 merged, we upgrade
PrestoSQL dependency in Pulsar SQL to the first several Trino version. To
handle the name change cases and gradually refactor Pulsar SQL as a
self-contained module so that we can move it into a standalone repository,
I find that there're three major issues to resolve.

1. Configs of Pulsar SQL go under the `conf/` folder and mix with other
Pulsar configs.
2. Pulsar Docker images (base and all) bundle Pulsar SQL.
3. Integration tests of Pulsar SQL are tightly coupled with the main repo
(test infra).

This proposal is aimed at resolving the first issue to package Pulsar Trino
distro and config in a dedicated folder; that is, to make it self-contained.

Goal
====

I have already prepared a draft to perform the changes as
https://github.com/apache/pulsar/pull/17062. Generally, we move the config
files under `PRESTO_HOME` and correspondingly update scripts.

In this way, all Trino distro artifacts are under the same home path, so
that we can later move it out as a whole.

This change should not affect those who use Pulsar with the entry point
script, but it changes the layout of the release artifact, so I'd prefer to
perform a PIP process.

Implementation
============

It's straightforward to inline in the "Goal" section.

However, the name of the folder (`presto` or `trino`) and the level of the
folder (`lib/presto/` or `trino/`) is open to discussion. I think both are
fine and will try `trino/` first.

To minimize unnecessary changes, I tend to keep the modules name
`pulsar-presto-xxx` as is.

Alternatives
=========

I don't make a completed proposal to resolve all three issues listed above.
Because I'm still unfamiliar with the latter two topics yet and I'd prefer
to implement these improvements one by one since they're naturally
independent. If I try to make a completed proposal at once, it's highly
possible I give up halfway.

Anything else?
===========

Previous discussion:

[DISCUSS] Move Pulsar SQL to a separated repository?
https://lists.apache.org/thread/mflm0pb5235jjk80vol0vs7v0hvowkq8

Reply via email to