I’m not a fan of shading. It doesn’t make things simpler; it makes them 
complicated in a different way. But of course people can take the 
Calcite/Avatica jars and create their own shaded/fat artifacts. I think that’s 
the simplest and least disruptive solution for people who don’t like adding the 
Avatica jar to their classpath.

The underlying issue is: Where should we draw the boundaries of “core”? Should 
“core” just contain the parser and algebra? Should it contain a JDBC adapter 
and JDBC driver? Should it contain non-standard SQL operators? Should it have 
the ability to evaluate expressions and queries?

Many people (myself included) think that core’s boundaries should be redrawn, 
but everyone disagrees about what those boundaries should be.

Julian


> On Nov 27, 2020, at 12:11 AM, Vladimir Ozerov <ppoze...@gmail.com> wrote:
> 
> Hi colleagues,
> 
> Thank you for the valuable feedback. The problem is indeed complex. I share
> the worry that complete decoupling might be too disruptive for users, since
> they will observe compilation problems when migrating to the newer version,
> and will have to update their dependencies, which also could be problematic
> (e.g. due to security concerns). So I'd like to propose a slightly
> different approach that should not cause any problems for the existing
> users. We change the goal from the complete decoupling to the *isolation *of
> dependent classes.
> 
> Let me explain it with Avatica as an example. There are two class of
> Avatica-related dependencies in the core: (1) utilities (e.g. classes from
> org.apache.calcite.avatica.util), and (2) logic (e.g. classes from
> org.apache.calcite.jdbc, org.apache.calcite.adapter.jdbc). The first class
> is very easy to eliminate. The second class cannot be eliminated with the
> serious repackaging of the whole Calcite. So we can do the following:
> 
> 1. Introduce the "commons" module, and move utilities there, thus solving
> (1).
> 2. Shade the "commons" module into the "core" during the build - if we do
> this, the existing users will not have to change their dependencies, so
> this is a critically important step (at least for now). An alternative to
> this is just to copy-paste utility classes into the "core" module,
> violating DRY
> 3. Contain the outstanding Avatica dependencies to a couple of JDBC-related
> packages, and add a static analysis rule to disallow Avatica classes in any
> other package. This may require some advanced refactoring (e.g.
> CalciteConnectionConfig)
> 
> As a result, Avatica dependency is reduced to a handful of packages, and
> existing applications will work mostly seamlessly during migration. Now we
> can do one of two things:
> 1. Either create a separate reduced artifact "core-reduced" without
> Avatica-dependent packages
> 2. Since many products shade Calcite during the build, we can advise them
> to just exclude Avatica-dependent packages when shading
> 
> How does it sound?
> 
> Regards,
> Vladimir
> 
> 
> ср, 25 нояб. 2020 г. в 10:48, Chunwei Lei <chunwei.l...@gmail.com>:
> 
>> I like the idea. But I have the same worry as Haisheng.
>> 
>> 
>> Best,
>> Chunwei
>> 
>> 
>> On Wed, Nov 25, 2020 at 3:07 PM Xin Wang <data.xinw...@gmail.com> wrote:
>> 
>>> +1 for this idea. We only use the parser/optimizer part.
>>> 
>>> JiaTao Tao <taojia...@gmail.com> 于2020年11月25日周三 下午2:38写道:
>>> 
>>>> +1 for this idea, I have been developing Calcite for a long
>> time(counting
>>>> during project Kylin), we all treat calcite as an optimizer, but we
>> need
>>> to
>>>> consider overhead.
>>>> 
>>>> I aggre with Stamatis: "since those dependencies were not causing any
>>> real
>>>> trouble."
>>>> 
>>>> 
>>>> What really troubling me is that when we do some in logical, we may
>> have
>>> to
>>>> consider the implemnt, for an example, we used keep "In", not convert
>> to
>>>> join or "OR", but calcite have no impl about "In".
>>>> 
>>>> 
>>>> Regards!
>>>> 
>>>> Aron Tao
>>>> 
>>>> 
>>>> 
>>>> Haisheng Yuan <hy...@apache.org> 于2020年11月25日周三 下午12:57写道:
>>>> 
>>>>>> I would like to propose to decouple the "core" module from "ling4j"
>>> and
>>>>> Avatica.
>>>>> I like the idea.
>>>>> 
>>>>> Moving Enumerable out of core may be time consuming and disruptive,
>>>>> because many core tests are using Enumerable to verify plan quality
>> and
>>>>> correctness.
>>>>> 
>>>>> Best,
>>>>> Haisheng
>>>>> 
>>>>> On 2020/11/24 23:42:19, Stamatis Zampetakis <zabe...@gmail.com>
>> wrote:
>>>>>> Hi Vladimir,
>>>>>> 
>>>>>> Personally, I like the idea.
>>>>>> I had similar thoughts in the past but it didn't try to break it
>> down
>>>>> since
>>>>>> those dependencies were not causing any real trouble.
>>>>>> 
>>>>>> Let's see what the others think.
>>>>>> 
>>>>>> Best,
>>>>>> Stamatis
>>>>>> 
>>>>>> 
>>>>>> On Tue, Nov 24, 2020 at 7:30 PM Vladimir Ozerov <
>> ppoze...@gmail.com>
>>>>> wrote:
>>>>>> 
>>>>>>> Hi colleagues,
>>>>>>> 
>>>>>>> Many Calcite integrations use only part of the framework.
>>>>> Specifically, it
>>>>>>> is common to use only the parser/optimizer part. JDBC and runtime
>>> are
>>>>> used
>>>>>>> less frequently because they are not very well suited for mature
>>>>> processing
>>>>>>> engines (e.g. Enumerable runs out of memory easily).
>>>>>>> 
>>>>>>> However, in order to use the parser/optimizer from the core
>> module,
>>>> you
>>>>>>> also need to add "linq4j" and Avatica modules to the classpath,
>>> which
>>>>> is
>>>>>>> not convenient - why to include modules, that you do not use?
>>>>>>> 
>>>>>>> It turns out that most of the dependencies are indeed leaky
>>>>> abstractions,
>>>>>>> that could be decoupled easily. For example, the RelOptUtil class
>>>> from
>>>>> the
>>>>>>> "core" depends on ... two string constants from the Avatica
>> module.
>>>>>>> 
>>>>>>> I would like to propose to decouple the "core" module from
>> "ling4j"
>>>> and
>>>>>>> Avatica. For example, we may introduce the new "common" module,
>>> that
>>>>> will
>>>>>>> hold common constants, utility classes, and interfaces (e.g.
>> Meta).
>>>>> Then,
>>>>>>> we can organize the dependencies like this:
>>>>>>> common -> core
>>>>>>> common -> linq4j
>>>>>>> common -> Avatica
>>>>>>> 
>>>>>>> Finally, we may shade and relocate the "common" module into the
>>>> "core"
>>>>>>> during the build. In the end, we will have -2 runtime
>> dependencies
>>>> with
>>>>>>> relatively little effort. In principle, the same approach could
>> be
>>>>> applied
>>>>>>> to Janino and Jackson dependencies, but it could be more complex,
>>> so
>>>> my
>>>>>>> proposal is only about "linq4" and Avatica.
>>>>>>> 
>>>>>>> How do you feel about it? Does this proposal sense to the
>>> community?
>>>> If
>>>>>>> yes, I can try implementing the POC for this.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Vladimir.
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> Thanks,
>>> Xin
>>> 
>> 

Reply via email to