Personally, I like having small and self-contained modules. I also found myself chatting with security folks on whether CVE affects or not Calcite/Avatica due to the dependencies quite a few times. Moreover, there seems to be a concrete use-case behind this decomposition so if someone is willing to do the work I think it would be beneficial for the project. Inherently there is going to be some breakage but what amount is acceptable can only be determined once there is a concrete proposal/PR in place.
Best, Stamatis On Wed, Oct 30, 2024 at 3:51 PM Laurent Goujon <[email protected]> wrote: > > > > > Can you elaborate which parts of Avatica are useful as a framework for > > writing JDBC drivers ? > > I always thought of Avatica as a small and focused project implementing a > > JDBC over HTTP proxy, > > to complement Calcite which is the actual framework for writing SQL > > databases. > > > > Is it the type system ? Or the property handling ? What's left of Avatica > > if you factor out the networking parts ? > > > All the abstract classes related to the JDBC driver API > (UnregisteredDriver, AvaticaConnection, AvaticaStatement, AvaticaResult, > ...)? There's actually a lot of boilerplate and standard behavior to be > implemented if you want your JDBC driver implementation to be conformant, > and Avatica takes care of this boilerplate (Yes, JDBC API surface seems > small but the PDF spec is 228 pages). > > Although the rpc part is an important part of the project, I believe the > overall goal was also to have other engines be able to use it as well, > hence some of the interfaces like the Meta and Cursor classes. > > I suspect that projects that depend on Avatica without using the network > > part may not be built optimally. > > JDBC is a rather small specification, and most of the implementation is > > very DB specific. > > > Define optimally? But personally I think they are doing fine, the core > Avatica abstractions are fairly flexible (in retrospect, why would Avatica > have so many abstractions if the goal was to only support the avatica > protocol? That would not be very optimal as well :) ) > > You can also check by yourself: > * https://github.com/apache/drill/tree/master/exec/jdbc > * https://github.com/apache/arrow/blob/main/java/flight/flight-sql-jdbc-core > > Laurent > > On Wed, Oct 30, 2024 at 1:37 AM Istvan Toth <[email protected]> > wrote: > > > Interesting. > > > > Can you elaborate which parts of Avatica are useful as a framework for > > writing JDBC drivers ? > > I always thought of Avatica as a small and focused project implementing a > > JDBC over HTTP proxy, > > to complement Calcite which is the actual framework for writing SQL > > databases. > > > > Is it the type system ? Or the property handling ? What's left of Avatica > > if you factor out the networking parts ? > > > > I suspect that projects that depend on Avatica without using the network > > part may not be built optimally. > > JDBC is a rather small specification, and most of the implementation is > > very DB specific. > > > > Istvan > > > > On Tue, Oct 29, 2024 at 8:47 PM Laurent Goujon <[email protected] > > > > > wrote: > > > > > Hi, > > > > > > I'd like to submit an idea regarding the current coupling of Avatica with > > > protobuf and jackson. > > > > > > Avatica is both a framework to write a JDBC driver (used by Drill and > > > Flight SQL JDBC drivers) AND also a client/server protocol (used by > > Phoenix > > > and Druid). > > > > > > Today most of the code is under the core module, and it handles both > > > aspects at the same time. But for projects only using the framework part, > > > this results in the introduction of extra dependencies which may either > > > conflict with their own use of those dependencies, or one could use the > > > shaded version which relocates those dependencies but this result in an > > > increase in artifact size (and also extra work re CVE analysis). > > > > > > To illustrate the current situation: > > > * avatica jar (1.25.0) is 7MB (~20MB uncompressed) > > > * avatica classes only represent less than 4MB (out of 20MB), 800kb if > > you > > > remove proto, remote and ha classes. > > > > > > The ideal scenario is that core Avatica classes do not depend on protobuf > > > and jackson (possibly by moving the protocol part out of core to make > > sure > > > concerns stay separated) but the effort is not trivial and would result > > in > > > some API breakage. > > > > > > Would people be interested in addressing this issue, and okay to > > introduce > > > some public API breakage? Or are there other alternatives to be > > considered? > > > > > > Cheers, > > > > > > Laurent > > > > > > > > > -- > > *István Tóth* | Sr. Staff Software Engineer > > *Email*: [email protected] > > cloudera.com <https://www.cloudera.com> > > [image: Cloudera] <https://www.cloudera.com/> > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: > > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera > > on LinkedIn] <https://www.linkedin.com/company/cloudera> > > ------------------------------ > > ------------------------------ > >
