Hi Paul, In principle, I like the idea. Currently, the main sticking point with format and storage plugins is the protobuf. But for that, they would be completely “pluggable".
> On Jan 7, 2020, at 5:54 PM, Paul Rogers <[email protected]> wrote: > > Hi All, > > Wanted to chime in on this topic. We've long talked about the idea of > building plugins separately from Drill itself; but have never had the > resources to achieve this goal. Turns out Presto has a nice, simple way to > build plugins separately from Presto itself. [1] > > If Drill were to adopt something similar, then we could divide plugins into > two tiers: core plugins supported by the Drill project (with the kind of > testing Arena suggests), and true contributed plugins that are maintained by > others, using whatever form of testing works for them. > > Over time, if we find that a plugin becomes "stale" (lack of support or > usage), we can perhaps "demote" it to contributed status as a way of managing > the fact that we can't/won't do full testing on stale plugins. > > Presto uses the Java ServiceLoader [2] class to load plugins in a class > loader separate from that for Presto itself. A side benefit of such an > approach is that, with the right class loader, we get isolation of plugin and > Drill dependencies: different plugins can use different Guava versions > without the conflicts that we ran into in the run-up to the recent release. > > At some point we can make a simple trade-off: it might cost less to follow > Presto's lead than to continue to deal with the conflicts and ambiguities in > our current "everything in the Drill core" approach. > > Thanks, > - Paul > > [1] https://prestodb.io/docs/current/develop/spi-overview.html > > [2] https://docs.oracle.com/javase/7/docs/api/java/util/ServiceLoader.html > > > > > > > On Friday, January 3, 2020, 11:04:53 AM PST, Charles Givre > <[email protected]> wrote: > > Arina, > Thanks for the response. See responses inline: > >> On Jan 3, 2020, at 4:40 AM, Arina Yelchiyeva <[email protected]> >> wrote: >> >> Hi Charles, >> >> I would expect contributor to provide efficient way to test storage plugins. >> Depending on the source, some have embedded versions, some don’t. So >> contributor should provide instructions how plugin can be tested. >> Ideally, tests should be able to run without dependency on external system. >> Kudu is old plugin and does not comply with Drill coding standards, so the >> fact it does not have unit tests does not mean that other won’t have. > > I wasn't suggesting that new plugins shouldn't have good testing ;-). Far > from it. What I am wondering is whether it is necessary to have tests for > every class, or can we go for good coverage by providing a lot of query unit > tests that test a lot of different cases? > > Let's say I'm writing a plugin for Druid and Druid doesn't have an embedded > mode. (I have no idea whether Druid does or doesn't, just using it as an > example.) Would it be acceptable for a developer to provide a Docker > container loaded with Druid and a small dataset to serve as a testbed for the > unit tests? Just FYI, for the API plugin, I wrote 16 or so unit tests using > a mock webserver, but that one was relatively easy because there are no > shortage of libraries for testing HTTP responses. > > >> >> I would suggest we won’t accept storage plugins that does not have good unit >> tests coverage, documentations and code quality. Since plugins are easily >> pluggable, if plugin does not qualify Drill coding standards, it can be >> build manually and added as jar to the Drill classpath. > > I completely agree. One of my biggest frustrations when writing the Drill > book was the amount of undocumented functionality in Drill. One thing > however, is that I don't think plugins are truly "pluggable" yet because most > if not all require updates to the protobufs. > >> From my perspective, Drill users (or any users) expect if code is available >> from official build, it means it fully tested and works. Otherwise, users >> would complain that somethings does not work and say the Drill product is of >> bad quality. > >> >> Regarding code standards, this should apply to all PRs. We don’t have many >> code reviewers for the project and submitting PRs that have poor code >> quality would mean that these people (who voluntarily spend time reviewing >> and understanding someone code) would have to spend lots of time reviewing >> and correcting simple things. Many Apache projects have high code quality >> standards, much higher than Drill, so I think it would be fair if >> contributor would spend more time making code better than expect reviewer to >> point to every trivial issue. >> >> Kind regards, >> Arina >> >>> On Jan 2, 2020, at 8:50 PM, Charles Givre <[email protected]> wrote: >>> >>> Hello Drill Devs, >>> I wanted to ask a question about testing storage plugins. Currently there >>> are PRs for storage plugins for Apache Druid >>> (https://github.com/apache/drill/pull/1888 >>> <https://github.com/apache/drill/pull/1888>), and a generic REST API plugin >>> (https://github.com/apache/drill/pull/1892 >>> <https://github.com/apache/drill/pull/1892>). >>> >>> I wanted to ask about what is necessary with respect to testing to get a >>> storage plugin accepted? I looked at some of the others, and some, like >>> the HBase plugin, have many unit tests and others, like the Kudu plugin >>> barely have any. Also, since obviously, these unit tests require an >>> external system (or at least a docker container of one) what should a >>> contributor provide with a storage plugin? Should they provide a docker >>> container with a data set pre-loaded, or should the unit tests be marked >>> "Ignore"? >>> >>> When we do releases, are we running the tests against external systems? >>> Thanks, >>> -- C >>> >>> >>> >>> >>
