Hi Paul, 
In principle, I like the idea.  Currently, the main sticking point with format 
and storage plugins is the protobuf.  But for that, they would be completely 
“pluggable". 


> On Jan 7, 2020, at 5:54 PM, Paul Rogers <[email protected]> wrote:
> 
> Hi All,
> 
> Wanted to chime in on this topic. We've long talked about the idea of 
> building plugins separately from Drill itself; but have never had the 
> resources to achieve this goal. Turns out Presto has a nice, simple way to 
> build plugins separately from Presto itself. [1]
> 
> If Drill were to adopt something similar, then we could divide plugins into 
> two tiers: core plugins supported by the Drill project (with the kind of 
> testing Arena suggests), and true contributed plugins that are maintained by 
> others, using whatever form of testing works for them.
> 
> Over time, if we find that a plugin becomes "stale" (lack of support or 
> usage), we can perhaps "demote" it to contributed status as a way of managing 
> the fact that we can't/won't do full testing on stale plugins.
> 
> Presto uses the Java ServiceLoader [2] class to load plugins in a class 
> loader separate from that for Presto itself. A side benefit of such an 
> approach is that, with the right class loader, we get isolation of plugin and 
> Drill dependencies: different plugins can use different Guava versions 
> without the conflicts that we ran into in the run-up to the recent release.
> 
> At some point we can make a simple trade-off: it might cost less to follow 
> Presto's lead than to continue to deal with the conflicts and ambiguities in 
> our current "everything in the Drill core" approach.
> 
> Thanks,
> - Paul
> 
> [1] https://prestodb.io/docs/current/develop/spi-overview.html
> 
> [2] https://docs.oracle.com/javase/7/docs/api/java/util/ServiceLoader.html
> 
> 
> 
> 
> 
> 
>    On Friday, January 3, 2020, 11:04:53 AM PST, Charles Givre 
> <[email protected]> wrote:  
> 
> Arina, 
> Thanks for the response.  See responses inline:
> 
>> On Jan 3, 2020, at 4:40 AM, Arina Yelchiyeva <[email protected]> 
>> wrote:
>> 
>> Hi Charles,
>> 
>> I would expect contributor to provide efficient way to test storage plugins. 
>> Depending on the source, some have embedded versions, some don’t. So 
>> contributor should provide instructions how plugin can be tested.
>> Ideally, tests should be able to run without dependency on external system. 
>> Kudu is old plugin and does not comply with Drill coding standards, so the 
>> fact it does not have unit tests does not mean that other won’t have.
> 
> I wasn't suggesting that new plugins shouldn't have good testing ;-). Far 
> from it.  What I am wondering is whether it is necessary to have tests for 
> every class, or can we go for good coverage by providing a lot of query unit 
> tests that test a lot of different cases?
> 
> Let's say I'm writing a plugin for Druid and Druid doesn't have an embedded 
> mode. (I have no idea whether Druid does or doesn't, just using it as an 
> example.)  Would it be acceptable for a developer to provide a Docker 
> container loaded with Druid and a small dataset to serve as a testbed for the 
> unit tests?  Just FYI, for the API plugin, I wrote 16 or so unit tests using 
> a mock webserver, but that one was relatively easy because there are no 
> shortage of libraries for testing HTTP responses.
> 
> 
>> 
>> I would suggest we won’t accept storage plugins that does not have good unit 
>> tests coverage, documentations and code quality. Since plugins are easily 
>> pluggable, if plugin does not qualify Drill coding standards, it can be 
>> build manually and added as jar to the Drill classpath.
> 
> I completely agree. One of my biggest frustrations when writing the Drill 
> book was the amount of undocumented functionality in Drill.  One thing 
> however, is that I don't think plugins are truly "pluggable" yet because most 
> if not all require updates to the protobufs. 
> 
>> From my perspective, Drill users (or any users) expect if code is available 
>> from official build, it means it fully tested and works. Otherwise, users 
>> would complain that somethings does not work and say the Drill product is of 
>> bad quality.
> 
>> 
>> Regarding code standards, this should apply to all PRs. We don’t have many 
>> code reviewers for the project and submitting PRs that have poor code 
>> quality would mean that these people (who voluntarily spend time reviewing 
>> and understanding someone code) would have to spend lots of time reviewing 
>> and correcting simple things. Many Apache projects have high code quality 
>> standards, much higher than Drill, so I think it would be fair if 
>> contributor would spend more time making code better than expect reviewer to 
>> point to every trivial issue.
>> 
>> Kind regards,
>> Arina
>> 
>>> On Jan 2, 2020, at 8:50 PM, Charles Givre <[email protected]> wrote:
>>> 
>>> Hello Drill Devs, 
>>> I wanted to ask a question about testing storage plugins.  Currently there 
>>> are PRs for storage plugins for Apache Druid 
>>> (https://github.com/apache/drill/pull/1888 
>>> <https://github.com/apache/drill/pull/1888>), and a generic REST API plugin 
>>> (https://github.com/apache/drill/pull/1892 
>>> <https://github.com/apache/drill/pull/1892>).  
>>> 
>>> I wanted to ask about what is necessary with respect to testing to get a 
>>> storage plugin accepted?  I looked at some of the others, and some, like 
>>> the HBase plugin, have many unit tests and others, like the Kudu plugin 
>>> barely have any.  Also, since obviously, these unit tests require an 
>>> external system (or at least a docker container of one) what should a 
>>> contributor provide with a storage plugin?  Should they provide a docker 
>>> container with a data set pre-loaded, or should the unit tests be marked 
>>> "Ignore"?
>>> 
>>> When we do releases, are we running the tests against external systems?
>>> Thanks,
>>> -- C
>>> 
>>> 
>>> 
>>> 
>> 

Reply via email to