Hi Lokendra,

a) Drill has two definitions for "schema":

   - the first one is the schema of the table (table structure: column
   names and types)
   - the second one is the schemas or databases ("plugin name" + "workspace
   name"), which are accessible to be queried by Drill [1]

Therefore the schema list will be obtained from the storage plugins config
file along with workspaces list.
Please find more info in current PR for SYSLOG format plugin [2].

Since these definitions cause confusion, we can consider to drop the second
definition from Drill 2.0.0 version.

b) The first conversion is done on the planning stage, but the second one
on the execution stage. The first stage will not always return exact column
types.
But if it returns proper data types (not ANY) it can be used futher without
additional work.
It is considered to be improved in scope of "Schema provision" work [3].

[1] https://drill.apache.org/docs/show-databases-and-show-schemas/
[2] https://github.com/apache/drill/pull/1530
[3] https://issues.apache.org/jira/browse/DRILL-6835


Kind regards
Vitalii


On Fri, Feb 1, 2019 at 2:19 AM Lokendra Singh Panwar <[email protected]>
wrote:

> Hi All,
>
> I am relatively new to Drill and trying to write a custom storage plugin.
>
> I have couple of (naive sounding) queries, so mostly need some brief
> pointers:
>
> a) Why do a StoragePlugin have to implement a registerSchemas() (coming
> from SchemaFactory)? I assumed that drill would discover the data-schema
> on-the-fly, so that shouldn't be a need for the plugin to register it
> beforehand.
>
> (I created a version of my plugin and skipped implementing the
> registerSchemas method, assuming it will be discovered, and tried to do a*
> "SELECT * FROM myplugin.`tableid`"*  and it threw a *"VALIDATION_ERROR:
> Schema [[myplugin]] is not valid with respect to either root schema or
> current default schema"* --> So, I suspect that might be due to me not
> implementing registerSchemas(), hence the question)
>
> b) Similarly, I see plugins creating their own DrillTable class extending
> either DrillTable/DynamicDrill and then have to override the
> RelDataType getRowType(RelDataTypeFactory typeFactory) method, that seems
> to be converting the relation-item to drill types. But, I see similar type
> conversion also being
> done in the RecordReader classes when creating and loading the
> value-vectors.  Am I reading it right, that we are doing it twice?
>
> Any pointers will be greatly appreciated.
>
> Thanks,
> Lokendra
>

Reply via email to