[
https://issues.apache.org/jira/browse/DRILL-8474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18035640#comment-18035640
]
ASF GitHub Bot commented on DRILL-8474:
---------------------------------------
cgivre commented on PR #2989:
URL: https://github.com/apache/drill/pull/2989#issuecomment-3492642333
> I guess I don't understand how this connects to queries. This looks like
some very nice registry behavior that handles the distributed nature of Drill
w.r.t moving a schema jar around and making it available on the classpath for
Daffodil to use in every drill bit.
>
> But how then does a query access things in the jar? Is there some sort of
path/access mechanism to load things from the jar? I.e., the jar ends up on the
class path, and then normal Java loading i.e., getResource() calls, are used to
get stuff out of the jar?
>
> I guess I'm looking for the piece that puts this registry together with a
query that uses it.
I added a new parameter to Daffodil reader: `schemaFile`. If that is
defined, Drill will look in the persistent storage for a schema file.
So the workflow would be:
1. "Register the schema"
```sql
CREATE DAFFODIL SCHEMA USING JAR 'schema.jar'
```
That schema will now be propagated to all Drillbits and is ready to use.
2. Query data:
```sql
SELECT * FROM table(dfs.`data/data06Int.dat`
(type => 'daffodil', "
validationMode => 'true',
schemaFile => 'schema.jar',
rootName => 'row',
rootNamespace => null))
```
In theory, Drill should handle all the file management. The `schemaURI`
variable functions as before.
The query language uses the word `JAR` but the schema files can be anything
supported by Daffodil.
> Add Daffodil Format Plugin
> --------------------------
>
> Key: DRILL-8474
> URL: https://issues.apache.org/jira/browse/DRILL-8474
> Project: Apache Drill
> Issue Type: New Feature
> Affects Versions: 1.21.1
> Reporter: Charles Givre
> Priority: Major
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)