[ 
https://issues.apache.org/jira/browse/DRILL-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16802826#comment-16802826
 ] 

ASF GitHub Bot commented on DRILL-7011:
---------------------------------------

arina-ielchiieva commented on issue #1711: DRILL-7011: Support schema in scan 
framework
URL: https://github.com/apache/drill/pull/1711#issuecomment-477160154
 
 
   @paul-rogers command syntax is the following:
   ```
   CREATE [OR REPLACE] SCHEMA
   [LOAD 'file:///path/to/file']
   [(column_name data_type nullability,...)]
   [FOR TABLE `table_name`]
   [PATH 'file:///schema_file_path/schema_file_name'] 
   [PROPERTIES ('key1'='value1', 'key2'='value2', ...)]
   ```
   `PROPERTIES` should be provided in parenthesis, in a form of key / value 
pairs where value follows after the key and equal sign, each enclosed the 
single quotes. Key / value groups should be separated by commas.
   ```
   create schema
   (col1 int, col2 int)
   for table t
   properties (
   'drill.strict' = 'true',
   'some_other_prop' = 'val')
   ```
   In `.drill.schema` JSON file this would look the following way:
   ```
   {
     "table" : "dfs.tmp.`t`",
     "schema" : {
       "columns" : [
         {
           "name" : "col1",
           "type" : "INT",
           "mode" : "OPTIONAL"
         },
         {
           "name" : "col2",
           "type" : "INT",
           "mode" : "OPTIONAL"
         }
       ],
       "properties" : {
         "drill.strict" : "true",
         "some_other_prop" : "val"
       }
     },
     "version" : 1
   }
   ```
   During deserialization they will be stored in `TupleMetadata` class (use 
`property(key)`, `properties` methods to extract them).
   
   If you want to add column properties, similar syntax will be used, except 
instead of parenthesis you need to use curly braces:
   ```
   create schema
   (col1 int properties {'drill.strict' = 'true'}, col2 int)
   for table t
   properties (
   'drill.strict' = 'true',
   'some_other_prop' = 'val')
   ```
   JSON output:
   ```
   {
     "table" : "dfs.tmp.`t`",
     "schema" : {
       "columns" : [
         {
           "name" : "col1",
           "type" : "INT",
           "mode" : "OPTIONAL",
           "properties" : {
             "drill.strict" : "true"
           }
         },
         {
           "name" : "col2",
           "type" : "INT",
           "mode" : "OPTIONAL"
         }
       ],
       "properties" : {
         "drill.strict" : "true",
         "some_other_prop" : "val"
       }
     },
     "version" : 1
   }
   ```
   Please let me know if there are any other syntax related questions.
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow hybrid model in the Row set-based scan framework
> ------------------------------------------------------
>
>                 Key: DRILL-7011
>                 URL: https://issues.apache.org/jira/browse/DRILL-7011
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.15.0
>            Reporter: Arina Ielchiieva
>            Assignee: Paul Rogers
>            Priority: Major
>             Fix For: 1.16.0
>
>
> As part of schema provisioning project we want to allow hybrid model for Row 
> set-based scan framework, namely to allow to pass custom schema metadata 
> which can be partial.
> Currently schema provisioning has SchemaContainer class that contains the 
> following information (can be obtained from metastore, schema file, table 
> function):
> 1. schema represented by org.apache.drill.exec.record.metadata.TupleMetadata
> 2. properties represented by Map<String, String>, can contain information if 
> schema is strict or partial (default is partial) etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to