[druid] branch master updated: Information schema now uses numeric column types (#13777)

progers Fri, 17 Feb 2023 14:39:43 -0800

This is an automated email from the ASF dual-hosted git repository.

progers pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git



The following commit(s) were added to refs/heads/master by this push:
     new 85d36be085 Information schema now uses numeric column types (#13777)
85d36be085 is described below

commit 85d36be0853b6eaa5f2f94be3028a91291ed56b3
Author: Paul Rogers <[email protected]>
AuthorDate: Fri Feb 17 14:39:31 2023 -0800

    Information schema now uses numeric column types (#13777)
    
    Change to use SQL schemas to allow null numeric columns
    
    * Updated docs
---
 docs/querying/sql-metadata-tables.md               | 174 +++++++++---------
 .../sql/calcite/schema/InformationSchema.java      | 203 ++++++++++++---------
 2 files changed, 199 insertions(+), 178 deletions(-)

diff --git a/docs/querying/sql-metadata-tables.md 
b/docs/querying/sql-metadata-tables.md
index 2373062870..bd1c1489d5 100644
--- a/docs/querying/sql-metadata-tables.md
+++ b/docs/querying/sql-metadata-tables.md
@@ -53,50 +53,50 @@ WHERE "TABLE_SCHEMA" = 'druid' AND "TABLE_NAME" = 'foo'
 ### SCHEMATA table
 `INFORMATION_SCHEMA.SCHEMATA` provides a list of all known schemas, which 
include `druid` for standard [Druid Table datasources](datasource.md#table), 
`lookup` for [Lookups](datasource.md#lookup), `sys` for the virtual [System 
metadata tables](#system-schema), and `INFORMATION_SCHEMA` for these virtual 
tables. Tables are allowed to have the same name across different schemas, so 
the schema may be included in an SQL statement to distinguish them, e.g. 
`lookup.table` vs `druid.table`.
 
-|Column|Notes|
-|------|-----|
-|CATALOG_NAME|Always set as `druid`|
-|SCHEMA_NAME|`druid`, `lookup`, `sys`, or `INFORMATION_SCHEMA`|
-|SCHEMA_OWNER|Unused|
-|DEFAULT_CHARACTER_SET_CATALOG|Unused|
-|DEFAULT_CHARACTER_SET_SCHEMA|Unused|
-|DEFAULT_CHARACTER_SET_NAME|Unused|
-|SQL_PATH|Unused|
+|Column|Type|Notes|
+|------|----|-----|
+|CATALOG_NAME|VARCHAR|Always set as `druid`|
+|SCHEMA_NAME|VARCHAR|`druid`, `lookup`, `sys`, or `INFORMATION_SCHEMA`|
+|SCHEMA_OWNER|VARCHAR|Unused|
+|DEFAULT_CHARACTER_SET_CATALOG|VARCHAR|Unused|
+|DEFAULT_CHARACTER_SET_SCHEMA|VARCHAR|Unused|
+|DEFAULT_CHARACTER_SET_NAME|VARCHAR|Unused|
+|SQL_PATH|VARCHAR|Unused|
 
 ### TABLES table
 `INFORMATION_SCHEMA.TABLES` provides a list of all known tables and schemas.
 
-|Column|Notes|
-|------|-----|
-|TABLE_CATALOG|Always set as `druid`|
-|TABLE_SCHEMA|The 'schema' which the table falls under, see [SCHEMATA table 
for details](#schemata-table)|
-|TABLE_NAME|Table name. For the `druid` schema, this is the `dataSource`.|
-|TABLE_TYPE|"TABLE" or "SYSTEM_TABLE"|
-|IS_JOINABLE|If a table is directly joinable if on the right hand side of a 
`JOIN` statement, without performing a subquery, this value will be set to 
`YES`, otherwise `NO`. Lookups are always joinable because they are globally 
distributed among Druid query processing nodes, but Druid datasources are not, 
and will use a less efficient subquery join.|
-|IS_BROADCAST|If a table is 'broadcast' and distributed among all Druid query 
processing nodes, this value will be set to `YES`, such as lookups and Druid 
datasources which have a 'broadcast' load rule, else `NO`.|
+|Column|Type|Notes|
+|------|----|-----|
+|TABLE_CATALOG|VARCHAR|Always set as `druid`|
+|TABLE_SCHEMA|VARCHAR|The 'schema' which the table falls under, see [SCHEMATA 
table for details](#schemata-table)|
+|TABLE_NAME|VARCHAR|Table name. For the `druid` schema, this is the 
`dataSource`.|
+|TABLE_TYPE|VARCHAR|"TABLE" or "SYSTEM_TABLE"|
+|IS_JOINABLE|VARCHAR|If a table is directly joinable if on the right hand side 
of a `JOIN` statement, without performing a subquery, this value will be set to 
`YES`, otherwise `NO`. Lookups are always joinable because they are globally 
distributed among Druid query processing nodes, but Druid datasources are not, 
and will use a less efficient subquery join.|
+|IS_BROADCAST|VARCHAR|If a table is 'broadcast' and distributed among all 
Druid query processing nodes, this value will be set to `YES`, such as lookups 
and Druid datasources which have a 'broadcast' load rule, else `NO`.|
 
 ### COLUMNS table
 `INFORMATION_SCHEMA.COLUMNS` provides a list of all known columns across all 
tables and schema.
 
-|Column|Notes|
-|------|-----|
-|TABLE_CATALOG|Always set as `druid`|
-|TABLE_SCHEMA|The 'schema' which the table column falls under, see [SCHEMATA 
table for details](#schemata-table)|
-|TABLE_NAME|The 'table' which the column belongs to, see [TABLES table for 
details](#tables-table)|
-|COLUMN_NAME|The column name|
-|ORDINAL_POSITION|The order in which the column is stored in a table|
-|COLUMN_DEFAULT|Unused|
-|IS_NULLABLE||
-|DATA_TYPE||
-|CHARACTER_MAXIMUM_LENGTH|Unused|
-|CHARACTER_OCTET_LENGTH|Unused|
-|NUMERIC_PRECISION||
-|NUMERIC_PRECISION_RADIX||
-|NUMERIC_SCALE||
-|DATETIME_PRECISION||
-|CHARACTER_SET_NAME||
-|COLLATION_NAME||
-|JDBC_TYPE|Type code from java.sql.Types (Druid extension)|
+|Column|Type|Notes|
+|------|----|-----|
+|TABLE_CATALOG|VARCHAR|Always set as `druid`|
+|TABLE_SCHEMA|VARCHAR|The 'schema' which the table column falls under, see 
[SCHEMATA table for details](#schemata-table)|
+|TABLE_NAME|VARCHAR|The 'table' which the column belongs to, see [TABLES table 
for details](#tables-table)|
+|COLUMN_NAME|VARCHAR|The column name|
+|ORDINAL_POSITION|BIGINT|The order in which the column is stored in a table|
+|COLUMN_DEFAULT|VARCHAR|Unused|
+|IS_NULLABLE|VARCHAR||
+|DATA_TYPE|VARCHAR||
+|CHARACTER_MAXIMUM_LENGTH|BIGINT|Unused|
+|CHARACTER_OCTET_LENGTH|BIGINT|Unused|
+|NUMERIC_PRECISION|BIGINT||
+|NUMERIC_PRECISION_RADIX|BIGINT||
+|NUMERIC_SCALE|BIGINT||
+|DATETIME_PRECISION|BIGINT||
+|CHARACTER_SET_NAME|VARCHAR||
+|COLLATION_NAME|VARCHAR||
+|JDBC_TYPE|BIGINT|Type code from java.sql.Types (Druid extension)|
 
 For example, this query returns [data type](sql-data-types.md) information for 
columns in the `foo` table:
 
@@ -119,24 +119,24 @@ Segments table provides details on all Druid segments, 
whether they are publishe
 
 |Column|Type|Notes|
 |------|-----|-----|
-|segment_id|STRING|Unique segment identifier|
-|datasource|STRING|Name of datasource|
-|start|STRING|Interval start time (in ISO 8601 format)|
-|end|STRING|Interval end time (in ISO 8601 format)|
-|size|LONG|Size of segment in bytes|
-|version|STRING|Version string (generally an ISO8601 timestamp corresponding 
to when the segment set was first started). Higher version means the more 
recently created segment. Version comparing is based on string comparison.|
-|partition_num|LONG|Partition number (an integer, unique within a 
datasource+interval+version; may not necessarily be contiguous)|
-|num_replicas|LONG|Number of replicas of this segment currently being served|
-|num_rows|LONG|Number of rows in this segment, or zero if the number of rows 
is not known.<br /><br />This row count is gathered by the Broker in the 
background. It will be zero if the Broker has not gathered a row count for this 
segment yet. For segments ingested from streams, the reported row count may lag 
behind the result of a `count(*)` query because the cached `num_rows` on the 
Broker may be out of date. This will settle shortly after new rows stop being 
written to that particular  [...]
-|is_active|LONG|True for segments that represent the latest state of a 
datasource.<br /><br />Equivalent to `(is_published = 1 AND is_overshadowed = 
0) OR is_realtime = 1`. In steady state, when no ingestion or data management 
operations are happening, `is_active` will be equivalent to `is_available`. 
However, they may differ from each other when ingestion or data management 
operations have executed recently. In these cases, Druid will load and unload 
segments appropriately to bring actu [...]
-|is_published|LONG|Boolean represented as long type where 1 = true, 0 = false. 
1 if this segment has been published to the metadata store and is marked as 
used. See the [segment lifecycle 
documentation](../design/architecture.md#segment-lifecycle) for more details.|
-|is_available|LONG|Boolean represented as long type where 1 = true, 0 = false. 
1 if this segment is currently being served by any data serving process, like a 
Historical or a realtime ingestion task. See the [segment lifecycle 
documentation](../design/architecture.md#segment-lifecycle) for more details.|
-|is_realtime|LONG|Boolean represented as long type where 1 = true, 0 = false. 
1 if this segment is _only_ served by realtime tasks, and 0 if any Historical 
process is serving this segment.|
-|is_overshadowed|LONG|Boolean represented as long type where 1 = true, 0 = 
false. 1 if this segment is published and is _fully_ overshadowed by some other 
published segments. Currently, `is_overshadowed` is always 0 for unpublished 
segments, although this may change in the future. You can filter for segments 
that "should be published" by filtering for `is_published = 1 AND 
is_overshadowed = 0`. Segments can briefly be both published and overshadowed 
if they were recently replaced, but ha [...]
-|shard_spec|STRING|JSON-serialized form of the segment `ShardSpec`|
-|dimensions|STRING|JSON-serialized form of the segment dimensions|
-|metrics|STRING|JSON-serialized form of the segment metrics|
-|last_compaction_state|STRING|JSON-serialized form of the compaction task's 
config (compaction task which created this segment). May be null if segment was 
not created by compaction task.|
+|segment_id|VARCHAR|Unique segment identifier|
+|datasource|VARCHAR|Name of datasource|
+|start|VARCHAR|Interval start time (in ISO 8601 format)|
+|end|VARCHAR|Interval end time (in ISO 8601 format)|
+|size|BIGINT|Size of segment in bytes|
+|version|VARCHAR|Version string (generally an ISO8601 timestamp corresponding 
to when the segment set was first started). Higher version means the more 
recently created segment. Version comparing is based on string comparison.|
+|partition_num|BIGINT|Partition number (an integer, unique within a 
datasource+interval+version; may not necessarily be contiguous)|
+|num_replicas|BIGINT|Number of replicas of this segment currently being served|
+|num_rows|BIGINT|Number of rows in this segment, or zero if the number of rows 
is not known.<br /><br />This row count is gathered by the Broker in the 
background. It will be zero if the Broker has not gathered a row count for this 
segment yet. For segments ingested from streams, the reported row count may lag 
behind the result of a `count(*)` query because the cached `num_rows` on the 
Broker may be out of date. This will settle shortly after new rows stop being 
written to that particula [...]
+|is_active|BIGINT|True for segments that represent the latest state of a 
datasource.<br /><br />Equivalent to `(is_published = 1 AND is_overshadowed = 
0) OR is_realtime = 1`. In steady state, when no ingestion or data management 
operations are happening, `is_active` will be equivalent to `is_available`. 
However, they may differ from each other when ingestion or data management 
operations have executed recently. In these cases, Druid will load and unload 
segments appropriately to bring ac [...]
+|is_published|BIGINT|Boolean represented as long type where 1 = true, 0 = 
false. 1 if this segment has been published to the metadata store and is marked 
as used. See the [segment lifecycle 
documentation](../design/architecture.md#segment-lifecycle) for more details.|
+|is_available|BIGINT|Boolean represented as long type where 1 = true, 0 = 
false. 1 if this segment is currently being served by any data serving process, 
like a Historical or a realtime ingestion task. See the [segment lifecycle 
documentation](../design/architecture.md#segment-lifecycle) for more details.|
+|is_realtime|BIGINT|Boolean represented as long type where 1 = true, 0 = 
false. 1 if this segment is _only_ served by realtime tasks, and 0 if any 
Historical process is serving this segment.|
+|is_overshadowed|BIGINT|Boolean represented as long type where 1 = true, 0 = 
false. 1 if this segment is published and is _fully_ overshadowed by some other 
published segments. Currently, `is_overshadowed` is always 0 for unpublished 
segments, although this may change in the future. You can filter for segments 
that "should be published" by filtering for `is_published = 1 AND 
is_overshadowed = 0`. Segments can briefly be both published and overshadowed 
if they were recently replaced, but  [...]
+|shard_spec|VARCHAR|JSON-serialized form of the segment `ShardSpec`|
+|dimensions|VARCHAR|JSON-serialized form of the segment dimensions|
+|metrics|VARCHAR|JSON-serialized form of the segment metrics|
+|last_compaction_state|VARCHAR|JSON-serialized form of the compaction task's 
config (compaction task which created this segment). May be null if segment was 
not created by compaction task.|
 
 For example, to retrieve all currently active segments for datasource 
"wikipedia", use the query:
 
@@ -199,15 +199,15 @@ Servers table lists all discovered servers in the cluster.
 
 |Column|Type|Notes|
 |------|-----|-----|
-|server|STRING|Server name in the form host:port|
-|host|STRING|Hostname of the server|
-|plaintext_port|LONG|Unsecured port of the server, or -1 if plaintext traffic 
is disabled|
-|tls_port|LONG|TLS port of the server, or -1 if TLS is disabled|
-|server_type|STRING|Type of Druid service. Possible values include: 
COORDINATOR, OVERLORD,  BROKER, ROUTER, HISTORICAL, MIDDLE_MANAGER or PEON.|
-|tier|STRING|Distribution tier see 
[druid.server.tier](../configuration/index.md#historical-general-configuration).
 Only valid for HISTORICAL type, for other types it's null|
-|current_size|LONG|Current size of segments in bytes on this server. Only 
valid for HISTORICAL type, for other types it's 0|
-|max_size|LONG|Max size in bytes this server recommends to assign to segments 
see 
[druid.server.maxSize](../configuration/index.md#historical-general-configuration).
 Only valid for HISTORICAL type, for other types it's 0|
-|is_leader|LONG|1 if the server is currently the 'leader' (for services which 
have the concept of leadership), otherwise 0 if the server is not the leader, 
or the default long value (0 or null depending on 
`druid.generic.useDefaultValueForNull`) if the server type does not have the 
concept of leadership|
+|server|VARCHAR|Server name in the form host:port|
+|host|VARCHAR|Hostname of the server|
+|plaintext_port|BIGINT|Unsecured port of the server, or -1 if plaintext 
traffic is disabled|
+|tls_port|BIGINT|TLS port of the server, or -1 if TLS is disabled|
+|server_type|VARCHAR|Type of Druid service. Possible values include: 
COORDINATOR, OVERLORD,  BROKER, ROUTER, HISTORICAL, MIDDLE_MANAGER or PEON.|
+|tier|VARCHAR|Distribution tier see 
[druid.server.tier](../configuration/index.md#historical-general-configuration).
 Only valid for HISTORICAL type, for other types it's null|
+|current_size|BIGINT|Current size of segments in bytes on this server. Only 
valid for HISTORICAL type, for other types it's 0|
+|max_size|BIGINT|Max size in bytes this server recommends to assign to 
segments see 
[druid.server.maxSize](../configuration/index.md#historical-general-configuration).
 Only valid for HISTORICAL type, for other types it's 0|
+|is_leader|BIGINT|1 if the server is currently the 'leader' (for services 
which have the concept of leadership), otherwise 0 if the server is not the 
leader, or the default long value (0 or null depending on 
`druid.generic.useDefaultValueForNull`) if the server type does not have the 
concept of leadership|
 
 To retrieve information about all servers, use the query:
 
@@ -221,8 +221,8 @@ SERVER_SEGMENTS is used to join servers with segments table
 
 |Column|Type|Notes|
 |------|-----|-----|
-|server|STRING|Server name in format host:port (Primary key of [servers 
table](#servers-table))|
-|segment_id|STRING|Segment identifier (Primary key of [segments 
table](#segments-table))|
+|server|VARCHAR|Server name in format host:port (Primary key of [servers 
table](#servers-table))|
+|segment_id|VARCHAR|Segment identifier (Primary key of [segments 
table](#segments-table))|
 
 JOIN between "servers" and "segments" can be used to query the number of 
segments for a specific datasource,
 grouped by server, example query:
@@ -244,20 +244,20 @@ check out the documentation for [ingestion 
tasks](../ingestion/tasks.md).
 
 |Column|Type|Notes|
 |------|-----|-----|
-|task_id|STRING|Unique task identifier|
-|group_id|STRING|Task group ID for this task, the value depends on the task 
`type`. For example, for native index tasks, it's same as `task_id`, for sub 
tasks, this value is the parent task's ID|
-|type|STRING|Task type, for example this value is "index" for indexing tasks. 
See [tasks-overview](../ingestion/tasks.md)|
-|datasource|STRING|Datasource name being indexed|
-|created_time|STRING|Timestamp in ISO8601 format corresponding to when the 
ingestion task was created. Note that this value is populated for completed and 
waiting tasks. For running and pending tasks this value is set to 
1970-01-01T00:00:00Z|
-|queue_insertion_time|STRING|Timestamp in ISO8601 format corresponding to when 
this task was added to the queue on the Overlord|
-|status|STRING|Status of a task can be RUNNING, FAILED, SUCCESS|
-|runner_status|STRING|Runner status of a completed task would be NONE, for 
in-progress tasks this can be RUNNING, WAITING, PENDING|
-|duration|LONG|Time it took to finish the task in milliseconds, this value is 
present only for completed tasks|
-|location|STRING|Server name where this task is running in the format 
host:port, this information is present only for RUNNING tasks|
-|host|STRING|Hostname of the server where task is running|
-|plaintext_port|LONG|Unsecured port of the server, or -1 if plaintext traffic 
is disabled|
-|tls_port|LONG|TLS port of the server, or -1 if TLS is disabled|
-|error_msg|STRING|Detailed error message in case of FAILED tasks|
+|task_id|VARCHAR|Unique task identifier|
+|group_id|VARCHAR|Task group ID for this task, the value depends on the task 
`type`. For example, for native index tasks, it's same as `task_id`, for sub 
tasks, this value is the parent task's ID|
+|type|VARCHAR|Task type, for example this value is "index" for indexing tasks. 
See [tasks-overview](../ingestion/tasks.md)|
+|datasource|VARCHAR|Datasource name being indexed|
+|created_time|VARCHAR|Timestamp in ISO8601 format corresponding to when the 
ingestion task was created. Note that this value is populated for completed and 
waiting tasks. For running and pending tasks this value is set to 
1970-01-01T00:00:00Z|
+|queue_insertion_time|VARCHAR|Timestamp in ISO8601 format corresponding to 
when this task was added to the queue on the Overlord|
+|status|VARCHAR|Status of a task can be RUNNING, FAILED, SUCCESS|
+|runner_status|VARCHAR|Runner status of a completed task would be NONE, for 
in-progress tasks this can be RUNNING, WAITING, PENDING|
+|duration|BIGINT|Time it took to finish the task in milliseconds, this value 
is present only for completed tasks|
+|location|VARCHAR|Server name where this task is running in the format 
host:port, this information is present only for RUNNING tasks|
+|host|VARCHAR|Hostname of the server where task is running|
+|plaintext_port|BIGINT|Unsecured port of the server, or -1 if plaintext 
traffic is disabled|
+|tls_port|BIGINT|TLS port of the server, or -1 if TLS is disabled|
+|error_msg|VARCHAR|Detailed error message in case of FAILED tasks|
 
 For example, to retrieve tasks information filtered by status, use the query
 
@@ -271,14 +271,14 @@ The supervisors table provides information about 
supervisors.
 
 |Column|Type|Notes|
 |------|-----|-----|
-|supervisor_id|STRING|Supervisor task identifier|
-|state|STRING|Basic state of the supervisor. Available states: 
`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, 
`STOPPING`. Check [Kafka 
Docs](../development/extensions-core/kafka-supervisor-operations.md) for 
details.|
-|detailed_state|STRING|Supervisor specific state. (See documentation of the 
specific supervisor for details, e.g. 
[Kafka](../development/extensions-core/kafka-ingestion.md) or 
[Kinesis](../development/extensions-core/kinesis-ingestion.md))|
-|healthy|LONG|Boolean represented as long type where 1 = true, 0 = false. 1 
indicates a healthy supervisor|
-|type|STRING|Type of supervisor, e.g. `kafka`, `kinesis` or 
`materialized_view`|
-|source|STRING|Source of the supervisor, e.g. Kafka topic or Kinesis stream|
-|suspended|LONG|Boolean represented as long type where 1 = true, 0 = false. 1 
indicates supervisor is in suspended state|
-|spec|STRING|JSON-serialized supervisor spec|
+|supervisor_id|VARCHAR|Supervisor task identifier|
+|state|VARCHAR|Basic state of the supervisor. Available states: 
`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, 
`STOPPING`. Check [Kafka 
Docs](../development/extensions-core/kafka-supervisor-operations.md) for 
details.|
+|detailed_state|VARCHAR|Supervisor specific state. (See documentation of the 
specific supervisor for details, e.g. 
[Kafka](../development/extensions-core/kafka-ingestion.md) or 
[Kinesis](../development/extensions-core/kinesis-ingestion.md))|
+|healthy|BIGINT|Boolean represented as long type where 1 = true, 0 = false. 1 
indicates a healthy supervisor|
+|type|VARCHAR|Type of supervisor, e.g. `kafka`, `kinesis` or 
`materialized_view`|
+|source|VARCHAR|Source of the supervisor, e.g. Kafka topic or Kinesis stream|
+|suspended|BIGINT|Boolean represented as long type where 1 = true, 0 = false. 
1 indicates supervisor is in suspended state|
+|spec|VARCHAR|JSON-serialized supervisor spec|
 
 For example, to retrieve supervisor tasks information filtered by health 
status, use the query
 
diff --git 
a/sql/src/main/java/org/apache/druid/sql/calcite/schema/InformationSchema.java 
b/sql/src/main/java/org/apache/druid/sql/calcite/schema/InformationSchema.java
index 7f16d0ae9f..2e46406cdf 100644
--- 
a/sql/src/main/java/org/apache/druid/sql/calcite/schema/InformationSchema.java
+++ 
b/sql/src/main/java/org/apache/druid/sql/calcite/schema/InformationSchema.java
@@ -46,14 +46,14 @@ import org.apache.calcite.schema.impl.AbstractSchema;
 import org.apache.calcite.schema.impl.AbstractTable;
 import org.apache.calcite.sql.type.SqlTypeName;
 import org.apache.druid.java.util.emitter.EmittingLogger;
-import org.apache.druid.segment.column.ColumnType;
-import org.apache.druid.segment.column.RowSignature;
 import org.apache.druid.server.security.Action;
 import org.apache.druid.server.security.AuthenticationResult;
 import org.apache.druid.server.security.AuthorizationUtils;
 import org.apache.druid.server.security.AuthorizerMapper;
 import org.apache.druid.server.security.Resource;
 import org.apache.druid.server.security.ResourceAction;
+import org.apache.druid.sql.calcite.planner.Calcites;
+import org.apache.druid.sql.calcite.planner.DruidTypeSystem;
 import org.apache.druid.sql.calcite.planner.PlannerContext;
 import org.apache.druid.sql.calcite.table.DruidTable;
 import org.apache.druid.sql.calcite.table.RowSignatures;
@@ -73,44 +73,65 @@ public class InformationSchema extends AbstractSchema
   private static final String SCHEMATA_TABLE = "SCHEMATA";
   private static final String TABLES_TABLE = "TABLES";
   private static final String COLUMNS_TABLE = "COLUMNS";
-  private static final RowSignature SCHEMATA_SIGNATURE = RowSignature
-      .builder()
-      .add("CATALOG_NAME", ColumnType.STRING)
-      .add("SCHEMA_NAME", ColumnType.STRING)
-      .add("SCHEMA_OWNER", ColumnType.STRING)
-      .add("DEFAULT_CHARACTER_SET_CATALOG", ColumnType.STRING)
-      .add("DEFAULT_CHARACTER_SET_SCHEMA", ColumnType.STRING)
-      .add("DEFAULT_CHARACTER_SET_NAME", ColumnType.STRING)
-      .add("SQL_PATH", ColumnType.STRING)
+
+  private static class RowTypeBuilder
+  {
+    final RelDataTypeFactory typeFactory = DruidTypeSystem.TYPE_FACTORY;
+    final RelDataTypeFactory.Builder builder = typeFactory.builder();
+
+    public RowTypeBuilder add(String name, SqlTypeName type)
+    {
+      builder.add(name, Calcites.createSqlTypeWithNullability(typeFactory, 
type, false));
+      return this;
+    }
+
+    public RowTypeBuilder add(String name, SqlTypeName type, boolean nullable)
+    {
+      builder.add(name, Calcites.createSqlTypeWithNullability(typeFactory, 
type, nullable));
+      return this;
+    }
+
+    public RelDataType build()
+    {
+      return builder.build();
+    }
+  }
+
+  private static final RelDataType SCHEMATA_SIGNATURE = new RowTypeBuilder()
+      .add("CATALOG_NAME", SqlTypeName.VARCHAR)
+      .add("SCHEMA_NAME", SqlTypeName.VARCHAR)
+      .add("SCHEMA_OWNER", SqlTypeName.VARCHAR)
+      .add("DEFAULT_CHARACTER_SET_CATALOG", SqlTypeName.VARCHAR)
+      .add("DEFAULT_CHARACTER_SET_SCHEMA", SqlTypeName.VARCHAR)
+      .add("DEFAULT_CHARACTER_SET_NAME", SqlTypeName.VARCHAR)
+      .add("SQL_PATH", SqlTypeName.VARCHAR)
       .build();
-  private static final RowSignature TABLES_SIGNATURE = RowSignature
-      .builder()
-      .add("TABLE_CATALOG", ColumnType.STRING)
-      .add("TABLE_SCHEMA", ColumnType.STRING)
-      .add("TABLE_NAME", ColumnType.STRING)
-      .add("TABLE_TYPE", ColumnType.STRING)
-      .add("IS_JOINABLE", ColumnType.STRING)
-      .add("IS_BROADCAST", ColumnType.STRING)
+  private static final RelDataType TABLES_SIGNATURE = new RowTypeBuilder()
+      .add("TABLE_CATALOG", SqlTypeName.VARCHAR)
+      .add("TABLE_SCHEMA", SqlTypeName.VARCHAR)
+      .add("TABLE_NAME", SqlTypeName.VARCHAR)
+      .add("TABLE_TYPE", SqlTypeName.VARCHAR)
+      .add("IS_JOINABLE", SqlTypeName.VARCHAR)
+      .add("IS_BROADCAST", SqlTypeName.VARCHAR)
       .build();
-  private static final RowSignature COLUMNS_SIGNATURE = RowSignature
-      .builder()
-      .add("TABLE_CATALOG", ColumnType.STRING)
-      .add("TABLE_SCHEMA", ColumnType.STRING)
-      .add("TABLE_NAME", ColumnType.STRING)
-      .add("COLUMN_NAME", ColumnType.STRING)
-      .add("ORDINAL_POSITION", ColumnType.STRING)
-      .add("COLUMN_DEFAULT", ColumnType.STRING)
-      .add("IS_NULLABLE", ColumnType.STRING)
-      .add("DATA_TYPE", ColumnType.STRING)
-      .add("CHARACTER_MAXIMUM_LENGTH", ColumnType.STRING)
-      .add("CHARACTER_OCTET_LENGTH", ColumnType.STRING)
-      .add("NUMERIC_PRECISION", ColumnType.STRING)
-      .add("NUMERIC_PRECISION_RADIX", ColumnType.STRING)
-      .add("NUMERIC_SCALE", ColumnType.STRING)
-      .add("DATETIME_PRECISION", ColumnType.STRING)
-      .add("CHARACTER_SET_NAME", ColumnType.STRING)
-      .add("COLLATION_NAME", ColumnType.STRING)
-      .add("JDBC_TYPE", ColumnType.LONG)
+  private static final RelDataType COLUMNS_SIGNATURE = new RowTypeBuilder()
+      .add("TABLE_CATALOG", SqlTypeName.VARCHAR)
+      .add("TABLE_SCHEMA", SqlTypeName.VARCHAR)
+      .add("TABLE_NAME", SqlTypeName.VARCHAR)
+      .add("COLUMN_NAME", SqlTypeName.VARCHAR)
+      .add("ORDINAL_POSITION", SqlTypeName.BIGINT)
+      .add("COLUMN_DEFAULT", SqlTypeName.VARCHAR)
+      .add("IS_NULLABLE", SqlTypeName.VARCHAR)
+      .add("DATA_TYPE", SqlTypeName.VARCHAR)
+      .add("CHARACTER_MAXIMUM_LENGTH", SqlTypeName.VARCHAR, true)
+      .add("CHARACTER_OCTET_LENGTH", SqlTypeName.VARCHAR, true)
+      .add("NUMERIC_PRECISION", SqlTypeName.BIGINT, true)
+      .add("NUMERIC_PRECISION_RADIX", SqlTypeName.BIGINT, true)
+      .add("NUMERIC_SCALE", SqlTypeName.BIGINT, true)
+      .add("DATETIME_PRECISION", SqlTypeName.BIGINT, true)
+      .add("CHARACTER_SET_NAME", SqlTypeName.VARCHAR, true)
+      .add("COLLATION_NAME", SqlTypeName.VARCHAR, true)
+      .add("JDBC_TYPE", SqlTypeName.BIGINT)
       .build();
   private static final RelDataTypeSystem TYPE_SYSTEM = 
RelDataTypeSystem.DEFAULT;
 
@@ -175,7 +196,7 @@ public class InformationSchema extends AbstractSchema
     @Override
     public RelDataType getRowType(final RelDataTypeFactory typeFactory)
     {
-      return RowSignatures.toRelDataType(SCHEMATA_SIGNATURE, typeFactory);
+      return SCHEMATA_SIGNATURE;
     }
 
     @Override
@@ -279,7 +300,7 @@ public class InformationSchema extends AbstractSchema
     @Override
     public RelDataType getRowType(final RelDataTypeFactory typeFactory)
     {
-      return RowSignatures.toRelDataType(TABLES_SIGNATURE, typeFactory);
+      return TABLES_SIGNATURE;
     }
 
     @Override
@@ -333,10 +354,15 @@ public class InformationSchema extends AbstractSchema
                                     @Override
                                     public Iterable<Object[]> apply(final 
String tableName)
                                     {
+                                      Table table = 
subSchema.getTable(tableName);
+                                      if (table == null) {
+                                        // Table just disappeared.
+                                        return null;
+                                      }
                                       return generateColumnMetadata(
                                           schemaName,
                                           tableName,
-                                          subSchema.getTable(tableName),
+                                           table.getRowType(typeFactory),
                                           typeFactory
                                       );
                                     }
@@ -349,20 +375,20 @@ public class InformationSchema extends AbstractSchema
                                     public Iterable<Object[]> apply(final 
String functionName)
                                     {
                                       final TableMacro viewMacro = 
getView(subSchema, functionName);
-                                      if (viewMacro == null) {
-                                        return null;
-                                      }
-
-                                      try {
-                                        return generateColumnMetadata(
-                                            schemaName,
-                                            functionName,
-                                            
viewMacro.apply(Collections.emptyList()),
-                                            typeFactory
-                                        );
-                                      }
-                                      catch (Exception e) {
-                                        log.error(e, "Encountered exception 
while handling view[%s].", functionName);
+                                      if (viewMacro != null) {
+                                        try {
+                                          return generateColumnMetadata(
+                                              schemaName,
+                                              functionName,
+                                              
viewMacro.apply(Collections.emptyList()).getRowType(typeFactory),
+                                              typeFactory
+                                          );
+                                        }
+                                        catch (Exception e) {
+                                          log.error(e, "Encountered exception 
while handling view[%s].", functionName);
+                                          return null;
+                                        }
+                                      } else {
                                         return null;
                                       }
                                     }
@@ -382,7 +408,7 @@ public class InformationSchema extends AbstractSchema
     @Override
     public RelDataType getRowType(final RelDataTypeFactory typeFactory)
     {
-      return RowSignatures.toRelDataType(COLUMNS_SIGNATURE, typeFactory);
+      return COLUMNS_SIGNATURE;
     }
 
     @Override
@@ -401,16 +427,12 @@ public class InformationSchema extends AbstractSchema
     private Iterable<Object[]> generateColumnMetadata(
         final String schemaName,
         final String tableName,
-        final Table table,
+        final RelDataType tableSchema,
         final RelDataTypeFactory typeFactory
     )
     {
-      if (table == null) {
-        return null;
-      }
-
       return FluentIterable
-          .from(table.getRowType(typeFactory).getFieldList())
+          .from(tableSchema.getFieldList())
           .transform(
               new Function<RelDataTypeField, Object[]>()
               {
@@ -418,29 +440,30 @@ public class InformationSchema extends AbstractSchema
                 public Object[] apply(final RelDataTypeField field)
                 {
                   final RelDataType type = field.getType();
-                  boolean isNumeric = 
SqlTypeName.NUMERIC_TYPES.contains(type.getSqlTypeName());
-                  boolean isCharacter = 
SqlTypeName.CHAR_TYPES.contains(type.getSqlTypeName());
-                  boolean isDateTime = 
SqlTypeName.DATETIME_TYPES.contains(type.getSqlTypeName());
+                  SqlTypeName sqlTypeName = type.getSqlTypeName();
+                  boolean isNumeric = 
SqlTypeName.NUMERIC_TYPES.contains(sqlTypeName);
+                  boolean isCharacter = 
SqlTypeName.CHAR_TYPES.contains(sqlTypeName);
+                  boolean isDateTime = 
SqlTypeName.DATETIME_TYPES.contains(sqlTypeName);
 
-                  final String typeName = type instanceof 
RowSignatures.ComplexSqlType ? ((RowSignatures.ComplexSqlType) 
type).asTypeString() : type.getSqlTypeName().toString();
+                  final String typeName = type instanceof 
RowSignatures.ComplexSqlType ? ((RowSignatures.ComplexSqlType) 
type).asTypeString() : sqlTypeName.toString();
                   return new Object[]{
                       CATALOG_NAME, // TABLE_CATALOG
                       schemaName, // TABLE_SCHEMA
                       tableName, // TABLE_NAME
                       field.getName(), // COLUMN_NAME
-                      String.valueOf(field.getIndex()), // ORDINAL_POSITION
+                      (long) (field.getIndex() + 1), // ORDINAL_POSITION
                       "", // COLUMN_DEFAULT
                       type.isNullable() ? INFO_TRUE : INFO_FALSE, // 
IS_NULLABLE
                       typeName, // DATA_TYPE
                       null, // CHARACTER_MAXIMUM_LENGTH
                       null, // CHARACTER_OCTET_LENGTH
-                      isNumeric ? String.valueOf(type.getPrecision()) : null, 
// NUMERIC_PRECISION
-                      isNumeric ? "10" : null, // NUMERIC_PRECISION_RADIX
-                      isNumeric ? String.valueOf(type.getScale()) : null, // 
NUMERIC_SCALE
-                      isDateTime ? String.valueOf(type.getPrecision()) : null, 
// DATETIME_PRECISION
+                      isNumeric ? (long) type.getPrecision() : null, // 
NUMERIC_PRECISION
+                      isNumeric ? 10L : null, // NUMERIC_PRECISION_RADIX
+                      isNumeric ? (long) type.getScale() : null, // 
NUMERIC_SCALE
+                      isDateTime ? (long) type.getPrecision() : null, // 
DATETIME_PRECISION
                       isCharacter ? type.getCharset().name() : null, // 
CHARACTER_SET_NAME
                       isCharacter ? type.getCollation().getCollationName() : 
null, // COLLATION_NAME
-                      Long.valueOf(type.getSqlTypeName().getJdbcOrdinal()) // 
JDBC_TYPE (Druid extension)
+                      (long) type.getSqlTypeName().getJdbcOrdinal() // 
JDBC_TYPE (Druid extension)
                   };
                 }
               }
@@ -503,27 +526,25 @@ public class InformationSchema extends AbstractSchema
       final Set<String> names
   )
   {
-    if (schema != null) {
-      return ImmutableSet.copyOf(
-          AuthorizationUtils.filterAuthorizedResources(
-              authenticationResult,
-              names,
-              name -> {
-                final String resoureType = schema.getSchemaResourceType(name);
-                if (resoureType != null) {
-                  return Collections.singletonList(
-                      new ResourceAction(new Resource(name, resoureType), 
Action.READ)
-                  );
-                } else {
-                  return Collections.emptyList();
-                }
-              },
-              authorizerMapper
-          )
-      );
-    } else {
+    if (schema == null) {
       // for schemas with no resource type, or that are not named schemas, we 
don't filter anything
       return names;
     }
+    return ImmutableSet.copyOf(
+        AuthorizationUtils.filterAuthorizedResources(
+            authenticationResult,
+            names,
+            name -> {
+              final String resourseType = schema.getSchemaResourceType(name);
+              if (resourseType == null) {
+                return Collections.emptyList();
+              }
+              return Collections.singletonList(
+                  new ResourceAction(new Resource(name, resourseType), 
Action.READ)
+              );
+            },
+            authorizerMapper
+        )
+    );
   }
 }


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[druid] branch master updated: Information schema now uses numeric column types (#13777)

Reply via email to