tsreaper commented on code in PR #443: URL: https://github.com/apache/flink-table-store/pull/443#discussion_r1054155956
########## docs/content/docs/sql-api/creating-tables.md: ########## @@ -0,0 +1,261 @@ +--- +title: "Creating Tables" +weight: 2 +type: docs +aliases: +- /sql-api/creating-tables.html +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Creating Tables + +## Creating Catalog Managed Tables + +Tables created in Table Store [catalogs]({{< ref "docs/sql-api/creating-catalogs" >}}) are managed by the catalog. When the table is dropped from catalog, its table files will also be deleted. + +The following SQL assumes that you have registered and are using a Table Store catalog. It creates a managed table named `MyTable` with five columns in the catalog's `default` database. + +{{< tabs "catalog-managed-table-example" >}} + +{{< tab "Flink" >}} + +```sql +CREATE TABLE MyTable ( + user_id BIGINT, + item_id BIGINT, + behavior STRING, + dt STRING, + hh STRING +); +``` + +{{< /tab >}} + +{{< tab "Spark3" >}} + +```sql +CREATE TABLE tablestore.default.MyTable ( + user_id BIGINT, + item_id BIGINT, + behavior STRING, + dt STRING, + hh STRING +); +``` + +{{< /tab >}} + +{{< /tabs >}} + +### Tables with Primary Keys + +The following SQL creates a table named `MyTable` with five columns, where `dt`, `hh` and `user_id` are the primary keys. + +{{< tabs "primary-keys-example" >}} + +{{< tab "Flink" >}} + +```sql +CREATE TABLE MyTable ( + user_id BIGINT, + item_id BIGINT, + behavior STRING, + dt STRING, + hh STRING, + PRIMARY KEY (dt, hh, user_id) NOT ENFORCED +); +``` + +{{< /tab >}} + +{{< tab "Spark3" >}} + +```sql +CREATE TABLE tablestore.default.MyTable ( + user_id BIGINT, + item_id BIGINT, + behavior STRING, + dt STRING, + hh STRING +) TBLPROPERTIES ( + 'primary-key' = 'dt,hh,user_id' +); +``` + +{{< /tab >}} + +{{< /tabs >}} + +### Partitioned Tables + +The following SQL creates a table named `MyTable` with five columns partitioned by `dt` and `hh`. + +{{< tabs "partitions-example" >}} + +{{< tab "Flink" >}} + +```sql +CREATE TABLE MyTable ( + user_id BIGINT, + item_id BIGINT, + behavior STRING, + dt STRING, + hh STRING +) PARTITIONED BY (dt, hh); +``` + +{{< /tab >}} + +{{< tab "Spark3" >}} + +```sql +CREATE TABLE tablestore.default.MyTable ( + user_id BIGINT, + item_id BIGINT, + behavior STRING, + dt STRING, + hh STRING +) PARTITIONED BY (dt, hh); +``` + +{{< /tab >}} + +{{< /tabs >}} + +{{< hint info >}} + +Partition keys must be a subset of primary keys if primary keys are defined. + +{{< /hint >}} + +### Table Properties + +Users can specify table properties to enable features or improve performance of Table Store. For a complete list of such properties, see [configurations]({{< ref "docs/maintenance-actions/configurations" >}}). + +The following SQL creates a table named `MyTable` with five columns partitioned by `dt` and `hh`. This table has two properties: `'bucket' = '2'` and `'bucket-key' = 'user_id'`. + +{{< tabs "table-properties-example" >}} + +{{< tab "Flink" >}} + +```sql +CREATE TABLE MyTable ( + user_id BIGINT, + item_id BIGINT, + behavior STRING, + dt STRING, + hh STRING +) PARTITIONED BY (dt, hh) WITH ( + 'bucket' = '2', + 'bucket-key' = 'user_id' +); +``` + +{{< /tab >}} + +{{< tab "Spark3" >}} + +```sql +CREATE TABLE tablestore.default.MyTable ( + user_id BIGINT, + item_id BIGINT, + behavior STRING, + dt STRING, + hh STRING +) PARTITIONED BY (dt, hh) TBLPROPERTIES ( + 'bucket' = '2', + 'bucket-key' = 'user_id' +); +``` + +{{< /tab >}} + +{{< /tabs >}} + +## Creating External Tables + +External tables are recorded but not managed by catalogs. If an external table is dropped, its table files will not be deleted. + +Table Store external tables can be used in any catalog. If you do not want to create a Table Store catalog and just want to read / write a table, you can consider external tables. + +{{< tabs "external-table-example" >}} + +{{< tab "Flink" >}} + +Flink SQL supports reading and writing an external table. External Table Store tables are created by specifying the `connector` and `path` table properties. The following SQL creates an external table named `MyTable` with five columns, where the base path of table files is `hdfs://path/to/table`. + +```sql +CREATE TABLE MyTable ( + user_id BIGINT, + item_id BIGINT, + behavior STRING, + dt STRING, + hh STRING +) WITH ( + 'connector' = 'table-store', + 'path' = 'hdfs://path/to/table', + 'auto-create' = 'true' -- this table property creates table files for an empty table if table path does not exist + -- currently only supported by Flink +); +``` + +{{< /tab >}} + +{{< tab "Hive" >}} + +Hive SQL only supports reading from an external table. The following SQL creates an external table named `my_table`, where the base path of table files is `hdfs://path/to/table`. As schemas are stored in table files, users do not need to write column definitions. + +```sql +CREATE EXTERNAL TABLE my_table +STORED BY 'org.apache.flink.table.store.hive.TableStoreHiveStorageHandler' +LOCATION 'hdfs://path/to/table'; +``` + +{{< /tab >}} + Review Comment: In that case we'll have to change the category name from "SQL API" to something else. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org