[doris] branch master updated: [Doc](tvf)Added tvf support for reading documents from avro files (#23436)

diwu Thu, 31 Aug 2023 06:51:04 -0700

This is an automated email from the ASF dual-hosted git repository.

diwu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git



The following commit(s) were added to refs/heads/master by this push:
     new b763bfa17d [Doc](tvf)Added tvf support for reading documents from avro 
files (#23436)
b763bfa17d is described below

commit b763bfa17db45ee62687c72f9d86944dc67fdcb0
Author: DongLiang-0 <[email protected]>
AuthorDate: Thu Aug 31 21:49:27 2023 +0800

    [Doc](tvf)Added tvf support for reading documents from avro files (#23436)
---
 .../sql-manual/sql-functions/table-functions/hdfs.md  |  2 +-
 .../sql-manual/sql-functions/table-functions/s3.md    | 19 +++++++++++++++++++
 .../sql-manual/sql-functions/table-functions/hdfs.md  |  2 +-
 .../sql-manual/sql-functions/table-functions/s3.md    | 18 ++++++++++++++++++
 4 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/docs/en/docs/sql-manual/sql-functions/table-functions/hdfs.md 
b/docs/en/docs/sql-manual/sql-functions/table-functions/hdfs.md
index 5969585ad5..7cbd21366a 100644
--- a/docs/en/docs/sql-manual/sql-functions/table-functions/hdfs.md
+++ b/docs/en/docs/sql-manual/sql-functions/table-functions/hdfs.md
@@ -69,7 +69,7 @@ Related parameters for accessing HDFS in HA mode:
 
 File format parameters:
 
-- `format`: (required) Currently support 
`csv/csv_with_names/csv_with_names_and_types/json/parquet/orc`
+- `format`: (required) Currently support 
`csv/csv_with_names/csv_with_names_and_types/json/parquet/orc/avro`
 - `column_separator`: (optional) default `,`.
 - `line_delimiter`: (optional) default `\n`.
 - `compress_type`: (optional) Currently support 
`UNKNOWN/PLAIN/GZ/LZO/BZ2/LZ4FRAME/DEFLATE`. Default value is `UNKNOWN`, it 
will automatically infer the type based on the suffix of `uri`.
diff --git a/docs/en/docs/sql-manual/sql-functions/table-functions/s3.md 
b/docs/en/docs/sql-manual/sql-functions/table-functions/s3.md
index b42788583f..d089c98155 100644
--- a/docs/en/docs/sql-manual/sql-functions/table-functions/s3.md
+++ b/docs/en/docs/sql-manual/sql-functions/table-functions/s3.md
@@ -424,6 +424,25 @@ MySQL [(none)]> select * from s3(
 
+-----------+------------------------------------------+----------------+----------+-------------------------+--------+-------------+---------------+---------------------+
 ```
 
+**avro format**
+
+`avro` format: S3 tvf supports parsing the column names and column types of 
the table schema from the avro file. Example:
+
+```sql
+select * from s3(
+         "uri" = "http://127.0.0.1:9312/test2/person.avro";,
+         "ACCESS_KEY" = "ak",
+         "SECRET_KEY" = "sk",
+         "FORMAT" = "avro");
++--------+--------------+-------------+-----------------+
+| name   | boolean_type | double_type | long_type       |
++--------+--------------+-------------+-----------------+
+| Alyssa |            1 |     10.0012 | 100000000221133 |
+| Ben    |            0 |    5555.999 |      4009990000 |
+| lisi   |            0 | 5992225.999 |      9099933330 |
++--------+--------------+-------------+-----------------+
+```
+
 **uri contains wildcards**
 
 uri can use wildcards to read multiple files. Note: If wildcards are used, the 
format of each file must be consistent (especially 
csv/csv_with_names/csv_with_names_and_types count as different formats), S3 tvf 
uses the first file to parse out the table schema. For example:
diff --git a/docs/zh-CN/docs/sql-manual/sql-functions/table-functions/hdfs.md 
b/docs/zh-CN/docs/sql-manual/sql-functions/table-functions/hdfs.md
index 47e723a623..c7faaa7a86 100644
--- a/docs/zh-CN/docs/sql-manual/sql-functions/table-functions/hdfs.md
+++ b/docs/zh-CN/docs/sql-manual/sql-functions/table-functions/hdfs.md
@@ -70,7 +70,7 @@ hdfs(
 - `dfs.client.failover.proxy.provider.your-nameservices`：（选填）
 
 文件格式相关参数
-- `format`：(必填) 目前支持 
`csv/csv_with_names/csv_with_names_and_types/json/parquet/orc`
+- `format`：(必填) 目前支持 
`csv/csv_with_names/csv_with_names_and_types/json/parquet/orc/avro`
 - `column_separator`：(选填) 列分割符, 默认为`,`。 
 - `line_delimiter`：(选填) 行分割符，默认为`\n`。
 - `compress_type`: (选填) 目前支持 `UNKNOWN/PLAIN/GZ/LZO/BZ2/LZ4FRAME/DEFLATE`。 默认值为 
`UNKNOWN`, 将会根据 `uri` 的后缀自动推断类型。
diff --git a/docs/zh-CN/docs/sql-manual/sql-functions/table-functions/s3.md 
b/docs/zh-CN/docs/sql-manual/sql-functions/table-functions/s3.md
index 1a64205643..081734985c 100644
--- a/docs/zh-CN/docs/sql-manual/sql-functions/table-functions/s3.md
+++ b/docs/zh-CN/docs/sql-manual/sql-functions/table-functions/s3.md
@@ -428,6 +428,24 @@ MySQL [(none)]> select * from s3(
 |         5 | forest brown coral puff cream            | Manufacturer#3 | 
Brand#32 | STANDARD POLISHED TIN   |     15 | SM PKG      |           905 |  
wake carefully     |
 
+-----------+------------------------------------------+----------------+----------+-------------------------+--------+-------------+---------------+---------------------+
 ```
+**avro format**
+
+`avro`  格式：S3 tvf支持从avro文件中解析出table schema的列名、列类型。举例：
+
+```sql
+select * from s3(
+         "uri" = "http://127.0.0.1:9312/test2/person.avro";,
+         "ACCESS_KEY" = "ak",
+         "SECRET_KEY" = "sk",
+         "FORMAT" = "avro");
++--------+--------------+-------------+-----------------+
+| name   | boolean_type | double_type | long_type       |
++--------+--------------+-------------+-----------------+
+| Alyssa |            1 |     10.0012 | 100000000221133 |
+| Ben    |            0 |    5555.999 |      4009990000 |
+| lisi   |            0 | 5992225.999 |      9099933330 |
++--------+--------------+-------------+-----------------+
+```
 
 **uri包含通配符**
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[doris] branch master updated: [Doc](tvf)Added tvf support for reading documents from avro files (#23436)

Reply via email to