This is an automated email from the ASF dual-hosted git repository.
morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push:
new f0f3548dfe [regression](multi-catalog)add EMR cloud env test tools
(#21788)
f0f3548dfe is described below
commit f0f3548dfee84160d224b0bfd57e5098efa258f3
Author: slothever <[email protected]>
AuthorDate: Fri Jul 28 09:45:10 2023 +0800
[regression](multi-catalog)add EMR cloud env test tools (#21788)
add emr test tools for aliyun, huawei cloud, tencent cloud.
---
tools/emr_storage_regression/README.md | 178 +++++++++++++++++++
tools/emr_storage_regression/default_emr_env.sh | 58 +++++++
tools/emr_storage_regression/emr_tools.sh | 192 +++++++++++++++++++++
.../ping_test/create_catalog_aliyun.sql | 12 ++
.../ping_test/create_catalog_aws.sql | 7 +
.../ping_test/create_catalog_hw.sql | 5 +
.../ping_test/create_catalog_tx.sql | 5 +
.../ping_test/data/create_dlf_ping.sql | 9 +
.../ping_test/data/create_hive_ping.sql | 9 +
.../ping_test/data/create_spark_ping.sql | 9 +
.../ping_test/data/data_for_dlf.sql | 7 +
.../ping_test/data/data_for_hive.sql | 15 ++
.../ping_test/data/data_for_spark.sql | 10 ++
tools/emr_storage_regression/ping_test/ping.sql | 15 ++
.../ping_test/ping_aliyun.sql | 23 +++
tools/emr_storage_regression/ping_test/ping_poc.sh | 151 ++++++++++++++++
.../standard_set/analyze.sql | 69 ++++++++
.../standard_set/gen_spark_create_sql.sh | 38 ++++
.../gen_tbl/gen_clickbench_create_sql.sh | 153 ++++++++++++++++
.../standard_set/gen_tbl/gen_ssb_create_sql.sh | 166 ++++++++++++++++++
.../standard_set/gen_tbl/gen_tpch_create_sql.sh | 140 +++++++++++++++
.../standard_set/queries/clickbench_queries.sql | 43 +++++
.../standard_set/queries/ssb_flat_queries.sql | 13 ++
.../standard_set/queries/ssb_queries.sql | 13 ++
.../standard_set/queries/tpch_queries.sql | 22 +++
.../standard_set/run_queries.sh | 46 +++++
.../standard_set/run_standard_set.sh | 76 ++++++++
27 files changed, 1484 insertions(+)
diff --git a/tools/emr_storage_regression/README.md
b/tools/emr_storage_regression/README.md
new file mode 100644
index 0000000000..bdc6f4f8ba
--- /dev/null
+++ b/tools/emr_storage_regression/README.md
@@ -0,0 +1,178 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Data Lake Regression Testing Tool For External Table
+
+Used to test the doris external table on object storage for cloud vendors
+
+> Supported storage formats: HDFS, Alibaba Cloud OSS, Tencent Cloud COS,
Huawei Cloud OBS
+
+> Supported data lake table formats: Iceberg
+
+The following provides the example of the command line options:
+
+```
+sh tools/emr_storage_regression/emr_tools.sh --profile default_emr_env.sh
+```
+
+Or
+
+```
+sh tools/emr_storage_regression/emr_tools.sh --case CASE --endpoint ENDPOINT
--region REGION --service SERVICE --ak AK --sk SK --host HOST --user USER
--port PORT
+```
+
+The usage of each option is described below.
+
+## Connectivity Test
+
+When the `--case` option is set to `ping`, will check Doris's connectivity on
EMR:
+
+- `--endpoint`, Object Storage Endpoint.
+
+- `--region`, Object Storage Region.
+
+- `--ak`, Object Storage Access Key.
+
+- `--sk`, Object Storage Secret Key.
+
+- `--host`, Doris Mysql Client IP.
+
+- `--user`, Doris Mysql Client Username.
+
+- `--port`, Doris Mysql Client Port.
+
+- `--service`, EMR cloud vendors: ali(Alibaba), hw(Huawei), tx(tencent).
+
+### Environment Variables
+
+Need modify the environment variable in `default_emr_env.sh`, the script will
execute `source default_emr_env.sh` to make the environment variable take
effect.
+
+If environment variables are configured, you can run the test script directly
with the following command:
+
+```
+sh emr_tools.sh --profile default_emr_env.sh
+```
+
+### The Script Execution Steps For Connectivity Test
+
+1. Create Spark and Hive tables on EMR
+2. Use Spark and Hive command lines to insert sample data
+3. Doris creates the Catalog for connectivity test
+4. Execute SQL for connectivity test: `ping.sql`
+
+### Alibaba Cloud
+
+```
+sh emr_tools.sh --profile default_emr_env.sh
+```
+
+Or
+
+Set `--service` to `ali`, and then test connectivity on Huawei Cloud.
+
+```
+sh emr_tools.sh --case ping --endpoint oss-cn-beijing-internal.aliyuncs.com
--region cn-beijing --service ali --ak ak --sk sk --host 127.0.0.1 --user root
--port 9030 > log
+```
+
+Alibaba Cloud EMR also supports testing connectivity for both Doris with DLF
metadata and Doris on OSS-HDFS storage.
+
+- The DLF metadata connectivity test needs to be performed on the EMR cluster
where the DLF serves as the metadata store, Default value of `DLF_ENDPOINT` is
`datalake-vpc.cn-beijing.aliyuncs.com`, configured at ping_test/ping_poc.sh.
+
+- To test the OSS-HDFS storage connectivity, need to [enable the HDFS service
on the OSS storage and
configure](https://www.alibabacloud.com/help/en/e-mapreduce/latest/oss-hdfsnew),
Default value of `JINDO_ENDPOINT` is `cn-beijing.oss-dls.aliyuncs.com`,
configured at ping_test/ping_poc.sh.
+
+### Tencent Cloud
+
+```
+sh emr_tools.sh --profile default_emr_env.sh
+```
+
+Or
+
+Set `--service` to `tx`, and then test connectivity on Huawei Cloud.
+
+```
+sh emr_tools.sh --case ping --endpoint cos.ap-beijing.myqcloud.com --region
ap-beijing --service tx --ak ak --sk sk --host 127.0.0.1 --user root --port
9030 > log
+```
+
+### Huawei Cloud
+
+```
+sh emr_tools.sh --profile default_emr_env.sh
+```
+
+Or
+
+Set `--service`to `hw`, and then test connectivity on Huawei Cloud.
+
+```
+sh emr_tools.sh --case ping --endpoint obs.cn-north-4.myhuaweicloud.com
--region cn-north-4 --service hw --ak ak --sk sk --host 127.0.0.1 --user root
--port 9030 > log
+```
+
+## Performance Testing on Standard Test Set
+
+When the `--case` option is set to `data_set`, will test the query performance
of Doris external table:
+
+- `--test` test data set: ssb, ssb_flat, tpch, clickbench and all. Default
`all`.
+
+- `--service`, EMR cloud vendors: ali(Alibaba), hw(Huawei), tx(tencent).
+
+- `--host`, Doris Mysql Client IP.
+
+- `--user`, Doris Mysql Client Username.
+
+- `--port`, Doris Mysql Client Port.
+
+### Environment Variables
+
+Just modify the above environment variable in `default_emr_env.sh`, the script
will execute `source default_emr_env.sh` to make the environment variable take
effect.
+
+If environment variables are configured, you can run the test script directly
with the following command:
+
+```
+sh emr_tools.sh --profile default_emr_env.sh
+```
+
+### Prepare Data
+
+1. To run the standard test set using the `emr_tools.sh` script, you need to
rewrite the object storage bucket specified by the `BUCKET` variable, and then
prepare data in advance and put them under the bucket. The script will generate
table creation statements based on the bucket.
+
+2. Now the `emr_tools.sh` script supports iceberg, parquet and orc data for
ssb, ssb_flat, tpch, clickbench.
+
+### Execution Steps
+
+1. After the connectivity test, the Doris Catalog corresponding to the
standard test set is created
+2. Prepare the test set data based on the object storage bucket specified by
the `BUCKET` variable
+3. Generate Spark table creation statements and create Spark object storage
tables on EMR
+4. Create the spark table in the local HDFS directory: `hdfs:///benchmark-hdfs`
+5. You can choose to analyze Doris tables ahead of time and manually execute
the statements in `analyze.sql` in the Doris Catalog
+6. Execute standard test set scripts: `run_standard_set.sh`
+
+### Standard data set: ssb, ssb_flat, tpch, clickbench
+
+- Full test. After executing the test command, Doris will run ssb, ssb_flat,
tpch, clickbench tests in sequence, and the test results will include the cases
on HDFS and on the object storage specified by `--service`.
+
+```
+sh emr_tools.sh --case data_set --service ali --host 127.0.0.1 --user root
--port 9030 > log
+```
+
+- Specify a single test. `--test` option can be set to one of ssb, ssb_flat,
tpch and clickbench.
+
+```
+sh emr_tools.sh --case data_set --test ssb --service ali --host 127.0.0.1
--user root --port 9030 > log
+```
diff --git a/tools/emr_storage_regression/default_emr_env.sh
b/tools/emr_storage_regression/default_emr_env.sh
new file mode 100644
index 0000000000..0fae74e19b
--- /dev/null
+++ b/tools/emr_storage_regression/default_emr_env.sh
@@ -0,0 +1,58 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+##############################################################
+# See emr_tools.sh
+##############################################################
+
+# specified sevices: ali,hw,tx
+export SERVICE=ali
+# doris host
+export HOST=127.0.0.1
+# doris user
+export USER=root
+# doris mysql cli port
+export PORT=9030
+
+# prepare endpoint,region,ak/sk
+if [[ ${SERVICE} == 'ali' ]]; then
+ export CASE=ping
+ export AK=ak
+ export SK=sk
+ export ENDPOINT=oss-cn-beijing-internal.aliyuncs.com
+ export REGION=oss-cn-beijing
+ export HMS_META_URI="thrift://172.16.1.1:9083"
+ export HMS_WAREHOUSE=oss://benchmark-oss/user
+elif [[ ${SERVICE} == 'hw' ]]; then
+ export CASE=ping
+ export AK=ak
+ export SK=sk
+ export ENDPOINT=obs.cn-north-4.myhuaweicloud.com
+ export REGION=cn-north-4
+ export HMS_META_URI="thrift://node1:9083,thrift://node2:9083"
+ export HMS_WAREHOUSE=obs://datalake-bench/user
+ export
BEELINE_URI="jdbc:hive2://192.168.0.1:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;hive.server2.proxy.user=hive"
+elif [[ ${SERVICE} == 'tx' ]]; then
+ export CASE=ping
+ export AK=ak
+ export SK=sk
+ export ENDPOINT=cos.ap-beijing.mycloud.com
+ export REGION=ap-beijing
+ export HMS_META_URI="thrift://172.21.0.1:7004"
+ export HMS_WAREHOUSE=cosn://datalake-bench-cos-1308700295/user
+fi
diff --git a/tools/emr_storage_regression/emr_tools.sh
b/tools/emr_storage_regression/emr_tools.sh
new file mode 100644
index 0000000000..d1a7c0fe01
--- /dev/null
+++ b/tools/emr_storage_regression/emr_tools.sh
@@ -0,0 +1,192 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+##############################################################
+# This script is used to test EMR cloud service
+# Usage:
+# provide your env arguments in default_emr_env.sh
+# sh emr_tools.sh --case ping --endpoint
oss-cn-beijing-internal.aliyuncs.com --region cn-beijing --service ali --ak ak
--sk sk
+##############################################################
+
+set -eo pipefail
+
+usage() {
+ echo "
+Usage: $0 <options>
+ Optional options:
+ [no option]
+ --case regression case runner: ping, data_set
+ --profile cloud credential profile
+ --ak cloud access key
+ --sk cloud secret key
+ --endpoint cloud endpoint
+ --region cloud region
+ --service cloud optional service provider: ali, tx, hw
+ --host doris mysql cli host, example: 127.0.0.1
+ --user doris username, example: user
+ --port doris port, example: 9030
+ Example:
+ sh emr_tools.sh --case ping --endpoint
oss-cn-beijing-internal.aliyuncs.com --region cn-beijing --service ali --ak ak
--sk sk
+ "
+ exit 1
+}
+
+if ! OPTS="$(getopt \
+ -n "$0" \
+ -o '' \
+ -l 'case:' \
+ -l 'profile:' \
+ -l 'ak:' \
+ -l 'sk:' \
+ -l 'endpoint:' \
+ -l 'region:' \
+ -l 'service:' \
+ -l 'host:' \
+ -l 'user:' \
+ -l 'port:' \
+ -l 'test:' \
+ -o 'h' \
+ -- "$@")"; then
+ usage
+fi
+eval set -- "${OPTS}"
+
+while true; do
+ case "$1" in
+ --profile)
+ PROFILE="$2"
+ # can use custom profile: sh emr_tools.sh --profile default_emr_env.sh
+ if [[ -n "${PROFILE}" ]]; then
+ # example: "$(pwd)/default_emr_env.sh"
+ # shellcheck disable=SC1090
+ source "${PROFILE}"
+ fi
+ shift 2
+ break
+ ;;
+ --case)
+ CASE="$2"
+ shift 2
+ ;;
+ --ak)
+ AK="$2"
+ shift 2
+ ;;
+ --sk)
+ SK="$2"
+ shift 2
+ ;;
+ --endpoint)
+ ENDPOINT="$2"
+ shift 2
+ ;;
+ --region)
+ REGION="$2"
+ shift 2
+ ;;
+ --test)
+ TEST_SET="$2"
+ shift 2
+ ;;
+ --service)
+ SERVICE="$2"
+ shift 2
+ ;;
+ --host)
+ HOST="$2"
+ shift 2
+ ;;
+ --user)
+ USER="$2"
+ shift 2
+ ;;
+ --port)
+ PORT="$2"
+ shift 2
+ ;;
+ -h)
+ usage
+ ;;
+ --)
+ shift
+ break
+ ;;
+ *)
+ echo "$1"
+ echo "Internal error"
+ exit 1
+ ;;
+ esac
+done
+
+export FE_HOST=${HOST}
+export USER=${USER}
+export FE_QUERY_PORT=${PORT}
+
+if [[ ${CASE} == 'ping' ]]; then
+ if [[ ${SERVICE} == 'hw' ]]; then
+ # shellcheck disable=SC2269
+ HMS_META_URI="${HMS_META_URI}"
+ # shellcheck disable=SC2269
+ HMS_WAREHOUSE="${HMS_WAREHOUSE}"
+ # shellcheck disable=SC2269
+ BEELINE_URI="${BEELINE_URI}"
+ elif [[ ${SERVICE} == 'ali' ]]; then
+ # shellcheck disable=SC2269
+ HMS_META_URI="${HMS_META_URI}"
+ # shellcheck disable=SC2269
+ HMS_WAREHOUSE="${HMS_WAREHOUSE}"
+ else
+ # [[ ${SERVICE} == 'tx' ]];
+ # shellcheck disable=SC2269
+ HMS_META_URI="${HMS_META_URI}"
+ # shellcheck disable=SC2269
+ HMS_WAREHOUSE="${HMS_WAREHOUSE}"
+ fi
+ sh ping_test/ping_poc.sh "${ENDPOINT}" "${REGION}" "${SERVICE}" "${AK}"
"${SK}" "${HMS_META_URI}" "${HMS_WAREHOUSE}" "${BEELINE_URI}"
+elif [[ ${CASE} == 'data_set' ]]; then
+ if [[ ${SERVICE} == 'tx' ]]; then
+ BUCKET=cosn://datalake-bench-cos-1308700295
+ elif [[ ${SERVICE} == 'ali' ]]; then
+ BUCKET=oss://benchmark-oss
+ fi
+ # gen table for spark
+ if ! sh stardard_set/gen_spark_create_sql.sh "${BUCKET}" obj; then
+ echo "Fail to generate spark obj table for test set"
+ exit 1
+ fi
+ if ! sh stardard_set/gen_spark_create_sql.sh hdfs:///benchmark-hdfs hdfs;
then
+ echo "Fail to generate spark hdfs table for test set, import hdfs data
first"
+ exit 1
+ fi
+ # FE_HOST=172.16.1.163
+ # USER=root
+ # PORT=9035
+ if [[ -z ${TEST_SET} ]]; then
+ TEST_SET='all'
+ fi
+ TYPE=hdfs sh stardard_set/run_standard_set.sh "${FE_HOST}" "${USER}"
"${PORT}" hms_hdfs "${TEST_SET}"
+ TYPE=hdfs sh stardard_set/run_standard_set.sh "${FE_HOST}" "${USER}"
"${PORT}" iceberg_hms "${TEST_SET}"
+ if [[ ${SERVICE} == 'tx' ]]; then
+ sh stardard_set/run_standard_set.sh "${FE_HOST}" "${USER}" "${PORT}"
hms_cos "${TEST_SET}"
+ sh stardard_set/run_standard_set.sh "${FE_HOST}" "${USER}" "${PORT}"
iceberg_hms_cos "${TEST_SET}"
+ elif [[ ${SERVICE} == 'ali' ]]; then
+ sh stardard_set/run_standard_set.sh "${FE_HOST}" "${USER}" "${PORT}"
hms_oss "${TEST_SET}"
+ sh stardard_set/run_standard_set.sh "${FE_HOST}" "${USER}" "${PORT}"
iceberg_hms_oss "${TEST_SET}"
+ fi
+fi
diff --git a/tools/emr_storage_regression/ping_test/create_catalog_aliyun.sql
b/tools/emr_storage_regression/ping_test/create_catalog_aliyun.sql
new file mode 100644
index 0000000000..e19a9672e8
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/create_catalog_aliyun.sql
@@ -0,0 +1,12 @@
+CREATE CATALOG IF NOT EXISTS hms_hdfs PROPERTIES ( "type" = "hms",
"hive.metastore.uris" = "META_URI" );
+CREATE CATALOG IF NOT EXISTS hms_oss PROPERTIES ( "type" = "hms",
"hive.metastore.uris" = "META_URI", "oss.secret_key" = "SK_INPUT",
"oss.endpoint" = "ENDPOINT", "oss.access_key" = "AK_INPUT" );
+CREATE CATALOG IF NOT EXISTS hms_jindo PROPERTIES ( "type" = "hms",
"hive.metastore.uris" = "META_URI", "oss.secret_key" = "SK_INPUT",
"oss.endpoint" = "JINDO_ENDPOINT", "oss.access_key" = "AK_INPUT",
"oss.hdfs.enabled" = "true" );
+
+CREATE CATALOG IF NOT EXISTS iceberg_hms PROPERTIES ( "type" = "iceberg",
"iceberg.catalog.type" = "hms", "hive.metastore.uris" = "META_URI" );
+CREATE CATALOG IF NOT EXISTS iceberg_hms_oss PROPERTIES ( "type" = "iceberg",
"iceberg.catalog.type" = "hms", "hive.metastore.uris" = "META_URI",
"oss.secret_key" = "SK_INPUT", "oss.endpoint" = "ENDPOINT", "oss.access_key" =
"AK_INPUT" );
+CREATE CATALOG IF NOT EXISTS iceberg_hms_jindo PROPERTIES ( "type" =
"iceberg", "iceberg.catalog.type" = "hms", "hive.metastore.uris" = "META_URI",
"oss.secret_key" = "SK_INPUT", "oss.endpoint" = "JINDO_ENDPOINT",
"oss.access_key" = "AK_INPUT", "oss.hdfs.enabled" = "true" );
+
+CREATE CATALOG IF NOT EXISTS dlf PROPERTIES( "type" = "hms",
"hive.metastore.type" = "dlf", "dlf.proxy.mode" = "DLF_ONLY", "dlf.endpoint" =
"DLF_ENDPOINT", "dlf.uid" = "217316283625971977", "dlf.access_key" =
"AK_INPUT", "dlf.secret_key" = "SK_INPUT" );
+CREATE CATALOG IF NOT EXISTS dlf_jindo PROPERTIES( "type" = "hms",
"hive.metastore.type" = "dlf", "dlf.proxy.mode" = "DLF_ONLY", "dlf.endpoint" =
"DLF_ENDPOINT", "dlf.uid" = "217316283625971977", "dlf.access_key" =
"AK_INPUT", "dlf.secret_key" = "SK_INPUT", "oss.hdfs.enabled" = "true" );
+CREATE CATALOG IF NOT EXISTS iceberg_dlf PROPERTIES ( "type"="iceberg",
"iceberg.catalog.type" = "hms", "hive.metastore.type" = "dlf", "dlf.endpoint" =
"DLF_ENDPOINT", "dlf.region" = "cn-beijing", "dlf.proxy.mode" = "DLF_ONLY",
"dlf.uid" = "217316283625971977", "dlf.access_key" = "AK_INPUT",
"dlf.secret_key" = "SK_INPUT" );
+CREATE CATALOG IF NOT EXISTS iceberg_dlf_jindo PROPERTIES ( "type"="iceberg",
"iceberg.catalog.type" = "hms", "hive.metastore.type" = "dlf", "dlf.endpoint" =
"DLF_ENDPOINT", "dlf.region" = "cn-beijing", "dlf.proxy.mode" = "DLF_ONLY",
"dlf.uid" = "217316283625971977", "dlf.access_key" = "AK_INPUT",
"dlf.secret_key" = "SK_INPUT", "oss.hdfs.enabled" = "true" );
\ No newline at end of file
diff --git a/tools/emr_storage_regression/ping_test/create_catalog_aws.sql
b/tools/emr_storage_regression/ping_test/create_catalog_aws.sql
new file mode 100644
index 0000000000..28ef2aaf4b
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/create_catalog_aws.sql
@@ -0,0 +1,7 @@
+CREATE CATALOG IF NOT EXISTS hms_hdfs PROPERTIES ( "type" = "hms",
"hive.metastore.uris" = "META_URI" );
+CREATE CATALOG IF NOT EXISTS hms_s3 PROPERTIES ( "type" = "hms",
"hive.metastore.uris" = "META_URI", "s3.secret_key" = "SK_INPUT", "s3.endpoint"
= "ENDPOINT", "s3.access_key" = "AK_INPUT" );
+
+CREATE CATALOG IF NOT EXISTS iceberg_hms PROPERTIES ( "type" = "iceberg",
"iceberg.catalog.type" = "hms", "hive.metastore.uris" = "META_URI" );
+CREATE CATALOG IF NOT EXISTS iceberg_hms_s3 PROPERTIES ( "type" = "iceberg",
"iceberg.catalog.type" = "hms", "hive.metastore.uris" = "META_URI",
"s3.secret_key" = "SK_INPUT", "s3.endpoint" = "ENDPOINT", "s3.access_key" =
"AK_INPUT" );
+
+-- glue s3
\ No newline at end of file
diff --git a/tools/emr_storage_regression/ping_test/create_catalog_hw.sql
b/tools/emr_storage_regression/ping_test/create_catalog_hw.sql
new file mode 100644
index 0000000000..84b9c4b777
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/create_catalog_hw.sql
@@ -0,0 +1,5 @@
+CREATE CATALOG IF NOT EXISTS hms_hdfs PROPERTIES ( "type" = "hms",
"hive.metastore.uris" = "META_URI" );
+CREATE CATALOG IF NOT EXISTS hms_obs PROPERTIES ( "type" = "hms",
"hive.metastore.uris" = "META_URI", "obs.secret_key" = "SK_INPUT",
"obs.endpoint" = "ENDPOINT", "obs.access_key" = "AK_INPUT" );
+
+CREATE CATALOG IF NOT EXISTS iceberg_hms PROPERTIES ( "type" = "iceberg",
"iceberg.catalog.type" = "hms", "hive.metastore.uris" = "META_URI" );
+CREATE CATALOG IF NOT EXISTS iceberg_hms_obs PROPERTIES ( "type" = "iceberg",
"iceberg.catalog.type" = "hms", "hive.metastore.uris" = "META_URI",
"obs.secret_key" = "SK_INPUT", "obs.endpoint" = "ENDPOINT", "obs.access_key" =
"AK_INPUT" );
diff --git a/tools/emr_storage_regression/ping_test/create_catalog_tx.sql
b/tools/emr_storage_regression/ping_test/create_catalog_tx.sql
new file mode 100644
index 0000000000..2f672736db
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/create_catalog_tx.sql
@@ -0,0 +1,5 @@
+CREATE CATALOG IF NOT EXISTS hms_hdfs PROPERTIES ( "type" = "hms",
"hive.metastore.uris" = "META_URI" );
+CREATE CATALOG IF NOT EXISTS hms_cos PROPERTIES ( "type" = "hms",
"hive.metastore.uris" = "META_URI", "cos.secret_key" = "SK_INPUT",
"cos.endpoint" = "ENDPOINT", "cos.access_key" = "AK_INPUT" );
+
+CREATE CATALOG IF NOT EXISTS iceberg_hms PROPERTIES ( "type" = "iceberg",
"iceberg.catalog.type" = "hms", "hive.metastore.uris" = "META_URI" );
+CREATE CATALOG IF NOT EXISTS iceberg_hms_cos PROPERTIES ( "type" = "iceberg",
"iceberg.catalog.type" = "hms", "hive.metastore.uris" = "META_URI",
"cos.secret_key" = "SK_INPUT", "cos.endpoint" = "ENDPOINT", "cos.access_key" =
"AK_INPUT" );
diff --git a/tools/emr_storage_regression/ping_test/data/create_dlf_ping.sql
b/tools/emr_storage_regression/ping_test/data/create_dlf_ping.sql
new file mode 100644
index 0000000000..b37b8085bf
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/data/create_dlf_ping.sql
@@ -0,0 +1,9 @@
+CREATE DATABASE IF NOT EXISTS hive_dlf_db;
+CREATE TABLE IF NOT EXISTS hive_dlf_db.types ( hms_int INT, hms_smallint
SMALLINT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING, hms_decimal
DECIMAL(12,4), hms_char CHAR(50), hms_varchar VARCHAR(50), hms_bool BOOLEAN,
hms_timstamp TIMESTAMP, hms_date DATE ) ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredP [...]
+CREATE TABLE IF NOT EXISTS hive_dlf_db.types_one_part ( hms_int INT,
hms_smallint SMALLINT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char CHAR(50), hms_varchar VARCHAR(50), hms_bool
BOOLEAN, hms_timstamp TIMESTAMP, hms_date DATE ) PARTITIONED BY (dt STRING) ROW
FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT
'org.apac [...]
+CREATE TABLE IF NOT EXISTS hive_dlf_db.types_multi_part ( hms_int INT,
hms_smallint SMALLINT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char CHAR(50), hms_varchar VARCHAR(50), hms_bool
BOOLEAN ) PARTITIONED BY (dt STRING, hms_timstamp TIMESTAMP, hms_date DATE) ROW
FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT
'org.ap [...]
+
+CREATE DATABASE IF NOT EXISTS hive_iceberg_db_dlf;
+CREATE TABLE IF NOT EXISTS hive_iceberg_db_dlf.types ( hms_int INT,
hms_smallint INT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char STRING, hms_varchar STRING, hms_bool
BOOLEAN, hms_timstamp TIMESTAMP, hms_date DATE ) STORED BY
"org.apache.iceberg.mr.hive.HiveIcebergStorageHandler" TBLPROPERTIES
('iceberg.catalog'='dlf', 'format-version'= '2');
+CREATE TABLE IF NOT EXISTS hive_iceberg_db_dlf.types_one_part ( hms_int INT,
hms_smallint INT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char STRING, hms_varchar STRING, hms_bool
BOOLEAN, hms_timstamp TIMESTAMP, hms_date DATE ) PARTITIONED BY (dt STRING)
STORED BY "org.apache.iceberg.mr.hive.HiveIcebergStorageHandler" TBLPROPERTIES
('iceberg.catalog'='dlf', 'format-version'='2');
+CREATE TABLE IF NOT EXISTS hive_iceberg_db_dlf.types_multi_part ( hms_int INT,
hms_smallint SMALLINT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char CHAR(50), hms_varchar VARCHAR(50), hms_bool
BOOLEAN ) PARTITIONED BY (dt STRING, hms_timstamp TIMESTAMP, hms_date DATE)
STORED BY "org.apache.iceberg.mr.hive.HiveIcebergStorageHandler" TBLPROPERTIES
('iceberg.catalog'='dlf', 'format-version'='2');
diff --git a/tools/emr_storage_regression/ping_test/data/create_hive_ping.sql
b/tools/emr_storage_regression/ping_test/data/create_hive_ping.sql
new file mode 100644
index 0000000000..54e306f729
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/data/create_hive_ping.sql
@@ -0,0 +1,9 @@
+CREATE DATABASE IF NOT EXISTS hive_hms_db;
+CREATE TABLE IF NOT EXISTS hive_hms_db.types ( hms_int INT, hms_smallint
SMALLINT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING, hms_decimal
DECIMAL(12,4), hms_char CHAR(50), hms_varchar VARCHAR(50), hms_bool BOOLEAN,
hms_timstamp TIMESTAMP, hms_date DATE ) ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredP [...]
+CREATE TABLE IF NOT EXISTS hive_hms_db.types_one_part ( hms_int INT,
hms_smallint SMALLINT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char CHAR(50), hms_varchar VARCHAR(50), hms_bool
BOOLEAN, hms_timstamp TIMESTAMP, hms_date DATE ) PARTITIONED BY (dt STRING) ROW
FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT
'org.apac [...]
+CREATE TABLE IF NOT EXISTS hive_hms_db.types_multi_part ( hms_int INT,
hms_smallint SMALLINT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char CHAR(50), hms_varchar VARCHAR(50), hms_bool
BOOLEAN ) PARTITIONED BY (dt STRING, hms_timstamp TIMESTAMP, hms_date DATE) ROW
FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT
'org.ap [...]
+
+CREATE DATABASE IF NOT EXISTS hive_iceberg_db_hms;
+CREATE TABLE IF NOT EXISTS hive_iceberg_db_hms.types ( hms_int INT,
hms_smallint INT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char STRING, hms_varchar STRING, hms_bool
BOOLEAN, hms_timstamp TIMESTAMP, hms_date DATE ) STORED BY
"org.apache.iceberg.mr.hive.HiveIcebergStorageHandler" TBLPROPERTIES
('format-version'='2');
+CREATE TABLE IF NOT EXISTS hive_iceberg_db_hms.types_one_part ( hms_int INT,
hms_smallint INT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char STRING, hms_varchar STRING, hms_bool
BOOLEAN, hms_timstamp TIMESTAMP, hms_date DATE ) PARTITIONED BY (dt STRING)
STORED BY "org.apache.iceberg.mr.hive.HiveIcebergStorageHandler" TBLPROPERTIES
( 'format-version'='2');
+CREATE TABLE IF NOT EXISTS hive_iceberg_db_hms.types_multi_part ( hms_int INT,
hms_smallint INT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char STRING, hms_varchar STRING, hms_bool
BOOLEAN ) PARTITIONED BY (dt STRING, hms_timstamp TIMESTAMP, hms_date DATE)
STORED BY "org.apache.iceberg.mr.hive.HiveIcebergStorageHandler" TBLPROPERTIES
( 'format-version'='2');
diff --git a/tools/emr_storage_regression/ping_test/data/create_spark_ping.sql
b/tools/emr_storage_regression/ping_test/data/create_spark_ping.sql
new file mode 100644
index 0000000000..06b9cb6501
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/data/create_spark_ping.sql
@@ -0,0 +1,9 @@
+CREATE DATABASE IF NOT EXISTS spark_hms_db;
+CREATE TABLE IF NOT EXISTS spark_hms_db.types ( hms_int INT, hms_smallint
SMALLINT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING, hms_decimal
DECIMAL(12,4), hms_char CHAR(50), hms_varchar VARCHAR(50), hms_bool BOOLEAN,
hms_timstamp TIMESTAMP, hms_date DATE ) using parquet;
+CREATE TABLE IF NOT EXISTS spark_hms_db.types_one_part ( hms_int INT,
hms_smallint SMALLINT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char CHAR(50), hms_varchar VARCHAR(50), hms_bool
BOOLEAN, hms_timstamp TIMESTAMP, hms_date DATE ) using parquet partitioned by
(dt string);
+CREATE TABLE IF NOT EXISTS spark_hms_db.types_multi_part ( hms_int INT,
hms_smallint SMALLINT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char CHAR(50), hms_varchar VARCHAR(50), hms_bool
BOOLEAN ) using parquet partitioned by (dt string, hms_timstamp TIMESTAMP,
hms_date DATE);
+
+--CREATE DATABASE IF NOT EXISTS iceberg.spark_iceberg_db_hms;
+--CREATE TABLE IF NOT EXISTS iceberg.spark_iceberg_db_hms.types ( hms_int INT,
hms_smallint SMALLINT, hms_bigint BIGINT, hms_double DOUBLE, hms_string STRING,
hms_decimal DECIMAL(12,4), hms_char CHAR(50), hms_varchar VARCHAR(50), hms_bool
BOOLEAN, hms_timstamp TIMESTAMP, hms_date DATE ) USING iceberg TBLPROPERTIES (
'format-version'='2');
+--CREATE TABLE IF NOT EXISTS iceberg.spark_iceberg_db_hms.types_one_part (
hms_int INT, hms_smallint SMALLINT, hms_bigint BIGINT, hms_double DOUBLE,
hms_string STRING, hms_decimal DECIMAL(12,4), hms_char CHAR(50), hms_varchar
VARCHAR(50), hms_bool BOOLEAN, hms_timstamp TIMESTAMP, hms_date DATE )
PARTITIONED BY (dt STRING) USING iceberg TBLPROPERTIES ( 'format-version'='2');
+--CREATE TABLE IF NOT EXISTS iceberg.spark_iceberg_db_hms.types_multi_part (
hms_int INT, hms_smallint SMALLINT, hms_bigint BIGINT, hms_double DOUBLE,
hms_string STRING, hms_decimal DECIMAL(12,4), hms_char CHAR(50), hms_varchar
VARCHAR(50), hms_bool BOOLEAN ) PARTITIONED BY (dt STRING, hms_timstamp
TIMESTAMP, hms_date DATE) USING iceberg TBLPROPERTIES ( 'format-version'='2');
\ No newline at end of file
diff --git a/tools/emr_storage_regression/ping_test/data/data_for_dlf.sql
b/tools/emr_storage_regression/ping_test/data/data_for_dlf.sql
new file mode 100644
index 0000000000..cf9574563b
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/data/data_for_dlf.sql
@@ -0,0 +1,7 @@
+insert into hive_dlf_db.types
values(123,34,3455,34.667754,"wastxali",234.1234,"a23f","1234vb",false,"2023-04-23
21:23:34.123","2023-04-23");
+insert into hive_dlf_db.types_one_part
values(604,376,234,123.478,"aswwas",234.1234,"a23f","wsd",false,"2023-04-23
21:23:34.123","2023-04-23","2023-04-22");
+insert into hive_dlf_db.types_one_part
values(223,22,234,234.500,"awsali",234.1234,"a23f","1234vb",true,"2023-04-22
21:21:34.123","2023-04-21","2023-04-24");
+insert into hive_dlf_db.types_multi_part
values(1234,346,234,123.65567,"hwaws",234.1234,"a23f","1234vb",true,"2023-04-20","2023-04-21
19:23:34.123","2023-04-19");
+insert into hive_dlf_db.types_multi_part
values(3212343,34,234,123.730,"hwaws",234.1234,"a23f","1234vb",true,"2023-04-20","2023-04-22
20:23:34.123","2023-04-22");
+insert into hive_dlf_db.types_multi_part
values(355,22,990,123.324235,"hwaws",234.1234,"a23f","1234vb",true,"2023-04-21","2023-04-22
20:23:34.123","2023-04-22");
+insert into hive_dlf_db.types_multi_part
values(23675,22,986,123.324235,"hwaws",234.1234,"a23f","1234vb",true,"2023-04-25","2023-04-21
19:23:34.123","2023-04-24");
\ No newline at end of file
diff --git a/tools/emr_storage_regression/ping_test/data/data_for_hive.sql
b/tools/emr_storage_regression/ping_test/data/data_for_hive.sql
new file mode 100644
index 0000000000..a7599944ef
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/data/data_for_hive.sql
@@ -0,0 +1,15 @@
+insert into hive_hms_db.types
values(1123,5126,51,4534.63463,"wastxali",235.2351,"a23f","1234vb",false,"2023-04-23
21:23:34.123","2023-04-23");
+insert into hive_hms_db.types_one_part
values(23621,23,234,345.12512356,"aswwas",525.2352,"a23f","wsd",false,"2023-04-23
21:23:34.123","2023-04-23","2023-04-22");
+insert into hive_hms_db.types_one_part
values(11625,62,234,2347.6236,"awsali",546.2342,"a23f","1234vb",true,"2023-04-22
21:21:34.123","2023-04-21","2023-04-24");
+insert into hive_hms_db.types_multi_part
values(123,66,234,13.1242,"hwaws",3463.4363,"a23f","1234vb",true,"2023-04-20","2023-04-21
19:23:34.123","2023-04-19");
+insert into hive_hms_db.types_multi_part
values(324,77,234,123.163446,"hwaws",345.3413,"a23f","1234vb",true,"2023-04-20","2023-04-22
20:23:34.123","2023-04-22");
+insert into hive_hms_db.types_multi_part
values(423,909,234,123657.512,"hwaws",234.2363,"a23f","1234vb",true,"2023-04-21","2023-04-22
20:23:34.123","2023-04-22");
+insert into hive_hms_db.types_multi_part
values(343,712,234,1234.21451,"hwaws",3564.8945,"a23f","1234vb",true,"2023-04-25","2023-04-21
19:23:34.123","2023-04-24");
+
+insert into hive_iceberg_db_hms.types
values(123,22,234,123.324235,"wsawh",234.1234,"a23f","1234vb",false,"2023-04-23
21:23:34.123","2023-04-23");
+insert into hive_iceberg_db_hms.types_one_part
values(223,22,234,123.324235,"aswwas",234.1234,"a23f","wsd",false,"2023-04-23
21:23:34.123","2023-04-23","2023-04-22");
+insert into hive_iceberg_db_hms.types_one_part
values(223,22,234,123.324235,"awsali",234.1234,"a23f","1234vb",true,"2023-04-22
21:21:34.123","2023-04-21","2023-04-24");
+insert into hive_iceberg_db_hms.types_multi_part
values(323,22,234,123.324235,"hwaws",234.1234,"a23f","1234vb",true,"2023-04-20","2023-04-21
19:23:34.123","2023-04-19");
+insert into hive_iceberg_db_hms.types_multi_part
values(323,22,234,123.324235,"hwaws",234.1234,"a23f","1234vb",true,"2023-04-20","2023-04-22
20:23:34.123","2023-04-22");
+insert into hive_iceberg_db_hms.types_multi_part
values(323,22,234,123.324235,"hwaws",234.1234,"a23f","1234vb",true,"2023-04-21","2023-04-22
20:23:34.123","2023-04-22");
+insert into hive_iceberg_db_hms.types_multi_part
values(323,22,234,123.324235,"hwaws",234.1234,"a23f","1234vb",true,"2023-04-25","2023-04-21
19:23:34.123","2023-04-24");
diff --git a/tools/emr_storage_regression/ping_test/data/data_for_spark.sql
b/tools/emr_storage_regression/ping_test/data/data_for_spark.sql
new file mode 100644
index 0000000000..a1e2d1e996
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/data/data_for_spark.sql
@@ -0,0 +1,10 @@
+insert into spark_hms_db.types
values(123,22,234,123.324235,"sawer",234.1234,"a23f","1234vb",false,"2023-04-23
21:23:34.123","2023-04-23");
+insert into spark_hms_db.types_one_part
values(223,22,234,123.324235,"sawer",234.1234,"a23f","1234vb",false,"2023-04-23
21:23:34.123","2023-04-23 21:23:34","2023-04-23");
+insert into spark_hms_db.types_one_part
values(223,22,234,123.324235,"sawer",234.1234,"a23f","1234vb",false,"2023-04-23
21:23:34.123","2023-04-23 21:23:34","2023-04-23");
+insert into spark_hms_db.types_multi_part
values(323,22,234,123.324235,"sawer",234.1234,"a23f","1234vb",true,"2023-04-23","2023-04-23
21:23:34.123","2023-04-23");
+insert into spark_hms_db.types_multi_part
values(323,22,234,123.324235,"sawer",234.1234,"a23f","1234vb",true,"2023-04-23","2023-04-23
21:23:34.123","2023-04-23");
+insert into spark_hms_db.types_multi_part
values(323,22,234,123.324235,"sawer",234.1234,"a23f","1234vb",true,"2023-04-23","2023-04-23
21:23:34.123","2023-04-23");
+
+--insert into iceberg.spark_iceberg_db_hms.types
values(123,22,234,123.324235,"sawer",234.1234,"a23f","1234vb",false,"2023-04-23
21:23:34.123","2023-04-23");
+--insert into iceberg.spark_iceberg_db_hms.types_one_part
values(223,22,234,123.324235,"sawer",234.1234,"a23f","1234vb",false,"2023-04-23
21:23:34.123","2023-04-23","2023-04-23");
+--insert into iceberg.spark_iceberg_db_hms.types_multi_part
values(323,22,234,123.324235,"sawer",234.1234,"a23f","1234vb",true,"2023-04-23","2023-04-23
21:23:34.123","2023-04-23");
diff --git a/tools/emr_storage_regression/ping_test/ping.sql
b/tools/emr_storage_regression/ping_test/ping.sql
new file mode 100644
index 0000000000..01b75a15b2
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/ping.sql
@@ -0,0 +1,15 @@
+select * from spark_hms_db.types;
+select * from spark_hms_db.types_one_part;
+select * from spark_hms_db.types_multi_part;
+
+select * from hive_hms_db.types;
+select * from hive_hms_db.types_one_part;
+select * from hive_hms_db.types_multi_part;
+
+select * from spark_iceberg_db_hms.types;
+select * from spark_iceberg_db_hms.types_one_part;
+select * from spark_iceberg_db_hms.types_multi_part;
+
+select * from hive_iceberg_db_hms.types;
+select * from hive_iceberg_db_hms.types_one_part;
+select * from hive_iceberg_db_hms.types_multi_part;
\ No newline at end of file
diff --git a/tools/emr_storage_regression/ping_test/ping_aliyun.sql
b/tools/emr_storage_regression/ping_test/ping_aliyun.sql
new file mode 100644
index 0000000000..63e974a2e6
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/ping_aliyun.sql
@@ -0,0 +1,23 @@
+select * from spark_hms_db.types;
+select * from spark_hms_db.types_one_part;
+select * from spark_hms_db.types_multi_part;
+
+select * from hive_hms_db.types;
+select * from hive_hms_db.types_one_part;
+select * from hive_hms_db.types_multi_part;
+
+select * from spark_iceberg_db_hms.types;
+select * from spark_iceberg_db_hms.types_one_part;
+select * from spark_iceberg_db_hms.types_multi_part;
+
+select * from hive_iceberg_db_hms.types;
+select * from hive_iceberg_db_hms.types_one_part;
+select * from hive_iceberg_db_hms.types_multi_part;
+
+select * from hive_dlf_db.types;
+select * from hive_dlf_db.types_one_part;
+select * from hive_dlf_db.types_multi_part;
+
+select * from hive_iceberg_db_dlf.types;
+select * from hive_iceberg_db_dlf.types_one_part;
+select * from hive_iceberg_db_dlf.types_multi_part;
diff --git a/tools/emr_storage_regression/ping_test/ping_poc.sh
b/tools/emr_storage_regression/ping_test/ping_poc.sh
new file mode 100644
index 0000000000..e9c23e1581
--- /dev/null
+++ b/tools/emr_storage_regression/ping_test/ping_poc.sh
@@ -0,0 +1,151 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+##############################################################
+# See emr_tools.sh
+##############################################################
+
+## Step 1: create external table and import data
+ENDPOINT=$1
+REGION=$2
+SERVICE=$3
+AK=$4
+SK=$5
+HMS_META_URI=$6
+HMS_WAREHOUSE=$7
+BEELINE_URI=$8
+
+# set global env to local
+# shellcheck disable=SC2269
+FE_HOST=${FE_HOST}
+# shellcheck disable=SC2269
+FE_QUERY_PORT=${FE_QUERY_PORT}
+# shellcheck disable=SC2269
+USER=${USER}
+
+DLF_ENDPOINT=datalake-vpc.cn-beijing.aliyuncs.com
+JINDO_ENDPOINT=cn-beijing.oss-dls.aliyuncs.com
+
+if [[ -z ${HMS_WAREHOUSE} ]]; then
+ echo "Need warehouse for ${SERVICE}"
+fi
+cd "$(dirname "$0")" || gexit
+
+run_spark_create_sql() {
+ if [[ ${SERVICE} == 'ali' ]]; then
+ PARAM="--conf
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
\
+ --conf
spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog \
+ --conf
spark.sql.catalog.iceberg.catalog-impl=org.apache.iceberg.aliyun.dlf.hive.DlfCatalog
\
+ --conf spark.sql.catalog.iceberg.access.key.id=${AK} \
+ --conf spark.sql.catalog.iceberg.access.key.secret=${SK} \
+ --conf spark.sql.catalog.iceberg.dlf.endpoint=${DLF_ENDPOINT} \
+ --conf spark.sql.catalog.iceberg.dlf.region-id=${REGION} \
+ --conf spark.sql.catalog.hms=org.apache.iceberg.spark.SparkCatalog
\
+ --conf spark.sql.catalog.hms.type=hive \
+ --conf spark.sql.defaultCatalog=hms \
+ --conf spark.sql.catalog.hms.warehouse=${HMS_WAREHOUSE} \
+ -f data/create_spark_ping.sql" 2>spark_create.log
+ elif [[ ${SERVICE} == 'tx' ]]; then
+ PARAM="--jars
/usr/local/service/iceberg/iceberg-spark-runtime-3.2_2.12-0.13.1.jar \
+ --conf
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
\
+ --conf
spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog \
+ --conf spark.sql.catalog.spark_catalog.type=hive \
+ --conf
spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog \
+ --conf spark.sql.catalog.local.type=hadoop \
+ --conf spark.sql.catalog.local.warehouse=/usr/hive/warehouse \
+ -f data/create_spark_ping.sql" 2>spark_create.log
+ elif [[ ${SERVICE} == 'hw' ]]; then
+ PARAM="-f data/create_spark_ping.sql" 2>spark_create.log
+ else
+ echo "Unknown service type: ${SERVICE}"
+ exit 1
+ fi
+ eval spark-sql "${PARAM}"
+}
+
+run_spark_create_sql
+run_hive_create_sql() {
+ if [[ ${SERVICE} == 'hw' ]]; then
+ beeline -u "${BEELINE_URI}" -f data/create_hive_ping.sql
2>hive_create.log
+ elif [[ ${SERVICE} == 'ali' ]]; then
+ hive -f data/create_hive_ping.sql 2>hive_create.log
+ else
+ hive -f data/create_hive_ping.sql 2>hive_create.log
+ fi
+}
+
+run_hive_create_sql
+
+## Step 2: make ping data
+spark-sql -f data/data_for_spark.sql >>spark_data.log
+hive -f data/data_for_hive.sql >>hive_data.log
+
+run_query() {
+ QUERY_NUM=1
+ TRIES=2
+ sql_file=$1
+ catalog=$2
+ while read -r query; do
+ echo -n "create catalog ${QUERY_NUM},"
+ for i in $(seq 1 "${TRIES}"); do
+ if [[ -n ${catalog} ]]; then
+ query="switch ${catalog};${query}"
+ fi
+ RES=$(mysql -vvv -h"${FE_HOST}" -u"${USER}" -P"${FE_QUERY_PORT}"
-e "${query}")
+ echo -n "${RES}"
+ [[ "${i}" != "${TRIES}" ]] && echo -n ","
+ done
+ QUERY_NUM=$((QUERY_NUM + 1))
+ done <"${sql_file}"
+}
+
+## Step 3: create external catalog in doris
+# shellcheck disable=SC2094
+case "${SERVICE}" in
+ali)
+ sed -e 's#DLF_ENDPOINT#'"${DLF_ENDPOINT}"'#g' emr_catalog.sql
>emr_catalog.sql
+ sed -e 's#JINDO_ENDPOINT#'"${JINDO_ENDPOINT}"'#g' emr_catalog.sql
>emr_catalog.sql
+ sed -e 's#ENDPOINT#'"${ENDPOINT}"'#g' -e
's#META_URI#'"${HMS_META_URI}"'#g' -e 's#AK_INPUT#'"${AK}"'#g' -e
's#SK_INPUT#'"${SK}"'#g' create_catalog_aliyun.sql >emr_catalog.sql
+ ;;
+tx)
+ sed -e 's#ENDPOINT#'"${ENDPOINT}"'#g' -e
's#META_URI#'"${HMS_META_URI}"'#g' -e 's#AK_INPUT#'"${AK}"'#g' -e
's#SK_INPUT#'"${SK}"'#g' create_catalog_tx.sql >emr_catalog.sql
+ ;;
+aws)
+ sed -e 's#ENDPOINT#'"${ENDPOINT}"'#g' -e
's#META_URI#'"${HMS_META_URI}"'#g' -e 's#AK_INPUT#'"${AK}"'#g' -e
's#SK_INPUT#'"${SK}"'#g' create_catalog_aws.sql >emr_catalog.sql
+ ;;
+hw)
+ sed -e 's#ENDPOINT#'"${ENDPOINT}"'#g' -e
's#META_URI#'"${HMS_META_URI}"'#g' -e 's#AK_INPUT#'"${AK}"'#g' -e
's#SK_INPUT#'"${SK}"'#g' create_catalog_hw.sql >emr_catalog.sql
+ ;;
+*)
+ echo "Internal error"
+ exit 1
+ ;;
+esac
+
+run_query emr_catalog.sql
+
+## Step 4: query ping
+EMR_CATALOG=$(awk '{print $6}' emr_catalog.sql)
+# shellcheck disable=SC2116
+# required echo here, or the EMR_CATALOG will not be split.
+for c in $(echo "${EMR_CATALOG}"); do
+ if [[ ${SERVICE} == 'ali' ]]; then
+ run_query ping_aliyun.sql "${c}"
+ fi
+ run_query ping.sql "${c}"
+done
diff --git a/tools/emr_storage_regression/standard_set/analyze.sql
b/tools/emr_storage_regression/standard_set/analyze.sql
new file mode 100644
index 0000000000..7a0e4ef746
--- /dev/null
+++ b/tools/emr_storage_regression/standard_set/analyze.sql
@@ -0,0 +1,69 @@
+analyze table ssb100_parquet.customer;
+analyze table ssb100_parquet.dates;
+analyze table ssb100_parquet.lineorder;
+analyze table ssb100_parquet.lineorder_flat;
+analyze table ssb100_parquet.part;
+analyze table ssb100_parquet.supplier;
+
+analyze table ssb100_orc.customer;
+analyze table ssb100_orc.dates;
+analyze table ssb100_orc.lineorder;
+analyze table ssb100_orc.lineorder_flat;
+analyze table ssb100_orc.part;
+analyze table ssb100_orc.supplier;
+
+analyze table tpch100_parquet.customer;
+analyze table tpch100_parquet.lineitem;
+analyze table tpch100_parquet.nation;
+analyze table tpch100_parquet.orders;
+analyze table tpch100_parquet.part;
+analyze table tpch100_parquet.partsupp;
+analyze table tpch100_parquet.region;
+analyze table tpch100_parquet.supplier;
+
+analyze table tpch100_orc.customer;
+analyze table tpch100_orc.lineitem;
+analyze table tpch100_orc.nation;
+analyze table tpch100_orc.orders;
+analyze table tpch100_orc.part;
+analyze table tpch100_orc.partsupp;
+analyze table tpch100_orc.region;
+analyze table tpch100_orc.supplier;
+
+analyze table clickbench_orc.hits;
+analyze table clickbench_parquet.hits;
+
+analyze table ssb100_parquet_hdfs.customer;
+analyze table ssb100_parquet_hdfs.dates;
+analyze table ssb100_parquet_hdfs.lineorder;
+analyze table ssb100_parquet_hdfs.lineorder_flat;
+analyze table ssb100_parquet_hdfs.part;
+analyze table ssb100_parquet_hdfs.supplier;
+
+analyze table ssb100_orc_hdfs.customer;
+analyze table ssb100_orc_hdfs.dates;
+analyze table ssb100_orc_hdfs.lineorder;
+analyze table ssb100_orc_hdfs.lineorder_flat;
+analyze table ssb100_orc_hdfs.part;
+analyze table ssb100_orc_hdfs.supplier;
+
+analyze table tpch100_orc_hdfs.customer;
+analyze table tpch100_orc_hdfs.lineitem;
+analyze table tpch100_orc_hdfs.nation;
+analyze table tpch100_orc_hdfs.orders;
+analyze table tpch100_orc_hdfs.part;
+analyze table tpch100_orc_hdfs.partsupp;
+analyze table tpch100_orc_hdfs.region;
+analyze table tpch100_orc_hdfs.supplier;
+
+analyze table tpch100_parquet_hdfs.customer;
+analyze table tpch100_parquet_hdfs.lineitem;
+analyze table tpch100_parquet_hdfs.nation;
+analyze table tpch100_parquet_hdfs.orders;
+analyze table tpch100_parquet_hdfs.part;
+analyze table tpch100_parquet_hdfs.partsupp;
+analyze table tpch100_parquet_hdfs.region;
+analyze table tpch100_parquet_hdfs.supplier;
+
+analyze table clickbench_hdfs_orc.hits;
+analyze table clickbench_hdfs_parquet.hits;
diff --git a/tools/emr_storage_regression/standard_set/gen_spark_create_sql.sh
b/tools/emr_storage_regression/standard_set/gen_spark_create_sql.sh
new file mode 100644
index 0000000000..e70e67e43e
--- /dev/null
+++ b/tools/emr_storage_regression/standard_set/gen_spark_create_sql.sh
@@ -0,0 +1,38 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+##############################################################
+# See emr_tools.sh
+##############################################################
+
+# shellcheck disable=SC2129
+BUCKET=$1
+TYPE=$2
+cd "$(dirname "$0")" || exit
+sh gen_tbl/gen_ssb_create_sql.sh "${BUCKET}"/ssb/ssb100_orc
ssb100_orc_"${TYPE}" orc >create_"${TYPE}".sql
+sh gen_tbl/gen_ssb_create_sql.sh "${BUCKET}"/ssb/ssb100_parquet
ssb100_parquet_"${TYPE}" parquet >>create_"${TYPE}".sql
+# tpch
+sh gen_tbl/gen_tpch_create_sql.sh "${BUCKET}"/tpch/tpch100_orc
tpch100_orc_"${TYPE}" orc >>create_"${TYPE}".sql
+sh gen_tbl/gen_tpch_create_sql.sh "${BUCKET}"/tpch/tpch100_parquet
tpch100_parquet_"${TYPE}" parquet >>create_"${TYPE}".sql
+# clickbench
+sh gen_tbl/gen_clickbench_create_sql.sh "${BUCKET}"/clickbench/hits_parquet
clickbench_parquet_"${TYPE}" parquet >>create_"${TYPE}".sql
+sh gen_tbl/gen_clickbench_create_sql.sh "${BUCKET}"/clickbench/hits_orc
clickbench_orc_"${TYPE}" orc >>create_"${TYPE}".sql
+# iceberg
+# sh gen_tbl/gen_ssb_create_sql.sh oss://benchmark-oss/ssb/ssb100_iceberg
ssb100_iceberg iceberg >> create_"${TYPE}".sql
+# sh gen_tbl/gen_tpch_create_sql.sh oss://benchmark-oss/tpch/tpch100_iceberg
tpch100_iceberg iceberg >> create_"${TYPE}".sql
+# sh gen_tbl/gen_clickbench_create_sql.sh
oss://benchmark-oss/clickbench/hits_iceberg clickbench_iceberg_hdfs >>
create_"${TYPE}".sql
diff --git
a/tools/emr_storage_regression/standard_set/gen_tbl/gen_clickbench_create_sql.sh
b/tools/emr_storage_regression/standard_set/gen_tbl/gen_clickbench_create_sql.sh
new file mode 100644
index 0000000000..90e57624ed
--- /dev/null
+++
b/tools/emr_storage_regression/standard_set/gen_tbl/gen_clickbench_create_sql.sh
@@ -0,0 +1,153 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+##############################################################
+# See emr_tools.sh
+##############################################################
+
+if [[ -z "$1" ]]; then
+ echo 'the first argument is database location'
+ exit
+else
+ db_loc=$1
+fi
+
+if [[ -z "$2" ]]; then
+ echo 'the second argument is database name'
+ exit
+else
+ db=$2
+fi
+if [[ -z "$3" ]]; then
+ format=parquet
+else
+ format=$3
+fi
+# shellcheck disable=SC2016
+echo '
+CREATE DATABASE IF NOT EXISTS '"${db}"';
+USE '"${db}"';
+CREATE TABLE IF NOT EXISTS `hits`(
+ `WatchID` BIGINT,
+ `JavaEnable` SMALLINT,
+ `Title` STRING,
+ `GoodEvent` SMALLINT,
+ `EventTime` TIMESTAMP,
+ `EventDate` DATE,
+ `CounterID` INT,
+ `ClientIP` INT,
+ `RegionID` INT,
+ `UserID` BIGINT,
+ `CounterClass` SMALLINT,
+ `OS` SMALLINT,
+ `UserAgent` SMALLINT,
+ `URL` STRING,
+ `Referer` STRING,
+ `IsRefresh` SMALLINT,
+ `RefererCategoryID` SMALLINT,
+ `RefererRegionID` INT,
+ `URLCategoryID` SMALLINT,
+ `URLRegionID` INT,
+ `ResolutionWidth` SMALLINT,
+ `ResolutionHeight` SMALLINT,
+ `ResolutionDepth` SMALLINT,
+ `FlashMajor` SMALLINT,
+ `FlashMinor` SMALLINT,
+ `FlashMinor2` STRING,
+ `NetMajor` SMALLINT,
+ `NetMinor` SMALLINT,
+ `UserAgentMajor` SMALLINT,
+ `UserAgentMinor` STRING,
+ `CookieEnable` SMALLINT,
+ `JavascriptEnable` SMALLINT,
+ `IsMobile` SMALLINT,
+ `MobilePhone` SMALLINT,
+ `MobilePhoneModel` STRING,
+ `Params` STRING,
+ `IPNetworkID` INT,
+ `TraficSourceID` SMALLINT,
+ `SearchEngineID` SMALLINT,
+ `SearchPhrase` STRING,
+ `AdvEngineID` SMALLINT,
+ `IsArtifical` SMALLINT,
+ `WindowClientWidth` SMALLINT,
+ `WindowClientHeight` SMALLINT,
+ `ClientTimeZone` SMALLINT,
+ `ClientEventTime` TIMESTAMP,
+ `SilverlightVersion1` SMALLINT,
+ `SilverlightVersion2` SMALLINT,
+ `SilverlightVersion3` INT,
+ `SilverlightVersion4` SMALLINT,
+ `PageCharset` STRING,
+ `CodeVersion` INT,
+ `IsLink` SMALLINT,
+ `IsDownload` SMALLINT,
+ `IsNotBounce` SMALLINT,
+ `FUniqID` BIGINT,
+ `OriginalURL` STRING,
+ `HID` INT,
+ `IsOldCounter` SMALLINT,
+ `IsEvent` SMALLINT,
+ `IsParameter` SMALLINT,
+ `DontCountHits` SMALLINT,
+ `WithHash` SMALLINT,
+ `HitColor` STRING,
+ `LocalEventTime` TIMESTAMP,
+ `Age` SMALLINT,
+ `Sex` SMALLINT,
+ `Income` SMALLINT,
+ `Interests` SMALLINT,
+ `Robotness` SMALLINT,
+ `RemoteIP` INT,
+ `WindowName` INT,
+ `OpenerName` INT,
+ `HistoryLength` SMALLINT,
+ `BrowserLanguage` STRING,
+ `BrowserCountry` STRING,
+ `SocialNetwork` STRING,
+ `SocialAction` STRING,
+ `HTTPError` SMALLINT,
+ `SendTiming` INT,
+ `DNSTiming` INT,
+ `ConnectTiming` INT,
+ `ResponseStartTiming` INT,
+ `ResponseEndTiming` INT,
+ `FetchTiming` INT,
+ `SocialSourceNetworkID` SMALLINT,
+ `SocialSourcePage` STRING,
+ `ParamPrice` BIGINT,
+ `ParamOrderID` STRING,
+ `ParamCurrency` STRING,
+ `ParamCurrencyID` SMALLINT,
+ `OpenstatServiceName` STRING,
+ `OpenstatCampaignID` STRING,
+ `OpenstatAdID` STRING,
+ `OpenstatSourceID` STRING,
+ `UTMSource` STRING,
+ `UTMMedium` STRING,
+ `UTMCampaign` STRING,
+ `UTMContent` STRING,
+ `UTMTerm` STRING,
+ `FromTag` STRING,
+ `HasGCLID` SMALLINT,
+ `RefererHash` BIGINT,
+ `URLHash` BIGINT,
+ `CLID` INT)
+USING '"${format}"'
+LOCATION "'"${db_loc}"'";
+'
diff --git
a/tools/emr_storage_regression/standard_set/gen_tbl/gen_ssb_create_sql.sh
b/tools/emr_storage_regression/standard_set/gen_tbl/gen_ssb_create_sql.sh
new file mode 100644
index 0000000000..ae4b993372
--- /dev/null
+++ b/tools/emr_storage_regression/standard_set/gen_tbl/gen_ssb_create_sql.sh
@@ -0,0 +1,166 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+##############################################################
+# See emr_tools.sh
+##############################################################
+
+if [[ -z "$1" ]]; then
+ echo 'the first argument is database location'
+ exit
+else
+ db_loc=$1
+fi
+
+if [[ -z "$2" ]]; then
+ echo 'the second argument is database name'
+ exit
+else
+ db=$2
+fi
+
+if [[ -z "$3" ]]; then
+ format=parquet
+else
+ format=$3
+fi
+# shellcheck disable=SC2016
+echo '
+CREATE DATABASE IF NOT EXISTS '"${db}"';
+USE '"${db}"';
+
+CREATE TABLE IF NOT EXISTS `customer`(
+ `c_custkey` BIGINT COMMENT "",
+ `c_name` VARCHAR(26) COMMENT "",
+ `c_address` VARCHAR(41) COMMENT "",
+ `c_city` VARCHAR(11) COMMENT "",
+ `c_nation` VARCHAR(16) COMMENT "",
+ `c_region` VARCHAR(13) COMMENT "",
+ `c_phone` VARCHAR(16) COMMENT "",
+ `c_mktsegment` VARCHAR(11) COMMENT "")
+USING '"${format}"'
+LOCATION "'"${db_loc}"/customer'";
+
+CREATE TABLE IF NOT EXISTS `dates`(
+ `d_datekey` BIGINT COMMENT "",
+ `d_date` VARCHAR(20) COMMENT "",
+ `d_dayofweek` VARCHAR(10) COMMENT "",
+ `d_month` VARCHAR(11) COMMENT "",
+ `d_year` BIGINT COMMENT "",
+ `d_yearmonthnum` BIGINT COMMENT "",
+ `d_yearmonth` VARCHAR(9) COMMENT "",
+ `d_daynuminweek` BIGINT COMMENT "",
+ `d_daynuminmonth` BIGINT COMMENT "",
+ `d_daynuminyear` BIGINT COMMENT "",
+ `d_monthnuminyear` BIGINT COMMENT "",
+ `d_weeknuminyear` BIGINT COMMENT "",
+ `d_sellingseason` VARCHAR(14) COMMENT "",
+ `d_lastdayinweekfl` BIGINT COMMENT "",
+ `d_lastdayinmonthfl` BIGINT COMMENT "",
+ `d_holidayfl` BIGINT COMMENT "",
+ `d_weekdayfl` BIGINT COMMENT "")
+USING '"${format}"'
+LOCATION "'"${db_loc}"/dates'";
+
+CREATE TABLE IF NOT EXISTS `lineorder`(
+ `lo_orderkey` BIGINT COMMENT "",
+ `lo_linenumber` BIGINT COMMENT "",
+ `lo_custkey` BIGINT COMMENT "",
+ `lo_partkey` BIGINT COMMENT "",
+ `lo_suppkey` BIGINT COMMENT "",
+ `lo_orderdate` BIGINT COMMENT "",
+ `lo_orderpriority` VARCHAR(16) COMMENT "",
+ `lo_shippriority` BIGINT COMMENT "",
+ `lo_quantity` BIGINT COMMENT "",
+ `lo_extendedprice` BIGINT COMMENT "",
+ `lo_ordtotalprice` BIGINT COMMENT "",
+ `lo_discount` BIGINT COMMENT "",
+ `lo_revenue` BIGINT COMMENT "",
+ `lo_supplycost` BIGINT COMMENT "",
+ `lo_tax` BIGINT COMMENT "",
+ `lo_commitdate` BIGINT COMMENT "",
+ `lo_shipmode` VARCHAR(11) COMMENT "")
+USING '"${format}"'
+LOCATION "'"${db_loc}"/lineorder'";
+
+CREATE TABLE IF NOT EXISTS `part`(
+ `p_partkey` BIGINT COMMENT "",
+ `p_name` VARCHAR(23) COMMENT "",
+ `p_mfgr` VARCHAR(7) COMMENT "",
+ `p_category` VARCHAR(8) COMMENT "",
+ `p_brand` VARCHAR(10) COMMENT "",
+ `p_color` VARCHAR(12) COMMENT "",
+ `p_type` VARCHAR(26) COMMENT "",
+ `p_size` BIGINT COMMENT "",
+ `p_container` VARCHAR(11) COMMENT "")
+USING '"${format}"'
+LOCATION "'"${db_loc}"/part'";
+
+CREATE TABLE IF NOT EXISTS `supplier`(
+ `s_suppkey` BIGINT COMMENT "",
+ `s_name` VARCHAR(26) COMMENT "",
+ `s_address` VARCHAR(26) COMMENT "",
+ `s_city` VARCHAR(11) COMMENT "",
+ `s_nation` VARCHAR(16) COMMENT "",
+ `s_region` VARCHAR(13) COMMENT "",
+ `s_phone` VARCHAR(16) COMMENT "")
+USING '"${format}"'
+LOCATION "'"${db_loc}"/supplier'";
+
+CREATE TABLE IF NOT EXISTS `lineorder_flat` (
+ `lo_orderdate` BIGINT COMMENT "",
+ `lo_orderkey` BIGINT COMMENT "",
+ `lo_linenumber` TINYINT COMMENT "",
+ `lo_custkey` BIGINT COMMENT "",
+ `lo_partkey` BIGINT COMMENT "",
+ `lo_suppkey` BIGINT COMMENT "",
+ `lo_orderpriority` VARCHAR(100) COMMENT "",
+ `lo_shippriority` TINYINT COMMENT "",
+ `lo_quantity` TINYINT COMMENT "",
+ `lo_extendedprice` BIGINT COMMENT "",
+ `lo_ordtotalprice` BIGINT COMMENT "",
+ `lo_discount` TINYINT COMMENT "",
+ `lo_revenue` BIGINT COMMENT "",
+ `lo_supplycost` BIGINT COMMENT "",
+ `lo_tax` TINYINT COMMENT "",
+ `lo_commitdate` BIGINT COMMENT "",
+ `lo_shipmode` VARCHAR(100) COMMENT "",
+ `c_name` VARCHAR(100) COMMENT "",
+ `c_address` VARCHAR(100) COMMENT "",
+ `c_city` VARCHAR(100) COMMENT "",
+ `c_nation` VARCHAR(100) COMMENT "",
+ `c_region` VARCHAR(100) COMMENT "",
+ `c_phone` VARCHAR(100) COMMENT "",
+ `c_mktsegment` VARCHAR(100) COMMENT "",
+ `s_name` VARCHAR(100) COMMENT "",
+ `s_address` VARCHAR(100) COMMENT "",
+ `s_city` VARCHAR(100) COMMENT "",
+ `s_nation` VARCHAR(100) COMMENT "",
+ `s_region` VARCHAR(100) COMMENT "",
+ `s_phone` VARCHAR(100) COMMENT "",
+ `p_name` VARCHAR(100) COMMENT "",
+ `p_mfgr` VARCHAR(100) COMMENT "",
+ `p_category` VARCHAR(100) COMMENT "",
+ `p_brand` VARCHAR(100) COMMENT "",
+ `p_color` VARCHAR(100) COMMENT "",
+ `p_type` VARCHAR(100) COMMENT "",
+ `p_size` TINYINT COMMENT "",
+ `p_container` VARCHAR(100) COMMENT "")
+USING '"${format}"'
+LOCATION "'"${db_loc}"/lineorder_flat'";
+'
diff --git
a/tools/emr_storage_regression/standard_set/gen_tbl/gen_tpch_create_sql.sh
b/tools/emr_storage_regression/standard_set/gen_tbl/gen_tpch_create_sql.sh
new file mode 100644
index 0000000000..7905d2a940
--- /dev/null
+++ b/tools/emr_storage_regression/standard_set/gen_tbl/gen_tpch_create_sql.sh
@@ -0,0 +1,140 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+##############################################################
+# See emr_tools.sh
+##############################################################
+
+if [[ -z "$1" ]]; then
+ echo 'the first argument is database location'
+ exit
+else
+ db_loc=$1
+fi
+
+if [[ -z "$2" ]]; then
+ echo 'the second argument is database name'
+ exit
+else
+ db=$2
+fi
+
+if [[ -z "$3" ]]; then
+ format=parquet
+else
+ format=$3
+fi
+# shellcheck disable=SC2016
+echo '
+CREATE DATABASE IF NOT EXISTS '"${db}"' ;
+USE '"${db}"';
+
+CREATE TABLE IF NOT EXISTS `customer`(
+ `c_custkey` int,
+ `c_name` string,
+ `c_address` string,
+ `c_nationkey` int,
+ `c_phone` string,
+ `c_acctbal` decimal(12,2),
+ `c_mktsegment` string,
+ `c_comment` string)
+USING '"${format}"'
+LOCATION "'"${db_loc}"/customer'";
+
+CREATE TABLE IF NOT EXISTS `lineitem`(
+ `l_orderkey` int,
+ `l_partkey` int,
+ `l_suppkey` int,
+ `l_linenumber` int,
+ `l_quantity` decimal(12,2),
+ `l_extendedprice` decimal(12,2),
+ `l_discount` decimal(12,2),
+ `l_tax` decimal(12,2),
+ `l_returnflag` string,
+ `l_linestatus` string,
+ `l_shipdate` date,
+ `l_commitdate` date,
+ `l_receiptdate` date,
+ `l_shipinstruct` string,
+ `l_shipmode` string,
+ `l_comment` string)
+USING '"${format}"'
+LOCATION "'"${db_loc}"/lineitem'";
+
+CREATE TABLE IF NOT EXISTS `nation`(
+ `n_nationkey` int,
+ `n_name` string,
+ `n_regionkey` int,
+ `n_comment` string)
+USING '"${format}"'
+LOCATION "'"${db_loc}"/nation'";
+
+CREATE TABLE IF NOT EXISTS `orders`(
+ `o_orderkey` int,
+ `o_custkey` int,
+ `o_orderstatus` string,
+ `o_totalprice` decimal(12,2),
+ `o_orderdate` date,
+ `o_orderpriority` string,
+ `o_clerk` string,
+ `o_shippriority` int,
+ `o_comment` string)
+USING '"${format}"'
+LOCATION "'"${db_loc}"/orders'";
+
+CREATE TABLE IF NOT EXISTS `part`(
+ `p_partkey` int,
+ `p_name` string,
+ `p_mfgr` string,
+ `p_brand` string,
+ `p_type` string,
+ `p_size` int,
+ `p_container` string,
+ `p_retailprice` decimal(12,2),
+ `p_comment` string)
+USING '"${format}"'
+LOCATION "'"${db_loc}"/part'";
+
+CREATE TABLE IF NOT EXISTS `partsupp`(
+ `ps_partkey` int,
+ `ps_suppkey` int,
+ `ps_availqty` int,
+ `ps_supplycost` decimal(12,2),
+ `ps_comment` string)
+USING '"${format}"'
+LOCATION "'"${db_loc}"/partsupp'";
+
+CREATE TABLE IF NOT EXISTS `region` (
+ `r_regionkey` int,
+ `r_name` string,
+ `r_comment` string)
+USING '"${format}"'
+LOCATION "'"${db_loc}"/region'";
+
+CREATE TABLE IF NOT EXISTS `supplier`(
+ `s_suppkey` int,
+ `s_name` string,
+ `s_address` string,
+ `s_nationkey` int,
+ `s_phone` string,
+ `s_acctbal` decimal(12,2),
+ `s_comment` string)
+USING '"${format}"'
+LOCATION "'"${db_loc}"/supplier'";
+
+'
diff --git
a/tools/emr_storage_regression/standard_set/queries/clickbench_queries.sql
b/tools/emr_storage_regression/standard_set/queries/clickbench_queries.sql
new file mode 100644
index 0000000000..28dbd29dbd
--- /dev/null
+++ b/tools/emr_storage_regression/standard_set/queries/clickbench_queries.sql
@@ -0,0 +1,43 @@
+SELECT COUNT(*) FROM hits;
+SELECT COUNT(*) FROM hits WHERE AdvEngineID <> 0;
+SELECT SUM(AdvEngineID), COUNT(*), AVG(ResolutionWidth) FROM hits;
+SELECT AVG(UserID) FROM hits;
+SELECT COUNT(DISTINCT UserID) FROM hits;
+SELECT COUNT(DISTINCT SearchPhrase) FROM hits;
+SELECT MIN(EventDate), MAX(EventDate) FROM hits;
+SELECT AdvEngineID, COUNT(*) FROM hits WHERE AdvEngineID <> 0 GROUP BY
AdvEngineID ORDER BY COUNT(*) DESC;
+SELECT RegionID, COUNT(DISTINCT UserID) AS u FROM hits GROUP BY RegionID ORDER
BY u DESC LIMIT 10;
+SELECT RegionID, SUM(AdvEngineID), COUNT(*) AS c, AVG(ResolutionWidth),
COUNT(DISTINCT UserID) FROM hits GROUP BY RegionID ORDER BY c DESC LIMIT 10;
+SELECT MobilePhoneModel, COUNT(DISTINCT UserID) AS u FROM hits WHERE
MobilePhoneModel <> '' GROUP BY MobilePhoneModel ORDER BY u DESC LIMIT 10;
+SELECT MobilePhone, MobilePhoneModel, COUNT(DISTINCT UserID) AS u FROM hits
WHERE MobilePhoneModel <> '' GROUP BY MobilePhone, MobilePhoneModel ORDER BY u
DESC LIMIT 10;
+SELECT SearchPhrase, COUNT(*) AS c FROM hits WHERE SearchPhrase <> '' GROUP BY
SearchPhrase ORDER BY c DESC LIMIT 10;
+SELECT SearchPhrase, COUNT(DISTINCT UserID) AS u FROM hits WHERE SearchPhrase
<> '' GROUP BY SearchPhrase ORDER BY u DESC LIMIT 10;
+SELECT SearchEngineID, SearchPhrase, COUNT(*) AS c FROM hits WHERE
SearchPhrase <> '' GROUP BY SearchEngineID, SearchPhrase ORDER BY c DESC LIMIT
10;
+SELECT UserID, COUNT(*) FROM hits GROUP BY UserID ORDER BY COUNT(*) DESC LIMIT
10;
+SELECT UserID, SearchPhrase, COUNT(*) FROM hits GROUP BY UserID, SearchPhrase
ORDER BY COUNT(*) DESC LIMIT 10;
+SELECT UserID, SearchPhrase, COUNT(*) FROM hits GROUP BY UserID, SearchPhrase
LIMIT 10;
+SELECT UserID, extract(minute FROM EventTime) AS m, SearchPhrase, COUNT(*)
FROM hits GROUP BY UserID, m, SearchPhrase ORDER BY COUNT(*) DESC LIMIT 10;
+SELECT UserID FROM hits WHERE UserID = 435090932899640449;
+SELECT COUNT(*) FROM hits WHERE URL LIKE '%google%';
+SELECT SearchPhrase, MIN(URL), COUNT(*) AS c FROM hits WHERE URL LIKE
'%google%' AND SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY c DESC LIMIT
10;
+SELECT SearchPhrase, MIN(URL), MIN(Title), COUNT(*) AS c, COUNT(DISTINCT
UserID) FROM hits WHERE Title LIKE '%Google%' AND URL NOT LIKE '%.google.%' AND
SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY c DESC LIMIT 10;
+SELECT * FROM hits WHERE URL LIKE '%google%' ORDER BY EventTime LIMIT 10;
+SELECT SearchPhrase FROM hits WHERE SearchPhrase <> '' ORDER BY EventTime
LIMIT 10;
+SELECT SearchPhrase FROM hits WHERE SearchPhrase <> '' ORDER BY SearchPhrase
LIMIT 10;
+SELECT SearchPhrase FROM hits WHERE SearchPhrase <> '' ORDER BY EventTime,
SearchPhrase LIMIT 10;
+SELECT CounterID, AVG(length(URL)) AS l, COUNT(*) AS c FROM hits WHERE URL <>
'' GROUP BY CounterID HAVING COUNT(*) > 100000 ORDER BY l DESC LIMIT 25;
+SELECT REGEXP_REPLACE(Referer, '^https?://(?:www\.)?([^/]+)/.*$', '\1') AS k,
AVG(length(Referer)) AS l, COUNT(*) AS c, MIN(Referer) FROM hits WHERE Referer
<> '' GROUP BY k HAVING COUNT(*) > 100000 ORDER BY l DESC LIMIT 25;
+SELECT SUM(ResolutionWidth), SUM(ResolutionWidth + 1), SUM(ResolutionWidth +
2), SUM(ResolutionWidth + 3), SUM(ResolutionWidth + 4), SUM(ResolutionWidth +
5), SUM(ResolutionWidth + 6), SUM(ResolutionWidth + 7), SUM(ResolutionWidth +
8), SUM(ResolutionWidth + 9), SUM(ResolutionWidth + 10), SUM(ResolutionWidth +
11), SUM(ResolutionWidth + 12), SUM(ResolutionWidth + 13), SUM(ResolutionWidth
+ 14), SUM(ResolutionWidth + 15), SUM(ResolutionWidth + 16),
SUM(ResolutionWidth + 17), SUM(Resolutio [...]
+SELECT SearchEngineID, ClientIP, COUNT(*) AS c, SUM(IsRefresh),
AVG(ResolutionWidth) FROM hits WHERE SearchPhrase <> '' GROUP BY
SearchEngineID, ClientIP ORDER BY c DESC LIMIT 10;
+SELECT WatchID, ClientIP, COUNT(*) AS c, SUM(IsRefresh), AVG(ResolutionWidth)
FROM hits WHERE SearchPhrase <> '' GROUP BY WatchID, ClientIP ORDER BY c DESC
LIMIT 10;
+SELECT WatchID, ClientIP, COUNT(*) AS c, SUM(IsRefresh), AVG(ResolutionWidth)
FROM hits GROUP BY WatchID, ClientIP ORDER BY c DESC LIMIT 10;
+SELECT URL, COUNT(*) AS c FROM hits GROUP BY URL ORDER BY c DESC LIMIT 10;
+SELECT 1, URL, COUNT(*) AS c FROM hits GROUP BY 1, URL ORDER BY c DESC LIMIT
10;
+SELECT ClientIP, ClientIP - 1, ClientIP - 2, ClientIP - 3, COUNT(*) AS c FROM
hits GROUP BY ClientIP, ClientIP - 1, ClientIP - 2, ClientIP - 3 ORDER BY c
DESC LIMIT 10;
+SELECT URL, COUNT(*) AS PageViews FROM hits WHERE CounterID = 62 AND EventDate
>= '2013-07-01' AND EventDate <= '2013-07-31' AND DontCountHits = 0 AND
IsRefresh = 0 AND URL <> '' GROUP BY URL ORDER BY PageViews DESC LIMIT 10;
+SELECT Title, COUNT(*) AS PageViews FROM hits WHERE CounterID = 62 AND
EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND DontCountHits = 0
AND IsRefresh = 0 AND Title <> '' GROUP BY Title ORDER BY PageViews DESC LIMIT
10;
+SELECT URL, COUNT(*) AS PageViews FROM hits WHERE CounterID = 62 AND EventDate
>= '2013-07-01' AND EventDate <= '2013-07-31' AND IsRefresh = 0 AND IsLink <> 0
AND IsDownload = 0 GROUP BY URL ORDER BY PageViews DESC LIMIT 10 OFFSET 1000;
+SELECT TraficSourceID, SearchEngineID, AdvEngineID, CASE WHEN (SearchEngineID
= 0 AND AdvEngineID = 0) THEN Referer ELSE '' END AS Src, URL AS Dst, COUNT(*)
AS PageViews FROM hits WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND
EventDate <= '2013-07-31' AND IsRefresh = 0 GROUP BY TraficSourceID,
SearchEngineID, AdvEngineID, Src, Dst ORDER BY PageViews DESC LIMIT 10 OFFSET
1000;
+SELECT URLHash, EventDate, COUNT(*) AS PageViews FROM hits WHERE CounterID =
62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND IsRefresh =
0 AND TraficSourceID IN (-1, 6) AND RefererHash = 3594120000172545465 GROUP BY
URLHash, EventDate ORDER BY PageViews DESC LIMIT 10 OFFSET 100;
+SELECT WindowClientWidth, WindowClientHeight, COUNT(*) AS PageViews FROM hits
WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <=
'2013-07-31' AND IsRefresh = 0 AND DontCountHits = 0 AND URLHash =
2868770270353813622 GROUP BY WindowClientWidth, WindowClientHeight ORDER BY
PageViews DESC LIMIT 10 OFFSET 10000;
+SELECT DATE_FORMAT(EventTime, '%Y-%m-%d %H:%i:00') AS M, COUNT(*) AS PageViews
FROM hits WHERE CounterID = 62 AND EventDate >= '2013-07-14' AND EventDate <=
'2013-07-15' AND IsRefresh = 0 AND DontCountHits = 0 GROUP BY
DATE_FORMAT(EventTime, '%Y-%m-%d %H:%i:00') ORDER BY DATE_FORMAT(EventTime,
'%Y-%m-%d %H:%i:00') LIMIT 10 OFFSET 1000;
diff --git
a/tools/emr_storage_regression/standard_set/queries/ssb_flat_queries.sql
b/tools/emr_storage_regression/standard_set/queries/ssb_flat_queries.sql
new file mode 100644
index 0000000000..52c249a7f0
--- /dev/null
+++ b/tools/emr_storage_regression/standard_set/queries/ssb_flat_queries.sql
@@ -0,0 +1,13 @@
+SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM lineorder_flat
WHERE LO_ORDERDATE >= 19930101 AND LO_ORDERDATE <= 19931231 AND LO_DISCOUNT
BETWEEN 1 AND 3 AND LO_QUANTITY < 25;
+SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM lineorder_flat
WHERE LO_ORDERDATE >= 19940101 AND LO_ORDERDATE <= 19940131 AND LO_DISCOUNT
BETWEEN 4 AND 6 AND LO_QUANTITY BETWEEN 26 AND 35;
+SELECT SUM(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM lineorder_flat
WHERE weekofyear(LO_ORDERDATE) = 6 AND LO_ORDERDATE >= 19940101 AND
LO_ORDERDATE <= 19941231 AND LO_DISCOUNT BETWEEN 5 AND 7 AND LO_QUANTITY
BETWEEN 26 AND 35;
+SELECT SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR, P_BRAND FROM
lineorder_flat WHERE P_CATEGORY = 'MFGR#12' AND S_REGION = 'AMERICA' GROUP BY
YEAR, P_BRAND ORDER BY YEAR, P_BRAND;
+SELECT SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR, P_BRAND FROM
lineorder_flat WHERE P_BRAND >= 'MFGR#2221' AND P_BRAND <= 'MFGR#2228' AND
S_REGION = 'ASIA' GROUP BY YEAR, P_BRAND ORDER BY YEAR, P_BRAND;
+SELECT SUM(LO_REVENUE), (LO_ORDERDATE DIV 10000) AS YEAR, P_BRAND FROM
lineorder_flat WHERE P_BRAND = 'MFGR#2239' AND S_REGION = 'EUROPE' GROUP BY
YEAR, P_BRAND ORDER BY YEAR, P_BRAND;
+SELECT C_NATION, S_NATION, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE)
AS revenue FROM lineorder_flat WHERE C_REGION = 'ASIA' AND S_REGION = 'ASIA'
AND LO_ORDERDATE >= 19920101 AND LO_ORDERDATE <= 19971231 GROUP BY C_NATION,
S_NATION, YEAR ORDER BY YEAR ASC, revenue DESC;
+SELECT C_CITY, S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS
revenue FROM lineorder_flat WHERE C_NATION = 'UNITED STATES' AND S_NATION =
'UNITED STATES' AND LO_ORDERDATE >= 19920101 AND LO_ORDERDATE <= 19971231 GROUP
BY C_CITY, S_CITY, YEAR ORDER BY YEAR ASC, revenue DESC;
+SELECT C_CITY, S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS
revenue FROM lineorder_flat WHERE C_CITY IN ('UNITED KI1', 'UNITED KI5') AND
S_CITY IN ('UNITED KI1', 'UNITED KI5') AND LO_ORDERDATE >= 19920101 AND
LO_ORDERDATE <= 19971231 GROUP BY C_CITY, S_CITY, YEAR ORDER BY YEAR ASC,
revenue DESC;
+SELECT C_CITY, S_CITY, (LO_ORDERDATE DIV 10000) AS YEAR, SUM(LO_REVENUE) AS
revenue FROM lineorder_flat WHERE C_CITY IN ('UNITED KI1', 'UNITED KI5') AND
S_CITY IN ('UNITED KI1', 'UNITED KI5') AND LO_ORDERDATE >= 19971201 AND
LO_ORDERDATE <= 19971231 GROUP BY C_CITY, S_CITY, YEAR ORDER BY YEAR ASC,
revenue DESC;
+SELECT (LO_ORDERDATE DIV 10000) AS YEAR, C_NATION, SUM(LO_REVENUE -
LO_SUPPLYCOST) AS profit FROM lineorder_flat WHERE C_REGION = 'AMERICA' AND
S_REGION = 'AMERICA' AND P_MFGR IN ('MFGR#1', 'MFGR#2') GROUP BY YEAR, C_NATION
ORDER BY YEAR ASC, C_NATION ASC;
+SELECT (LO_ORDERDATE DIV 10000) AS YEAR, S_NATION, P_CATEGORY, SUM(LO_REVENUE
- LO_SUPPLYCOST) AS profit FROM lineorder_flat WHERE C_REGION = 'AMERICA' AND
S_REGION = 'AMERICA' AND LO_ORDERDATE >= 19970101 AND LO_ORDERDATE <= 19981231
AND P_MFGR IN ('MFGR#1', 'MFGR#2') GROUP BY YEAR, S_NATION, P_CATEGORY ORDER BY
YEAR ASC, S_NATION ASC, P_CATEGORY ASC;
+SELECT (LO_ORDERDATE DIV 10000) AS YEAR, S_CITY, P_BRAND, SUM(LO_REVENUE -
LO_SUPPLYCOST) AS profit FROM lineorder_flat WHERE S_NATION = 'UNITED STATES'
AND LO_ORDERDATE >= 19970101 AND LO_ORDERDATE <= 19981231 AND P_CATEGORY =
'MFGR#14' GROUP BY YEAR, S_CITY, P_BRAND ORDER BY YEAR ASC, S_CITY ASC, P_BRAND
ASC;
diff --git a/tools/emr_storage_regression/standard_set/queries/ssb_queries.sql
b/tools/emr_storage_regression/standard_set/queries/ssb_queries.sql
new file mode 100644
index 0000000000..8b5e336593
--- /dev/null
+++ b/tools/emr_storage_regression/standard_set/queries/ssb_queries.sql
@@ -0,0 +1,13 @@
+SELECT SUM(lo_extendedprice * lo_discount) AS REVENUE FROM lineorder, dates
WHERE lo_orderdate = d_datekey AND d_year = 1993 AND lo_discount BETWEEN 1 AND
3 AND lo_quantity < 25;
+SELECT SUM(lo_extendedprice * lo_discount) AS REVENUE FROM lineorder, dates
WHERE lo_orderdate = d_datekey AND d_yearmonth = 'Jan1994' AND lo_discount
BETWEEN 4 AND 6 AND lo_quantity BETWEEN 26 AND 35;
+SELECT SUM(lo_extendedprice * lo_discount) AS REVENUE FROM lineorder, dates
WHERE lo_orderdate = d_datekey AND d_weeknuminyear = 6 AND d_year = 1994 AND
lo_discount BETWEEN 5 AND 7 AND lo_quantity BETWEEN 26 AND 35;
+SELECT SUM(lo_revenue), d_year, p_brand FROM lineorder, dates, part, supplier
WHERE lo_orderdate = d_datekey AND lo_partkey = p_partkey AND lo_suppkey =
s_suppkey AND p_category = 'MFGR#12' AND s_region = 'AMERICA' GROUP BY d_year,
p_brand ORDER BY p_brand;
+SELECT SUM(lo_revenue), d_year, p_brand FROM lineorder, dates, part, supplier
WHERE lo_orderdate = d_datekey AND lo_partkey = p_partkey AND lo_suppkey =
s_suppkey AND p_brand BETWEEN 'MFGR#2221' AND 'MFGR#2228' AND s_region = 'ASIA'
GROUP BY d_year, p_brand ORDER BY d_year, p_brand;
+SELECT SUM(lo_revenue), d_year, p_brand FROM lineorder, dates, part, supplier
WHERE lo_orderdate = d_datekey AND lo_partkey = p_partkey AND lo_suppkey =
s_suppkey AND p_brand = 'MFGR#2239' AND s_region = 'EUROPE' GROUP BY d_year,
p_brand ORDER BY d_year, p_brand;
+SELECT c_nation, s_nation, d_year, SUM(lo_revenue) AS REVENUE FROM customer,
lineorder, supplier, dates WHERE lo_custkey = c_custkey AND lo_suppkey =
s_suppkey AND lo_orderdate = d_datekey AND c_region = 'ASIA' AND s_region =
'ASIA' AND d_year >= 1992 AND d_year <= 1997 GROUP BY c_nation, s_nation,
d_year ORDER BY d_year ASC, REVENUE DESC;
+SELECT c_city, s_city, d_year, SUM(lo_revenue) AS REVENUE FROM customer,
lineorder, supplier, dates WHERE lo_custkey = c_custkey AND lo_suppkey =
s_suppkey AND lo_orderdate = d_datekey AND c_nation = 'UNITED STATES' AND
s_nation = 'UNITED STATES' AND d_year >= 1992 AND d_year <= 1997 GROUP BY
c_city, s_city, d_year ORDER BY d_year ASC, REVENUE DESC;
+SELECT c_city, s_city, d_year, SUM(lo_revenue) AS REVENUE FROM customer,
lineorder, supplier, dates WHERE lo_custkey = c_custkey AND lo_suppkey =
s_suppkey AND lo_orderdate = d_datekey AND ( c_city = 'UNITED KI1' OR c_city =
'UNITED KI5' ) AND ( s_city = 'UNITED KI1' OR s_city = 'UNITED KI5' ) AND
d_year >= 1992 AND d_year <= 1997 GROUP BY c_city, s_city, d_year ORDER BY
d_year ASC, REVENUE DESC;
+SELECT c_city, s_city, d_year, SUM(lo_revenue) AS REVENUE FROM customer,
lineorder, supplier, dates WHERE lo_custkey = c_custkey AND lo_suppkey =
s_suppkey AND lo_orderdate = d_datekey AND ( c_city = 'UNITED KI1' OR c_city =
'UNITED KI5' ) AND ( s_city = 'UNITED KI1' OR s_city = 'UNITED KI5' ) AND
d_yearmonth = 'Dec1997' GROUP BY c_city, s_city, d_year ORDER BY d_year ASC,
REVENUE DESC;
+SELECT d_year, c_nation, SUM(lo_revenue - lo_supplycost) AS PROFIT FROM dates,
customer, supplier, part, lineorder WHERE lo_custkey = c_custkey AND lo_suppkey
= s_suppkey AND lo_partkey = p_partkey AND lo_orderdate = d_datekey AND
c_region = 'AMERICA' AND s_region = 'AMERICA' AND ( p_mfgr = 'MFGR#1' OR p_mfgr
= 'MFGR#2' ) GROUP BY d_year, c_nation ORDER BY d_year, c_nation;
+SELECT d_year, s_nation, p_category, SUM(lo_revenue - lo_supplycost) AS PROFIT
FROM dates, customer, supplier, part, lineorder WHERE lo_custkey = c_custkey
AND lo_suppkey = s_suppkey AND lo_partkey = p_partkey AND lo_orderdate =
d_datekey AND c_region = 'AMERICA' AND s_region = 'AMERICA' AND ( d_year = 1997
OR d_year = 1998 ) AND ( p_mfgr = 'MFGR#1' OR p_mfgr = 'MFGR#2' ) GROUP BY
d_year, s_nation, p_category ORDER BY d_year, s_nation, p_category;
+SELECT d_year, s_city, p_brand, SUM(lo_revenue - lo_supplycost) AS PROFIT FROM
dates, customer, supplier, part, lineorder WHERE lo_custkey = c_custkey AND
lo_suppkey = s_suppkey AND lo_partkey = p_partkey AND lo_orderdate = d_datekey
AND s_nation = 'UNITED STATES' AND ( d_year = 1997 OR d_year = 1998 ) AND
p_category = 'MFGR#14' GROUP BY d_year, s_city, p_brand ORDER BY d_year,
s_city, p_brand;
diff --git a/tools/emr_storage_regression/standard_set/queries/tpch_queries.sql
b/tools/emr_storage_regression/standard_set/queries/tpch_queries.sql
new file mode 100644
index 0000000000..43d46e0b0b
--- /dev/null
+++ b/tools/emr_storage_regression/standard_set/queries/tpch_queries.sql
@@ -0,0 +1,22 @@
+select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount))
as sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as
sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc, count(*) as count_order from lineitem where
l_shipdate <= date '1998-12-01' - interval '90' day group by l_returnflag,
l_linestatus order by l_returnflag, l_linestatus;
+select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone,
s_comment from part, supplier, partsupp, nation, region where p_partkey =
ps_partkey and s_suppkey = ps_suppkey and p_size = 15 and p_type like '%BRASS'
and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name =
'EUROPE' and ps_supplycost = ( select min(ps_supplycost) from partsupp,
supplier, nation, region where p_partkey = ps_partkey and s_suppkey =
ps_suppkey and s_nationkey = n_nationkey and n_regi [...]
+select l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue,
o_orderdate, o_shippriority from customer, orders, lineitem where c_mktsegment
= 'BUILDING' and c_custkey = o_custkey and l_orderkey = o_orderkey and
o_orderdate < date '1995-03-15' and l_shipdate > date '1995-03-15' group by
l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate
limit 10;
+select o_orderpriority, count(*) as order_count from orders where o_orderdate
>= date '1993-07-01' and o_orderdate < date '1993-07-01' + interval '3' month
and exists ( select * from lineitem where l_orderkey = o_orderkey and
l_commitdate < l_receiptdate ) group by o_orderpriority order by
o_orderpriority;
+select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from
customer, orders, lineitem, supplier, nation, region where c_custkey =
o_custkey and l_orderkey = o_orderkey and l_suppkey = s_suppkey and c_nationkey
= s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and
r_name = 'ASIA' and o_orderdate >= date '1994-01-01' and o_orderdate < date
'1994-01-01' + interval '1' year group by n_name order by revenue desc;
+select sum(l_extendedprice * l_discount) as revenue from lineitem where
l_shipdate >= date '1994-01-01' and l_shipdate < date '1994-01-01' + interval
'1' year and l_discount between .06 - 0.01 and .06 + 0.01 and l_quantity < 24;
+select supp_nation, cust_nation, l_year, sum(volume) as revenue from ( select
n1.n_name as supp_nation, n2.n_name as cust_nation, extract(year from
l_shipdate) as l_year, l_extendedprice * (1 - l_discount) as volume from
supplier, lineitem, orders, customer, nation n1, nation n2 where s_suppkey =
l_suppkey and o_orderkey = l_orderkey and c_custkey = o_custkey and s_nationkey
= n1.n_nationkey and c_nationkey = n2.n_nationkey and ( (n1.n_name = 'FRANCE'
and n2.n_name = 'GERMANY') or (n1.n_ [...]
+select o_year, sum(case when nation = 'BRAZIL' then volume else 0 end) /
sum(volume) as mkt_share from ( select extract(year from o_orderdate) as
o_year, l_extendedprice * (1 - l_discount) as volume, n2.n_name as nation from
part, supplier, lineitem, orders, customer, nation n1, nation n2, region where
p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and
o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey =
r_regionkey and r_name = 'AMERIC [...]
+select nation, o_year, sum(amount) as sum_profit from ( select n_name as
nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 -
l_discount) - ps_supplycost * l_quantity as amount from part, supplier,
lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey =
l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey =
l_orderkey and s_nationkey = n_nationkey and p_name like '%green%' ) as profit
group by nation, o_year order by na [...]
+select c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue,
c_acctbal, n_name, c_address, c_phone, c_comment from customer, orders,
lineitem, nation where c_custkey = o_custkey and l_orderkey = o_orderkey and
o_orderdate >= date '1993-10-01' and o_orderdate < date '1993-10-01' + interval
'3' month and l_returnflag = 'R' and c_nationkey = n_nationkey group by
c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment order by
revenue desc limit 20;
+select ps_partkey, sum(ps_supplycost * ps_availqty) as value from partsupp,
supplier, nation where ps_suppkey = s_suppkey and s_nationkey = n_nationkey and
n_name = 'GERMANY' group by ps_partkey having sum(ps_supplycost * ps_availqty)
> ( select sum(ps_supplycost * ps_availqty) * 0.000002 from partsupp, supplier,
nation where ps_suppkey = s_suppkey and s_nationkey = n_nationkey and n_name =
'GERMANY' ) order by value desc;
+select l_shipmode, sum(case when o_orderpriority = '1-URGENT' or
o_orderpriority = '2-HIGH' then 1 else 0 end) as high_line_count, sum(case when
o_orderpriority <> '1-URGENT' and o_orderpriority <> '2-HIGH' then 1 else 0
end) as low_line_count from orders, lineitem where o_orderkey = l_orderkey and
l_shipmode in ('MAIL', 'SHIP') and l_commitdate < l_receiptdate and l_shipdate
< l_commitdate and l_receiptdate >= date '1994-01-01' and l_receiptdate < date
'1994-01-01' + interval '1' year g [...]
+select c_count, count(*) as custdist from ( select c_custkey,
count(o_orderkey) as c_count from customer left outer join orders on c_custkey
= o_custkey and o_comment not like '%special%requests%' group by c_custkey ) as
c_orders group by c_count order by custdist desc, c_count desc;
+select 100.00 * sum(case when p_type like 'PROMO%' then l_extendedprice * (1 -
l_discount) else 0 end) / sum(l_extendedprice * (1 - l_discount)) as
promo_revenue from lineitem, part where l_partkey = p_partkey and l_shipdate >=
date '1995-09-01' and l_shipdate < date '1995-09-01' + interval '1' month;
+with revenue0 as ( select l_suppkey supplier_no, sum(l_extendedprice * (1 -
l_discount)) total_revenue from lineitem where l_shipdate >= date '1996-01-01'
and l_shipdate < date '1996-01-01' + interval '3' month group by l_suppkey )
select /*+SET_VAR(enable_nereids_planner=true,enable_pipeline_engine=true) */
s_suppkey, s_name, s_address, s_phone, total_revenue from supplier, revenue0
where s_suppkey = supplier_no and total_revenue = ( select max(total_revenue)
from revenue0 ) order by s_ [...]
+select p_brand, p_type, p_size, count(distinct ps_suppkey) as supplier_cnt
from partsupp, part where p_partkey = ps_partkey and p_brand <> 'Brand#45' and
p_type not like 'MEDIUM POLISHED%' and p_size in (49, 14, 23, 45, 19, 3, 36, 9)
and ps_suppkey not in ( select s_suppkey from supplier where s_comment like
'%Customer%Complaints%' ) group by p_brand, p_type, p_size order by
supplier_cnt desc, p_brand, p_type, p_size;
+select sum(l_extendedprice) / 7.0 as avg_yearly from lineitem, part where
p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' and
l_quantity < ( select 0.2 * avg(l_quantity) from lineitem where l_partkey =
p_partkey );
+select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice,
sum(l_quantity) from customer, orders, lineitem where o_orderkey in ( select
l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 300 )
and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name,
c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc,
o_orderdate limit 100;
+select sum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part
where ( p_partkey = l_partkey and p_brand = 'Brand#12' and p_container in ('SM
CASE', 'SM BOX', 'SM PACK', 'SM PKG') and l_quantity >= 1 and l_quantity <= 1 +
10 and p_size between 1 and 5 and l_shipmode in ('AIR', 'AIR REG') and
l_shipinstruct = 'DELIVER IN PERSON' ) or ( p_partkey = l_partkey and p_brand =
'Brand#23' and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK') and
l_quantity >= 10 and l_qu [...]
+select s_name, s_address from supplier, nation where s_suppkey in ( select
ps_suppkey from partsupp where ps_partkey in ( select p_partkey from part where
p_name like 'forest%' ) and ps_availqty > ( select 0.5 * sum(l_quantity) from
lineitem where l_partkey = ps_partkey and l_suppkey = ps_suppkey and l_shipdate
>= date '1994-01-01' and l_shipdate < date '1994-01-01' + interval '1' year ) )
and s_nationkey = n_nationkey and n_name = 'CANADA' order by s_name;
+select s_name, count(*) as numwait from supplier, lineitem l1, orders, nation
where s_suppkey = l1.l_suppkey and o_orderkey = l1.l_orderkey and o_orderstatus
= 'F' and l1.l_receiptdate > l1.l_commitdate and exists ( select * from
lineitem l2 where l2.l_orderkey = l1.l_orderkey and l2.l_suppkey <>
l1.l_suppkey ) and not exists ( select * from lineitem l3 where l3.l_orderkey =
l1.l_orderkey and l3.l_suppkey <> l1.l_suppkey and l3.l_receiptdate >
l3.l_commitdate ) and s_nationkey = n_nation [...]
+select cntrycode, count(*) as numcust, sum(c_acctbal) as totacctbal from (
select substring(c_phone, 1, 2) as cntrycode, c_acctbal from customer where
substring(c_phone, 1, 2) in ('13', '31', '23', '29', '30', '18', '17') and
c_acctbal > ( select avg(c_acctbal) from customer where c_acctbal > 0.00 and
substring(c_phone, 1, 2) in ('13', '31', '23', '29', '30', '18', '17') ) and
not exists ( select * from orders where o_custkey = c_custkey ) ) as custsale
group by cntrycode order by cntrycode;
diff --git a/tools/emr_storage_regression/standard_set/run_queries.sh
b/tools/emr_storage_regression/standard_set/run_queries.sh
new file mode 100644
index 0000000000..6448c1ae74
--- /dev/null
+++ b/tools/emr_storage_regression/standard_set/run_queries.sh
@@ -0,0 +1,46 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+##############################################################
+# See emr_tools.sh
+##############################################################
+
+set -e
+
+FE_HOST=$1
+USER=$2
+FE_QUERY_PORT=$3
+DB=$4
+
+TRIES=3
+QUERY_NUM=1
+RESULT_FILE=result-master-"${DB}".csv
+touch "${RESULT_FILE}"
+truncate -s 0 "${RESULT_FILE}"
+
+while read -r query; do
+ echo -n "query${QUERY_NUM}," | tee -a "${RESULT_FILE}"
+ for i in $(seq 1 "${TRIES}"); do
+ RES=$(mysql -vvv -h"${FE_HOST}" -u"${USER}" -P"${FE_QUERY_PORT}"
-D"${DB}" -e "${query}" | perl -nle 'print $1 if /((\d+\.\d+)+ sec)/' || :)
+ echo -n "${RES}" | tee -a "${RESULT_FILE}"
+ [[ "${i}" != "${TRIES}" ]] && echo -n "," | tee -a "${RESULT_FILE}"
+ done
+ echo "" | tee -a "${RESULT_FILE}"
+
+ QUERY_NUM=$((QUERY_NUM + 1))
+done <"$5"
diff --git a/tools/emr_storage_regression/standard_set/run_standard_set.sh
b/tools/emr_storage_regression/standard_set/run_standard_set.sh
new file mode 100644
index 0000000000..b237795129
--- /dev/null
+++ b/tools/emr_storage_regression/standard_set/run_standard_set.sh
@@ -0,0 +1,76 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+##############################################################
+# See emr_tools.sh
+##############################################################
+
+# usage: sh run.sh dlf parquet
+FE_HOST=$1
+USER=$2
+PORT=$3
+if [[ -z "$4" ]]; then
+ echo 'need catalog name'
+ exit
+else
+ catalog_name=$4
+fi
+
+if [[ -z "$5" ]]; then
+ echo "run all test default"
+elif [[ "$5" = 'all' ]]; then
+ echo "run all test"
+else
+ case=$5
+fi
+
+if [[ -z ${TYPE} ]]; then
+ TYPE=obj
+fi
+echo "execute ${case} benchmark for ${TYPE}..."
+
+if [[ "${case}" = 'ssb' ]]; then
+ # ssb
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".ssb100_parquet_"${TYPE}" queries/ssb_queries.sql
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".ssb100_orc_"${TYPE}" queries/ssb_queries.sql
+elif [[ "${case}" = 'ssb_flat' ]]; then
+ # ssb_flat
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".ssb100_parquet_"${TYPE}" queries/ssb_flat_queries.sql
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".ssb100_orc_"${TYPE}" queries/ssb_flat_queries.sql
+elif [[ "${case}" = 'tpch' ]]; then
+ # tpch
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".tpch100_parquet_"${TYPE}" queries/tpch_queries.sql
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".tpch100_orc_"${TYPE}" queries/tpch_queries.sql
+elif [[ "${case}" = 'clickbench' ]]; then
+ # clickbench
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".clickbench_parquet_"${TYPE}" queries/clickbench_queries.sql
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".clickbench_orc_"${TYPE}" queries/clickbench_queries.sql
+else
+ # run all
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".ssb100_parquet_"${TYPE}" queries/ssb_queries.sql
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".ssb100_orc_"${TYPE}" queries/ssb_queries.sql
+
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".ssb100_parquet_"${TYPE}" queries/ssb_flat_queries.sql
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".ssb100_orc_"${TYPE}" queries/ssb_flat_queries.sql
+
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".tpch100_parquet_"${TYPE}" queries/tpch_queries.sql
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".tpch100_orc_"${TYPE}" queries/tpch_queries.sql
+
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".clickbench_parquet_"${TYPE}" queries/clickbench_queries.sql
+ sh run_queries.sh "${FE_HOST}" "${USER}" "${PORT}"
"${catalog_name}".clickbench_orc_"${TYPE}" queries/clickbench_queries.sql
+fi
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]