hudi-bot opened a new issue, #17328:
URL: https://github.com/apache/hudi/issues/17328
{code:java}
The following `SET` statements cannot take effect when they are put before
the createTable statements or put into the options clause of the createTable
statement.
# These set statements must be run before the create table statement.
SET hoodie.metadata.enable =true;
SET hoodie.metadata.index.column.stats.enable = true;
SET hoodie.enable.data.skipping = true;
CREATE DATABASE IF NOT EXISTS tpcds_hudi_1tb LOCATION
's3a://performance-benchmark-datasets-us-west-2/jenkins/benchmarks/read/tpcds_hudi_1tb';
USE tpcds_hudi_1tb;
CREATE TABLE tpcds_hudi_1tb.store_sales
USING hudi
OPTIONS (
type = 'mor',
primaryKey = 'ss_ticket_number',
precombineField = 'ss_sold_date_sk',
partitionPathComponentPrefixes = 'ss_sold_date_sk'
)
LOCATION
's3a://performance-benchmark-datasets-us-west-2/jenkins/benchmarks/read/tpcds_hudi_1tb/store_sales';
# For partition pruning
# Trigger partition pruning:
select count(1) from tpcds_hudi_1tb.store_sales where ss_sold_date_sk =
2450816;
# Without partition prunning:
select count(1) from tpcds_hudi_1tb.store_sales;
# For data skipping
# Both the following queries will skip files.
select count(1) from tpcds_hudi_1tb.store_sales where ss_customer_sk = 325;
select count(1) from tpcds_hudi_1tb.store_sales where ss_customer_sk = 100;
# To compare, we need to recreate the table, with
`hoodie.enable.data.skipping` set to false.
{code}
## JIRA info
- Link: https://issues.apache.org/jira/browse/HUDI-8665
- Type: Sub-task
- Parent: https://issues.apache.org/jira/browse/HUDI-9109
- Fix version(s):
- 1.1.0
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]