[I] Enable Partition Transforms and/or Spark SQL In Spark `rewrite_data_files` Procedure [iceberg]

via GitHub Mon, 16 Oct 2023 15:22:27 -0700


RLashofRegas opened a new issue, #8846:
URL: https://github.com/apache/iceberg/issues/8846


   ### Feature Request / Improvement
   
   I am using iceberg v0.14.0 w/ Spark 3.3.0 on Amazon EMR 6.8.0.
   
   We are trying to implement regular table maintenance on a table that uses 
partition transforms. Let's say we have a table with DDL like: 
   
   ```sql
   CREATE TABLE glue_catalog.my_db.my_table (
     load_date TIMESTAMP,
     ...
   )
   USING iceberg
   PARTITIONED BY (years(load_date), months(load_date), days(load_date))
   ...
   ```
   
   Ideally for this table we'd like to call one of the following:
   
   1. `CALL glue_catalog.system.rewrite_data_files(table => 
'glue_catalog.my_db.my_table', where => "date(load_date)='2023-01-01'")`
   2. `CALL glue_catalog.system.rewrite_data_files(table => 
'glue_catalog.my_db.my_table', where => "year(load_date)=2023 AND 
month(load_date)=1")`
   3. `CALL glue_catalog.system.rewrite_data_files(table => 
'glue_catalog.my_db.my_table', where => "date(load_date)=DATE_TRUNC('MM', 
CURRENT_DATE) - INTERVAL 1 MONTH")`
   4. etc.
   
   Whenever I try one of these I get various exceptions:
   - `date(load_date)='2023-10-01'` -> `Py4JJavaError: An error occurred while 
calling o101.sql: java.util.NoSuchElementException: None.get`
   - `years(load_date)=2023` -> `IllegalArgumentException: Cannot parse 
predicates in where option: years(load_date)=2023`
   
   ### Query engine
   
   Spark


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Enable Partition Transforms and/or Spark SQL In Spark `rewrite_data_files` Procedure [iceberg]

Reply via email to