devinjdangelo commented on code in PR #7244:
URL: https://github.com/apache/arrow-datafusion/pull/7244#discussion_r1289380024
##########
datafusion/common/src/config.rs:
##########
@@ -270,7 +270,48 @@ config_namespace! {
/// will be reordered heuristically to minimize the cost of
evaluation. If false,
/// the filters are applied in the same order as written in the query
pub reorder_filters: bool, default = false
+
+ // The following map to parquet::file::properties::WriterProperties
+
+ /// Sets best effort maximum size of data page in bytes
+ pub data_pagesize_limit: usize, default = 1024 * 1024
Review Comment:
I think we will definitely want `COPY TO` to be able to set any of these
configs on a per statement basis.
For `insert into`, we could allow the table itself to be registered with
specific settings e.g.:
```sql
create external table my_table(x int, y int)
stored as parquet
location '/tmp/my_table'
WITH (
DATA_PAGESIZE_LIMIT 2048,
DATA_PAGE_ROW_COUNT_LIMIT 100000)
...
);
```
`insert into mytable` would then use any table specific settings or fall
back to the session level configs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]