renzepost opened a new pull request, #36817:
URL: https://github.com/apache/airflow/pull/36817
closes: #36793
As mentioned in #36793, a default setting of 1 for `parquet_row_group_size`
leads to quite a few problems. For example the output Parquet files become
huge, the
Taragolis commented on PR #36817:
URL: https://github.com/apache/airflow/pull/36817#issuecomment-1894336872
> @Taragolis suggested a much lower value between 100 and 1000.
This was a suggestion from the pessimist inside of me. 🤣
--
This is an automated message from the Apache Git
potiuk commented on PR #36817:
URL: https://github.com/apache/airflow/pull/36817#issuecomment-1894630369
It's borderline breaking change, but I'd hate to bump MAJOR version of
google provider because of it - I think however it would be enough if there is
sa STRONG mention in the Changelong
renzepost commented on PR #36817:
URL: https://github.com/apache/airflow/pull/36817#issuecomment-1896177270
Ah, got it! I've added a more verbose description in the changelog. Let me
know if I missed anything or need to change the wording.
--
This is an automated message from the Apache G
potiuk commented on PR #36817:
URL: https://github.com/apache/airflow/pull/36817#issuecomment-1896225993
Nice
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe
Taragolis commented on code in PR #36817:
URL: https://github.com/apache/airflow/pull/36817#discussion_r1456304434
##
airflow/providers/google/CHANGELOG.rst:
##
@@ -26,6 +26,14 @@
Changelog
-
+The default value of ``parquet_row_group_size`` in ``BaseSQLToGCSOperator`
Taragolis commented on code in PR #36817:
URL: https://github.com/apache/airflow/pull/36817#discussion_r1456304434
##
airflow/providers/google/CHANGELOG.rst:
##
@@ -26,6 +26,14 @@
Changelog
-
+The default value of ``parquet_row_group_size`` in ``BaseSQLToGCSOperator`
eladkal commented on code in PR #36817:
URL: https://github.com/apache/airflow/pull/36817#discussion_r1457350670
##
airflow/providers/google/CHANGELOG.rst:
##
@@ -27,6 +27,16 @@
Changelog
-
+.. note::
+ The default value of ``parquet_row_group_size`` in ``BaseSQLToG
eladkal merged PR #36817:
URL: https://github.com/apache/airflow/pull/36817
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscr...@airflow.