Taragolis commented on code in PR #36817:
URL: https://github.com/apache/airflow/pull/36817#discussion_r1456304434


##########
airflow/providers/google/CHANGELOG.rst:
##########
@@ -26,6 +26,14 @@
 
 Changelog
 ---------
+The default value of ``parquet_row_group_size`` in ``BaseSQLToGCSOperator`` 
has changed from 1 to
+100000, in order to have a default that provides better compression efficiency 
and performance of
+reading the data in the output Parquet files. In many cases, the previous 
value of 1 resulted in
+very large files, long task durations and out of memory issues. A default 
value of 100000 may require
+more memory to execute the operator, in which case users can override the 
``parquet_row_group_size``
+parameter in the operator. All operators that are derived from 
``BaseSQLToGCSOperator`` are affected
+when ``export_format`` is ``parquet``: ``MySQLToGCSOperator``, 
``PrestoToGCSOperator``,
+``OracleToGCSOperator``, ``TrinoToGCSOperator``, ``MSSQLToGCSOperator`` and 
``PostgresToGCSOperator``.

Review Comment:
   ```suggestion
   
   .. note::
     The default value of ``parquet_row_group_size`` in 
``BaseSQLToGCSOperator`` has changed from 1 to
     100000, in order to have a default that provides better compression 
efficiency and performance of
     reading the data in the output Parquet files. In many cases, the previous 
value of 1 resulted in
     very large files, long task durations and out of memory issues. A default 
value of 100000 may require
     more memory to execute the operator, in which case users can override the 
``parquet_row_group_size``
     parameter in the operator. All operators that are derived from 
``BaseSQLToGCSOperator`` are affected
     when ``export_format`` is ``parquet``: ``MySQLToGCSOperator``, 
``PrestoToGCSOperator``,
     ``OracleToGCSOperator``, ``TrinoToGCSOperator``, ``MSSQLToGCSOperator`` 
and ``PostgresToGCSOperator``.
   ```
   
   This would make it more noticeable in Changelog
   
   
![image](https://github.com/apache/airflow/assets/3998685/a4a67bd7-0543-42cd-bc72-d5e4f8514192)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to