[PR] SqlToS3Operator: feat/ add max_rows_per_file parameter [airflow]

2024-01-28 Thread via GitHub
selimchergui opened a new pull request, #37055: URL: https://github.com/apache/airflow/pull/37055 **Description** `SqlToS3Operator` create an S3 object from the output of sql qurey. Unless you specify a groupby_kwargs parameter, the entire data will be written in one object. This could r

Re: [PR] SqlToS3Operator: feat/ add max_rows_per_file parameter [airflow]

2024-01-28 Thread via GitHub
boring-cyborg[bot] commented on PR #37055: URL: https://github.com/apache/airflow/pull/37055#issuecomment-1913559242 Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors'

Re: [PR] SqlToS3Operator: feat/ add max_rows_per_file parameter [airflow]

2024-01-28 Thread via GitHub
potiuk commented on code in PR #37055: URL: https://github.com/apache/airflow/pull/37055#discussion_r1468901595 ## airflow/providers/amazon/aws/transfers/sql_to_s3.py: ## @@ -194,13 +197,32 @@ def execute(self, context: Context) -> None: def _partition_dataframe(self, df:

Re: [PR] SqlToS3Operator: feat/ add max_rows_per_file parameter [airflow]

2024-01-29 Thread via GitHub
selimchergui commented on code in PR #37055: URL: https://github.com/apache/airflow/pull/37055#discussion_r1469234261 ## airflow/providers/amazon/aws/transfers/sql_to_s3.py: ## @@ -194,13 +197,32 @@ def execute(self, context: Context) -> None: def _partition_dataframe(sel

Re: [PR] SqlToS3Operator: feat/ add max_rows_per_file parameter [airflow]

2024-01-29 Thread via GitHub
potiuk commented on PR #37055: URL: https://github.com/apache/airflow/pull/37055#issuecomment-1914610384 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] SqlToS3Operator: feat/ add max_rows_per_file parameter [airflow]

2024-01-29 Thread via GitHub
potiuk commented on code in PR #37055: URL: https://github.com/apache/airflow/pull/37055#discussion_r1469545063 ## airflow/providers/amazon/aws/transfers/sql_to_s3.py: ## @@ -81,6 +81,9 @@ class SqlToS3Operator(BaseOperator): You can specify this argument if you

Re: [PR] SqlToS3Operator: feat/ add max_rows_per_file parameter [airflow]

2024-01-29 Thread via GitHub
potiuk commented on PR #37055: URL: https://github.com/apache/airflow/pull/37055#issuecomment-1914630398 I reverse-quoted a grouby_keywords as spellchecking did not like it, hopefully it will get green and I will be able to merge it -- This is an automated message from the Apache Git Serv

Re: [PR] SqlToS3Operator: feat/ add max_rows_per_file parameter [airflow]

2024-01-29 Thread via GitHub
potiuk merged PR #37055: URL: https://github.com/apache/airflow/pull/37055 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.a

Re: [PR] SqlToS3Operator: feat/ add max_rows_per_file parameter [airflow]

2024-01-29 Thread via GitHub
boring-cyborg[bot] commented on PR #37055: URL: https://github.com/apache/airflow/pull/37055#issuecomment-1915824330 Awesome work, congrats on your first merged pull request! You are invited to check our [Issue Tracker](https://github.com/apache/airflow/issues) for additional contributions.

Re: [PR] SqlToS3Operator: feat/ add max_rows_per_file parameter [airflow]

2024-01-29 Thread via GitHub
potiuk commented on PR #37055: URL: https://github.com/apache/airflow/pull/37055#issuecomment-1915824822 > Now that you explicitly fail if both parameters are specified, you could potentially simplify this statement: `if self.max_rows_per_file and not self.groupby_kwargs:` But I'll leave th