dabla commented on PR #65480: URL: https://github.com/apache/airflow/pull/65480#issuecomment-4506695963
Nice work @sunildataengineer, I had 3 remarks concerning this PR, so once those are resolved I think we're good as this will be a nice addition. The reason why I insist on moving the logic of `_do_transfer` to a `transfer` method in both hooks is that when you need to process multiple files, the deferred operator won't help you. It's ideal when you only have to do one SFTP operation as then indeed you won't block the worker, but when multiple files (e.g. a LOT of files) need to be processed then it's best to use the async `PythonOperator` to allow multiplexing, which will indeed block one worker to process all files, but all sftp operations will be done in one event loop and you can use the [SFTPClientPool](https://github.com/apache/airflow/blob/main/providers/sftp/src/airflow/providers/sftp/pools/sftp.py#L46) which improves throughput, something you cannot do with multiple deferred operators. Of course this all depends on the use case, as I said, it's a good addition if you only need to do one specific SFTP operation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
