[GitHub] [airflow] pgagnon commented on issue #6104: [AIRFLOW-4574] allow providing private_key in SSHHook
pgagnon commented on issue #6104: [AIRFLOW-4574] allow providing private_key in SSHHook URL: https://github.com/apache/airflow/pull/6104#issuecomment-532366320 I see this has just been merged, but nevertheless... I can see why this feature could be useful, but I am concerned that it might encourage users to paste key material in their code and subsequently publish it on their VCS. To put it mildly this is an inadvisable practice. @dstandish @mik-laj @kaxil WDYT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] pgagnon commented on issue #6104: [AIRFLOW-4574] allow providing private_key in SSHHook
pgagnon commented on issue #6104: [AIRFLOW-4574] allow providing private_key in SSHHook URL: https://github.com/apache/airflow/pull/6104#issuecomment-532774200 @dstandish I see your point. My issue with this is that from the user's standpoint it's much less intuitive to figure out how to add a private key to a connection contrasted with a password, which is a standard connection field. Perhaps we could update the docs/docstring to provide an example and warn users against hardcoding keys in their dags? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] pgagnon commented on issue #6104: [AIRFLOW-4574] allow providing private_key in SSHHook
pgagnon commented on issue #6104: [AIRFLOW-4574] allow providing private_key in SSHHook URL: https://github.com/apache/airflow/pull/6104#issuecomment-532948106 @mik-laj > I think that it is very useful to be able to configure connections using environment variables. I agree that it is useful, but as a side effect it creates an avenue for users to easily leak credentials by inadvertance. > Incorrect use is possible, but it is very limited and not very obvious. This type of credentials compromise is actually [so pervasive](https://www.zdnet.com/article/over-10-github-repos-have-leaked-api-or-cryptographic-keys/) that GitHub is now [scanning repositories for key patterns](https://help.github.com/en/articles/about-token-scanning). Google does [the same](https://cloud.google.com/source-repositories/docs/detecting-security-keys). Similarly AWS released a [git hook to prevent users from accidentally committing key material](https://github.com/awslabs/git-secrets). I don't think it can be qualified as limited at all, especially since in our case we need to keep in mind that a lot of Airflow users are data scientist types that just want to get their work done and aren't necessarily expected to have an acute understanding of proper security practices. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services