Re: [PR] Handle downloading github raw files gracefully for 403 status code [airflow]

2024-12-01 Thread via GitHub


amoghrajesh merged PR #44469:
URL: https://github.com/apache/airflow/pull/44469


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Handle downloading github raw files gracefully for 403 status code [airflow]

2024-11-30 Thread via GitHub


potiuk commented on PR #44469:
URL: https://github.com/apache/airflow/pull/44469#issuecomment-2509472199

   Rebased to account for the workaround in `main`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Handle downloading github raw files gracefully for 403 status code [airflow]

2024-11-30 Thread via GitHub


potiuk commented on code in PR #44469:
URL: https://github.com/apache/airflow/pull/44469#discussion_r1864308815


##
dev/breeze/src/airflow_breeze/utils/github.py:
##
@@ -54,6 +54,11 @@ def download_file_from_github(tag: str, path: str, 
output_file: Path, timeout: i
 if not get_dry_run():
 try:
 response = requests.get(url, timeout=timeout)
+if response.status_code == 403:
+get_console().print(
+f"[error]The {url} is not accessible.This may be caused by 
network issues or VPN settings"

Review Comment:
   > @amoghrajesh can we use the github token as optional parameter for this 
(similar to what we do with issue generation command). In case of throttling we 
can recommand using it
   
   Not a good idea. This whole part of the installation process and retrieving 
constraints is supposed to work **without** `GITHUB_TOKEN`. Basically you are 
not supposed any headers when using `--constraints https://` 
   
   And it's not needed. This is a ONE request when you install Airflow - i.e. 
run `pip install`. There is no scenario where throttling woudl be a limiting 
factor here, you would have to run 10s of pip installs at the same time to hit 
it - which even if you are behind proxy / NAT, is not going to realistically 
happen. 
   
   Airlfow repo is under "Enterprise"  account which has 15.000 requests per 
hour for primary limits and 900 points per minute (15 GET requests per second):
   
   
https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28#about-secondary-rate-limits
   
   So I think why 403 on throttling is theorethically possible, it's not a 
concern for constraint retrieval.
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Handle downloading github raw files gracefully for 403 status code [airflow]

2024-11-29 Thread via GitHub


vincbeck commented on code in PR #44469:
URL: https://github.com/apache/airflow/pull/44469#discussion_r1863810134


##
dev/breeze/src/airflow_breeze/utils/github.py:
##
@@ -54,6 +54,13 @@ def download_file_from_github(tag: str, path: str, 
output_file: Path, timeout: i
 if not get_dry_run():
 try:
 response = requests.get(url, timeout=timeout)
+if response.status_code == 403:
+get_console().print(
+f"[error]The {url} is not accessible.This may be caused by 
either of:\n"
+f"   1. network issues or VPN settings\n"
+f"   2. Github rate limiting limit"

Review Comment:
   The double limit looks weird (nota. native english speaker here so I might 
be wrong)
   
   ```suggestion
   f"   2. Github rate limit"
   ```



##
dev/breeze/src/airflow_breeze/utils/github.py:
##
@@ -54,6 +54,13 @@ def download_file_from_github(tag: str, path: str, 
output_file: Path, timeout: i
 if not get_dry_run():
 try:
 response = requests.get(url, timeout=timeout)
+if response.status_code == 403:
+get_console().print(
+f"[error]The {url} is not accessible.This may be caused by 
either of:\n"
+f"   1. network issues or VPN settings\n"
+f"   2. Github rate limiting limit"

Review Comment:
   The double limit looks weird (not a native english speaker here so I might 
be wrong)
   
   ```suggestion
   f"   2. Github rate limit"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Handle downloading github raw files gracefully for 403 status code [airflow]

2024-11-29 Thread via GitHub


eladkal commented on code in PR #44469:
URL: https://github.com/apache/airflow/pull/44469#discussion_r1863739968


##
dev/breeze/src/airflow_breeze/utils/github.py:
##
@@ -54,6 +54,11 @@ def download_file_from_github(tag: str, path: str, 
output_file: Path, timeout: i
 if not get_dry_run():
 try:
 response = requests.get(url, timeout=timeout)
+if response.status_code == 403:
+get_console().print(
+f"[error]The {url} is not accessible.This may be caused by 
network issues or VPN settings"

Review Comment:
   @amoghrajesh can we use the github token as optional parameter for this 
(similar to what we do with issue generation command). In case of throttling we 
can recommand using it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Handle downloading github raw files gracefully for 403 status code [airflow]

2024-11-29 Thread via GitHub


amoghrajesh commented on code in PR #44469:
URL: https://github.com/apache/airflow/pull/44469#discussion_r1863704801


##
dev/breeze/src/airflow_breeze/utils/github.py:
##
@@ -54,6 +54,11 @@ def download_file_from_github(tag: str, path: str, 
output_file: Path, timeout: i
 if not get_dry_run():
 try:
 response = requests.get(url, timeout=timeout)
+if response.status_code == 403:
+get_console().print(
+f"[error]The {url} is not accessible.This may be caused by 
network issues or VPN settings"

Review Comment:
   Its actually pretty useless to print `response.content` - it deviates 
entirely from the actual error



##
dev/breeze/src/airflow_breeze/utils/github.py:
##
@@ -54,6 +54,11 @@ def download_file_from_github(tag: str, path: str, 
output_file: Path, timeout: i
 if not get_dry_run():
 try:
 response = requests.get(url, timeout=timeout)
+if response.status_code == 403:
+get_console().print(
+f"[error]The {url} is not accessible.This may be caused by 
network issues or VPN settings"

Review Comment:
   Throttle limit, yes. Good idea



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Handle downloading github raw files gracefully for 403 status code [airflow]

2024-11-29 Thread via GitHub


gopidesupavan commented on code in PR #44469:
URL: https://github.com/apache/airflow/pull/44469#discussion_r1863695916


##
dev/breeze/src/airflow_breeze/utils/github.py:
##
@@ -54,6 +54,11 @@ def download_file_from_github(tag: str, path: str, 
output_file: Path, timeout: i
 if not get_dry_run():
 try:
 response = requests.get(url, timeout=timeout)
+if response.status_code == 403:
+get_console().print(
+f"[error]The {url} is not accessible.This may be caused by 
network issues or VPN settings"

Review Comment:
   May be worth adding this to print `response.content` so that we get to know 
what actually the error?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Handle downloading github raw files gracefully for 403 status code [airflow]

2024-11-29 Thread via GitHub


vincbeck commented on code in PR #44469:
URL: https://github.com/apache/airflow/pull/44469#discussion_r1863614387


##
dev/breeze/src/airflow_breeze/utils/github.py:
##
@@ -54,6 +54,11 @@ def download_file_from_github(tag: str, path: str, 
output_file: Path, timeout: i
 if not get_dry_run():
 try:
 response = requests.get(url, timeout=timeout)
+if response.status_code == 403:
+get_console().print(
+f"[error]The {url} is not accessible.This may be caused by 
network issues or VPN settings"

Review Comment:
   nit: We recently got some 403 from Github because we hit their throttle 
limit. Maybe update the message to also include this as possible cause?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Handle downloading github raw files gracefully for 403 status code [airflow]

2024-11-29 Thread via GitHub


vincbeck commented on code in PR #44469:
URL: https://github.com/apache/airflow/pull/44469#discussion_r1863614387


##
dev/breeze/src/airflow_breeze/utils/github.py:
##
@@ -54,6 +54,11 @@ def download_file_from_github(tag: str, path: str, 
output_file: Path, timeout: i
 if not get_dry_run():
 try:
 response = requests.get(url, timeout=timeout)
+if response.status_code == 403:
+get_console().print(
+f"[error]The {url} is not accessible.This may be caused by 
network issues or VPN settings"

Review Comment:
   nit: We recently got some 403 from Github because we hit their throttle 
limit. Maybe update the message to also this as possible cause?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org