[GitHub] [airflow] uranusjr commented on issue #27328: SFTPOperator throws object of type 'PlainXComArg' has no len() when using with Taskflow API

2022-10-27 Thread GitBox


uranusjr commented on issue #27328:
URL: https://github.com/apache/airflow/issues/27328#issuecomment-1294500097

   Value checks in operators should be done in `execute`, not `__init__`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] o-nikolas commented on issue #27338: scripts/tools/initialize_virtualenv.py calling .exists() on str

2022-10-27 Thread GitBox


o-nikolas commented on issue #27338:
URL: https://github.com/apache/airflow/issues/27338#issuecomment-1294435687

   Thanks for the bug report @rkarish!
   
   I've assigned you the task since you checked that you're willing to submit a 
PR :smiley: 
   
   >  I believe this should be os.path.exists(airflow_home) instead
   
   Looking at the function stub for `clean_up_airflow_home` it looks like that 
code is expecting a `Path` object, but it's getting a string. So I think the 
better fix is to update the calling code to pass in a Path object instead of a 
string as it is now. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] o-nikolas commented on issue #27328: SFTPOperator throws object of type 'PlainXComArg' has no len() when using with Taskflow API

2022-10-27 Thread GitBox


o-nikolas commented on issue #27328:
URL: https://github.com/apache/airflow/issues/27328#issuecomment-1294411017

   Thanks for the bug report @jtommi!
   
   I see that you've checked that you're willing to submit a PR, so I have 
assigned the task to you :smiley: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] o-nikolas commented on a diff in pull request #27184: SSHOperator ignores cmd_timeout (#27182)

2022-10-27 Thread GitBox


o-nikolas commented on code in PR #27184:
URL: https://github.com/apache/airflow/pull/27184#discussion_r1007576007


##
airflow/providers/ssh/hooks/ssh.py:
##
@@ -491,9 +491,12 @@ def exec_ssh_client_command(
 if stdout_buffer_length > 0:
 agg_stdout += stdout.channel.recv(stdout_buffer_length)
 
+timedout = False
+
 # read from both stdout and stderr
 while not channel.closed or channel.recv_ready() or 
channel.recv_stderr_ready():
 readq, _, _ = select([channel], [], [], timeout)
+timedout = len(readq) == 0

Review Comment:
   I don't have a deep enough understanding of the select api and ssh to know 
for sure, but it seems harmless to check the others.
   
   If you're positive that this is correct then I'm happy to commit and 
approve.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #27338: scripts/tools/initialize_virtualenv.py calling .exists() on str

2022-10-27 Thread GitBox


boring-cyborg[bot] commented on issue #27338:
URL: https://github.com/apache/airflow/issues/27338#issuecomment-1294388090

   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] rkarish opened a new issue, #27338: scripts/tools/initialize_virtualenv.py calling .exists() on str

2022-10-27 Thread GitBox


rkarish opened a new issue, #27338:
URL: https://github.com/apache/airflow/issues/27338

   ### Apache Airflow version
   
   main (development)
   
   ### What happened
   
   While setting up a local development environment I went to use the 
`scripts/tools/initialize_virtualenv.py ` script and received an exception. I 
believe this should be `os.path.exists(airflow_home)` instead.
   
   ```
   Traceback (most recent call last):
 File 
"/Users/rkarish/Projects/airflow/scripts/tools/initialize_virtualenv.py", line 
187, in 
   main()
 File 
"/Users/rkarish/Projects/airflow/scripts/tools/initialize_virtualenv.py", line 
142, in main
   clean_up_airflow_home(airflow_home_dir)
 File 
"/Users/rkarish/Projects/airflow/scripts/tools/initialize_virtualenv.py", line 
36, in clean_up_airflow_home
   if airflow_home.exists():
   AttributeError: 'str' object has no attribute 'exists'
   ```
   
   Also `LOCAL_VIRTUALENV.rst` has an incorrect path to this file.
   
   ### What you think should happen instead
   
   _No response_
   
   ### How to reproduce
   
   _No response_
   
   ### Operating System
   
   macOS 13.0
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #27337: fIx failing masking tests for python < 3.10

2022-10-27 Thread GitBox


potiuk commented on PR #27337:
URL: https://github.com/apache/airflow/pull/27337#issuecomment-1294354347

   Running for "full tests" and with change in setup.py to get latest version 
of exceptiongroup - apparently the test results sligthly differ for different 
Python version


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk opened a new pull request, #27337: fIx failing masking tests for python < 3.10

2022-10-27 Thread GitBox


potiuk opened a new pull request, #27337:
URL: https://github.com/apache/airflow/pull/27337

   Seems that the number of times user is printed in stack trace depend on 
Python version. The fix in #27335 seems to only have worked for Python 3.10 
with the 1.0.0 of exceptiongroup the stack trace has less stack levels.
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] shubham22 commented on a diff in pull request #27262: Strenghten a bit and clarify importance of triaging issues

2022-10-27 Thread GitBox


shubham22 commented on code in PR #27262:
URL: https://github.com/apache/airflow/pull/27262#discussion_r1007459660


##
ISSUE_TRIAGE_PROCESS.rst:
##
@@ -30,6 +30,52 @@ to fix an issue or make an enhancement, without needing to 
open an issue first.
 This is intended to make it as easy as possible to contribute to
 the project.
 
+Another important part of our Issue reporting process are also Github 
Discussions.

Review Comment:
   - remove "also"
   - "GitHub",  also may be add a link to it as this is the first time it is 
mentioned (https://github.com/apache/airflow/discussions)



##
COMMITTERS.rst:
##
@@ -75,7 +75,8 @@ Code contribution
 Community contributions
 
 
-1.  Was instrumental in triaging issues
+1.  Actively participated in `triaging issues `_ 
showing their understanding

Review Comment:
   Points 3~5 are present tense.
   Suggestion: Actively participates in...



##
ISSUE_TRIAGE_PROCESS.rst:
##
@@ -30,6 +30,52 @@ to fix an issue or make an enhancement, without needing to 
open an issue first.
 This is intended to make it as easy as possible to contribute to
 the project.
 
+Another important part of our Issue reporting process are also Github 
Discussions.
+Issues should represent clear feature requests or bugs which can/should be 
either implemented or fixed.
+Users are encouraged to open Discussions rather than Issues if there are no 
clear, reproducible
+steps, or when they have troubleshooting problems.
+
+Responding to issues/discussions (relatively) quickly
+'
+
+It is vital to provide rather quick feedback to Issues and Discussions opened 
by our users, so that they
+feel listened to rather than ignored. Even if the response is "we are not 
going to work on it because ...",
+or "converting this issue to discussion because ..." or "closing because it is 
a duplicate of #xxx", it is
+far more welcoming than leaving issues and discussions unanswered. Sometimes 
issues and discussions are
+answered by other users (and this is cool) but if an issue/discussion is not 
responded to for a few days or
+weeks, this gives an impression that the user reporting it is ignored, which 
creates an impression of a
+non-welcoming project.
+
+We strive to provide relatively quick responses to all such issues and 
discussions. Users should exercise
+patience while waiting for those (knowing that people might be busy, on 
vacations etc.) however they should
+not wait weeks until someone looks at their issues.
+
+Issue Triage team
+''
+
+While many of the issues can be responded to by other users and committers, 
the committer team is not
+big enough to handle all such requests and sometimes they are busy with 
implementing important huge features

Review Comment:
   Suggestion: with implementing important and complex features...



##
ISSUE_TRIAGE_PROCESS.rst:
##
@@ -30,6 +30,52 @@ to fix an issue or make an enhancement, without needing to 
open an issue first.
 This is intended to make it as easy as possible to contribute to
 the project.
 
+Another important part of our Issue reporting process are also Github 
Discussions.
+Issues should represent clear feature requests or bugs which can/should be 
either implemented or fixed.
+Users are encouraged to open Discussions rather than Issues if there are no 
clear, reproducible
+steps, or when they have troubleshooting problems.
+
+Responding to issues/discussions (relatively) quickly
+'
+
+It is vital to provide rather quick feedback to Issues and Discussions opened 
by our users, so that they
+feel listened to rather than ignored. Even if the response is "we are not 
going to work on it because ...",
+or "converting this issue to discussion because ..." or "closing because it is 
a duplicate of #xxx", it is
+far more welcoming than leaving issues and discussions unanswered. Sometimes 
issues and discussions are
+answered by other users (and this is cool) but if an issue/discussion is not 
responded to for a few days or
+weeks, this gives an impression that the user reporting it is ignored, which 
creates an impression of a
+non-welcoming project.

Review Comment:
   Suggestion: it gives an impression that the user was ignored and that the 
Airflow project is unwelcoming.



##
ISSUE_TRIAGE_PROCESS.rst:
##
@@ -30,6 +30,52 @@ to fix an issue or make an enhancement, without needing to 
open an issue first.
 This is intended to make it as easy as possible to contribute to
 the project.
 
+Another important part of our Issue reporting process are also Github 
Discussions.
+Issues should represent clear feature requests or bugs which can/should be 
either implemented or fixed.
+Users are encouraged to open Discussions rather than Issues if there are no 
clear, reproducible
+steps, or when they have troubleshooting problems.

Review 

[GitHub] [airflow] potiuk commented on pull request #27262: Strenghten a bit and clarify importance of triaging issues

2022-10-27 Thread GitBox


potiuk commented on PR #27262:
URL: https://github.com/apache/airflow/pull/27262#issuecomment-1294202399

   Any more comments? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated (acc6982770 -> 550b49b418)

2022-10-27 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


from acc6982770 More resilient test for secrets masker (#27335)
 add 550b49b418 Skip Integration tests on Public runners if not full tests 
needed (#27322)

No new revisions were added by this update.

Summary of changes:
 .github/workflows/ci.yml   |   5 +
 dev/breeze/SELECTIVE_CHECKS.md |   1 +
 .../airflow_breeze/commands/testing_commands.py|  32 ++-
 .../commands/testing_commands_config.py|   1 +
 .../src/airflow_breeze/utils/selective_checks.py   |  24 +--
 dev/breeze/tests/test_selective_checks.py  |   8 +
 images/breeze/output-commands-hash.txt |   4 +-
 images/breeze/output-commands.svg  |  90 -
 images/breeze/output_testing.svg   |  22 +--
 images/breeze/output_testing_tests.svg | 216 +++--
 10 files changed, 223 insertions(+), 180 deletions(-)



[GitHub] [airflow] potiuk merged pull request #27322: Skip Integration tests on Public runners if not full tests needed

2022-10-27 Thread GitBox


potiuk merged PR #27322:
URL: https://github.com/apache/airflow/pull/27322


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #27322: Skip Integration tests on Public runners if not full tests needed

2022-10-27 Thread GitBox


potiuk commented on PR #27322:
URL: https://github.com/apache/airflow/pull/27322#issuecomment-1294202074

   Failures unrelated. All good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] punx120 commented on a diff in pull request #27184: SSHOperator ignores cmd_timeout (#27182)

2022-10-27 Thread GitBox


punx120 commented on code in PR #27184:
URL: https://github.com/apache/airflow/pull/27184#discussion_r1007434797


##
airflow/providers/ssh/hooks/ssh.py:
##
@@ -491,9 +491,12 @@ def exec_ssh_client_command(
 if stdout_buffer_length > 0:
 agg_stdout += stdout.channel.recv(stdout_buffer_length)
 
+timedout = False
+
 # read from both stdout and stderr
 while not channel.closed or channel.recv_ready() or 
channel.recv_stderr_ready():
 readq, _, _ = select([channel], [], [], timeout)
+timedout = len(readq) == 0

Review Comment:
   I'm not sure - from `select` doc
   ```
   the return value is a tuple of three lists corresponding to the first three 
arguments; 
   each contains the subset of the corresponding file descriptors that are 
ready.
   ```
   and we pass empty list to 2nd and 3rd args.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] kazanzhy commented on a diff in pull request #26944: Use DbApiHook.run for DbApiHook.get_records and DbApiHook.get_first

2022-10-27 Thread GitBox


kazanzhy commented on code in PR #26944:
URL: https://github.com/apache/airflow/pull/26944#discussion_r1007433847


##
airflow/providers/common/sql/hooks/sql.py:
##
@@ -175,41 +207,26 @@ def get_pandas_df_by_chunks(self, sql, parameters=None, 
*, chunksize, **kwargs):
 yield from psql.read_sql(sql, con=conn, params=parameters, 
chunksize=chunksize, **kwargs)
 
 def get_records(
-self,
-sql: str | list[str],
-parameters: Iterable | Mapping | None = None,
-**kwargs: dict,
-):
+self, sql: str | list[str], parameters: Iterable | Mapping | None = 
None
+) -> Any | list[Any]:

Review Comment:
   I changed back from `sql: str = ""` to `sql: str | list[str] = ""`.
   It seems strange but without it, I can't remove `# type: ignore[override]`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #27322: Skip Integration tests on Public runners if not full tests needed

2022-10-27 Thread GitBox


potiuk commented on PR #27322:
URL: https://github.com/apache/airflow/pull/27322#issuecomment-1294152356

   And added better messaging (colour and showing up the actually 
skipped/sequentialized test types.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated (9c73b3f7fc -> acc6982770)

2022-10-27 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


from 9c73b3f7fc Fix typo (#27327)
 add acc6982770 More resilient test for secrets masker (#27335)

No new revisions were added by this update.

Summary of changes:
 tests/utils/log/test_secrets_masker.py | 27 ++-
 1 file changed, 2 insertions(+), 25 deletions(-)



[GitHub] [airflow] potiuk merged pull request #27335: More resilient test for secrets masker

2022-10-27 Thread GitBox


potiuk merged PR #27335:
URL: https://github.com/apache/airflow/pull/27335


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] dejii opened a new pull request, #27336: use `id` key to retrieve the dataflow job_id

2022-10-27 Thread GitBox


dejii opened a new pull request, #27336:
URL: https://github.com/apache/airflow/pull/27336

   When using any of the DataflowJob sensors 
([example](https://github.com/apache/airflow/blob/9c73b3f7fc1d18925d0ed09e8719f53b8147b0f2/airflow/providers/google/cloud/example_dags/example_dataflow.py#L176-L181)),
 the `dataflow_job_id` key is used to extract the dataflow job id from the job 
returned by the dataflow task. This results in the error shown below
   ```
   jinja2.exceptions.UndefinedError: 'dict object' has no attribute 
'dataflow_job_id'
   ```
   The key that contains the job id is `id`. [Dataflow REST API 
reference](https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.jobs#Job)
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk opened a new pull request, #27335: More resilient test for secrets masker

2022-10-27 Thread GitBox


potiuk opened a new pull request, #27335:
URL: https://github.com/apache/airflow/pull/27335

   The test expected exact stack trace, but we really want to check if the 
stacktrace contains masked passwords at all levels of context.
   
   This PR makes the test more resilient to any changes in stacktrace.
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #27333: Add examples and howtos about sensors

2022-10-27 Thread GitBox


potiuk commented on PR #27333:
URL: https://github.com/apache/airflow/pull/27333#issuecomment-1294060637

   Small but I think nice :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #27322: Skip Integration tests on Public runners if not full tests needed

2022-10-27 Thread GitBox


potiuk commented on PR #27322:
URL: https://github.com/apache/airflow/pull/27322#issuecomment-1294041835

   Actually I found that out that it did not work for Postgres - because it was 
inside "mysql', 'mssql' - moved the if outside


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk closed issue #27332: airflow tasks are getting randomly terminated with no errors in UI and logs on worker shows module not found

2022-10-27 Thread GitBox


potiuk closed issue #27332: airflow tasks are getting randomly terminated with 
no errors in UI and logs on worker shows module not found
URL: https://github.com/apache/airflow/issues/27332


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


potiuk commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1294034952

   This is very close to what I've heard! Good one @Taragolis! And yeah  
PYTHONDONTWRITEBYTECODE is also my typical recommendation. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk opened a new pull request, #27333: Add examples and howtos about sensors

2022-10-27 Thread GitBox


potiuk opened a new pull request, #27333:
URL: https://github.com/apache/airflow/pull/27333

   The examples and docs were missing for a number of built-in sensors. This 
documentation and examples do not add much but at least give the user 
information that there are such sensors available when they look at our 
documentation.
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #27332: airflow tasks are getting randomly terminated with no errors in UI and logs on worker shows module not found

2022-10-27 Thread GitBox


boring-cyborg[bot] commented on issue #27332:
URL: https://github.com/apache/airflow/issues/27332#issuecomment-1294020775

   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] gurjit-sandhu opened a new issue, #27332: airflow tasks are getting randomly terminated with no errors in UI and logs on worker shows module not found

2022-10-27 Thread GitBox


gurjit-sandhu opened a new issue, #27332:
URL: https://github.com/apache/airflow/issues/27332

   ### Official Helm Chart version
   
   1.6.0
   
   ### Apache Airflow version
   
   airflow 2
   
   ### Kubernetes Version
   
   1.22
   
   ### Helm Chart configuration
   
i have set PYTHONPATH as below in extraENV and verified after login to 
worker pod 
 extraEnv: | 
   - name: PYTHONPATH
 value: "/opt/airflow/dags:/opt/airflow"  
 
   also have setup PYTHONPATH in docker image
   
   # Setting python path for importing dag modules 
   ENV PYTHONPATH="/opt/airflow/dags:/opt/airflow"
 
   
   ### Docker Image customisations
   
   also have setup PYTHONPATH in docker image
   
   # Setting python path for importing dag modules 
   ENV PYTHONPATH="/opt/airflow/dags:/opt/airflow"
   
   ### What happened
   
   dags are showing up on UI however after executing it get terminated and 
below are logs from worker showing its not able to find python modules -- if we 
re-run same tasks it succeeds however randomly it fails with below error module 
not found
   
   _execute_in_fork(command_to_exec, celery_task_id)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/celery_executor.py",
 line 108, in _execute_in_fork
   raise AirflowException(msg)
   airflow.exceptions.AirflowException: Celery command failed on host: 
airflow-worker-1.airflow-worker.airflow.svc.cluster.local with celery_task_id 
f672c24c-d54c-4401-a1c1-8892d54f90f4
   [2022-10-27 18:48:32,001: INFO/MainProcess] Task 
airflow.executors.celery_executor.execute_command[1fdb829f-755a-4ac0-98c5-362e6b6c8a44]
 received
   [2022-10-27 18:48:32,011: INFO/ForkPoolWorker-15] 
[1fdb829f-755a-4ac0-98c5-362e6b6c8a44] Executing command in Celery: ['airflow', 
'tasks', 'run', 'yipit_fsv_data_export_dag', 
'verify_all_exports_were_successful', 'manual__2022-10-27T13:42:55-05:00', 
'--local', '--subdir', 'DAGS_FOLDER/partner_exports/data_export_dag_builder.py']
   [2022-10-27 18:48:32,087: INFO/ForkPoolWorker-15] Filling up the DagBag from 
/opt/airflow/dags/partner_exports/data_export_dag_builder.py
   [2022-10-27 18:48:32,144: ERROR/ForkPoolWorker-15] Failed to import: 
/opt/airflow/dags/partner_exports/data_export_dag_builder.py
   Traceback (most recent call last):
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dagbag.py", 
line 317, in parse
   loader.exec_module(new_module)
 File "", line 843, in exec_module
 File "", line 219, in 
_call_with_frames_removed
 File "/opt/airflow/dags/partner_exports/data_export_dag_builder.py", line 
20, in 
   from utils.generic_callbacks import on_failure_callback, 
on_success_callback
   ModuleNotFoundError: No module named 'utils'
   [2022-10-27 18:48:32,145: ERROR/ForkPoolWorker-15] 
[1fdb829f-755a-4ac0-98c5-362e6b6c8a44] Failed to execute task Dag 
'yipit_fsv_data_export_dag' could not be found; either it does not exist or it 
failed to parse..
   Traceback (most recent call last):
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/celery_executor.py",
 line 128, in _execute_in_fork
   args.func(args)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/cli_parser.py", 
line 51, in command
   return func(*args, **kwargs)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/cli.py", line 
99, in wrapper
   return f(*args, **kwargs)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/commands/task_command.py",
 line 360, in task_run
   dag = get_dag(args.subdir, args.dag_id)
 File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/cli.py", line 
203, in get_dag
   raise AirflowException(
   airflow.exceptions.AirflowException: Dag 'yipit_fsv_data_export_dag' could 
not be found; either it does not exist or it failed to parse.
   
   ### What you think should happen instead
   
   since PYTHONPATH is set in both docker and helm chart environment variables 
- dag should be able to find modules
   
   ### How to reproduce
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] vincbeck opened a new pull request, #27331: Fix example_emr_eks system test. Clean trust policies from the execution role

2022-10-27 Thread GitBox


vincbeck opened a new pull request, #27331:
URL: https://github.com/apache/airflow/pull/27331

   Fix example_emr_eks system test. Clean trust policies from the execution 
role. The trust policy from the execution role gets too big each system test 
occurence add a new one through update-role-trust-policy. See error below
   
   ```
   (LimitExceeded) when calling the UpdateAssumeRolePolicy operation: Cannot 
exceed quota for ACLSizePerRole: 2048
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


Taragolis commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293984562

   > BTW. I've heard VERY bad things about EFS when EFS is used to share DAGs. 
It has profound impact on stability and performance of Airlfow if you have big 
number of DAGs unless you pay big bucks for IOPS. I've heard that from many 
people.
   > This is the moment when I usually STRONGLY recommend GitSync instead: 
https://medium.com/apache-airflow/shared-volumes-in-airflow-the-good-the-bad-and-the-ugly-22e9f681afca
   
   It's always it depends on configuration and monitoring. I personally have 
this issue might be in Airflow 2.1.x and I do not know is it actually related 
to Airflow itself or some other stuff. Work with EFS definitely take more 
effort rather than GitSync.
   
   Just for someone who might found this thread in the future with EFS 
performance degradation might help:
   
   **Disable save python bytecodes inside of NFS (AWS EFS) mount**
  + Mount as Read-Only
  + Disable Python bytecode by set `PYTHONDONTWRITEBYTECODE=x`
  + Or set location for bytecodes by set `PYTHONPYCACHEPREFIX` for example 
to `/tmp/pycaches`
   
   Throughput in mode Bursting in first looks like miracle but when all 
Bursting Capacity go to zero it could turn into your life into the hell. Each 
newly created EFS share has about 2.1 TB Bursting capacity.
   
   What could be done here:
   - Switch to Provisional Throughput mode permanently which might cost a lot, 
something like 6 USD per 1 MiB/sec without VAT
   - Switch to Provisional Throughput mode only when Bursting Capacity less 
than some amount, like 0.5 TB, and switch back when Bursting Capacity exceed 
limit 2.1 TB. Unfortunately there is no autoscaling so it would be manual or 
combination of CloudWatch Alerting + AWS Lambda.
   
   
![image](https://user-images.githubusercontent.com/3998685/198383225-2b101e42-726f-4f60-90e2-44ab3e4a1098.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ferruzzi commented on pull request #27278: Add Andrey as member of the triage team

2022-10-27 Thread GitBox


ferruzzi commented on PR #27278:
URL: https://github.com/apache/airflow/pull/27278#issuecomment-1293961451

   Hey!  I missed this one.   Agreed, and congrats.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


potiuk commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293946748

   > 5, which involves educating all current/future maintainers to understand 
memory nuances sweat_smile
   
   As counterintuitive as it is, I know what you are talking about :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


potiuk commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293945450

   BTW. I've heard VERY bad things about EFS when EFS is used to share DAGs. It 
has profound impact on stability and performance of Airlfow if you have big 
number of DAGs unless you pay big bucks for IOPS. I've heard that from many 
people.
   
   This is the moment when I STRONGLY recommend GitSync instead: 
https://medium.com/apache-airflow/shared-volumes-in-airflow-the-good-the-bad-and-the-ugly-22e9f681afca


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] o-nikolas commented on a diff in pull request #27184: SSHOperator ignores cmd_timeout (#27182)

2022-10-27 Thread GitBox


o-nikolas commented on code in PR #27184:
URL: https://github.com/apache/airflow/pull/27184#discussion_r1007236645


##
airflow/providers/ssh/hooks/ssh.py:
##
@@ -491,9 +491,12 @@ def exec_ssh_client_command(
 if stdout_buffer_length > 0:
 agg_stdout += stdout.channel.recv(stdout_buffer_length)
 
+timedout = False
+
 # read from both stdout and stderr
 while not channel.closed or channel.recv_ready() or 
channel.recv_stderr_ready():
 readq, _, _ = select([channel], [], [], timeout)
+timedout = len(readq) == 0

Review Comment:
   I think you need to check if **all** three results are empty to be sure a 
timeout has occurred. Useful link: https://stackoverflow.com/a/15195460/1055702



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a diff in pull request #27322: Skip Integration tests on Public runners if not full tests needed

2022-10-27 Thread GitBox


ferruzzi commented on code in PR #27322:
URL: https://github.com/apache/airflow/pull/27322#discussion_r1007223939


##
dev/breeze/src/airflow_breeze/utils/selective_checks.py:
##
@@ -290,7 +290,7 @@ def default_constraints_branch(self) -> str:
 return self._default_constraints_branch
 
 @cached_property
-def _full_tests_needed(self) -> bool:

Review Comment:
   Congrats, little method, you've been promoted.  :P 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] zachliu commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


zachliu commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293904526

   i'm also using AWS EFS :handshake: 
   
   i think i'll try 1 & 2, they seem to be the easiest except 5, which involves 
educating all current/future maintainers to understand memory nuances 
:sweat_smile: 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


potiuk commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293897683

   5. Stop worrying about it. The fact that Unix cache grows has only one 
drawback - it will show up when you choose to in your monitoring service. You 
should monitor other parameters - it's perfectly OK that cache grows up until 
all available memory - it has no negative consequences


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


potiuk commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293895871

   3. Write a custom handler to rotate the logs outside.
   4. Use externa service for logging (CloudWatch etc.) for storing them 
remotely.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27329: Airflow with Oracle: the field dag.next_dagrun_data_interval_start shows the error ORA-00972: identifier is too long

2022-10-27 Thread GitBox


potiuk commented on issue #27329:
URL: https://github.com/apache/airflow/issues/27329#issuecomment-1293890994

   We do not support Oracle. Look at prerequisites There are likely many more 
errors when you try to run Airflow on unsupported database and even if you fix 
them, they are likely to break any time.
   
   Simply don't use Oracle as metadata db. We are not going to help with 
solving any issues with it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk closed issue #27329: Airflow with Oracle: the field dag.next_dagrun_data_interval_start shows the error ORA-00972: identifier is too long

2022-10-27 Thread GitBox


potiuk closed issue #27329: Airflow with Oracle: the field 
dag.next_dagrun_data_interval_start shows the error ORA-00972: identifier is 
too long
URL: https://github.com/apache/airflow/issues/27329


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on a diff in pull request #27322: Skip Integration tests on Public runners if not full tests needed

2022-10-27 Thread GitBox


potiuk commented on code in PR #27322:
URL: https://github.com/apache/airflow/pull/27322#discussion_r1007204589


##
dev/breeze/src/airflow_breeze/commands/testing_commands.py:
##
@@ -285,6 +286,13 @@ def run_tests_in_parallel(
 if test_type.startswith(heavy_test_type):
 test_types_list.remove(test_type)
 tests_to_run_sequentially.append(test_type)
+if full_tests_needed:

Review Comment:
   Obviously :facepalm: . That would actually explain why they were run last 
time :)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] jedcunningham commented on pull request #27327: Fix typo

2022-10-27 Thread GitBox


jedcunningham commented on PR #27327:
URL: https://github.com/apache/airflow/pull/27327#issuecomment-1293885338

   Thanks @bmtKIA6!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated: Fix typo (#27327)

2022-10-27 Thread jedcunningham
This is an automated email from the ASF dual-hosted git repository.

jedcunningham pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 9c73b3f7fc Fix typo (#27327)
9c73b3f7fc is described below

commit 9c73b3f7fc1d18925d0ed09e8719f53b8147b0f2
Author: bmtKIA6 
AuthorDate: Thu Oct 27 14:04:23 2022 -0400

Fix typo (#27327)
---
 RELEASE_NOTES.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/RELEASE_NOTES.rst b/RELEASE_NOTES.rst
index 4d171f94c6..be6e0b34c4 100644
--- a/RELEASE_NOTES.rst
+++ b/RELEASE_NOTES.rst
@@ -158,7 +158,7 @@ pass a list of 1 or more Datasets:
 
 ..  code-block:: python
 
-with DAG(dag_id='dataset-consmer', schedule=[dataset]):
+with DAG(dag_id='dataset-consumer', schedule=[dataset]):
 ...
 
 And to mark a task as producing a dataset pass the dataset(s) to the 
``outlets`` attribute:



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #27327: Fix typo

2022-10-27 Thread GitBox


boring-cyborg[bot] commented on PR #27327:
URL: https://github.com/apache/airflow/pull/27327#issuecomment-1293884947

   Awesome work, congrats on your first merged pull request!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] jedcunningham merged pull request #27327: Fix typo

2022-10-27 Thread GitBox


jedcunningham merged PR #27327:
URL: https://github.com/apache/airflow/pull/27327


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


Taragolis commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293884630

   I also mount default `logs` directory to NFS (AWS EFS) so I could only 
suggest my personal configuration which use for a long time
   1. Change default dag processor manager log location outside of NFS, e.g. 
`AIRFLOW__LOGGING__DAG_PROCESSOR_MANAGER_LOG_LOCATION = 
"/tmp/airflow/logs/dag_processor_manager/dag_processor_manager.log"`
   2. Increase print stats interval `AIRFLOW__SCHEDULER__PRINT_STATS_INTERVAL = 
300` which could reduce final size of file
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #27214: Refactor amazon providers tests which use `moto`

2022-10-27 Thread GitBox


potiuk commented on PR #27214:
URL: https://github.com/apache/airflow/pull/27214#issuecomment-1293882239

   Actually that's a good one. Maybe it has something to do with race when 
creating the network. The idea for this test was to test kerberos with the 
right address. And Problem with those kerberos tests were that they did not 
work with Docker2 (there was an issue about it) so we will have to fix it 
anyway Maybe soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


potiuk commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293847808

   Nothing we can do about it :). But I am not sure if those are the culprits - 
accoding to the descriptions those should be removed when airflow stops keeping 
the file unless client crashes
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


potiuk commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293846632

   silly rename :rofl: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] pankajastro opened a new pull request, #27330: [Docs] Fix duplicate param in docstring RedshiftSQLHook `get_table_primary_key` method

2022-10-27 Thread GitBox


pankajastro opened a new pull request, #27330:
URL: https://github.com/apache/airflow/pull/27330

   [docs only change] RedshiftSQLHook `get_table_primary_key` docs string has 
`table` param twice but it should have one `table` and other `schema`
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


Taragolis commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293838786

   > wait i'm confused, so it is NFS design choice not to remove the cache file 
after it's written to an actual file?
   > 
   > 
![2022-10-27_12-07](https://user-images.githubusercontent.com/14293802/198342056-2a836c9b-4d02-40da-9ab2-231087e6fac6.png)
   
   https://nfs.sourceforge.net/#faq_d2
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] jedcunningham commented on a diff in pull request #27322: Skip Integration tests on Public runners if not full tests needed

2022-10-27 Thread GitBox


jedcunningham commented on code in PR #27322:
URL: https://github.com/apache/airflow/pull/27322#discussion_r1007164990


##
dev/breeze/src/airflow_breeze/commands/testing_commands.py:
##
@@ -285,6 +286,13 @@ def run_tests_in_parallel(
 if test_type.startswith(heavy_test_type):
 test_types_list.remove(test_type)
 tests_to_run_sequentially.append(test_type)
+if full_tests_needed:

Review Comment:
   Should this be `if not full_tests_needed`? We want to still run them when 
`full_tests_needed` is true, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] eitanme commented on pull request #27190: External task sensor fail fix

2022-10-27 Thread GitBox


eitanme commented on PR #27190:
URL: https://github.com/apache/airflow/pull/27190#issuecomment-1293802461

   @potiuk because of this bug, to use the `ExternalTaskSensor` currently you 
must explicitly set a timeout on the sensor or your DAG will hang forever. To 
your point on reliance on old behavior, to workaround the bug, folks may have 
set that timeout to avoid an infinite hang.
   
   In those cases, fixing this bug will cause a change in the exception they 
receive from `AirflowSensorTimeout` to the generic `AirflowException`. If they 
are relying on catching the `AirflowSensorTimeout` exception subclass they may 
have issues though if they catch the base class they'd still be OK.
   
   Does that sound about right? What would you propose we do?
   
   I'm happy to update a changelog if I'm pointed in the right direction?
   
   Also, there are some failing checks on this PR that I don't understand. 
Specifically, in the Sqlite Py3.7: API Always CLI Core Integration Other 
Providers WWW check a test fails that I'm pretty sure I don't go anywhere near:
   
   ```
   FAILED 
tests/jobs/test_local_task_job.py::TestLocalTaskJob::test_heartbeat_failed_fast
   ```
   
   Any ideas on that front? The logs are long and I didn't see much useful in 
them while looking through so I wanted to ask before trying to dig deeper as 
I'm not super familiar with this code-base and the checks on it.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ejstembler commented on issue #27300: Scheduler encounters database update error, then gets stuck in endless loop, yet still shows as healthy

2022-10-27 Thread GitBox


ejstembler commented on issue #27300:
URL: https://github.com/apache/airflow/issues/27300#issuecomment-1293802439

   Incidentally, two Astronomer engineers familiar with the issue: 
@alex-astronomer and @wolfier 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #27329: Airflow with Oracle: the field dag.next_dagrun_data_interval_start shows the error ORA-00972: identifier is too long

2022-10-27 Thread GitBox


boring-cyborg[bot] commented on issue #27329:
URL: https://github.com/apache/airflow/issues/27329#issuecomment-1293763131

   Thanks for opening your first issue here! Be sure to follow the issue 
template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Alaeddine22 opened a new issue, #27329: Airflow with Oracle: the field dag.next_dagrun_data_interval_start shows the error ORA-00972: identifier is too long

2022-10-27 Thread GitBox


Alaeddine22 opened a new issue, #27329:
URL: https://github.com/apache/airflow/issues/27329

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   Hello,
   I installed the stable version 2.4.1 with sqlAlchemy configured for an 
Oracle database.
   
   when running airflow standalone i'm having the following error : 
   
   `sqlalchemy.exc.DatabaseError: (cx_Oracle.DatabaseError) ORA-00972: 
identifier is too long
   [SQL: SELECT dag.dag_id AS dag_dag_id, dag.root_dag_id AS dag_root_dag_id, 
dag.is_paused AS dag_is_paused, dag.is_subdag AS dag_is_subdag, dag.is_active 
AS dag_is_active, dag.last_parsed_time AS dag_last_parsed_time, 
dag.last_pickled AS dag_last_pickled, dag.last_expired AS dag_last_expired, 
dag.scheduler_lock AS dag_scheduler_lock, dag.pickle_id AS dag_pickle_id, 
dag.fileloc AS dag_fileloc, dag.processor_subdir AS dag_processor_subdir, 
dag.owners AS dag_owners, dag.description AS dag_description, dag.default_view 
AS dag_default_view, dag.schedule_interval AS dag_schedule_interval, 
dag.timetable_description AS dag_timetable_descriptio_1, dag.max_active_tasks 
AS dag_max_active_tasks, dag.max_active_runs AS dag_max_active_runs, 
dag.has_task_concurrency_limits AS dag_has_task_concurrency_2, 
dag.has_import_errors AS dag_has_import_errors, dag.next_dagrun AS 
dag_next_dagrun, dag.next_dagrun_data_interval_start AS 
dag_next_dagrun_data_int_3, dag.next_dagrun_data_interval_end AS dag_next
 _dagrun_data_int_4, dag.next_dagrun_create_after AS 
dag_next_dagrun_create_a_5, dag_tag_1.name AS dag_tag_1_name, dag_tag_1.dag_id 
AS dag_tag_1_dag_id, dag_schedule_dataset_ref_1.dataset_id AS 
dag_schedule_dataset_ref_6, dag_schedule_dataset_ref_1.dag_id AS 
dag_schedule_dataset_ref_7, dag_schedule_dataset_ref_1.created_at AS 
dag_schedule_dataset_ref_8, dag_schedule_dataset_ref_1.updated_at AS 
dag_schedule_dataset_ref_9, task_outlet_dataset_refe_2.dataset_id AS 
task_outlet_dataset_refe_a, task_outlet_dataset_refe_2.dag_id AS 
task_outlet_dataset_refe_b, task_outlet_dataset_refe_2.task_id AS 
task_outlet_dataset_refe_c, task_outlet_dataset_refe_2.created_at AS 
task_outlet_dataset_refe_d, task_outlet_dataset_refe_2.updated_at AS 
task_outlet_dataset_refe_e
   FROM dag LEFT OUTER JOIN dag_tag dag_tag_1 ON dag.dag_id = dag_tag_1.dag_id 
LEFT OUTER JOIN dag_schedule_dataset_reference dag_schedule_dataset_ref_1 ON 
dag.dag_id = dag_schedule_dataset_ref_1.dag_id LEFT OUTER JOIN 
task_outlet_dataset_reference task_outlet_dataset_refe_2 ON dag.dag_id = 
task_outlet_dataset_refe_2.dag_id
   WHERE dag.dag_id IN (:dag_id_1_1, :dag_id_1_2, :dag_id_1_3, :dag_id_1_4, 
:dag_id_1_5, :dag_id_1_6, :dag_id_1_7, :dag_id_1_8, :dag_id_1_9, :dag_id_1_10, 
:dag_id_1_11, :dag_id_1_12, :dag_id_1_13, :dag_id_1_14, :dag_id_1_15, 
:dag_id_1_16, :dag_id_1_17, :dag_id_1_18, :dag_id_1_19, :dag_id_1_20, 
:dag_id_1_21, :dag_id_1_22, :dag_id_1_23, :dag_id_1_24, :dag_id_1_25, 
:dag_id_1_26, :dag_id_1_27, :dag_id_1_28, :dag_id_1_29, :dag_id_1_30, 
:dag_id_1_31, :dag_id_1_32, :dag_id_1_33, :dag_id_1_34, :dag_id_1_35, 
:dag_id_1_36, :dag_id_1_37, :dag_id_1_38, :dag_id_1_39, :dag_id_1_40, 
:dag_id_1_41, :dag_id_1_42) FOR UPDATE OF ]
   [parameters: {'dag_id_1_1': 'latest_only', 'dag_id_1_2': 
'example_short_circuit_operator', 'dag_id_1_3': 
'example_branch_python_operator_decorator', 'dag_id_1_4': 
'example_weekday_branch_operator', 'dag_id_1_5': 
'example_short_circuit_decorator', 'dag_id_1_6': 
'example_external_task_marker_child', 'dag_id_1_7': 
'dataset_consumes_1_never_scheduled', 'dag_id_1_8': 
'example_branch_datetime_operator_3', 'dag_id_1_9': 
'example_trigger_controller_dag', 'dag_id_1_10': 
'example_subdag_operator.section-2', 'dag_id_1_11': 'example_bash_operator', 
'dag_id_1_12': 'dataset_consumes_1_and_2', 'dag_id_1_13': 'example_complex', 
'dag_id_1_14': 'dataset_produces_2', 'dag_id_1_15': 
'example_xcom_args_with_operators', 'dag_id_1_16': 'tutorial_taskflow_api', 
'dag_id_1_17': 'example_branch_datetime_operator_2', 'dag_id_1_18': 
'example_branch_datetime_operator', 'dag_id_1_19': 
'example_branch_dop_operator_v3', 'dag_id_1_20': 'latest_only_with_trigger', 
'dag_id_1_21': 'dataset_produces_1', 'dag_id_1_22':
  'example_subdag_operator.section-1', 'dag_id_1_23': 'example_skip_dag', 
'dag_id_1_24': 'tutorial_dag', 'dag_id_1_25': 'example_branch_operator', 
'dag_id_1_26': 'example_external_task_marker_parent', 'dag_id_1_27': 
'example_xcom_args', 'dag_id_1_28': 'example_trigger_target_dag', 
'dag_id_1_29': 'dataset_consumes_unknown_never_scheduled', 'dag_id_1_30': 
'dataset_consumes_1', 'dag_id_1_31': 'example_subdag_operator', 'dag_id_1_32': 
'example_nested_branch_dag', 'dag_id_1_33': 'example_dag_decorator', 
'dag_id_1_34': 'example_python_operator', 'dag_id_1_35': 'tutorial', 
'dag_id_1_36': 'example_sla_dag', 'dag_id_1_37': 
'example_task_group_decorator', 'dag_id_1_38': 
'example_passing_params_via_test_command', 'dag_id_1_39': 

[GitHub] [airflow] zachliu commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


zachliu commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293754845

   wait i'm confused, so it is NFS design choice not to remove the cache file 
after it's written to an actual file?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] zachliu commented on issue #23512: Random "duplicate key value violates unique constraint" errors when initializing the postgres database

2022-10-27 Thread GitBox


zachliu commented on issue #23512:
URL: https://github.com/apache/airflow/issues/23512#issuecomment-1293752936

   i checked out `2.4.2` and did
   ```bash
   wget -qO - https://github.com/apache/airflow/pull/27297.patch | git apply -v 
-3
   ```
   then built my own airflow
   ```bash
   breeze release-management prepare-airflow-package --package-format=wheel 
--verbose
   ```
   then installed it
   ```bash
   pip install apache_airflow-2.4.2-py3-none-any.whl[...] --constraint ...
   ```
   no more "duplicate key value violates unique constraint" errors
   :rocket: :rocket: :rocket: 
   :rocket: :rocket: 
   :rocket: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


potiuk commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293750588

   Very much so. This is the choice of using NFS to store logs :)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


Taragolis commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293745606

   `.nfs*` files should be related to NFS not to Airflow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] zachliu commented on pull request #27297: Fix IntegrityError during webserver startup

2022-10-27 Thread GitBox


zachliu commented on PR #27297:
URL: https://github.com/apache/airflow/pull/27297#issuecomment-1293744097

   just tested, this works!
   :+1: :+1: :+1: 
   :+1: :+1: 
   :+1: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #27327: Fix typo

2022-10-27 Thread GitBox


boring-cyborg[bot] commented on PR #27327:
URL: https://github.com/apache/airflow/pull/27327#issuecomment-1293731189

   Congratulations on your first Pull Request and welcome to the Apache Airflow 
community! If you have any issues or are unsure about any anything please check 
our Contribution Guide 
(https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type 
annotations). Our [pre-commits]( 
https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks)
 will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in 
`docs/` directory). Adding a new operator? Check this short 
[guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst)
 Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze 
environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for 
testing locally, it's a heavy docker but it ships with a working Airflow and a 
lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get 
the final approval from Committers.
   - Please follow [ASF Code of 
Conduct](https://www.apache.org/foundation/policies/conduct) for all 
communication including (but not limited to) comments on Pull Requests, Mailing 
list and Slack.
   - Be sure to read the [Airflow Coding style]( 
https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it 
better .
   In case of doubts contact the developers at:
   Mailing List: d...@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] bmtKIA6 opened a new pull request, #27327: Fix typo

2022-10-27 Thread GitBox


bmtKIA6 opened a new pull request, #27327:
URL: https://github.com/apache/airflow/pull/27327

   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on pull request #27322: Skip Integration tests on Public runners if not full tests needed

2022-10-27 Thread GitBox


Taragolis commented on PR #27322:
URL: https://github.com/apache/airflow/pull/27322#issuecomment-1293731595

   Finger crossed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] jtommi opened a new issue, #27328: SFTPOperator throws object of type 'PlainXComArg' has no len() when using with Taskflow API

2022-10-27 Thread GitBox


jtommi opened a new issue, #27328:
URL: https://github.com/apache/airflow/issues/27328

   ### Apache Airflow Provider(s)
   
   sftp
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-sftp==4.1.0
   
   ### Apache Airflow version
   
   2.4.2 Python 3.10
   
   ### Operating System
   
   Debian 11 (Official docker image)
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   Base image is apache/airflow:2.4.2-python3.10
   
   
   ### What happened
   
   When combining Taskflow API and SFTPOperator, it throws an exception that 
didn't happen with apache-airflow-providers-sftp 4.0.0
   
   ### What you think should happen instead
   
   The DAG should work as expected
   
   ### How to reproduce
   
   ```python
   import pendulum
   
   from airflow import DAG
   from airflow.decorators import task
   from airflow.providers.sftp.operators.sftp import SFTPOperator
   
   with DAG(
   "example_sftp",
   schedule="@once",
   start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
   catchup=False,
   tags=["example"],
   ) as dag:
   
   @task
   def get_file_path():
   return "test.csv"
   
   local_filepath = get_file_path()
   
   upload = SFTPOperator(
   task_id=f"upload_file_to_sftp",
   ssh_conn_id="sftp_connection",
   local_filepath=local_filepath,
   remote_filepath="test.csv",
   )
   ```
   
   ### Anything else
   
   ```logs
   [2022-10-27T15:21:38.106+]` {logging_mixin.py:120} INFO - 
[2022-10-27T15:21:38.102+] {dagbag.py:342} ERROR - Failed to import: 
/opt/airflow/dags/test.py
   Traceback (most recent call last):
 File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/models/dagbag.py", 
line 338, in parse
   loader.exec_module(new_module)
 File "", line 883, in exec_module
 File "", line 241, in 
_call_with_frames_removed
 File "/opt/airflow/dags/test.py", line 21, in 
   upload = SFTPOperator(
 File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/models/baseoperator.py",
 line 408, in apply_defaults
   result = func(self, **kwargs, default_args=default_args)
 File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/sftp/operators/sftp.py",
 line 116, in __init__
   if len(self.local_filepath) != len(self.remote_filepath):
   TypeError: object of type 'PlainXComArg' has no len()
   ```
   
   It looks like the offending code was introduced in commit 
5f073e38dd46217b64dbc16d7b1055d89e8c3459
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] mobuchowski commented on a diff in pull request #27113: notification: add dag run state notification system

2022-10-27 Thread GitBox


mobuchowski commented on code in PR #27113:
URL: https://github.com/apache/airflow/pull/27113#discussion_r1007072665


##
airflow/config_templates/config.yml:
##
@@ -2169,6 +2169,13 @@
   type: string
   example: ~
   default: "15"
+- name: enable_dagrun_listener_notifications
+  description: |
+Enable emitting dagrun listener notifications in scheduler.
+  version_added: 2.5.0
+  type: boolean
+  example: ~
+  default: "False"

Review Comment:
   We probably can work with `if there's scheduler plugin registered, just do 
it`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] Taragolis commented on pull request #27214: Refactor amazon providers tests which use `moto`

2022-10-27 Thread GitBox


Taragolis commented on PR #27214:
URL: https://github.com/apache/airflow/pull/27214#issuecomment-1293728819

   > It's interesting to see it happening in 3 jobs out of 4 as it was the case 
in your build.
   
   I'm just a "Lucky Guy"
   
   > I am chasing that one for a long time and I was never able to make a 
plausible hypothesis on why it happens and implements some workaround. But any 
ideas/inputs are more than welcome.
   
   I also try to figure out fist why it might happen and is it possible that 
changes on this PR might increase probability of this error.
   
But only found that `Trino` and `Kerberos` the only one which define 
network configuration and specific ip-address
   
   
https://github.com/apache/airflow/blob/12b8bc1d754ab8db1ca224cfe4ce6e34254b35d4/scripts/ci/docker-compose/integration-trino.yml#L23-L27
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] zachliu commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


zachliu commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293721092

   the 2 `.nfs*` files at `~/logs/dag_processor_manager` do add up to my 
current cache memory usage :thinking: 
   
   
![2022-10-27_11-36](https://user-images.githubusercontent.com/14293802/198334683-656930fa-553b-4145-8874-f9fe4fe45d6a.png)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ashb commented on a diff in pull request #27113: notification: add dag run state notification system

2022-10-27 Thread GitBox


ashb commented on code in PR #27113:
URL: https://github.com/apache/airflow/pull/27113#discussion_r1007040063


##
airflow/config_templates/config.yml:
##
@@ -2169,6 +2169,13 @@
   type: string
   example: ~
   default: "15"
+- name: enable_dagrun_listener_notifications
+  description: |
+Enable emitting dagrun listener notifications in scheduler.
+  version_added: 2.5.0
+  type: boolean
+  example: ~
+  default: "False"

Review Comment:
   Not sure we need a config setting for this really (Thinking: we already have 
so many config options, adding another one should be avoided unless we really 
need it)



##
airflow/listeners/listener.py:
##
@@ -47,6 +50,15 @@ def __init__(self):
 def has_listeners(self) -> bool:
 return len(self.pm.get_plugins()) > 0
 
+@property
+def has_scheduler_listeners(self) -> bool:
+for plugin in self.pm.get_plugins():
+if inspect.ismodule(plugin):

Review Comment:
   Can you explain what's going on here? Why do we need to check if its a 
module? Pluggy supports adding classes to I thought? (But mostly: why do we 
care?



##
airflow/jobs/backfill_job.py:
##
@@ -18,6 +18,7 @@
 from __future__ import annotations
 
 import time
+from concurrent.futures import Executor, ThreadPoolExecutor

Review Comment:
   ```suggestion
   from concurrent.futures import Executor as FutureExecutor, ThreadPoolExecutor
   ```
   
   (and matching else where: to avoid confusing with Airflow's own Executor 
class



##
airflow/listeners/listener.py:
##
@@ -47,6 +50,15 @@ def __init__(self):
 def has_listeners(self) -> bool:
 return len(self.pm.get_plugins()) > 0
 
+@property

Review Comment:
   ```suggestion
   @cached_property
   ```
   
   Once we've computed this once per process it can't change again.



##
airflow/listeners/listener.py:
##
@@ -33,6 +34,8 @@
 
 _listener_manager = None
 
+_scheduler_hooks = ["on_dag_run_success", "on_dag_run_failure"]

Review Comment:
   Given these hooks are also called from backfill:
   
   ```suggestion
   _dagrun_hooks = ["on_dag_run_success", "on_dag_run_failure"]
   ```



##
airflow/jobs/scheduler_job.py:
##
@@ -1568,3 +1590,21 @@ def _cleanup_stale_dags(self, session: Session = 
NEW_SESSION) -> None:
 dag.is_active = False
 SerializedDagModel.remove_dag(dag_id=dag.dag_id, session=session)
 session.flush()
+
+def notify_dagrun_state_changed(self, dag_run: DagRun, msg: str = ""):
+if not self.enabled_dagrun_listener or not 
get_listener_manager().has_scheduler_listeners:
+return
+
+if dag_run.state == DagRunState.RUNNING:
+self._notification_threadpool.submit(  # type: ignore[union-attr]
+get_listener_manager().hook.on_dag_run_start, dag_run=dag_run, 
msg=msg
+)
+elif dag_run.state == DagRunState.SUCCESS:
+self._notification_threadpool.submit(  # type: ignore[union-attr]
+get_listener_manager().hook.on_dag_run_success, 
dag_run=dag_run, msg=msg
+)
+elif dag_run.state == DagRunState.FAILED:
+self._notification_threadpool.submit(  # type: ignore[union-attr]
+get_listener_manager().hook.on_dag_run_failure, 
dag_run=dag_run, msg=msg
+)

Review Comment:
   I'm not sure the threadpool belongs in here.
   
   If we make that the _plugin's_ responsibility then
   
   a) I don't think forcing a threadpool (of a fixed size, but that could be 
config driven) on users of this hook is required;
   b) We probably don't need `has_scheduler_listeners` anymore; 
   b) This PR becomes a lot smaller.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ayushthe1 commented on issue #27200: Handle TODO: .first() is not None can be changed to .scalar()

2022-10-27 Thread GitBox


ayushthe1 commented on issue #27200:
URL: https://github.com/apache/airflow/issues/27200#issuecomment-1293687618

   hey @potiuk ,i made a pr for this issue .Could you please review it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


potiuk commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293680414

   You can always drop the whole cache to verify what causes it:
   
   https://linuxhint.com/clear_cache_linux/
   
   Also you can do some trial/error to see which files are in the cache as 
explained in this answer:
   
https://serverfault.com/questions/278454/is-it-possible-to-list-the-files-that-are-cached
   
   Seems this is not easy to get list of files which contribute to cache, but 
if you have some guesses you might try to find out by using fntools.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] zachliu commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


zachliu commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293668371

   yeah, the root cause of this might be somewhere else. here are the facts:
   * setting `CONFIG_PROCESSOR_MANAGER_LOGGER=True` does make the cache increase
   * deleting files under `~/logs/dag_processor_manager` has no effect on cache 
memory usage, also there are cache files i cannot delete
   
![2022-10-27_11-02](https://user-images.githubusercontent.com/14293802/198326228-bb59c72d-fb38-4fdd-bcd7-8ec49582db86.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #27148: Make custom env vars optional for job templates

2022-10-27 Thread GitBox


potiuk commented on PR #27148:
URL: https://github.com/apache/airflow/pull/27148#issuecomment-1293656226

   Needs conflict resilution 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #14261: Airflow Scheduler liveness probe crashing (version 2.0)

2022-10-27 Thread GitBox


potiuk commented on issue #14261:
URL: https://github.com/apache/airflow/issues/14261#issuecomment-1293655208

   If you can do some analysis - look at the hostname that you got there (maybe 
add echo) and see if it still there in 2.4.* that would be awesome 
@dschneiderch (and open a new issue if it is still there for you - including 
some more information - log and possibly content of your jobs table)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #27326: Fix failing coverage info test

2022-10-27 Thread GitBox


potiuk merged PR #27326:
URL: https://github.com/apache/airflow/pull/27326


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated: Fix failing coverage info test (#27326)

2022-10-27 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 12b8bc1d75 Fix failing coverage info test (#27326)
12b8bc1d75 is described below

commit 12b8bc1d754ab8db1ca224cfe4ce6e34254b35d4
Author: Jarek Potiuk 
AuthorDate: Thu Oct 27 16:49:28 2022 +0200

Fix failing coverage info test (#27326)

The #27304 was merged with failing tests (my bad) after fixing
head -> heads typo.

This PR fixes the source of tests files where the typo has been
also corrected.
---
 dev/breeze/tests/test_pr_info_files/pr_github_context.json| 2 +-
 dev/breeze/tests/test_pr_info_files/push_github_context.json  | 2 +-
 dev/breeze/tests/test_pr_info_files/schedule_github_context.json  | 2 +-
 dev/breeze/tests/test_pr_info_files/self_hosted_forced_pr.json| 2 +-
 dev/breeze/tests/test_pr_info_files/simple_pr.json| 2 +-
 dev/breeze/tests/test_pr_info_files/simple_pr_different_repo.json | 2 +-
 6 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/dev/breeze/tests/test_pr_info_files/pr_github_context.json 
b/dev/breeze/tests/test_pr_info_files/pr_github_context.json
index dab869f396..af8e7a40f1 100644
--- a/dev/breeze/tests/test_pr_info_files/pr_github_context.json
+++ b/dev/breeze/tests/test_pr_info_files/pr_github_context.json
@@ -31,5 +31,5 @@
 }
 },
 "ref_name": "main",
-"ref": "refs/head/main"
+"ref": "refs/heads/main"
 }
diff --git a/dev/breeze/tests/test_pr_info_files/push_github_context.json 
b/dev/breeze/tests/test_pr_info_files/push_github_context.json
index d02cf8ac00..4e04c8d3df 100644
--- a/dev/breeze/tests/test_pr_info_files/push_github_context.json
+++ b/dev/breeze/tests/test_pr_info_files/push_github_context.json
@@ -7,5 +7,5 @@
 }
 },
 "ref_name": "main",
-"ref": "refs/head/main"
+"ref": "refs/heads/main"
 }
diff --git a/dev/breeze/tests/test_pr_info_files/schedule_github_context.json 
b/dev/breeze/tests/test_pr_info_files/schedule_github_context.json
index a66a03dfb2..9f7fc57392 100644
--- a/dev/breeze/tests/test_pr_info_files/schedule_github_context.json
+++ b/dev/breeze/tests/test_pr_info_files/schedule_github_context.json
@@ -5,5 +5,5 @@
 "schedule": "28 0 * * *"
 },
 "ref_name": "main",
-"ref": "refs/head/main"
+"ref": "refs/heads/main"
 }
diff --git a/dev/breeze/tests/test_pr_info_files/self_hosted_forced_pr.json 
b/dev/breeze/tests/test_pr_info_files/self_hosted_forced_pr.json
index f118681372..153146f769 100644
--- a/dev/breeze/tests/test_pr_info_files/self_hosted_forced_pr.json
+++ b/dev/breeze/tests/test_pr_info_files/self_hosted_forced_pr.json
@@ -25,5 +25,5 @@
 }
 },
 "ref_name": "main",
-"ref": "refs/head/main"
+"ref": "refs/heads/main"
 }
diff --git a/dev/breeze/tests/test_pr_info_files/simple_pr.json 
b/dev/breeze/tests/test_pr_info_files/simple_pr.json
index da0fb12bb7..c7a34fbc69 100644
--- a/dev/breeze/tests/test_pr_info_files/simple_pr.json
+++ b/dev/breeze/tests/test_pr_info_files/simple_pr.json
@@ -22,5 +22,5 @@
 }
 },
 "ref_name": "main",
-"ref": "refs/head/main"
+"ref": "refs/heads/main"
 }
diff --git a/dev/breeze/tests/test_pr_info_files/simple_pr_different_repo.json 
b/dev/breeze/tests/test_pr_info_files/simple_pr_different_repo.json
index 8ce2f521ec..2e78021748 100644
--- a/dev/breeze/tests/test_pr_info_files/simple_pr_different_repo.json
+++ b/dev/breeze/tests/test_pr_info_files/simple_pr_different_repo.json
@@ -22,5 +22,5 @@
 }
 },
 "ref_name": "main",
-"ref": "refs/head/main"
+"ref": "refs/heads/main"
 }



[GitHub] [airflow] potiuk commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


potiuk commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293640759

   Would be great contribution back :) ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


potiuk commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293640011

   Maybe The rotating file handler has another place where it copies files and 
leaves them behind. Not the end of the world (as you know this is 
no-harm-at-all and perfecrly normal to happen.
   
   Maybe i will take a look soon (or maybe you can @zahchliu - you could see 
how I've done that and you could potentially iterate on it and verify it in 
your test system and make a PR after you test it ? How about that?
   
   Also there are ways you can check if this might be the cause. Just delete 
the rotated files and see if that causes drrop in cache memory used.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ei-grad commented on a diff in pull request #23720: Fix backfill queued task getting reset to scheduled state.

2022-10-27 Thread GitBox


ei-grad commented on code in PR #23720:
URL: https://github.com/apache/airflow/pull/23720#discussion_r1006973733


##
airflow/executors/kubernetes_executor.py:
##
@@ -464,7 +464,9 @@ def clear_not_launched_queued_tasks(self, session=None) -> 
None:
 if not self.kube_client:
 raise AirflowException(NOT_STARTED_MESSAGE)
 
-query = session.query(TaskInstance).filter(TaskInstance.state == 
State.QUEUED)
+query = session.query(TaskInstance).filter(
+TaskInstance.state == State.QUEUED, TaskInstance.queued_by_job_id 
== self.job_id
+)

Review Comment:
   Would it be possible to have more than one backfill running then?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk closed issue #11085: Airflow Elasticsearch configuration log output does not contain required elements

2022-10-27 Thread GitBox


potiuk closed issue #11085: Airflow Elasticsearch configuration log output does 
not contain required elements
URL: https://github.com/apache/airflow/issues/11085


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #11085: Airflow Elasticsearch configuration log output does not contain required elements

2022-10-27 Thread GitBox


potiuk commented on issue #11085:
URL: https://github.com/apache/airflow/issues/11085#issuecomment-1293633447

   Working fine in Aiflow 2.4.2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #11085: Airflow Elasticsearch configuration log output does not contain required elements

2022-10-27 Thread GitBox


potiuk commented on issue #11085:
URL: https://github.com/apache/airflow/issues/11085#issuecomment-1293632868

   Told ya :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk closed issue #26566: Have SLA docs reflect reality

2022-10-27 Thread GitBox


potiuk closed issue #26566: Have SLA docs reflect reality
URL: https://github.com/apache/airflow/issues/26566


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #27111: Update SLA wording to reflect it is relative to Dag Run start

2022-10-27 Thread GitBox


potiuk merged PR #27111:
URL: https://github.com/apache/airflow/pull/27111


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated: Update SLA wording to reflect it is relative to Dag Run start. (#27111)

2022-10-27 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 639210a7e0 Update SLA wording to reflect it is relative to Dag Run 
start. (#27111)
639210a7e0 is described below

commit 639210a7e0bfc3f04f28c7d7278292d2cae7234b
Author: Damian Shaw <111310636+notatallshaw-...@users.noreply.github.com>
AuthorDate: Thu Oct 27 10:34:57 2022 -0400

Update SLA wording to reflect it is relative to Dag Run start. (#27111)
---
 docs/apache-airflow/concepts/tasks.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/apache-airflow/concepts/tasks.rst 
b/docs/apache-airflow/concepts/tasks.rst
index 63fbe818e0..c3f9d1de3b 100644
--- a/docs/apache-airflow/concepts/tasks.rst
+++ b/docs/apache-airflow/concepts/tasks.rst
@@ -158,7 +158,7 @@ If you merely want to be notified if a task runs over but 
still let it run to co
 SLAs
 
 
-An SLA, or a Service Level Agreement, is an expectation for the maximum time a 
Task should take. If a task takes longer than this to run, it is then visible 
in the "SLA Misses" part of the user interface, as well as going out in an 
email of all tasks that missed their SLA.
+An SLA, or a Service Level Agreement, is an expectation for the maximum time a 
Task should be completed relative to the Dag Run start time. If a task takes 
longer than this to run, it is then visible in the "SLA Misses" part of the 
user interface, as well as going out in an email of all tasks that missed their 
SLA.
 
 Tasks over their SLA are not cancelled, though - they are allowed to run to 
completion. If you want to cancel a task after a certain runtime is reached, 
you want :ref:`concepts:timeouts` instead.
 



[GitHub] [airflow] zachliu commented on issue #27065: Log files are still being cached causing ever-growing memory usage when scheduler is running

2022-10-27 Thread GitBox


zachliu commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293620996

   @potiuk the cache memory is still growing :crying_cat_face: 
   
   
![2022-10-27_10-32](https://user-images.githubusercontent.com/14293802/198315723-c48d12b0-314d-4459-b485-dce1e169940a.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #27326: Fix failing coverage info test

2022-10-27 Thread GitBox


potiuk commented on PR #27326:
URL: https://github.com/apache/airflow/pull/27326#issuecomment-1293618599

   Tests are great - need an approval to unbreak main :pray:  (few lines only)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated (5e6cec849a -> 671029bebc)

2022-10-27 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


from 5e6cec849a Update google_analytics.html (#27226)
 add 671029bebc Refactor amazon providers tests which use `moto` (#27214)

No new revisions were added by this update.

Summary of changes:
 setup.py   |  2 +-
 tests/providers/amazon/aws/hooks/conftest.py   | 34 
 tests/providers/amazon/aws/hooks/test_base_aws.py  | 16 +---
 .../amazon/aws/hooks/test_cloud_formation.py   | 13 +--
 tests/providers/amazon/aws/hooks/test_ec2.py   | 96 --
 tests/providers/amazon/aws/hooks/test_ecs.py   |  6 --
 tests/providers/amazon/aws/hooks/test_eks.py   | 15 +---
 tests/providers/amazon/aws/hooks/test_emr.py   | 13 ++-
 tests/providers/amazon/aws/hooks/test_glue.py  | 11 +--
 .../amazon/aws/hooks/test_glue_catalog.py  | 28 +--
 tests/providers/amazon/aws/hooks/test_kinesis.py   | 11 +--
 .../amazon/aws/hooks/test_lambda_function.py   | 10 +--
 tests/providers/amazon/aws/hooks/test_logs.py  | 21 ++---
 .../amazon/aws/hooks/test_redshift_cluster.py  | 16 +---
 tests/providers/amazon/aws/hooks/test_s3.py| 18 ++--
 .../amazon/aws/hooks/test_secrets_manager.py   | 19 +
 tests/providers/amazon/aws/hooks/test_sns.py   | 28 ++-
 tests/providers/amazon/aws/hooks/test_sqs.py   | 10 +--
 .../amazon/aws/hooks/test_step_function.py | 14 +---
 .../amazon/aws/log/test_cloudwatch_task_handler.py | 12 +--
 .../amazon/aws/log/test_s3_task_handler.py | 14 +---
 tests/providers/amazon/aws/operators/test_ec2.py   | 34 
 tests/providers/amazon/aws/operators/test_ecs.py   | 54 +---
 tests/providers/amazon/aws/operators/test_rds.py   | 18 +---
 .../amazon/aws/sensors/test_cloud_formation.py | 21 ++---
 tests/providers/amazon/aws/sensors/test_ec2.py | 36 
 .../aws/sensors/test_glue_catalog_partition.py | 11 +--
 tests/providers/amazon/aws/sensors/test_rds.py | 10 +--
 .../amazon/aws/sensors/test_redshift_cluster.py| 13 +--
 .../amazon/aws/system/utils/test_helpers.py|  7 +-
 .../amazon/aws/transfers/test_gcs_to_s3.py | 20 +
 .../amazon/aws/utils/eks_test_constants.py |  1 -
 tests/providers/amazon/conftest.py | 61 ++
 33 files changed, 246 insertions(+), 447 deletions(-)
 delete mode 100644 tests/providers/amazon/aws/hooks/conftest.py
 create mode 100644 tests/providers/amazon/conftest.py



[GitHub] [airflow] potiuk merged pull request #27214: Refactor amazon providers tests which use `moto`

2022-10-27 Thread GitBox


potiuk merged PR #27214:
URL: https://github.com/apache/airflow/pull/27214


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #27214: Refactor amazon providers tests which use `moto`

2022-10-27 Thread GitBox


potiuk commented on PR #27214:
URL: https://github.com/apache/airflow/pull/27214#issuecomment-1293616490

   Yeah - re-running the jobs fixed it :( . Now I re-run #27322 on public 
runners to see the exclusion working. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] BobDu commented on a diff in pull request #27316: [docs] best-practices add use variable with template example.

2022-10-27 Thread GitBox


BobDu commented on code in PR #27316:
URL: https://github.com/apache/airflow/pull/27316#discussion_r1006950778


##
docs/apache-airflow/best-practices.rst:
##
@@ -213,6 +213,30 @@ or if you need to deserialize a json object from the 
variable :
 
 {{ var.json. }}
 
+Ensure use variable with template in operator, not get it in top level code.
+
+Bad example:
+
+.. code-block:: python
+
+from airflow.models import Variable
+
+foo_var = Variable.get("foo")
+bash_use_variable_bad = BashOperator(
+task_id="bash_use_variable_bad", bash_command="echo variable 
foo=${foo_env}", env={"foo_env": foo_var}
+)
+
+Good example:
+
+.. code-block:: python
+
+bash_use_variable_good = BashOperator(
+task_id="bash_use_variable_good",
+bash_command="echo variable foo=${foo_env}",
+env={"foo_env": "{{ var.value.get('foo') }}"},
+)
+
+

Review Comment:
   ? `bash_command="echo variable foo=${Variable.get('foo')}"` is not a 
effective syntax.
   Are you want to say `bash_command=f"echo variable 
foo={Variable.get('foo')}"` ?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk opened a new pull request, #27326: Fix failing coverage info test

2022-10-27 Thread GitBox


potiuk opened a new pull request, #27326:
URL: https://github.com/apache/airflow/pull/27326

   The #27304 was merged with failing tests (my bad) after fixing head -> heads 
typo.
   
   This PR fixes the source of tests files where the typo has been also 
corrected.
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragment file, named `{pr_number}.significant.rst` or 
`{issue_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] caupetit-itf commented on issue #11085: Airflow Elasticsearch configuration log output does not contain required elements

2022-10-27 Thread GitBox


caupetit-itf commented on issue #11085:
URL: https://github.com/apache/airflow/issues/11085#issuecomment-1293604452

   I've juste tested from the docker image 2.4.2-python3.10
   No more endless loop and can see my end_of_log with a log_id in 
elasticsearch !
   
   So for me the problem is resolved :) I will update to the latest version 
first thing next time : )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #27304: Fix coverage upload step

2022-10-27 Thread GitBox


potiuk commented on PR #27304:
URL: https://github.com/apache/airflow/pull/27304#issuecomment-1293598403

   Ah I have not seen the test failing . Bad me . FGixing it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] potiuk closed issue #27324: Add `held` to possible TaskInstanceState

2022-10-27 Thread GitBox


potiuk closed issue #27324: Add `held` to possible TaskInstanceState
URL: https://github.com/apache/airflow/issues/27324


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] cdabella opened a new issue, #27324: Add `held` to possible TaskInstanceState

2022-10-27 Thread GitBox


cdabella opened a new issue, #27324:
URL: https://github.com/apache/airflow/issues/27324

   ### Description
   
   Add `held` as a TaskInstanceState which functions similarly to `failed` but 
represents a stop in DAG execution that is known and planned.
   
   ### Use case/motivation
   
   Many DAGs and pipelines have steps that require human intervention and 
sign-off before continuing, like manual data validation or manager approval 
before continuation. This can be functionally achieved today by marking a task 
as `failed`, but operationally overloading the meaning of `failed` can cause 
issues with Ops/monitoring/alerting that may not have the complete picture to 
know whether a task has truly failed or has been marked `failed` by design. 
Adding an additional TaskInstanceState which represents putting a task on-hold 
improves clarity in DAG design while achieving the functional goal of stopping 
a DAGRun from continuing.
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #27323: Handle the todo part and replaced .first() is not None to .scalar()

2022-10-27 Thread GitBox


boring-cyborg[bot] commented on PR #27323:
URL: https://github.com/apache/airflow/pull/27323#issuecomment-1293542841

   Congratulations on your first Pull Request and welcome to the Apache Airflow 
community! If you have any issues or are unsure about any anything please check 
our Contribution Guide 
(https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type 
annotations). Our [pre-commits]( 
https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks)
 will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in 
`docs/` directory). Adding a new operator? Check this short 
[guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst)
 Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze 
environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for 
testing locally, it's a heavy docker but it ships with a working Airflow and a 
lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get 
the final approval from Committers.
   - Please follow [ASF Code of 
Conduct](https://www.apache.org/foundation/policies/conduct) for all 
communication including (but not limited to) comments on Pull Requests, Mailing 
list and Slack.
   - Be sure to read the [Airflow Coding style]( 
https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it 
better .
   In case of doubts contact the developers at:
   Mailing List: d...@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ayushthe1 opened a new pull request, #27323: Handle the todo part and replaced .first() is not None to .scalar()

2022-10-27 Thread GitBox


ayushthe1 opened a new pull request, #27323:
URL: https://github.com/apache/airflow/pull/27323

   
   closes #27200 : Changed `.first() is not None` to `.scalar()` in the todo 
section of 
[file](https://github.com/apache/airflow/blob/d67ac5932dabbf06ae733fc57b48491a8029b8c2/airflow/models/serialized_dag.py#L156-L158)
   
   ---
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] ejstembler commented on issue #27300: Scheduler encounters database update error, then gets stuck in endless loop, yet still shows as healthy

2022-10-27 Thread GitBox


ejstembler commented on issue #27300:
URL: https://github.com/apache/airflow/issues/27300#issuecomment-1293542042

   ```
   [2022-10-24 22:11:55,940] {scheduler_job.py:768} ERROR - Exception when 
executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
 File 
"/usr/local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py", line 
751, in _execute
   self._run_scheduler_loop()
 File 
"/usr/local/lib/python3.9/site-packages/astronomer/airflow/version_check/plugin.py",
 line 29, in run_before
   fn(*args, **kwargs)
 File 
"/usr/local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py", line 
839, in _run_scheduler_loop
   num_queued_tis = self._do_scheduling(session)
 File 
"/usr/local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py", line 
911, in _do_scheduling
   self._create_dagruns_for_dags(guard, session)
 File "/usr/local/lib/python3.9/site-packages/airflow/utils/retries.py", 
line 76, in wrapped_function
   for attempt in run_with_db_retries(max_retries=retries, logger=logger, 
**retry_kwargs):
 File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 
382, in __iter__
   do = self.iter(retry_state=retry_state)
 File "/usr/local/lib/python3.9/site-packages/tenacity/__init__.py", line 
349, in iter
   return fut.result()
 File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 439, in 
result
   return self.__get_result()
 File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in 
__get_result
   raise self._exception
 File "/usr/local/lib/python3.9/site-packages/airflow/utils/retries.py", 
line 85, in wrapped_function
   return func(*args, **kwargs)
 File 
"/usr/local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py", line 
979, in _create_dagruns_for_dags
   self._create_dag_runs(query.all(), session)
 File 
"/usr/local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py", line 
1029, in _create_dag_runs
   dag.create_dagrun(
 File "/usr/local/lib/python3.9/site-packages/airflow/utils/session.py", 
line 68, in wrapper
   return func(*args, **kwargs)
 File "/usr/local/lib/python3.9/site-packages/airflow/models/dag.py", line 
2384, in create_dagrun
   session.flush()
 File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", 
line 3255, in flush
   self._flush(objects)
 File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", 
line 3395, in _flush
   transaction.rollback(_capture_exception=True)
 File 
"/usr/local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 
70, in __exit__
   compat.raise_(
 File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", 
line 211, in raise_
   raise exception
 File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", 
line 3355, in _flush
   flush_context.execute()
 File 
"/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 
453, in execute
   rec.execute(self)
 File 
"/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 
627, in execute
   util.preloaded.orm_persistence.save_obj(
 File 
"/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/persistence.py", line 
234, in save_obj
   _emit_update_statements(
 File 
"/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/persistence.py", line 
1032, in _emit_update_statements
   raise orm_exc.StaleDataError(
   sqlalchemy.orm.exc.StaleDataError: UPDATE statement on table 'dag' expected 
to update 1 row(s); 0 were matched.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] notatallshaw-gts commented on pull request #27111: Update SLA wording to reflect it is relative to Dag Run start

2022-10-27 Thread GitBox


notatallshaw-gts commented on PR #27111:
URL: https://github.com/apache/airflow/pull/27111#issuecomment-1293526817

   @potiuk Whenever you get a chance, any objections?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[airflow] branch main updated: Update google_analytics.html (#27226)

2022-10-27 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
 new 5e6cec849a Update google_analytics.html (#27226)
5e6cec849a is described below

commit 5e6cec849a5fa90967df1447aba9521f1cfff3d0
Author: oleg-ruban <54796035+oleg-ru...@users.noreply.github.com>
AuthorDate: Thu Oct 27 16:25:47 2022 +0300

Update google_analytics.html (#27226)

fix bug #27225 - Tracking User Activity Issue: Google Analytics tag version 
is not up-to-date
https://github.com/apache/airflow/issues/27225
---
 airflow/www/templates/analytics/google_analytics.html | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/airflow/www/templates/analytics/google_analytics.html 
b/airflow/www/templates/analytics/google_analytics.html
index ab661a05b6..379f32f930 100644
--- a/airflow/www/templates/analytics/google_analytics.html
+++ b/airflow/www/templates/analytics/google_analytics.html
@@ -17,12 +17,12 @@
  under the License.
 #}
 
+
+https://www.googletagmanager.com/gtag/js?id={{ analytics_id 
}}">
 
-  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
-  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new 
Date();a=s.createElement(o),
-  
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
-  
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
+  window.dataLayer = window.dataLayer || [];
+  function gtag(){dataLayer.push(arguments);}
+  gtag('js', new Date());
 
-  ga('create', '{{ analytics_id }}', 'auto');
-  ga('send', 'pageview');
+  gtag('config', '{{ analytics_id }}');
 



  1   2   >