[jira] [Commented] (AIRFLOW-3905) Allow using parameters for sql statement in SqlSensor

2022-06-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17552690#comment-17552690
 ] 

ASF GitHub Bot commented on AIRFLOW-3905:
-

malthe commented on PR #4723:
URL: https://github.com/apache/airflow/pull/4723#issuecomment-1152249671

   Shouldn't the `parameters` be templated?




> Allow using parameters for sql statement in SqlSensor
> -
>
> Key: AIRFLOW-3905
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3905
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Affects Versions: 1.10.2
>Reporter: Xiaodong Deng
>Assignee: Xiaodong Deng
>Priority: Minor
> Fix For: 1.10.3
>
>
> In most SQL-related operators/sensors, argument `parameters` is available to 
> help render SQL command conveniently. But this is not available in SqlSensor 
> yet.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (AIRFLOW-517) Metric for webserver start time

2022-06-04 Thread Teja Nanduri (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17547922#comment-17547922
 ] 

Teja Nanduri commented on AIRFLOW-517:
--

Hi [~aoen] , I want to try and contribute to airflow. Is this task still 
available to pick?

> Metric for webserver start time
> ---
>
> Key: AIRFLOW-517
> URL: https://issues.apache.org/jira/browse/AIRFLOW-517
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webserver
>Reporter: Dan Davydov
>Priority: Minor
>  Labels: beginner, starter
>
> It would be good to have a metric for webserver start time to e.g. see if 
> there are regressions due to a bad dag or some other infrastructure issue.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-05-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17540543#comment-17540543
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

potiuk commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1133832625

   > > @turbaszek Let me make a PR later~ We are doing pressure tests these 
days and this problem had appeared often.
   > 
   > Hey turbaszek, Any chance to have PR submitted, we are experiencing in 
2.3.0 as well.
   
   I think you wanted to call @ghostbody who wanted to submi the fix @vanducng .




> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-05-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17540527#comment-17540527
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

vanducng commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1133817633

   > @turbaszek Let me make a PR later~ We are doing pressure tests these days 
and this problem had appeared often.
   
Hey turbaszek, Any chance to have PR submitted, we are experiencing in 
2.3.0 as well.




> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (AIRFLOW-6500) dependency 'Execution Date' FAILED: Execution date is in the future

2022-05-09 Thread Lewis John McGibbney (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533948#comment-17533948
 ] 

Lewis John McGibbney commented on AIRFLOW-6500:
---

Hi [~mvshpavan] we also encountered this. I also saw the following ticket 
https://github.com/apache/airflow/issues/17481
I am using v2.2.4 via Helm on K8s 
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.5", 
GitCommit:"5c99e2ac2ff9a3c549d9ca665e7bc05a3e18f07e", GitTreeState:"clean", 
BuildDate:"2021-12-16T08:38:33Z", GoVersion:"go1.16.12", Compiler:"gc", 
Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"11+", 
GitVersion:"v1.11.5-dirty", 
GitCommit:"753b2dbc622f5cc417845f0ff8a77f539a4213ea", GitTreeState:"dirty", 
BuildDate:"2018-12-03T17:48:52Z", GoVersion:"go1.10.3", Compiler:"gc", 
Platform:"linux/amd64"}
WARNING: version difference between client (1.22) and server (1.11) exceeds the 
supported minor version skew of +/-1

> dependency 'Execution Date' FAILED: Execution date is in the future
> ---
>
> Key: AIRFLOW-6500
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6500
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.6
>Reporter: Pavan Maguluri
>Priority: Major
>
> I have the below Dag with "schedule_interval" set to None and "start_date" 
> set to airflow.utils.dates.days_ago(1). 
> Task1 --> Task2 --> End
> When I triggered the Dag from the Web-UI it is failing with the below reason. 
> Other Dags which has "schedule_interval" set to some value don't have this 
> issue.
> *[2020-01-07 07:06:39,962] \{taskinstance.py:624} INFO - Dependencies not met 
> for , 
> dependency 'Execution Date' FAILED: Execution date 
> 2020-01-07T07:06:58.550778+00:00 is in the future (the current date is 
> 2020-01-07T07:06:39.962619+00:00).*
>  
> Default Pipeline Arguments: 
> {code:python}
> PIPELINE_DEFAULT_ARGS = {
>  'owner': 'owner',
>  'start_date': airflow.utils.dates.days_ago(1),
>  'depends_on_past': True,
>  'email_on_failure': False,
>  'email_on_retry': False,
> }{code}
> All the solutions that I have read online tags with schedule_interval set to 
> some value and start_date being dynamic (like datetime.now()). I don't have 
> either of those settings, but, still see the problem. 
> Appreciate if you can shed some light to fix this issue.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (AIRFLOW-778) Metastore Partition Sensor Broken

2022-04-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528879#comment-17528879
 ] 

ASF GitHub Bot commented on AIRFLOW-778:


deprosun commented on PR #2005:
URL: https://github.com/apache/airflow/pull/2005#issuecomment-209271

   Has to be 
   ```python
   MetastorePartitionSensor(
   dag=dag,
   task_id="t",
   schema="some_schema",
   table="some_table",
   partition_name=(
   "visit_date=some_date" # without quotes
   ),
   sql="", # have to provide an empty string due to inheritance issue
   # conn_id and mysql_conn_id have to be duplicated 
   conn_id="hive_conn",
   mysql_conn_id="hive_conn",
   soft_fail=True,
   mode="reschedule",
   poke_interval=60,
   timeout=300,
   )
   ```




> Metastore Partition Sensor Broken
> -
>
> Key: AIRFLOW-778
> URL: https://issues.apache.org/jira/browse/AIRFLOW-778
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Reporter: Dan Davydov
>Assignee: Dan Davydov
>Priority: Blocker
>
> MetastorePartitionSensor always throws an exception on initialization due to 
> 72cc8b3006576153aa30d27643807b4ae5dfb593 .
> Looks like the tests for this are only run if an explicit flag is set which 
> is how this got past CI.
> cc [~xuanji]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (AIRFLOW-778) Metastore Partition Sensor Broken

2022-04-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527795#comment-17527795
 ] 

ASF GitHub Bot commented on AIRFLOW-778:


deprosun commented on PR #2005:
URL: https://github.com/apache/airflow/pull/2005#issuecomment-1109120353

   I got this error when I tried to use the `MetastorePartitionSensor`
   ```
 File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/sensors/sql.py", line 
72, in _get_hook
   conn = BaseHook.get_connection(self.conn_id)
 File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/hooks/base.py", line 
68, in get_connection
   conn = Connection.get_connection_from_secrets(conn_id)
 File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/models/connection.py",
 line 410, in get_connection_from_secrets
   raise AirflowNotFoundException(f"The conn_id `{conn_id}` isn't defined")
   airflow.exceptions.AirflowNotFoundException: The conn_id `` isn't defined
   ```
   
   
   What is the right way of using this Sensor?  I am not sure why I am getting 
the above exception.  This is how I am using it
   
   ```python
   MetastorePartitionSensor(
   dag=dag,
   task_id="t",
   schema="some_schema",
   table="some_table",
   partition_name=(
   "visit_date='some_date'"
   ),
   mysql_conn_id="hive_prod_conn",
   soft_fail=True,
   mode="reschedule",
   poke_interval=60,
   timeout=300,
   )
   ```




> Metastore Partition Sensor Broken
> -----
>
> Key: AIRFLOW-778
> URL: https://issues.apache.org/jira/browse/AIRFLOW-778
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Reporter: Dan Davydov
>Assignee: Dan Davydov
>Priority: Blocker
>
> MetastorePartitionSensor always throws an exception on initialization due to 
> 72cc8b3006576153aa30d27643807b4ae5dfb593 .
> Looks like the tests for this are only run if an explicit flag is set which 
> is how this got past CI.
> cc [~xuanji]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (AIRFLOW-3739) Make start_date optional

2022-04-20 Thread Ashutosh Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525304#comment-17525304
 ] 

Ashutosh Gupta commented on AIRFLOW-3739:
-

[~abitrolly] - Wanted to check if anyone is working on it. I can pick it up.

Thx,

Ashutosh

> Make start_date optional
> 
>
> Key: AIRFLOW-3739
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3739
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 1.10.3
>Reporter: Anatoli Babenia
>Priority: Major
>
> I want to define DAG, but not schedule it for running.
> ```
> from airflow import DAG
> from airflow.operators.bash_operator import BashOperator
> dag = DAG('115', schedule_interval="@daily")
> seed = BashOperator(
>     task_id='get_seed',
>     bash_command='date'
> )
> dag >> seed
> ```
> This fails with the error below.
> ```
> airflow.exceptions.AirflowException: Task is missing the start_date parameter
> zsh returned exit code 1
> ```
> It it possible to make `start_date` optional. If not, why?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (AIRFLOW-714) PrestoHook - Add session properties

2022-04-17 Thread Agario Unblocked (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523383#comment-17523383
 ] 

Agario Unblocked commented on AIRFLOW-714:
--

[agario|https://puffnachrichten.com/tech/agario-boston/13741/] 
[agario|https://alcoholink.community/members/agario/] agario 
[agario|https://www.msn.com/ar-ae/health/search?q=agario.boston] 
[agario|https://www.msn.com/ar-ae/health/search?q=www.zafer2.com] 
[agario|https://www.msn.com/ar-ae/health/search?q=mt2.org] 
[agario|https://www.msn.com/ar-ae/health/search?q=altyazilifilm.cc] 
[agario|https://www.msn.com/ar-ae/health/search?q=editsizserverler.org] 
[agario|https://pagespeed.web.dev/report?url=https%3A%2F%2Fagario.boston%2F] 
[agario|https://gtmetrix.com/reports/agario.boston/XfCIbNyQ/] 
[agario|https://www.webpagetest.org/result/220417_BiDc6X_FZ/] [agario 
online|https://agarioonline.org/] [agario game|https://agariogame.net/] 
[agario|https://agario.id/] [agar.io|https://agar.cc/] [agario 
unblocked|https://agario.nl/] [agario|https://agario.boston/] [agario 
unblocked|https://agar.rip/play/] [agario|https://agario.org.uk/play/]

> PrestoHook - Add session properties
> ---
>
> Key: AIRFLOW-714
> URL: https://issues.apache.org/jira/browse/AIRFLOW-714
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: hooks
>Affects Versions: 2.0.0
>Reporter: Teresa Fontanella De Santis
>Assignee: Teresa Fontanella De Santis
>Priority: Minor
> Fix For: 2.0.0
>
>
> In presto, there are some session properties 
> (https://prestodb.io/docs/current/sql/set-session.html) that can be used in 
> order to make queries more efficient. The idea would be to add session 
> properties in the "Extra Fields" param in Presto Connection.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-714) PrestoHook - Add session properties

2022-04-17 Thread Agario Unblocked (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523382#comment-17523382
 ] 

Agario Unblocked commented on AIRFLOW-714:
--

agario agario agario agario agario agario agario agario agario agario agario 
agario online agario game agario agar.io agario unblocked agario agario 
unblocked agario

> PrestoHook - Add session properties
> ---
>
> Key: AIRFLOW-714
> URL: https://issues.apache.org/jira/browse/AIRFLOW-714
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: hooks
>Affects Versions: 2.0.0
>Reporter: Teresa Fontanella De Santis
>Assignee: Teresa Fontanella De Santis
>Priority: Minor
> Fix For: 2.0.0
>
>
> In presto, there are some session properties 
> (https://prestodb.io/docs/current/sql/set-session.html) that can be used in 
> order to make queries more efficient. The idea would be to add session 
> properties in the "Extra Fields" param in Presto Connection.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-3537) Allow AWS ECS Operator to use templates in task_definition parameter

2022-04-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523163#comment-17523163
 ] 

ASF GitHub Bot commented on AIRFLOW-3537:
-

potiuk commented on PR #4341:
URL: https://github.com/apache/airflow/pull/4341#issuecomment-1100727118

   You take a look at other PRs people added with template fields. You will 
find a number of those If you search.




> Allow AWS ECS Operator to use templates in task_definition parameter
> 
>
> Key: AIRFLOW-3537
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3537
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: aws
>Reporter: tomoya tabata
>Assignee: tomoya tabata
>Priority: Minor
>
> The AWS ECS operator does not currently apply templates to the 
> task_definition parameter.
> I'd like to allow AWS ECS Operator to use templates in task_definition 
> parameter.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-3537) Allow AWS ECS Operator to use templates in task_definition parameter

2022-04-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523162#comment-17523162
 ] 

ASF GitHub Bot commented on AIRFLOW-3537:
-

potiuk commented on PR #4341:
URL: https://github.com/apache/airflow/pull/4341#issuecomment-1100726790

   Just add the field. It's enough.




> Allow AWS ECS Operator to use templates in task_definition parameter
> 
>
> Key: AIRFLOW-3537
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3537
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: aws
>Reporter: tomoya tabata
>Assignee: tomoya tabata
>Priority: Minor
>
> The AWS ECS operator does not currently apply templates to the 
> task_definition parameter.
> I'd like to allow AWS ECS Operator to use templates in task_definition 
> parameter.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-3537) Allow AWS ECS Operator to use templates in task_definition parameter

2022-04-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523158#comment-17523158
 ] 

ASF GitHub Bot commented on AIRFLOW-3537:
-

FridayPush commented on PR #4341:
URL: https://github.com/apache/airflow/pull/4341#issuecomment-1100718236

   I appreciate the invitation @potiuk, generally this PR is what I would 
produce to make the change. I would likely have removed the `template_fields` 
test as it is a value check. Additionally the [BaseOperator already has 
tests](https://github.com/apache/airflow/blob/main/tests/models/test_baseoperator.py)
 that validate the behavior of changes to the `template_fields`
   
   Per the original PR review the only template rendering test I can find in 
other [AWS Operators is the 'batch' 
operator](https://github.com/apache/airflow/blob/main/tests/providers/amazon/aws/operators/test_batch.py#L100).
   
   If I was to submit a new PR with the changes to the template_fields how 
should tests be handled? Would you lean towards removing the test or adding the 
per field test referenced by mik-laj?




> Allow AWS ECS Operator to use templates in task_definition parameter
> 
>
> Key: AIRFLOW-3537
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3537
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: aws
>Reporter: tomoya tabata
>Assignee: tomoya tabata
>Priority: Minor
>
> The AWS ECS operator does not currently apply templates to the 
> task_definition parameter.
> I'd like to allow AWS ECS Operator to use templates in task_definition 
> parameter.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-3537) Allow AWS ECS Operator to use templates in task_definition parameter

2022-04-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523038#comment-17523038
 ] 

ASF GitHub Bot commented on AIRFLOW-3537:
-

potiuk commented on PR #4341:
URL: https://github.com/apache/airflow/pull/4341#issuecomment-1100598087

   > Wish this had been merged. A pain point for us that's one field name added 
to a tuple.
   
   Feel free to contribute it as a PR. This is an open-source project with more 
than 2000 contributors - you are absolutely welcome to contribute such change.




> Allow AWS ECS Operator to use templates in task_definition parameter
> 
>
> Key: AIRFLOW-3537
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3537
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: aws
>Reporter: tomoya tabata
>Assignee: tomoya tabata
>Priority: Minor
>
> The AWS ECS operator does not currently apply templates to the 
> task_definition parameter.
> I'd like to allow AWS ECS Operator to use templates in task_definition 
> parameter.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-3537) Allow AWS ECS Operator to use templates in task_definition parameter

2022-04-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523004#comment-17523004
 ] 

ASF GitHub Bot commented on AIRFLOW-3537:
-

FridayPush commented on PR #4341:
URL: https://github.com/apache/airflow/pull/4341#issuecomment-1100510274

   Wish this had been merged. A pain point for us that's one field name added 
to a tuple. 




> Allow AWS ECS Operator to use templates in task_definition parameter
> 
>
> Key: AIRFLOW-3537
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3537
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: aws
>Reporter: tomoya tabata
>Assignee: tomoya tabata
>Priority: Minor
>
> The AWS ECS operator does not currently apply templates to the 
> task_definition parameter.
> I'd like to allow AWS ECS Operator to use templates in task_definition 
> parameter.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-04-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17520659#comment-17520659
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

woodywuuu commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1095231809

   airflow: 2.2.2 with mysql8、 HA scheduler、celery executor(redis backend)
   
   From logs, it show that those ti reported this error `killed externally 
(status: success)` , were rescheduled! 
   1. scheduler found a ti to scheduled (ti from None to scheduled)
   2. scheduler queued ti(ti from scheduled to queued)
   3. scheduler send ti to celery
   4. worker get ti
   5. worker found ti‘s state in mysql  is scheduled 
https://github.com/apache/airflow/blob/2.2.2/airflow/models/taskinstance.py#L1224
   6. worker set this ti to None
   7. scheduler reschedule this ti
   8. scheduler could not queue this ti again, and found this ti success(in 
celery), so set it to failed
   
   From mysql we get that: all failed task has no external_executor_id!
   
   We use 5000 dags, each with 50 dummy task, found that, if the following two 
conditions are met,the probability of triggering this problem will highly 
increase:
   
   1. no external_executor_id was set to queued ti in celery 
https://github.com/apache/airflow/blob/2.2.2/airflow/jobs/scheduler_job.py#L537
  * This sql above has skip_locked, and some queued ti in celery may miss 
this external_executor_id. 
   10. a scheduler loop cost very long(more than 60s), 
`adopt_or_reset_orphaned_tasks` judge that schedulerJob failed, and try adopt 
orphaned ti 
https://github.com/apache/airflow/blob/9ac742885ffb83c15f7e3dc910b0cf9df073407a/airflow/executors/celery_executor.py#L442
   
   We do these tests:
   1. patch `SchedulerJob. _process_executor_events `, not to set 
external_executor_id to those queued ti
  * 300+ dag failed with `killed externally (status: success)` normally 
less than 10
   2. patch `adopt_or_reset_orphaned_tasks`, not to adopt orphaned ti 
  * no dag failed !
   
   I read the notes 
[below](https://github.com/apache/airflow/blob/9ac742885ffb83c15f7e3dc910b0cf9df073407a/airflow/executors/celery_executor.py#L442)
 , but still don't understand this problems:
   1. why should we handle queued ti in celery and set this external id ?




> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-2320) Can not run DAGs since upgraded from Airflow 1.7

2022-04-10 Thread jer (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jer reassigned AIRFLOW-2320:


Assignee: jer  (was: Sajid Sajid)

> Can not run DAGs since upgraded from Airflow 1.7
> 
>
> Key: AIRFLOW-2320
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2320
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.9.0
>Reporter: Gregory Dougherty
>Assignee: jer
>Priority: Blocker
>  Labels: newbie, run
> Attachments: example_bash_operator.py
>
>
> We installed Airflow 1.7, and used it for several months.  I used PIP to 
> uninstall ariflow 1.7 and install 1.9 (gory details are 
> [here|[https://stackoverflow.com/questions/49544320/airflow-initdb-failed-importerror-no-module-named-log-logging-mixin]).]
>  
> Since then, I haven't had a single DAG run.  I renamed and moved log files to 
> match the 1.9 expectations, but still nothing happens.
>  
> I have a "run every 40 minutes" DAG, it hasn't run since 3/28.  When I 
> manually trigger it, no log file is created, nothing happens except I get a 
> running DAG listed under "DAG Runs" (I do NOT get anything listed under 
> "Recent Tasks", and "Last Run" does not get updated.
>  
> I have a "Run once" DAG that I created.  I triggered it, same behavior.
>  
> I have also tried running the example_bash_operator DAG.  Same behavior.  
> (I've attached the example_bash_operator.py file, so we're clear what I tried 
> to run)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-03-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17507936#comment-17507936
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

kenny813x201 commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1069820469


   We also got the same error message. In our case, it turns out that we are 
using the same name for different dags.
   Changing different dags from `as dag` to like `as dags1` and `as dags2` 
solve the issue for us.
   ```
   with DAG(
   "dag_name",
   ) as dag:
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-1746) Add a Nomad operator to trigger job from Airflow

2022-03-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17506680#comment-17506680
 ] 

ASF GitHub Bot commented on AIRFLOW-1746:
-

shantanugadgil commented on pull request #2708:
URL: https://github.com/apache/airflow/pull/2708#issuecomment-1067538261


   Is this still planned to be merged?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add a Nomad operator to trigger job from Airflow
> 
>
> Key: AIRFLOW-1746
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1746
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib
>Reporter: Eyal Trabelsi
>Assignee: Eyal Trabelsi
>Priority: Major
>
> We recently face the need to trigger nomad jobs from Airflow and no operator 
> are available for that. 
> The way the operator works is to register a nomad job and dispatch the job , 
> than check the status of the job using similar method like boto-core 
> (https://github.com/boto/botocore/blob/5a07b477114b11e6dc5f676f5db810972565b113/botocore/docs/waiter.py)
> The operator uses https://github.com/jrxFive/python-nomad which is a wrap 
> over nomad rest api of nomad written in python.
> Link to the PR : https://github.com/apache/incubator-airflow/pull/2708



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-03-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17506060#comment-17506060
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

aakashanand92 edited a comment on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1066506022


   > We face the same issue with tasks that stay indefinitely in a queued 
status, except that we don't see tasks as `up_for_retry`. It happens randomly 
within our DAGs. The task will stay in a queued status forever until we 
manually make it fail. We **don't use any sensors** at all. We are on an AWS 
MWAA instance (Airflow 2.0.2).
   > 
   > Example logs: Scheduler:
   > 
   > ```
   > [2022-01-14 08:03:32,868] {{scheduler_job.py:1239}} ERROR - Executor 
reports task instance  
finished (failed) although the task says its queued. (Info: None) Was the task 
killed externally?
   > [2022-01-14 08:03:32,845] {{scheduler_job.py:1210}} INFO - Executor 
reports execution of task0 execution_date=2022-01-13 07:00:00+00:00 exited with 
status failed for try_number 1
   >  in state FAILURE
   > ```
   > 
   > Worker:
   > 
   > ```
   > [2021-04-20 20:54:29,109: ERROR/ForkPoolWorker-15] Failed to execute task 
dag_id could not be found: task0. Either the dag did not exist or it failed to 
parse..`
   > This is not seen in the worker logs for every occurrence in the scheduler 
logs.
   > ```
   > 
   > Because of the MWAA autoscaling mechanism, `worker_concurrency` is not 
configurable. `worker_autoscale`: `10, 10`. `dagbag_import_timeout`: 120s 
`dag_file_processor_timeout`: 50s `parallelism` = 48 `dag_concurrency` = 1 
`max_threads` = 8
   > 
   > We currently have 2 (minWorkers) to 10 (maxWorkers) mw1.medium (2 vCPU) 
workers.
   
   @val2k Did you find a solution for this ? I am also using MWAA environment 
and facing the same issue.
   
   The tasks get stuck in queued state and when I look at the scheduler logs I 
can see the same error.
   
   "Executor reports task instance %s finished (%s) although the task says its 
%s. (Info: %s) Was the task killed externally?"
   
   I tried everything I can find in this thread but nothing seems to be working.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
>     Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-03-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17506059#comment-17506059
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

aakashanand92 edited a comment on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1066506022


   > We face the same issue with tasks that stay indefinitely in a queued 
status, except that we don't see tasks as `up_for_retry`. It happens randomly 
within our DAGs. The task will stay in a queued status forever until we 
manually make it fail. We **don't use any sensors** at all. We are on an AWS 
MWAA instance (Airflow 2.0.2).
   > 
   > Example logs: Scheduler:
   > 
   > ```
   > [2022-01-14 08:03:32,868] {{scheduler_job.py:1239}} ERROR - Executor 
reports task instance  
finished (failed) although the task says its queued. (Info: None) Was the task 
killed externally?
   > [2022-01-14 08:03:32,845] {{scheduler_job.py:1210}} INFO - Executor 
reports execution of task0 execution_date=2022-01-13 07:00:00+00:00 exited with 
status failed for try_number 1
   >  in state FAILURE
   > ```
   > 
   > Worker:
   > 
   > ```
   > [2021-04-20 20:54:29,109: ERROR/ForkPoolWorker-15] Failed to execute task 
dag_id could not be found: task0. Either the dag did not exist or it failed to 
parse..`
   > This is not seen in the worker logs for every occurrence in the scheduler 
logs.
   > ```
   > 
   > Because of the MWAA autoscaling mechanism, `worker_concurrency` is not 
configurable. `worker_autoscale`: `10, 10`. `dagbag_import_timeout`: 120s 
`dag_file_processor_timeout`: 50s `parallelism` = 48 `dag_concurrency` = 1 
`max_threads` = 8
   > 
   > We currently have 2 (minWorkers) to 10 (maxWorkers) mw1.medium (2 vCPU) 
workers.
   
   Did you find a solution for this ? I am also using MWAA environment and 
facing the same issue.
   
   The tasks get stuck in queued state and when I look at the scheduler logs I 
can see the same error.
   
   "Executor reports task instance %s finished (%s) although the task says its 
%s. (Info: %s) Was the task killed externally?"
   
   I tried everything I can find in this thread but nothing seems to be working.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
>     Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-03-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17506058#comment-17506058
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

aakashanand92 commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1066506022


   > We face the same issue with tasks that stay indefinitely in a queued 
status, except that we don't see tasks as `up_for_retry`. It happens randomly 
within our DAGs. The task will stay in a queued status forever until we 
manually make it fail. We **don't use any sensors** at all. We are on an AWS 
MWAA instance (Airflow 2.0.2).
   > 
   > Example logs: Scheduler:
   > 
   > ```
   > [2022-01-14 08:03:32,868] {{scheduler_job.py:1239}} ERROR - Executor 
reports task instance  
finished (failed) although the task says its queued. (Info: None) Was the task 
killed externally?
   > [2022-01-14 08:03:32,845] {{scheduler_job.py:1210}} INFO - Executor 
reports execution of task0 execution_date=2022-01-13 07:00:00+00:00 exited with 
status failed for try_number 1
   >  in state FAILURE
   > ```
   > 
   > Worker:
   > 
   > ```
   > [2021-04-20 20:54:29,109: ERROR/ForkPoolWorker-15] Failed to execute task 
dag_id could not be found: task0. Either the dag did not exist or it failed to 
parse..`
   > This is not seen in the worker logs for every occurrence in the scheduler 
logs.
   > ```
   > 
   > Because of the MWAA autoscaling mechanism, `worker_concurrency` is not 
configurable. `worker_autoscale`: `10, 10`. `dagbag_import_timeout`: 120s 
`dag_file_processor_timeout`: 50s `parallelism` = 48 `dag_concurrency` = 1 
`max_threads` = 8
   > 
   > We currently have 2 (minWorkers) to 10 (maxWorkers) mw1.medium (2 vCPU) 
workers.
   
   Did you find a solution for this ? I am also using MWAA environment and 
facing the same issue.
   
   The tasks get stuck in queued state and when I look at the scheduler logs I 
can see the same error.
   
   "Executor reports task instance %s finished (%s) although the task says its 
%s. (Info: %s) Was the task killed externally?"
   
   I tried to everything I can find in this thread but nothing seems to be 
working.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
>     Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-03-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17506007#comment-17506007
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

ghostbody commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1066406484


   @turbaszek Let me make a PR later~ We are doing pressure tests these days 
and this problem had appeared often.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-03-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17505135#comment-17505135
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

turbaszek commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1065613233


   @ghostbody do you have idea how this can be addressed? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-03-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501136#comment-17501136
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

ghostbody commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1058801920


   After STRUGLING, We found a method to 100% reproduce this issue !!!
   
   
   tl;dr 
   
   
https://github.com/apache/airflow/blob/9ac742885ffb83c15f7e3dc910b0cf9df073407a/airflow/models/taskinstance.py#L1253
   
   Add a `raise` to simulate  db error which will likely happen when the DB is 
under great pressure.
   
   Then you will get this issue `Was the task killed externally` in all the 
time.
   
   Conditions:
   
   - Airflow 2.2
   - Celery Executor
   
   It's becasue the worker use a local task job which will spwan a child 
process to execute the job. The parent process set the task from `Queued` to 
`Running` State. However, when the prepare work for the parent process failed, 
it will lead to this error directly.
   
   related code is here: 
https://github.com/apache/airflow/blob/2.2.2/airflow/jobs/local_task_job.py#L89
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-6405) Bigquery Update Table Properties Operator

2022-03-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500879#comment-17500879
 ] 

ASF GitHub Bot commented on AIRFLOW-6405:
-

rolldeep commented on pull request #7126:
URL: https://github.com/apache/airflow/pull/7126#issuecomment-1058249946


   Hi @jithin-sukumar , thanks for your operator!
   
   Unfortunately, I can't figure out how to use it. Currently I want to upsert 
delta table into main table. There is no parameters for the destination (main) 
table.
   
   Here is my operator:
   ```python
   upsert_table = BigQueryUpsertTableOperator(
   task_id=f"upsert_table",
   dataset_id='DATASET_NAME',
   table_resource={
   "tableReference": {"tableId": f" 
{config.get('TABLE_NAME')}"},
   "expirationTime": (int(time.time()) + 300) * 1000,
   },
   )
   ```
   The problem is that I can't choose the destination (main) table. 
@jithin-sukumar can you explain how can I set up my destination table? As I 
see, current implementation uses tableReference for both source and 
destination. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Bigquery Update Table Properties Operator
> -
>
> Key: AIRFLOW-6405
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6405
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: gcp, operators
>Affects Versions: 1.10.7
>Reporter: Jithin Sukumar
>Assignee: Jithin Sukumar
>Priority: Minor
> Fix For: 2.0.0
>
>
> Currently, Airflow doesn't seem to support BigQuery update table operations 
> [1] (specifically to update the properties of a table). Is this already under 
> development?
> (The use case is conditionally updating the `expiration time` of BQ tables.)
>  
> References:
> [1]:  [https://cloud.google.com/bigquery/docs/managing-tables]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-5017) Cannot Parse Environment Variable Connection for Spark on K8s

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-5017:


Assignee: Sajid Sajid

> Cannot Parse Environment Variable Connection for Spark on K8s
> -
>
> Key: AIRFLOW-5017
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5017
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: models
>Affects Versions: 1.10.3
>Reporter: Skyler Lehan
>Assignee: Sajid Sajid
>Priority: Critical
>
> Currently, if you create a Spark on Kubernetes based URL [connection with 
> environment 
> variables|https://airflow.apache.org/howto/connection/index.html#creating-a-connection-with-environment-variables]
>  for the Spark Submit operator, urllib is unable to parse it correctly due to 
> the double scheme that Spark expects (k8s://https:// hostname>:).
> To test this, first set an environment variable for the connection:
> {code:java}
> $ export AIRFLOW_CONN_SPARK_K8S=k8s://https://localhost:8080{code}
> Add the following example DAG:
> {code:java}
> from datetime import datetime, timedelta
> from airflow import DAG
> from airflow.contrib.operators.spark_submit_operator import 
> SparkSubmitOperator
> default_args = {
>   'owner': 'airflow',
>   'depends_on_past': False,
>   'start_date': datetime(2019, 7, 19),
>   'retries': 0
> }
> dag = DAG('spark_submit_k8s', default_args=default_args, 
> schedule_interval=timedelta(days=1))
> run_example_jar = SparkSubmitOperator(
>   task_id='spark_submit_example',
>   
> application='file:///usr/local/spark-2.4.2/examples/jars/spark-examples_2.11- 
>  2.4.2.jar',
>   java_class='org.apache.spark.examples.SparkPi',
>   conn_id='spark_k8s',
>   dag=dag
> )
> {code}
> This fails to parse as per the logs:
> {code:java}
> [2019-07-22 15:05:58,925] {logging_mixin.py:95} INFO - [2019-07-22 
> 15:05:58,925] {base_hook.py:83} INFO - Using connection to: id: spark_k8s. 
> Host: https, Port: None, Schema: /localhost:8080, Login: None, Password: 
> None, extra: {}
> {code}
> Because of the preceding "k8s://" urllib fails to parse this correctly. This 
> consequently will cause the DAG task to fail.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-5181) Connection reset by peer during Dataproc job

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-5181:


Assignee: Sajid Sajid

> Connection reset by peer during Dataproc job
> 
>
> Key: AIRFLOW-5181
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5181
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: plugins
>Affects Versions: 1.10.2
>Reporter: Maulik Soneji
>Assignee: Sajid Sajid
>Priority: Critical
>
> Dataproc job fails intermittently saying `Connection reset by peer`
> This is because of a stale connection in httplib2 cache that might cause this 
> issue.
> This issue happens intermittently and it is still an open issue in httplib2.
> There have been suggestions to not use the httplib2 library and use 
> httplib2shim.
> Link to conversation: 
> https://github.com/googleapis/google-cloud-python/issues/2028
> Link to open issue on httplib2: https://github.com/httplib2/httplib2/issues/40
> Should we also remove httplib2 dependency and use httplib2shim?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-1253) Jobs are not scheduled

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-1253:


Assignee: Sajid Sajid

> Jobs are not scheduled
> --
>
> Key: AIRFLOW-1253
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1253
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.8.1
>Reporter: Liran Y
>Assignee: Sajid Sajid
>Priority: Critical
>
> Jobs are not scheduled for me in airflow when using 24 hour cron (example: 0 
> 7 * * *)
> I tried to debug jobs.py and it seems the following is causing it not to run 
> (line 857):
> {code}
> if next_run_date and period_end and period_end <= datetime.now():
> {code}
> I don't understand the logic here.
> This is what's in the above variables:
> {noformat}
> next_run_date=2017-05-30 07:00:00
> period_end=2017-05-31 07:00:00
> datetime.now()=2017-05-30 07:01:22
> {noformat}
> next_run_date reflects the next run date of the DAG which is correct (today 
> at 07:00)
> period_end reflects when will be the next-next run which is also correct 
> (tomorrow)
> The DAG needs to run in between.
> So the IF above should be:
> {code}
> if next_run_date and period_end and period_end >= datetime.now():
> {code}
> Am I missing something?
> Thanks



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-3873) Issue with DAG dependency using ExternalTaskSensor

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-3873:


Assignee: Sajid Sajid

> Issue with DAG dependency using ExternalTaskSensor
> --
>
> Key: AIRFLOW-3873
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3873
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG
>Affects Versions: 1.10.1
> Environment: Running on Redhat Linux box on which Airflow is 
> Installed.
>Reporter: Krishna Garapati
>Assignee: Sajid Sajid
>Priority: Critical
>
> I have two DAGs Created and want to set dependencies between them using 
> externalTaskSensor as shown below. I am getting the error as "Broken DAG: 
> [/data1/airflow/dags/testdagdependency.py] No module named snakebite.client". 
> Please help me on this.
> ==
> *DAG 1:* 
> from airflow import DAG
> from airflow.operators.bash_operator import BashOperator
> from datetime import datetime, timedelta
> default_args = {
>  'owner': 'Krishna Garapati',
>  'depends_on_past': False,
>  'start_date': datetime(2019, 2, 9),
>  'email': ['krishna.garap...@transamerica.com'],
>  'email_on_failure': True,
>  'email_on_retry': True,
>  'retries': 0,
>  'retry_delay': timedelta(minutes=5)
>  #'queue': 'finance-ingestion',
>  # 'run_as_user': 'sptfinactmodel'
>  # 'pool': 'backfill',
>  # 'priority_weight': 10,
>  # 'end_date': datetime(2016, 1, 1),
> }
> dag = DAG('pythontest',default_args=default_args,schedule_interval='27 2 * * 
> *')
> # t1, t2 and t3 are examples of tasks created by instantiating operators
> t1 = BashOperator(
>  task_id='print_date',
>  bash_command='date',
>  dag=dag)
> t2 = BashOperator(
>  task_id='pythontest',
>  bash_command='\{{"python 
> /preprod/finance/financedatastagedev/scripts/airflowtest/hive/test.py"}}',
>  dag=dag)
> t2.set_upstream(t1)
>  
> ==
>  
> *DAG 2 ( Keeping dependency on DAG1)*
>  
> from airflow import DAG
> from airflow.operators.bash_operator import BashOperator
> from datetime import datetime, timedelta
> from airflow.operators.sensors import ExternalTaskSensor
> default_args = {
>  'owner': 'Krishna Garapati',
>  'depends_on_past': False,
>  'start_date': datetime(2019, 2, 11),
>  'email': ['krishna.garap...@transamerica.com'],
>  'email_on_failure': True,
>  'email_on_retry': True,
>  'retries': 0,
>  'retry_delay': timedelta(minutes=5)
>  #'queue': 'finance-ingestion',
>  # 'run_as_user': 'sptfinactmodel'
>  # 'pool': 'backfill',
>  # 'priority_weight': 10,
>  # 'end_date': datetime(2016, 1, 1),
> }
> dag = DAG('testdagdependency',default_args=default_args,schedule_interval='27 
> 15 * * *')
> wait_for_pythontest = ExternalTaskSensor(
>  task_id='wait_for_pythontest',
>  external_dag_id='pythontest',
>  external_task_id='pythontest',
>  execution_delta=None, # Same day as today
>  dag=dag)
> # t1, t2 and t3 are examples of tasks created by instantiating operators
> t1 = BashOperator(
>  task_id='print_date',
>  bash_command='date',
>  dag=dag)
> t2 = BashOperator(
>  task_id='testdependency',
>  bash_command='\{{"python 
> /preprod/finance/financedatastagedev/scripts/airflowtest/hive/test.py"}}',
>  dag=dag)
> wait_for_pythontest >> testdagdependency
> t2.set_upstream(t1)
>  
> =



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-4126) Graceful restart of Scheduler

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-4126:


Assignee: Sajid Sajid

> Graceful restart of Scheduler
> -
>
> Key: AIRFLOW-4126
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4126
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 1.10.0
>Reporter: Eric
>Assignee: Sajid Sajid
>Priority: Critical
>
> I am running Airflow in a 2 node (webserver and scheduler) configuration in 
> k8s using the LocalExecutor.
> Is there any way to gracefully stop the scheduler? I'd like to stop the 
> scheduler but allow any currently working tasks to finish. I'd like to do 
> this so I can deploy new dags.
> I looked through the documentation for an answer but have not found one yet.
>  
> Thanks



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-3539) Error Importing sub-dag (Webserver module load fail, scheduler works fine)

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-3539:


Assignee: Sajid Sajid  (was: Alex Lumpov)

> Error Importing sub-dag (Webserver module load fail, scheduler works fine)
> --
>
> Key: AIRFLOW-3539
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3539
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler, webserver
>Affects Versions: 1.10.1
>Reporter: Leonardo Zanell
>Assignee: Sajid Sajid
>Priority: Critical
>  Labels: docker, import
> Fix For: 1.10.1
>
> Attachments: Seleção_011.png
>
>
> Hi, I'm deploying the airflow in a VM instance in google cloud plataform, 
> using the docker image of puckel/docker-airflow, and using the 
> docker-compose-Celery-Executor.yaml. When I run the dags locally the imports 
> seens fine. but when I deploy on this docker-compose the import seems wrong 
> to the  webserver, the scheduler runs the dag fine, no problems at  sight, 
> when I turn on the dag in the UI the dag runs in the scheduler/ worker. but 
> on the webserver the import gives me this error:
> {code:java}
> [2018-12-18 13:06:15,273] {{__init__.py:51}} INFO - Using executor 
> CeleryExecutor
> [2018-12-18 13:06:15,762] {{models.py:271}} INFO - Filling up the DagBag from 
> /usr/local/airflow/dags
> [2018-12-18 13:06:16,290] {{models.py:380}} ERROR - Failed to import: 
> /usr/local/airflow/dags/dags/daily_etl.py
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 377, 
> in process_file
>  m = imp.load_source(mod_name, filepath)
>  File "/usr/local/lib/python3.6/imp.py", line 172, in load_source
>  module = _load(spec)
>  File "", line 684, in _load
>  File "", line 665, in _load_unlocked
>  File "", line 678, in exec_module
>  File "", line 219, in _call_with_frames_removed
>  File "/usr/local/airflow/dags/dags/daily_etl.py", line 4, in 
>  from dags.daily_extract import dag as dag_extract
> ModuleNotFoundError: No module named 'dags.daily_extract'
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-5633) Apache Airflow Integration with HashiCorp Nomad

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-5633:


Assignee: Sajid Sajid

> Apache Airflow Integration with HashiCorp Nomad
> ---
>
> Key: AIRFLOW-5633
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5633
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: executors, operators
>Affects Versions: 1.10.5
>Reporter: Yishan
>Assignee: Sajid Sajid
>Priority: Critical
>
> Nomad is an orchestrator, similar to Kubernetes and used in production by 
> companies like PagerDuty, Target, Citadel, Trivago, Pandora, Roblox, etc.  
> Built by HashiCorp (company behind other tools such as Terraform, Vault, and 
> Consul).
> We've gotten many requests from our customers and open source community for 
> an integration with Airflow.  There are a variety of community PRs from 
> supportive members who are trying to build their own (linked notable ones 
> below) - our approach would be to build one officially supported integration 
> ourselves for all our users and customers.  
> [https://github.com/apache/airflow/pull/2708]
> [https://github.com/hashicorp/nomad/issues/419]
> We'd love to understand how much interest and support we can gauge from the 
> Airflow committee around an integration like this.  Ideally, we'd aim to 
> avoid a situation where our integration is not accepted + Nomad users using 
> Airflow are subject to an incomplete/less-than-ideal experience on a lagging 
> HashiCorp-maintained-fork. 
> This has unfortunately happened with our Spark integration, so we'd love to 
> gauge the Airflow committee early-on in the process before we potentially 
> build the integration.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-5214) Airflow leaves too many TIME_WAIT TCP connections

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-5214:


Assignee: Sajid Sajid

> Airflow leaves too many TIME_WAIT TCP connections
> -
>
> Key: AIRFLOW-5214
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5214
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun, database
>Affects Versions: 1.10.2, 1.10.4
> Environment: CentOS 7, Airflow 1.10.4, Maria DB
>Reporter: Oliver Ricken
>Assignee: Sajid Sajid
>Priority: Critical
>
> Dear experts,
> in Airflow version 1.10.2 as well as 1.10.4, we experience a severe problem 
> with the limitation of the number of concurrent tasks.
> We observe that for more than 8 tasks being started and executed in parallel, 
> that the majority of those tasks fails with the error "Can't connect to MySQL 
> server" and error code 2006(99). This error code boils down to "Cannot bind 
> socket to resource", which is why we started looking into the TCP conenctions 
> of our Airflow host (a single node that hosts the webserver, scheduler and 
> worker).
> When the 8 tasks are simultaneously running, we observe more than 15,000 
> TIME_WAIT connections while less than 50 are established. Given, that the 
> number of available ports is somewhat smaller than 30,000, this large number 
> of blocked but unused TCP connections would explain the failing of further 
> task executions.
> Can anyone explain how these many open connections blocking ports/sockets 
> come about? Given that we have connection pooling enabled, we do not see any 
> explanation yet.
> Your help is very much appreciated, this issue strongly limits our current 
> performance!
> Cheers
> Oliver



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-5871) Stopping/Clearing a running airflow instance doesnt't terminate the job actually.

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-5871:


Assignee: Sajid Sajid

> Stopping/Clearing a running airflow instance doesnt't terminate the job 
> actually.
> -
>
> Key: AIRFLOW-5871
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5871
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.1
>Reporter: Vasudha Putta
>Assignee: Sajid Sajid
>Priority: Critical
>
> Hi Team,
> When I change the state of a running job instance to cleared/failed, it 
> doesn't completely terminate the existing job. I tried using pythonOperator, 
> bashOperator. The job connects to oracle and executes a package. Even 
> terminating/killing airflow job process  won't terminate the oracle sessions. 
> This is an issue as whenever we would need to compile the package we would 
> have to stop the dags, marks existing dag runs to clear state and then kill 
> the oracle sessions. Is there a way to clean stop dag runs in airflow.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-6124) tmp folder gets deleted, shell-init: error retrieving current directory

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-6124:


Assignee: Sajid Sajid

> tmp folder gets deleted, shell-init: error retrieving current directory
> ---
>
> Key: AIRFLOW-6124
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6124
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG
>Affects Versions: 1.10.6
>Reporter: Ravi Kaushik
>Assignee: Sajid Sajid
>Priority: Critical
> Attachments: Error.png
>
>
> My dag has three SparkSubmitOperators and throws this error occasionally 
> after executing one task. If I trigger the job manually by ignoring all 
> dependencies, it works well. Most likely /tmp folder is getting deleted and 
> it cannot find it again
>  
> {{```[2019-11-29 12:58:54,548] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,547] \{base_hook.py:84} INFO - Using connection to: id: 
> spark_default. Host: yarn, Port: None, Schema: None, Login: None, Password: 
> None, extra: \{'deploy-mode': 'cluster', 'spark-binary': 
> '/usr/bin/spark2-submit/'}
>  [2019-11-29 12:58:54,548] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,548] \{spark_submit_hook.py:302} INFO - Spark-Submit cmd: 
> ['spark-submit', '--master', 'yarn', '--conf', 'spark.yarn.maxAppAttempts=2', 
> '--files', '/home/centos/airflow/dags/conf/aws_keys.pickle', 
> '--num-executors', '4', '--executor-cores', '4', '--executor-memory', '20G', 
> '--name', 'universal_cross_matrix_joins_AUS_27-11-2019', '--deploy-mode', 
> 'cluster', 
> '/home/centos/airflow/scripts/near-airflow-repo/scripts/universal_cross_matrix/join_ifa_ncid.py',
>  '--input_date', '27-11-2019', '--country', 'AUS', '--tenant_id', '63ce7457']
>  [2019-11-29 12:58:54,553] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,553] \{spark_submit_hook.py:433} INFO - shell-init: error retrieving 
> current directory: getcwd: cannot access parent directories: No such file or 
> directory
>  [2019-11-29 12:58:54,555] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,555] \{spark_submit_hook.py:433} INFO - shell-init: error retrieving 
> current directory: getcwd: cannot access parent directories: No such file or 
> directory
>  [2019-11-29 12:58:54,557] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,557] \{spark_submit_hook.py:433} INFO - shell-init: error retrieving 
> current directory: getcwd: cannot access parent directories: No such file or 
> directory
>  [2019-11-29 12:58:54,560] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,560] \{spark_submit_hook.py:433} INFO - chdir: error retrieving 
> current directory: getcwd: cannot access parent directories: No such file or 
> directory
>  [2019-11-29 12:58:54,563] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,563] \{spark_submit_hook.py:433} INFO - chdir: error retrieving 
> current directory: getcwd: cannot access parent directories: No such file or 
> directory
>  [2019-11-29 12:58:54,565] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,564] \{spark_submit_hook.py:433} INFO - chdir: error retrieving 
> current directory: getcwd: cannot access parent directories: No such file or 
> directory
>  [2019-11-29 12:58:54,567] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,567] \{spark_submit_hook.py:433} INFO - shell-init: error retrieving 
> current directory: getcwd: cannot access parent directories: No such file or 
> directory
>  [2019-11-29 12:58:54,569] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,569] \{spark_submit_hook.py:433} INFO - chdir: error retrieving 
> current directory: getcwd: cannot access parent directories: No such file or 
> directory
>  [2019-11-29 12:58:54,571] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,571] \{spark_submit_hook.py:433} INFO - chdir: error retrieving 
> current directory: getcwd: cannot access parent directories: No such file or 
> directory
>  [2019-11-29 12:58:54,574] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,574] \{spark_submit_hook.py:433} INFO - chdir: error retrieving 
> current directory: getcwd: cannot access parent directories: No such file or 
> directory
>  [2019-11-29 12:58:54,576] \{logging_mixin.py:112} INFO - [2019-11-29 
> 12:58:54,576] \{spark_submit_hook.py:433} INFO - chdir: error retrieving 
> current directory: getcwd: cannot access parent directories: No such file or 
> directory```}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-4499) scheduler process running (in ps) but not doing anything, not writing to log for 3+hrs and not processing tasks

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-4499:


Assignee: Sajid Sajid

> scheduler process running (in ps) but not doing anything, not writing to log 
> for 3+hrs and not processing tasks
> ---
>
> Key: AIRFLOW-4499
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4499
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.3
>Reporter: t oo
>Assignee: Sajid Sajid
>Priority: Critical
>
> blogs mention this as long-standing issue but i could not see open JIRA for 
> it.
> scheduler process running (in ps -ef) but not doing anything, not writing to 
> log for 3+hrs and not processing tasks
> band-aid solution here:
> new config value ---> scheduler_restart_mins = x
> implement auto-restart of scheduler process if scheduler log file not updated 
> within 2*x mins and scheduler process start time is older than x mins
> env: localexecutor



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-7020) Jobs running with cron is not working as expected

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-7020:


Assignee: Sajid Sajid

> Jobs running with cron is not working as expected
> -
>
> Key: AIRFLOW-7020
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7020
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.6
>Reporter: Rohit
>Assignee: Sajid Sajid
>Priority: Critical
> Attachments: dag1_daiily.png, dag2_15min.png, dag3_daily.png, diff.png
>
>
> Job submitted with @daily is finished executing. It worked as expected and 
> finished execution of all tasks like 1 hour ago:
> [|https://imgur.com/uqRz8W6)]
>  
> But another dag running with [15 * * * 
> *|https://w2ec996050d80d4f7-tp.appspot.com/admin/dagrun/?flt2_dag_id_equals=load_num_tasks_t4]
>  is not working as expected all it's task are clogged. (queued not at all 
> running) ![Clogged]([https://imgur.com/rnxvOdw)]
>  
> I waited for hours but still the same result, the job's task is in null state.
>  
> So, I copied the code and just changed the cron from 15 * * * *. to @daily 
> and changed dag  name and started as new dag this one works like a charm
> ![DailyDag]([https://imgur.com/wECr2Qg)]
>  
> As you can see from the following image: 
> ![Difference]([https://imgur.com/ibvA2W3)]
>  * 1st one is dag with 15 * * * *.
>  * 2nd one with @daily
>  
> Why I am getting this? Is it a known issue? 
> Is there any solution?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-6779) Schedule stopped scheduling tasks with concurrency limit or depends on past

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-6779:


Assignee: Sajid Sajid  (was: QP Hou)

> Schedule stopped scheduling tasks with concurrency limit or depends on past
> ---
>
> Key: AIRFLOW-6779
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6779
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.0.0
>Reporter: QP Hou
>Assignee: Sajid Sajid
>Priority: Critical
>
> https://github.com/apache/airflow/pull/4751 introduced a bug that stops the 
> scheduler from scheduling tasks with the following properties:
> * depends on past
> * has custom concurrency limit



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-6560) db password leaks to logs

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-6560:


Assignee: Sajid Sajid

> db password leaks to logs
> -
>
> Key: AIRFLOW-6560
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6560
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging, security
>Affects Versions: 1.10.6
>Reporter: Marcin Jasiński
>Assignee: Sajid Sajid
>Priority: Critical
>
> I have configured Airflow metadata db as MySQL.
> {code:java}
> sql_alchemy_conn = sqla+mysql://airflow:airflow@localhost:3306/airflow{code}
> After that I have used initdb command:
> {code:java}
> airflow initdb{code}
> Tables in the airflow db have been created.
> Then I have runed commands:
> {code:java}
> airflow worker{code}
> in logs:
> {code:java}
> [2020-01-14 18:39:03,457: INFO/MainProcess] Connected to 
> sqla+mysql://airflow:airflow@localhost:3306/airflow{code}
> password as a plain text.
> Probably comes that from Celery - 
> [https://github.com/celery/celery/blob/master/celery/worker/consumer/connection.py#L24]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-5571) Kubernetes operator's bug that get logs will make pod exit unexpectedly

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-5571:


Assignee: Sajid Sajid

> Kubernetes operator's bug that get logs will make pod exit unexpectedly
> ---
>
> Key: AIRFLOW-5571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.0, 1.10.1, 1.10.2, 1.10.3
>Reporter: Liu Xuesi
>Assignee: Sajid Sajid
>Priority: Critical
>  Labels: kubernetes, operator
> Attachments: k8s_error_log.png
>
>
> In my project, I manage 200+ jobs in kubernetes, as I work on, I found a 
> critical bug for kubernetes operator.
> In pod_launcher.py, *_monitor_pod* function:
> {code:python}
> if get_logs:
> logs = self.read_pod_logs(pod) # Here has a retry.
> for line in logs:  # But exception throw from here!
> self.
> {code}
> in above code, *logs* is a HttpResponse, as it implemented __iter__() 
> function, you can use for loop to print the lines. In the other words, here 
> use a http long connection to get endless log.
> There is only try catch over *self.read_pod_logs*, however, If the network is 
> disconnected or jitter occurs, for loop will throw error.
> As I have 200+ job run everyday, I can get 4~5 errors everyday, and each 
> error will let monitor think the pod is down, and then mark the task as 
> failed, then retry it. This eventually lead to data error.
>  
> Below is a typical error log:
> {code:java}
>  [2019-09-17 20:50:02,532] {logging_mixin.py:95} INFO - [2019-09-17 
> 20:50:02,532] {pod_launcher.py:105} INFO - b'19/09/17 20:50:00 INFO 
> yarn.Client: Application report for application_1565926539066_3866 (state: 
> RUNNING)\n'
> [2019-09-17 20:50:02,532] {logging_mixin.py:95} INFO - [2019-09-17 
> 20:50:02,532] {pod_launcher.py:105} INFO - b'19/09/17 20:50:01 INFO 
> yarn.Client: Application report for application_1565926539066_3866 (state: 
> RUNNING)\n'
> [2019-09-17 20:50:02,533] {taskinstance.py:1047} ERROR - ('Connection broken: 
> IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 
> 639, in _update_chunk_length
> self.chunk_left = int(line, 16)
> ValueError: invalid literal for int() with base 16: b''
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 
> 397, in _error_catcher
> yield
>   File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 
> 704, in read_chunked
> self._update_chunk_length()
>   File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 
> 643, in _update_chunk_length
> raise httplib.IncompleteRead(line)
> http.client.IncompleteRead: IncompleteRead(0 bytes read)
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.6/dist-packages/airflow/models/taskinstance.py", line 
> 922, in _run_raw_task
> result = task_copy.execute(context=context)
>   File 
> "/usr/local/lib/python3.6/dist-packages/airflow/contrib/operators/k8s_pod_operator.py",
>  line 45, in execute
> super().execute(context)
>   File 
> "/usr/local/lib/python3.6/dist-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
>  line 148, in execute
> get_logs=self.get_logs)
>   File 
> "/usr/local/lib/python3.6/dist-packages/airflow/contrib/kubernetes/pod_launcher.py",
>  line 97, in run_pod
> return self._monitor_pod(pod, get_logs)
>   File 
> "/usr/local/lib/python3.6/dist-packages/airflow/contrib/kubernetes/pod_launcher.py",
>  line 104, in _monitor_pod
> for line in logs:
>   File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 
> 747, in __iter__
> for chunk in self.stream(decode_content=True):
>   File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 
> 527, in stream
> for line in self.read_chunked(amt, decode_content=decode_content):
>   File "/usr/local/lib/python3.6/dist-packages/urllib3/response.py", line 
> 732, in read_chunked
> self._original_response.close()
>   File &quo

[jira] [Assigned] (AIRFLOW-5686) Deleting DAG can leave scheduled/queued tasks consuming pool slots

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-5686:


Assignee: Sajid Sajid

> Deleting DAG can leave scheduled/queued tasks consuming pool slots
> --
>
> Key: AIRFLOW-5686
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5686
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.5
>Reporter: Ash Berlin-Taylor
>Assignee: Sajid Sajid
>Priority: Critical
> Fix For: 2.0.0
>
>
> If you delete a dag file when it had tasks in the scheduled or queued state, 
> those tasks instances are never touched again.
> With the slight tweak in Pool (making the default pool an explicit one) this 
> now matters, and this ends up with the scheduler being "blocked" from running 
> new tasks wiht this message:
> {noformat}
> Figuring out tasks to run in Pool(name=default_pool) with -9022 open slots 
> and 45 task instances ready to be queued {noformat}
> The fix should be to set any task instance in a non-terminal state (None, 
> queued, scheduled, running, up_for_retry etc.) to "removed" inside 
> DAG.deactivate_stale_dags (which is already called on scheduler shutdown).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-6772) {celery_executor.py:224} ERROR - Error sending Celery task:No module named 'MySQLdb'

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-6772:


Assignee: Sajid Sajid

> {celery_executor.py:224} ERROR - Error sending Celery task:No module named 
> 'MySQLdb'
> 
>
> Key: AIRFLOW-6772
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6772
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery, DAG
>Affects Versions: 1.10.7
> Environment: Ubuntu 18.04
>Reporter: Nitin Vashisth
>Assignee: Sajid Sajid
>Priority: Critical
>
> I have installed airflow with python3.6 and airflow version as 1.10.7
> I did necessary changes in airflow.cfg as below
> {code:java}
> // celery_result_backend = 
> db+postgresql://airflow:airflow@localhost/airflow{code}
> When I am trying to run a sample DAG (tutorial), it is throwing this error in 
> my scheduler.
> {code:java}
> [2020-02-11 13:20:50,892] {celery_executor.py:224} ERROR - Error sending 
> Celery task:No module named 'MySQLdb'
> Celery Task ID: ('tutorial', 'print_date', datetime.datetime(2020, 2, 10, 0, 
> 0, tzinfo=), 1)
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.6/dist-packages/airflow/executors/celery_executor.py",
>  line 118, in send_task_to_executor
> result = task.apply_async(args=[command], queue=queue)
>   File "/usr/local/lib/python3.6/dist-packages/celery/app/task.py", line 568, 
> in apply_async
> **options
>   File "/usr/local/lib/python3.6/dist-packages/celery/app/base.py", line 771, 
> in send_task
> amqp.send_task_message(P, name, message, **options)
>   File "/usr/local/lib/python3.6/dist-packages/celery/app/amqp.py", line 559, 
> in send_task_message
> **properties
>   File "/usr/local/lib/python3.6/dist-packages/kombu/messaging.py", line 181, 
> in publish
> exchange_name, declare,
>   File "/usr/local/lib/python3.6/dist-packages/kombu/connection.py", line 
> 518, in _ensured
> return fun(*args, **kwargs)
>   File "/usr/local/lib/python3.6/dist-packages/kombu/messaging.py", line 194, 
> in _publish
> [maybe_declare(entity) for entity in declare]
>   File "/usr/local/lib/python3.6/dist-packages/kombu/messaging.py", line 194, 
> in 
> [maybe_declare(entity) for entity in declare]
>   File "/usr/local/lib/python3.6/dist-packages/kombu/messaging.py", line 102, 
> in maybe_declare
> return maybe_declare(entity, self.channel, retry, **retry_policy)
>   File "/usr/local/lib/python3.6/dist-packages/kombu/common.py", line 121, in 
> maybe_declare
> return _maybe_declare(entity, channel)
>   File "/usr/local/lib/python3.6/dist-packages/kombu/common.py", line 161, in 
> _maybe_declare
> entity.declare(channel=channel)
>   File "/usr/local/lib/python3.6/dist-packages/kombu/entity.py", line 609, in 
> declare
> self._create_queue(nowait=nowait, channel=channel)
>   File "/usr/local/lib/python3.6/dist-packages/kombu/entity.py", line 618, in 
> _create_queue
> self.queue_declare(nowait=nowait, passive=False, channel=channel)
>   File "/usr/local/lib/python3.6/dist-packages/kombu/entity.py", line 653, in 
> queue_declare
> nowait=nowait,
>   File 
> "/usr/local/lib/python3.6/dist-packages/kombu/transport/virtual/base.py", 
> line 531, in queue_declare
> self._new_queue(queue, **kwargs)
>   File 
> "/usr/local/lib/python3.6/dist-packages/kombu/transport/sqlalchemy/__init__.py",
>  line 82, in _new_queue
> self._get_or_create(queue)
>   File 
> "/usr/local/lib/python3.6/dist-packages/kombu/transport/sqlalchemy/__init__.py",
>  line 70, in _get_or_create
> obj = self.session.query(self.queue_cls) \
>   File 
> "/usr/local/lib/python3.6/dist-packages/kombu/transport/sqlalchemy/__init__.py",
>  line 65, in session
> _, Session = self._open()
>   File 
> "/usr/local/lib/python3.6/dist-packages/kombu/transport/sqlalchemy/__init__.py",
>  line 56, in _open
> engine = self._engine_from_config()
>   File 
> "/usr/local/lib/python3.6/dist-packages/kombu/transport/sqlalchemy/__init__.py",
>  line 51, in _engine_from_config
> return create_engine(conninfo.hostname, **transport_options)
>   File 
> "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/__init__.py", line 
> 479, in create_engine
> return strategy.create(*args, **kwargs)
>   File 
> "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/strategies.py", 
> line 87, in create
> dbapi = dialect_cls.dbapi(**dbapi_args)
>   File 
> "/usr/local/lib/python3.6/dist-packages/sqlalchemy/dialects/mysql/mysqldb.py",
>  line 118, in dbapi
> return __import__("MySQLdb")
> ModuleNotFoundError: No module named 'MySQLdb'
> {code}
> what could be the issue here



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-3097) Capability for nested SubDags

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-3097:


Assignee: Sajid Sajid

> Capability for nested SubDags
> -
>
> Key: AIRFLOW-3097
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3097
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: operators
>Affects Versions: 1.8.0
>Reporter: John Longo
>Assignee: Sajid Sajid
>Priority: Critical
>  Labels: subdag
>
> Unless I'm doing something incorrectly, it appears that you cannot nest 
> SubDags which would be a very helpful feature.  I've created a simple 
> pipeline to demonstrate the failure case below.  It produces the following in 
> Airflow:  Broken DAG: [/home/airflow/airflow/dags/test_dag.py] 'NoneType' 
> object has no attribute 'dag_id' 
> test_dag.py
> {code:java}
> from airflow import DAG
> from airflow.operators.subdag_operator import SubDagOperator
> import datetime
> from datetime import timedelta
> from test_subdag1 import TestSubDag1
> startDate = '2018-09-20'
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'email': ['em...@airflow.com'],
> 'start_date': datetime.datetime(2018, 3, 20, 9, 0),
> 'email_on_failure': False,
> 'email_on_retry': False,
> 'retries': 5,
> 'retry_delay': timedelta(seconds=30),
> 'run_as_user': 'airflow'
> }
> Test_DAG = DAG('Test_DAG', default_args=default_args, 
> start_date=datetime.datetime(2018, 3, 20, 9, 0), schedule_interval=None, 
> catchup=False)
> test_subdag1 = SubDagOperator(subdag=TestSubDag1('Test_DAG', 'test_subdag1', 
> startDate),
> task_id='test_subdag1',
> dag=Test_DAG)
> TestDagConsolidateTask = DummyOperator(task_id='TestDag_Consolidate', 
> dag=Test_DAG)
> test_subdag1 >> TestDagConsolidateTask
> {code}
> test_subdag1.py
> {code:java}
> from airflow import DAG
> from airflow.operators.subdag_operator import SubDagOperator
> from airflow.operators.dummy_operator import DummyOperator
> from test_subdag2 import TestSubDag2
> import datetime
> from datetime import timedelta
> def TestSubDag1(parent_dag_name, child_dag_name, startDate):
> subdag = DAG(
> '%s.%s' % (parent_dag_name, child_dag_name),
> schedule_interval=None,
> start_date=startDate)
> test_subdag2 = SubDagOperator(subdag=TestSubDag2('%s.%s' % (parent_dag_name, 
> child_dag_name), 'test_subdag2', startDate),
> task_id='test_subdag2',
> dag=subdag) 
> Subdag1ConsolidateTask = DummyOperator(task_id='Subdag1_Consolidate', 
> dag=subdag)
> test_subdag2 >> Subdag1ConsolidateTask
> {code}
>  
> test_subdag2.py
> {code:java}
> // code placeholder
> from airflow import DAG
> from airflow.operators.dummy_operator import DummyOperator
> import datetime
> from datetime import timedelta
> def TestSubDag2(parent_dag_name, child_dag_name, startDate):
> subdag = DAG(
> '%s.%s' % (parent_dag_name, child_dag_name),
> schedule_interval=None,
> start_date=startDate)
> TestTask = DummyOperator(task_id='TestTask', dag=subdag)
> Subdag2ConsolidateTask = DummyOperator(task_id='Subdag2_Consolidate', 
> dag=subdag)
> TestTask >> Subdag2ConsolidateTask
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-5039) xcom_push execution_date setting is broken when core.enable_xcom_pickling=False

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-5039:


Assignee: Sajid Sajid  (was: Rodrigo Martins de Oliveira)

> xcom_push execution_date setting is broken when 
> core.enable_xcom_pickling=False
> ---
>
> Key: AIRFLOW-5039
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5039
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: xcom
>Affects Versions: 1.10.2, 1.10.3
>Reporter: Rodrigo Martins de Oliveira
>Assignee: Sajid Sajid
>Priority: Critical
>
> Function [xcom_push from airflow.models.taskinstance automatically sets the 
> execution_date to 
> XCOM|https://github.com/apache/airflow/blob/7f66d0a6dc13a790b4d2d31694cb50737983f82e/airflow/models/taskinstance.py#L1336],
>  while this is fine when pickling is enabled it just won't work when it isn't 
> as [XCOM will attempt to JSON serialize a `datetime` 
> object|https://github.com/apache/airflow/blob/7f66d0a6dc13a790b4d2d31694cb50737983f82e/airflow/models/xcom.py#L229],
>  which is unsupported.
>  
> This issue is critical as having pickle enabled poses security risks and [in 
> environments such as Google Cloud Composer pickling is disabled for 
> good|[https://cloud.google.com/composer/docs/concepts/airflow-configurations]].
> Attempting to run DAGs with tasks which pushes content to XCOM when 
> `core-enable_xcom_pickling=False` in `airflow.cfg` will result in the 
> following stacktrace (tested in version 1.10.2-composer):
> {quote}
>  TypeError: Object of type 'datetime' is not JSON serializable
>  [2019-07-24 19:14:53,193] \{models.py:1796} ERROR - Object of type 
> 'datetime' is not JSON serializable
>  Traceback (most recent call last):
>  File "/usr/local/lib/airflow/airflow/models.py", line 1668, in _run_raw_task
>  self.xcom_push(key=XCOM_RETURN_KEY, value=result)
>  File "/usr/local/lib/airflow/airflow/models.py", line 2063, in xcom_push
>  execution_date=execution_date or self.execution_date)
>  File "/usr/local/lib/airflow/airflow/utils/db.py", line 73, in wrapper
>  return func(*args, **kwargs)
>  File "/usr/local/lib/airflow/airflow/models.py", line 4785, in set
>  value = json.dumps(value).encode('UTF-8')
>  File "/opt/python3.6/lib/python3.6/json/__init__.py", line 231, in dumps
>  return _default_encoder.encode(obj)
>  File "/opt/python3.6/lib/python3.6/json/encoder.py", line 199, in encode
>  chunks = self.iterencode(o, _one_shot=True)
>  File "/opt/python3.6/lib/python3.6/json/encoder.py", line 257, in iterencode
>  return _iterencode(o, 0)
>  File "/opt/python3.6/lib/python3.6/json/encoder.py", line 180, in default
>  o.__class__.__name__)
>  TypeError: Object of type 'datetime' is not JSON serializable
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-2946) Connection times out on airflow worker

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-2946:


Assignee: Sajid Sajid

> Connection times out on airflow worker
> --
>
> Key: AIRFLOW-2946
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2946
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery, executors, worker
>Affects Versions: 1.10.0
> Environment: ubuntu 16.04, AWS EC2 
>Reporter: Avik Aggarwal
>Assignee: Sajid Sajid
>Priority: Critical
>
> Hi 
> I have Airflow cluster setup running Celery executors with Postgresql 
> installed on same machine as webserver and scheduler.
> After sometime, remote worker shows error 'Connection timed out' and Airflow 
> queues number of configured tasks in pool in queue and flow hungs up there 
> until queue tasks are deleted manually after stopping the scheduler service.
>  
> Logs:
> [2018-08-23 13:44:03,954: ERROR/MainProcess] Pool callback raised exception: 
> OperationalError('(psycopg2.OperationalError) could not connect to server: 
> Connection timed out\n\tIs the server running on host "" and 
> accepting\n\tTCP/IP connections on port 5432?\n',)
>  Traceback (most recent call last):
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/billiard/pool.py", 
> line 1747, in safe_apply_callback
>  fun(*args, **kwargs)
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/celery/worker/request.py", 
> line 367, in on_failure
>  self.id, exc, request=self, store_result=self.store_errors,
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/base.py", 
> line 157, in mark_as_failure
>  traceback=traceback, request=request)
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/base.py", 
> line 322, in store_result
>  request=request, **kwargs)
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/__init__.py",
>  line 53, in _inner
>  return fun(*args, **kwargs)
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/__init__.py",
>  line 105, in _store_result
>  session = self.ResultSession()
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/__init__.py",
>  line 99, in ResultSession
>  **self.engine_options)
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/session.py",
>  line 60, in session_factory
>  self.prepare_models(engine)
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/session.py",
>  line 55, in prepare_models
>  ResultModelBase.metadata.create_all(engine)
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/sql/schema.py", 
> line 4005, in create_all
>  tables=tables)
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
> line 1939, in _run_visitor
>  with self._optional_conn_ctx_manager(connection) as conn:
>  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
>  return self.gen.next()
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
> line 1932, in _optional_conn_ctx_manager
>  with self.contextual_connect() as conn:
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
> line 2123, in contextual_connect
>  self._wrap_pool_connect(self.pool.connect, None),
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
> line 2162, in _wrap_pool_connect
>  e, dialect, self)
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
> line 1476, in _handle_dbapi_exception_noconnection
>  exc_info
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", 
> line 265, in raise_from_cause
>  reraise(type(exception), exception, tb=exc_tb, cause=cause)
>  File 
> "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
> line 2158, in _wrap_pool_connect
>  return fn()
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", 
> line 403, in connect
>  return _ConnectionFairy._checkout(self)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", 
> line 791, in _checkout
>  fairy

[jira] [Assigned] (AIRFLOW-2844) Airflow Logs BrokenPipeException

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-2844:


Assignee: Sajid Sajid

> Airflow Logs BrokenPipeException
> 
>
> Key: AIRFLOW-2844
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2844
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.9.0
>Reporter: Kyle Bridenstine
>Assignee: Sajid Sajid
>Priority: Critical
>
> I'm using a clustered Airflow environment where I have four AWS ec2-instances 
> for the servers.
> *ec2-instances*
>  - Server 1: Webserver, Scheduler, Redis Queue, PostgreSQL Database
>  - Server 2: Webserver
>  - Server 3: Worker
>  - Server 4: Worker
> My setup has been working perfectly fine for three months now but 
> sporadically about once a week I get a Broken Pipe Exception when Airflow is 
> attempting to log something.
> {code:java}
> Log file isn't local.
> Fetching here: 
> http://ip-1-2-3-4:8793/log/foobar/task_1/2018-07-13T00:00:00/1.log
> [2018-07-16 00:00:15,521] {cli.py:374} INFO - Running on host ip-1-2-3-4
> [2018-07-16 00:00:15,698] {models.py:1197} INFO - Dependencies all met for 
> 
> [2018-07-16 00:00:15,710] {models.py:1197} INFO - Dependencies all met for 
> 
> [2018-07-16 00:00:15,710] {models.py:1407} INFO - 
> 
> Starting attempt 1 of 1
> 
> [2018-07-16 00:00:15,719] {models.py:1428} INFO - Executing 
>  on 2018-07-13 00:00:00
> [2018-07-16 00:00:15,720] {base_task_runner.py:115} INFO - Running: ['bash', 
> '-c', 'airflow run foobar task_1 2018-07-13T00:00:00 --job_id 1320 --raw -sd 
> DAGS_FOLDER/datalake_digitalplatform_arl_workflow_schedule_test_2.py']
> [2018-07-16 00:00:16,532] {base_task_runner.py:98} INFO - Subtask: 
> [2018-07-16 00:00:16,532] {configuration.py:206} WARNING - section/key 
> [celery/celery_ssl_active] not found in config
> [2018-07-16 00:00:16,532] {base_task_runner.py:98} INFO - Subtask: 
> [2018-07-16 00:00:16,532] {default_celery.py:41} WARNING - Celery Executor 
> will run without SSL
> [2018-07-16 00:00:16,534] {base_task_runner.py:98} INFO - Subtask: 
> [2018-07-16 00:00:16,533] {_init_.py:45} INFO - Using executor CeleryExecutor
> [2018-07-16 00:00:16,597] {base_task_runner.py:98} INFO - Subtask: 
> [2018-07-16 00:00:16,597] {models.py:189} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/dags/datalake_digitalplatform_arl_workflow_schedule_test_2.py
> [2018-07-16 00:00:16,768] {cli.py:374} INFO - Running on host ip-1-2-3-4
> [2018-07-16 00:16:24,931] {logging_mixin.py:84} WARNING - — Logging error —
> [2018-07-16 00:16:24,931] {logging_mixin.py:84} WARNING - Traceback (most 
> recent call last):
> [2018-07-16 00:16:24,931] {logging_mixin.py:84} WARNING - File 
> "/usr/lib64/python3.6/logging/_init_.py", line 996, in emit
> self.flush()
> [2018-07-16 00:16:24,932] {logging_mixin.py:84} WARNING - File 
> "/usr/lib64/python3.6/logging/_init_.py", line 976, in flush
> self.stream.flush()
> [2018-07-16 00:16:24,932] {logging_mixin.py:84} WARNING - BrokenPipeError: 
> [Errno 32] Broken pipe
> [2018-07-16 00:16:24,932] {logging_mixin.py:84} WARNING - Call stack:
> [2018-07-16 00:16:24,933] {logging_mixin.py:84} WARNING - File 
> "/usr/bin/airflow", line 27, in 
> args.func(args)
> [2018-07-16 00:16:24,934] {logging_mixin.py:84} WARNING - File 
> "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 392, in run
> pool=args.pool,
> [2018-07-16 00:16:24,934] {logging_mixin.py:84} WARNING - File 
> "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 50, in 
> wrapper
> result = func(*args, **kwargs)
> [2018-07-16 00:16:24,934] {logging_mixin.py:84} WARNING - File 
> "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1488, in 
> _run_raw_task
> result = task_copy.execute(context=context)
> [2018-07-16 00:16:24,934] {logging_mixin.py:84} WARNING - File 
> "/usr/local/lib/python3.6/site-packages/airflow/operators/sensors.py", line 
> 78, in execute
> while not self.poke(context):
> [2018-07-16 00:16:24,934] {logging_mixin.py:84} WARNING - File 
> "/home/ec2-user/airflow/plugins/custom_plugins.py", line 35, in poke
> directory = os.listdir(full_path)
> [2018-07-16 00:16:24,934] {logging_mixin.py:84} WARNING - File 
> "/usr/local/lib/python3.6/site-packages/airflo

[jira] [Assigned] (AIRFLOW-7039) Specific DAG Schedule & DST Results in Skipped DAG Run

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-7039:


Assignee: Sajid Sajid

> Specific DAG Schedule & DST Results in Skipped DAG Run
> --
>
> Key: AIRFLOW-7039
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7039
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.7
> Environment: Amazon Linux 2 AMI
>Reporter: Peter Kim
>Assignee: Sajid Sajid
>Priority: Critical
>  Labels: timezone
>
> *Scenario:* 
> EC2 running airflow is in Eastern Time (America/New_York), 
> airflow.cfg>[core]>default_timezone=America/New_York (automatically changes 
> correctly)
> Monday morning after Daylight Savings Time applied a handful of DAG runs were 
> not executed as expected.  The strange part is that these DAGs were the only 
> jobs that did not behave as expected, all other DAGs ran normally.  
> Additionally, only the first expected run after DST was skipped, subsequent 
> runs later that day were scheduled successfully.
> Here is the pattern observed:
> DAG Schedule which skipped first run:  (0 , * * 1,2,3,4,5)
> e.g. Schedules M-F, with two distinct runs per day.
> DAGs that run at one time, M-F & DAGs that run at two times, not M-F did not 
> experience this issue.  
>  
> Based on the logs, it appears as if the expected run that was missed was not 
> seen by the scheduler whatsoever (see below):
>  
>  
> 2020 03 06 6:30 AM ET (BEFORE DST, EXPECTED BEHAVIOR):
> [2020-03-06 06:31:01,220] \{logging_mixin.py:112} INFO - [2020-03-06 
> 06:31:01,220] \{settings.py:254} INFO - settings.configure_orm(): Using pool 
> settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=697
> [2020-03-06 06:31:01,222] \{scheduler_job.py:153} INFO - Started process 
> (PID=697) to work on /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 06:31:01,228] \{scheduler_job.py:1539} INFO - Processing file 
> /home/ec2-user/airflow/s3fuse/dags/.py for tasks to queue
> [2020-03-06 06:31:01,228] \{logging_mixin.py:112} INFO - [2020-03-06 
> 06:31:01,228] \{dagbag.py:403} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 06:31:01,238] \{scheduler_job.py:1551} INFO - DAG(s) 
> dict_keys(['']) retrieved from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 06:31:01,305] \{scheduler_job.py:1262} INFO - Processing 
> 
> [2020-03-06 06:31:01,348] \{logging_mixin.py:112} WARNING - 
> /home/ec2-user/venv/lib64/python3.7/site-packages/pymysql/cursors.py:170: 
> Warning: (1300, "Invalid utf8mb4 character string: '80047D'") result = 
> self._query(query)
> [2020-03-06 06:31:01,362] \{scheduler_job.py:1272} INFO - Created   @ 2020-03-05T15:30:00+00:00: scheduled__2020-03-05T15:30:00+00:00, 
> externally triggered: False>
> [2020-03-06 06:31:01,366] \{scheduler_job.py:740} INFO - Examining DAG run 
>  @ 2020-03-05 15:30:00+00:00: 
> scheduled__2020-03-05T15:30:00+00:00, externally triggered: False>
> [2020-03-06 06:31:01,389] \{scheduler_job.py:440} INFO - Skipping SLA check 
> for > because no tasks in DAG have SLAs
> [2020-03-06 06:31:01,395] \{scheduler_job.py:1613} INFO - Creating / updating 
> . 2020-03-05 15:30:00+00:00 [scheduled]> 
> in ORM
> [2020-03-06 06:31:01,414] \{scheduler_job.py:161} INFO - Processing 
> /home/ec2-user/airflow/s3fuse/dags/.py took 0.192 seconds
> 20200306 10 AM ET (BEFORE DST, EXPECTED BEHAVIOR):
> [2020-03-06 10:30:00,083] \{logging_mixin.py:112} INFO - [2020-03-06 
> 10:30:00,082] \{settings.py:254} INFO - settings.configure_orm(): Using pool 
> settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=16194
> [2020-03-06 10:30:00,085] \{scheduler_job.py:153} INFO - Started process 
> (PID=16194) to work on /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 10:30:00,090] \{scheduler_job.py:1539} INFO - Processing file 
> /home/ec2-user/airflow/s3fuse/dags/.py for tasks to queue
> [2020-03-06 10:30:00,090] \{logging_mixin.py:112} INFO - [2020-03-06 
> 10:30:00,090] \{dagbag.py:403} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 10:30:00,099] \{scheduler_job.py:1551} INFO - DAG(s) 
> dict_keys(['']) retrieved from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 10:30:00,159] \{scheduler_job.py:1262} INFO - Processing 
> 
> [2020-03-06 10:30:00,193] \{logging_mixin.py:112} WARNING - 
> /home/ec2-user/venv/lib64/python3.7/site-packages/pymysql/cursors.py:170: 
>

[jira] [Assigned] (AIRFLOW-5881) Dag gets stuck in "Scheduled" State when scheduling a large number of tasks

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-5881:


Assignee: Sajid Sajid

> Dag gets stuck in "Scheduled" State when scheduling a large number of tasks
> ---
>
> Key: AIRFLOW-5881
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5881
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.6
>Reporter: David Hartig
>Assignee: Sajid Sajid
>Priority: Critical
> Attachments: 2 (1).log, airflow.cnf
>
>
> Running with the KubernetesExecutor in and AKS cluster, when we upgraded to 
> version 1.10.6 we noticed that the all the Dags stop making progress but 
> start running and immediate exiting with the following message:
> "Instance State' FAILED: Task is in the 'scheduled' state which is not a 
> valid state for execution. The task must be cleared in order to be run."
> See attached log file for the worker. Nothing seems out of the ordinary in 
> the Scheduler log. 
> Reverting to 1.10.5 clears the problem.
> Note that at the time of the failure maybe 100 or so tasks are in this state, 
> with 70 coming from one highly parallelized dag. Clearing the scheduled tasks 
> just makes them reappear shortly thereafter. Marking them "up_for_retry" 
> results in one being executed but then the system is stuck in the original 
> zombie state. 
> Attached is the also a redacted airflow config flag. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-2320) Can not run DAGs since upgraded from Airflow 1.7

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-2320:


Assignee: Sajid Sajid

> Can not run DAGs since upgraded from Airflow 1.7
> 
>
> Key: AIRFLOW-2320
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2320
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.9.0
>Reporter: Gregory Dougherty
>Assignee: Sajid Sajid
>Priority: Blocker
>  Labels: newbie, run
> Attachments: example_bash_operator.py
>
>
> We installed Airflow 1.7, and used it for several months.  I used PIP to 
> uninstall ariflow 1.7 and install 1.9 (gory details are 
> [here|[https://stackoverflow.com/questions/49544320/airflow-initdb-failed-importerror-no-module-named-log-logging-mixin]).]
>  
> Since then, I haven't had a single DAG run.  I renamed and moved log files to 
> match the 1.9 expectations, but still nothing happens.
>  
> I have a "run every 40 minutes" DAG, it hasn't run since 3/28.  When I 
> manually trigger it, no log file is created, nothing happens except I get a 
> running DAG listed under "DAG Runs" (I do NOT get anything listed under 
> "Recent Tasks", and "Last Run" does not get updated.
>  
> I have a "Run once" DAG that I created.  I triggered it, same behavior.
>  
> I have also tried running the example_bash_operator DAG.  Same behavior.  
> (I've attached the example_bash_operator.py file, so we're clear what I tried 
> to run)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-5191) SubDag is marked failed

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-5191:


Assignee: Sajid Sajid

> SubDag is marked failed 
> 
>
> Key: AIRFLOW-5191
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5191
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, DagRun
>Affects Versions: 1.10.4
> Environment: CentOS 7, Maria-DB, python 3.6.7, Airflow 1.10.4
>Reporter: Oliver Ricken
>Assignee: Sajid Sajid
>Priority: Blocker
>
> Dear all,
> after having upgraded from Airflow version 1.10.2 to 1.10.4, we experience 
> strange and very problematic behaviour of SubDags (which are crucial for our 
> environment and used frequently).
> Tasks inside the SubDag failing and awaiting retry ("up-for-retry") mark the 
> SubDag "failed" (while in 1.10.2, the SubDag was still in "running"-state). 
> This is particularly problematic for downstream tasks depending on the state 
> of the SubDag. Since we have downstream tasks triggered on "all_done", the 
> downstream task is triggered by the "failed" SubDag although a 
> SubDag-internal task is awaiting retry and might (in our case: most likely) 
> yield successfully processed data. This data is thus not available to the 
> prematurely triggered task downstream of the SubDag.
> This is a severe problem for us and worth rolling back to 1.10.2 if there is 
> no quick solution or work-around to this issue!
> We urgently need help on this matter.
> Thanks allot in advance, any suggestions and input is highly appreciated!
> Cheers
> Oliver



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-6580) Killing or marking a task as failed does not kill the Pod in the backend

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-6580:


Assignee: Sajid Sajid  (was: Daniel Imberman)

> Killing or marking a task as failed does not kill the Pod in the backend
> 
>
> Key: AIRFLOW-6580
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6580
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, executor-kubernetes
>Affects Versions: 1.10.2
>Reporter: Ram
>Assignee: Sajid Sajid
>Priority: Blocker
>
> We're using KubernetesPodOperator in Airflow 1.10.2
>  The pods that we have some NodeAffinity and Tolerations in it. 
>  Sometimes the pod gets stuck at a Pending state.
>  *But when the task fails, the Pending pod does not kill itself.*
> Related to this, when we manually fail a task, the DAG task stops running, 
> but the Pod in the DAG does not get killed and continues running.
> We have tried setting the 'is_delete_operator_pod' to True. But for some 
> reason the Pod gets killed almost instantly when the execution starts. We 
> have not been able to debug the issue behind this.
> Does the latest version of Airflow account for this?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-2697) Drop snakebite in favour of pyarrow

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-2697:


Assignee: Sajid Sajid  (was: Julian de Ruiter)

> Drop snakebite in favour of pyarrow
> ---
>
> Key: AIRFLOW-2697
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2697
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: 1.9.0
>Reporter: Julian de Ruiter
>Assignee: Sajid Sajid
>Priority: Blocker
>  Labels: gsoc, gsoc2020, mentor
> Fix For: 2.0.0
>
>
> The current HdfsHook relies on the snakebite library, which is unfortunately 
> not compatible with Python 3. To add Python 3 support for the HdfsHook 
> requires switching to a different library for interacting with HDFS. The 
> hdfs3 library is an attractive alternative, as it supports Python 3 and seems 
> to be stable and relatively well supported.
> Update: hdfs3 doesn't get any updates anymore. The best library right now 
> seems to be pyarrow: https://arrow.apache.org/docs/python/filesystems.html
> Therefore I would like to upgrade to pyarrow instead of hdfs3.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-4176) [security] webui shows password - admin/log/?flt1_extra_contains=conn_password

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-4176:


Assignee: Sajid Sajid  (was: Giorgio Zoppi)

> [security] webui shows password - admin/log/?flt1_extra_contains=conn_password
> --
>
> Key: AIRFLOW-4176
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4176
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: security, ui
>Affects Versions: 1.10.2
>Reporter: t oo
>Assignee: Sajid Sajid
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: airf.png
>
>
> First setup hivecli connection:
> {noformat}
> source /home/ec2-user/venv/bin/activate; airflow connections -a \ 
>   --conn_id query_hive --conn_type hive_cli --conn_host domainhere \
>   --conn_port 1 --conn_schema default \
>   --conn_extra "{\"use_beeline\":\"true\", 
> \"ssl-options\":\"ssl=true;sslTrustStore=path-${RUNTIME_ENV}.jks;trustStorePassword=${QUERY_JKS_PASW}\"}"
>  \
>   --conn_login ${QUERY_HIVE_USER} --conn_password ${QUERY_HIVE_PASW}
> {noformat}
>  
> On the webui navigate to domain/admin/log/?flt1_extra_contains=conn_password
> and you will be able to see cleartext user and password!
> see attachment



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-7043) CLONE - The mirrored data folder for BigQuery_operators can't be accessed on manual runs

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-7043:


Assignee: Sajid Sajid

> CLONE - The mirrored data folder for BigQuery_operators can't be accessed on 
> manual runs
> 
>
> Key: AIRFLOW-7043
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7043
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, DAG
>Affects Versions: 1.10.9
> Environment: Windows 10 Pro, i7-4790S Processor, 16MB RAM
>Reporter: Valerio Dacanal
>Assignee: Sajid Sajid
>Priority: Blocker
>
> Using Airflow through the Google Cloud Composer, I've placed a series of text 
> files in the /data folder of the bucket as suggested in the documentation for 
> where to store external data files and have written a BigQueryOperator of the 
> following format: 
> {{ 
> with 
> open('/home/airflow/gcs/data/{0}.txt'.format(models.Variable.get('tmpcreatives')),'r')
>  as tmp_file: tmp_transfer = tmp_file.read() bq_sql_tmptransfer = 
> bigquery_operator.BigQueryOperator( task_id = 'task1', sql = """ {0} 
> """.format(tmp_transfer.format(tradata = 
> dag.params["ClientDatabase"]+dag.params["bq_param1"],rawdata = 
> dag.params["ClientDatabase"]+dag.params["bq_param2"])), use_legacy_sql = 
> False ) 
> }} 
> On scheduled runs, the DAG run's fine and completes the task, however if I 
> try to manually trigger the DAG or look at the run logs it comes up with the 
> message 'DAG "DataCreation_DAG_" seems to be missing' This is only an issue 
> when I use the open() function, if I replace that section with a hardcoded 
> string then the DAG works fine even on manual runs, I think it's a bug with 
> mounting the /data file from the cloud shell bucket but not entirely sure



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-6960) Airflow Celery worker : command returned non-zero exit status 2

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-6960:


Assignee: Sajid Sajid

> Airflow Celery worker : command returned non-zero exit status 2
> ---
>
> Key: AIRFLOW-6960
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6960
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Uragalage Thilanka Mahesh Perera
>Assignee: Sajid Sajid
>Priority: Blocker
>
> I am getting below error and trying to fix it for hours and did get any luck. 
> Below logs are from airflow celery worker.
> {code:java}
>   airflow command error: argument subcommand: invalid choice: 'tasks' (choose 
> from 'backfill', 'list_dag_runs', 'list_tasks', 'clear', 'pause', 'unpause', 
> 'trigger_dag', 'delete_dag', 'show_dag', 'pool', 'variables', 'kerberos', 
> 'render', 'run', 'initdb', 'list_dags', 'dag_state', 'task_failed_deps', 
> 'task_state', 'serve_logs', 'test', 'webserver', 'resetdb', 'upgradedb', 
> 'checkdb', 'shell', 'scheduler', 'worker', 'flower', 'version', 
> 'connections', 'create_user', 'delete_user', 'list_users', 'sync_perm', 
> 'next_execution', 'rotate_fernet_key'), see help above. usage: airflow [-h] 
> {backfill,list_dag_runs,list_tasks,clear,pause,unpause,trigger_dag,delete_dag,show_dag,pool,variables,kerberos,render,run,initdb,list_dags,dag_state,task_failed_deps,task_state,serve_logs,test,webserver,resetdb,upgradedb,checkdb,shell,scheduler,worker,flower,version,connections,create_user,delete_user,list_users,sync_perm,next_execution,rotate_fernet_key}
>  ... positional arguments: 
> {backfill,list_dag_runs,list_tasks,clear,pause,unpause,trigger_dag,delete_dag,show_dag,pool,variables,kerberos,render,run,initdb,list_dags,dag_state,task_failed_deps,task_state,serve_logs,test,webserver,resetdb,upgradedb,checkdb,shell,scheduler,worker,flower,version,connections,create_user,delete_user,list_users,sync_perm,next_execution,rotate_fernet_key}
>  sub-command help backfill Run subsections of a DAG for a specified date 
> range. If reset_dag_run option is used, backfill will first prompt users 
> whether airflow should clear all the previous dag_run and task_instances 
> within the backfill date range. If rerun_failed_tasks is used, backfill will 
> auto re-run the previous failed task instances within the backfill date 
> range. list_dag_runs List dag runs given a DAG id. If state option is given, 
> it will onlysearch for all the dagruns with the given state. If no_backfill 
> option is given, it will filter outall backfill dagruns for given dag id. 
> list_tasks List the tasks within a DAG clear Clear a set of task instance, as 
> if they never ran pause Pause a DAG unpause Resume a paused DAG trigger_dag 
> Trigger a DAG run delete_dag Delete all DB records related to the specified 
> DAG show_dag Displays DAG's tasks with their dependencies pool CRUD 
> operations on pools variables CRUD operations on variables kerberos Start a 
> kerberos ticket renewer render Render a task instance's template(s) run Run a 
> single task instance initdb Initialize the metadata database list_dags List 
> all the DAGs dag_state Get the status of a dag run task_failed_deps Returns 
> the unmet dependencies for a task instance from the perspective of the 
> scheduler. In other words, why a task instance doesn't get scheduled and then 
> queued by the scheduler, and then run by an executor). task_state Get the 
> status of a task instance serve_logs Serve logs generate by worker test Test 
> a task instance. This will run a task without checking for dependencies or 
> recording its state in the database. webserver Start a Airflow webserver 
> instance resetdb Burn down and rebuild the metadata database upgradedb 
> Upgrade the metadata database to latest version checkdb Check if the database 
> can be reached. shell Runs a shell to access the database scheduler Start a 
> scheduler instance worker Start a Celery worker node flower Start a Celery 
> Flower version Show the version connections List/Add/Delete connections 
> create_user Create an account for the Web UI (FAB-based) delete_user Delete 
> an account for the Web UI list_users List accounts for the Web UI sync_perm 
> Update permissions for existing roles and DAGs. next_execution Get the next 
> execution datetime of a DAG. rotate_fernet_key Rotate all encrypted 
> connection credentials and variables; see 
> https://airflow.readthedocs.io/en/stable/howto/secure- 
> connections.html#rotating-encryption-keys. optional arguments:

[jira] [Assigned] (AIRFLOW-7027) The mirrored data folder for BigQuery_operators can't be accessed on manual runs

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-7027:


Assignee: Sajid Sajid

> The mirrored data folder for BigQuery_operators can't be accessed on manual 
> runs
> 
>
> Key: AIRFLOW-7027
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7027
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, DAG
>Affects Versions: 1.10.9
> Environment: Windows 10 Pro, i7-4790S Processor, 16MB RAM
>Reporter: Jonny Evans
>Assignee: Sajid Sajid
>Priority: Blocker
>
> Using Airflow through the Google Cloud Composer, I've placed a series of text 
> files in the /data folder of the bucket as suggested in the documentation for 
> where to store external data files and have written a BigQueryOperator of the 
> following format: 
> {{ 
> with 
> open('/home/airflow/gcs/data/{0}.txt'.format(models.Variable.get('tmpcreatives')),'r')
>  as tmp_file: tmp_transfer = tmp_file.read() bq_sql_tmptransfer = 
> bigquery_operator.BigQueryOperator( task_id = 'task1', sql = """ {0} 
> """.format(tmp_transfer.format(tradata = 
> dag.params["ClientDatabase"]+dag.params["bq_param1"],rawdata = 
> dag.params["ClientDatabase"]+dag.params["bq_param2"])), use_legacy_sql = 
> False ) 
> }} 
> On scheduled runs, the DAG run's fine and completes the task, however if I 
> try to manually trigger the DAG or look at the run logs it comes up with the 
> message 'DAG "DataCreation_DAG_" seems to be missing' This is only an issue 
> when I use the open() function, if I replace that section with a hardcoded 
> string then the DAG works fine even on manual runs, I think it's a bug with 
> mounting the /data file from the cloud shell bucket but not entirely sure



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-7120) WE HAVE MOVED TO GITHUB ISSUE

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-7120:


Assignee: Sajid Sajid  (was: Daniel Imberman)

> WE HAVE MOVED TO GITHUB ISSUE
> -
>
> Key: AIRFLOW-7120
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7120
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: api, authentication, aws, backfill, backport-packages, 
> breeze, build, celery, ci, cli, configuration, contrib, core, DAG, DagRun, 
> database, dependencies, documentation, examples, executor-kubernetes, 
> executors, gcp, hooks, lineage, logging, models, mysql, operators, packages, 
> plugins, pre-commit, project-management, pylint, scheduler, security, 
> serialization, tests, ui, utils, webserver, worker, xcom
>Affects Versions: 1.10.10
>Reporter: Kamil Bregula
>Assignee: Sajid Sajid
>    Priority: Blocker
>
> Hello,
> We no longer use Jira for new tickets. Please, use Github Issue.
> https://github.com/apache/airflow/issues
> Jira is just an archive for old tickets. You can no longer create new 
> tickets. You can still comment on old tickets, but you can't expect a 
> response. If the ticket is important to you, please create a new ticket in 
> Github Issue and give the link to the Jira ticket.
> Big love,



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-6778) Add a DAGs PVC Mount Point Option for Workers under Kubernetes Executor

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-6778:


Assignee: Sajid Sajid  (was: Daniel Imberman)

> Add a DAGs PVC Mount Point Option for Workers under Kubernetes Executor
> ---
>
> Key: AIRFLOW-6778
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6778
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: executor-kubernetes, worker
>Affects Versions: 1.10.6, 1.10.7, 1.10.8, 1.10.9
>Reporter: Brandon Willard
>Assignee: Sajid Sajid
>Priority: Blocker
>  Labels: kubernetes, options
>
> The worker pods generated by the Kubernetes Executor force the DAGs PVC to be 
> mounted at the Airflow DAGs folder.  This, combined with a general inability 
> to specify arbitrary PVCs on workers (see AIRFLOW-3126 and the 
> linked/duplicated issues), severely constrains the usability of worker pods 
> and the Kubernetes Executor as a whole.
>  
> For example, if a DAGs-containing PVC is rooted at a Python package (e.g. 
> {{package/}}) that needs to be installed on each worker (e.g. DAGs in 
> {{package/dags/}}, package install point at {{package/setup.py}}, and Airflow 
> DAGs location {{/airflow/dags}}), then the current static mount point logic 
> will only allow a worker to directly mount the entire package into the 
> Airflow DAGs location  —  while the actual DAGs are in a subdirectory — or 
> exclusively mount the package's sub-path {{package/dags}} (using the existing 
> {{kubernetes.dags_volume_subpath}} config option).  While the latter is at 
> least correct, it completely foregoes the required parent directory and it 
> makes the requisite package unavailable for installation (e.g. the files 
> under {{package/}} are not available).
>  
> -In general, the only approach that seems to work for the Kubernetes Executor 
> is to specify a worker image with all DAG dependencies pre-loaded, which 
> largely voids the usefulness of a single DAGs PVC that can be dynamically 
> updated.  At best, one can include a {{requirements.txt}} in the PVC and use 
> it in tandem with an entry-point script built into the image, but that still 
> doesn't help with source installations of custom packages stored and updated 
> in a PVC.-
> Edit: This isn't even possible, because worker pods are created using [the 
> {{command}} field instead of 
> {{args}}|https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/#notes]!
>  
> A quick fix for this situation is to allow one to specify the DAGs PVC mount 
> point.  With this option, one can mount the PVC anywhere and specify an 
> Airflow DAGs location that works in conjunction with the mount point (e.g. 
> mount the PVC at {{/airflow/package}} and independently set the Airflow DAGs 
> location to {{/airflow/package/dags}}).  This option would — in many cases — 
> obviate the need for the marginally useful {{kubernetes.dags_volume_subpath}} 
> options, as well.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-6780) Use args Instead of command in Kubernetes Executor Worker Pods

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-6780:


Assignee: Sajid Sajid

> Use args Instead of command in Kubernetes Executor Worker Pods
> --
>
> Key: AIRFLOW-6780
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6780
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor-kubernetes
>Affects Versions: 1.10.7, 1.10.8, 1.10.9
>Reporter: Brandon Willard
>Assignee: Sajid Sajid
>Priority: Blocker
>
> The worker pods created by the Kubernetes Executor are hard-coded to use [the 
> {{command}} field instead of 
> {{args}}|[https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/#notes]!|https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/#notes].
>   The result is a worker pod that ignores its image's entry-point and 
> prevents the standard means of executing container preparation code --- among 
> other things.
> This --- alongside the issues stated in 
> [AIRFLOW-6778|https://issues.apache.org/jira/browse/AIRFLOW-6778] and 
> [AIRFLOW-3126|https://issues.apache.org/jira/browse/AIRFLOW-3126] --- renders 
> worker nodes virtually incompatible with a dynamic, shared DAGs volume 
> (unless dependencies are installed in the image and never change).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-3036) Upgrading to Airflow 1.10 not possible using GCP Cloud SQL for MYSQL

2022-02-25 Thread Sajid Sajid (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sajid Sajid reassigned AIRFLOW-3036:


Assignee: Sajid Sajid  (was: Precious Hills)

> Upgrading to Airflow 1.10 not possible using GCP Cloud SQL for MYSQL
> 
>
> Key: AIRFLOW-3036
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3036
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database
>Affects Versions: 1.10.0
> Environment: Google Cloud Platform, Google Kubernetes Engine, Airflow 
> 1.10 on Debian Stretch, Google Cloud SQL MySQL
>Reporter: Smith Mathieu
>Assignee: Sajid Sajid
>Priority: Blocker
>  Labels: 1.10, gcp, google, google-cloud-sql
>
> The upgrade path to airflow 1.10 seems impossible for users of MySQL in 
> Google's Cloud SQL service given new mysql requirements for 1.10.
>  
> When executing "airflow upgradedb"
> ```
>  INFO [alembic.runtime.migration] Running upgrade d2ae31099d61 -> 
> 0e2a74e0fc9f, Add time zone awareness
>  Traceback (most recent call last):
>  File "/usr/local/bin/airflow", line 32, in 
>  args.func(args)
>  File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 1002, 
> in initdb
>  db_utils.initdb(settings.RBAC)
>  File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 92, 
> in initdb
>  upgradedb()
>  File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 346, 
> in upgradedb
>  command.upgrade(config, 'heads')
>  File "/usr/local/lib/python3.6/site-packages/alembic/command.py", line 174, 
> in upgrade
>  script.run_env()
>  File "/usr/local/lib/python3.6/site-packages/alembic/script/base.py", line 
> 416, in run_env
>  util.load_python_file(self.dir, 'env.py')
>  File "/usr/local/lib/python3.6/site-packages/alembic/util/pyfiles.py", line 
> 93, in load_python_file
>  module = load_module_py(module_id, path)
>  File "/usr/local/lib/python3.6/site-packages/alembic/util/compat.py", line 
> 68, in load_module_py
>  module_id, path).load_module(module_id)
>  File "", line 399, in 
> _check_name_wrapper
>  File "", line 823, in load_module
>  File "", line 682, in load_module
>  File "", line 265, in _load_module_shim
>  File "", line 684, in _load
>  File "", line 665, in _load_unlocked
>  File "", line 678, in exec_module
>  File "", line 219, in _call_with_frames_removed
>  File "/usr/local/lib/python3.6/site-packages/airflow/migrations/env.py", 
> line 91, in 
>  run_migrations_online()
>  File "/usr/local/lib/python3.6/site-packages/airflow/migrations/env.py", 
> line 86, in run_migrations_online
>  context.run_migrations()
>  File "", line 8, in run_migrations
>  File 
> "/usr/local/lib/python3.6/site-packages/alembic/runtime/environment.py", line 
> 807, in run_migrations
>  self.get_context().run_migrations(**kw)
>  File "/usr/local/lib/python3.6/site-packages/alembic/runtime/migration.py", 
> line 321, in run_migrations
>  step.migration_fn(**kw)
>  File 
> "/usr/local/lib/python3.6/site-packages/airflow/migrations/versions/0e2a74e0fc9f_add_time_zone_awareness.py",
>  line 46, in upgrade
>  raise Exception("Global variable explicit_defaults_for_timestamp needs to be 
> on (1) for mysql")
>  Exception: Global variable explicit_defaults_for_timestamp needs to be on 
> (1) for mysql
>  ```
>   
> Reading documentation for upgrading to airflow 1.10, it seems the requirement 
> for explicit_defaults_for_timestamp=1 was intentional. 
>  
> However,  MySQL on Google Cloud SQL does not support configuring this 
> variable and it is off by default. Users of MySQL and Cloud SQL do not have 
> an upgrade path to 1.10. Alas, so close to the mythical Kubernetes Executor.
> In GCP, Cloud SQL is _the_ hosted MySQL solution. 
> [https://cloud.google.com/sql/docs/mysql/flags]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-02-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17495037#comment-17495037
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

omoumniabdou commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1046086644


   The problem for us was that we had one dag that reach 32 parallelize 
runnable task ( 32 leaf tasks) which was the value of parameter `parallelism`. 
After this, the scheduler was not able to run (or queue) any task.
   Increasing this parameter solve the problem for us.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-02-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17494898#comment-17494898
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

ghostbody commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1045871873


   @pbotros No, we do not solve this problem yet.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-02-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17493003#comment-17493003
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

pbotros commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1041101221


   We also run into this fairly often, despite not using any sensors. We only 
seemed to start getting this error once we changed our Airflow database to be 
in the cloud (AWS RDB); our Airflow webserver & scheduler runs on desktop 
workstations on-premises. As others have suggested in this thread, this is a 
very annoying problem that requires manual intervention.
   
   @ghostbody any progress on determining if that's the correct root cause?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-3379) Support for AWS triggers (S3, SNS, SQS, CloudWatch, etc.)

2022-02-15 Thread Anonymous (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous reassigned AIRFLOW-3379:
--

Assignee: Brylie Christopher Oxley  (was: Rohan Bhanderi)

> Support for AWS triggers (S3, SNS, SQS, CloudWatch, etc.)
> -
>
> Key: AIRFLOW-3379
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3379
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws
>Reporter: Brylie Christopher Oxley
>Assignee: Brylie Christopher Oxley
>Priority: Major
>
> We would like to build a 'reactive', or event-driven, data pipeline. From 
> what I can gather, the primary Airflow DAG trigger is a timer (cron). 
> However, it would be useful to trigger dags on external events, namely AWS 
> events like S3 file, SQS, SNS, and/or CloudWatch.
> I note there is an experimental API, which could be triggered from an AWS 
> Lambda, but would add boilerplate and brittleness to the data pipeline.
> What are our options for triggering Airflow DAGS from external AWS events?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-3379) Support for AWS triggers (S3, SNS, SQS, CloudWatch, etc.)

2022-02-15 Thread Anonymous (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous reassigned AIRFLOW-3379:
--

Assignee: Rohan Bhanderi

> Support for AWS triggers (S3, SNS, SQS, CloudWatch, etc.)
> -
>
> Key: AIRFLOW-3379
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3379
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws
>Reporter: Brylie Christopher Oxley
>Assignee: Rohan Bhanderi
>Priority: Major
>
> We would like to build a 'reactive', or event-driven, data pipeline. From 
> what I can gather, the primary Airflow DAG trigger is a timer (cron). 
> However, it would be useful to trigger dags on external events, namely AWS 
> events like S3 file, SQS, SNS, and/or CloudWatch.
> I note there is an experimental API, which could be triggered from an AWS 
> Lambda, but would add boilerplate and brittleness to the data pipeline.
> What are our options for triggering Airflow DAGS from external AWS events?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-6602) Make "executor_config" templated field to support dynamic parameters for kubernetes executor

2022-02-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489071#comment-17489071
 ] 

ASF GitHub Bot commented on AIRFLOW-6602:
-

BayanAzima commented on pull request #7230:
URL: https://github.com/apache/airflow/pull/7230#issuecomment-1032967030


   I'd like to see this feature as I'd like to template out a `sub_path` of a 
PV I have so that i don't expose other dag run data I store there to other dag 
runs. see the example below
   
   I notice that this issue is pretty old, was this added in v2 or is there 
another way I can do this now? 
   
   ``` 
   
   KUBERNETES_WORKSPACE_PVC = {
   "pod_override": k8s.V1Pod(
   spec=k8s.V1PodSpec(
   containers=[
   k8s.V1Container(
   name="base",
   volume_mounts=[
   k8s.V1VolumeMount(
   name=Constants.KUBERNETES_WORKSPACE_PVC,
   mount_path='/opt/airflow/mnt/workspace', 
   # sub_path="{{ 
ti.xcom_pull(key='run_group_id')}}", 
   read_only=False
   )
   ],
   )
   ],
   volumes=[
   k8s.V1Volume(
   name=Constants.KUBERNETES_WORKSPACE_PVC,
   
persistent_volume_claim=k8s.V1PersistentVolumeClaimVolumeSource(claim_name=Constants.KUBERNETE_WORKSPACE_PVC)
   )
   ],
   )
   ),
   }


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Make "executor_config" templated field to support dynamic parameters for 
> kubernetes executor
> 
>
> Key: AIRFLOW-6602
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6602
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: executor-kubernetes, executors
>Affects Versions: 1.10.7
>Reporter: Jun Xie
>Assignee: Jun Xie
>Priority: Major
>
> When running airflow with Kubernetes Executor, one specifies the 
> configurations through 
> "executor_config". At the moment, this field is not templated, meaning that 
> we won't be able to have dynamic parameters. So I did an experiment that I 
> created MyPythonOperator which inherits PythonOperator but with with 
> "executor_config" added to template_fields. However, the result shows that 
> this change itself isn't enough, because airflow first creates a Pod based on 
> executor_config without rendering it, and then run the task inside the pod 
> (the running will trigger the Jinja template rendering)
> See an example config below showing a use case where one can mount dynamic 
> "subPath" to the image
>  
> {code:java}
> executor_config = {
> "KubernetesExecutor": {
> "image": "some_image",
> "request_memory": "2Gi",
> 'request_cpu': '1',
> "volumes": [
> {
> "name": "data",
> "persistentVolumeClaim": {"claimName": "some_claim_name"},
> },
> ],
> "volume_mounts": [
> {
> "mountPath": "/code",
> "name": "data",
> "subPath": "/code/{{ dag_run.conf['branch_name'] }}"
> },
> ]
> }
> }
> {code}
>  
>  
>  
> I have then did a further experiment that in 
> trigger_tasks() from airflow/executors/base_executor.py, right before 
> execute_async() is called, I called simple_ti.render_templates() which will 
> trigger the rendering, so the kubernetes_executor.execute_async() will pick 
> up the resolved parameters
>  
> {code:java}
> # current behavior
> for i in range(min((open_slots, len(self.queued_tasks:
> key, (command, _, queue, simple_ti) = sorted_queue.pop(0)
> self.queued_tasks.pop(key)
> self.running[key] = command
> self.execute_async(key=key,
>command=command,
>queue=queue,
>ex

[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-01-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476036#comment-17476036
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

val2k edited a comment on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1012934214


   We face the same issue with tasks that stay indefinitely in a queued status, 
except that we don't see tasks as `up_for_retry`. It happens randomly within 
our DAGs. The task will stay in a queued status forever until we manually make 
it fail. We **don't use any sensors** at all. We are on an AWS MWAA instance 
(Airflow 2.0.2).
   
   Example logs:
   Scheduler:
   ```
   [2022-01-14 08:03:32,868] {{scheduler_job.py:1239}} ERROR - Executor reports 
task instance  finished 
(failed) although the task says its queued. (Info: None) Was the task killed 
externally?
   [2022-01-14 08:03:32,845] {{scheduler_job.py:1210}} INFO - Executor reports 
execution of task0 execution_date=2022-01-13 07:00:00+00:00 exited with status 
failed for try_number 1
in state FAILURE
   ```
   
   Worker:
   ```
   [2021-04-20 20:54:29,109: ERROR/ForkPoolWorker-15] Failed to execute task 
dag_id could not be found: task0. Either the dag did not exist or it failed to 
parse..`
   This is not seen in the worker logs for every occurrence in the scheduler 
logs.
   ```
   
   Because of the MWAA autoscaling mechanism, `worker_concurrency` is not 
configurable.
   `worker_autoscale`: `10, 10`.
   `dagbag_import_timeout`: 120s
   `dag_file_processor_timeout`: 50s
   `parallelism` = 48
   `dag_concurrency` = 1
   `max_threads` = 8
   
   We currently have 2 (minWorkers) to 10 (maxWorkers) mw1.medium (2 vCPU) 
workers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-01-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476035#comment-17476035
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

val2k commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1012934214


   We face the same issue with tasks that stay indefinitely in a queued status, 
except that we don't see tasks as `up_for_retry`. It happens randomly within 
our DAGs. The task will stay in a queued status forever until we manually make 
it fail. We **don't use any sensors** at all. We are on an AWS MWAA instance 
(Airflow 2.0.2).
   
   Example logs:
   Scheduler:
   ```
   [2022-01-14 08:03:32,868] {{scheduler_job.py:1239}} ERROR - Executor reports 
task instance  finished 
(failed) although the task says its queued. (Info: None) Was the task killed 
externally?
   [2022-01-14 08:03:32,845] {{scheduler_job.py:1210}} INFO - Executor reports 
execution of task0 execution_date=2022-01-13 07:00:00+00:00 exited with status 
failed for try_number 1
in state FAILURE
   ```
   
   Worker:
   `[2021-04-20 20:54:29,109: ERROR/ForkPoolWorker-15] Failed to execute task 
dag_id could not be found: task0. Either the dag did not exist or it failed to 
parse..`
   This is not seen in the worker logs for every occurrence in the scheduler 
logs.
   
   Because of the MWAA autoscaling mechanism, `worker_concurrency` is not 
configurable.
   `worker_autoscale`: `10, 10`.
   `dagbag_import_timeout`: 120s
   `dag_file_processor_timeout`: 50s
   `parallelism` = 48
   `dag_concurrency` = 1
   `max_threads` = 8
   
   We currently have 2 (minWorkers) to 10 (maxWorkers) mw1.medium (2 vCPU) 
workers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-01-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475121#comment-17475121
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

ghostbody edited a comment on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1011812804


   we reviewed the code and found that in `local_task_job.py`, the parent 
process has a `heatbeat_callback`, and will check the state and child process 
return code of the `task_instance`.
   
   However, theses lines may cover a bug?
   
   
![image](https://user-images.githubusercontent.com/8371330/149270821-45da67da-186e-409b-8f3e-072fe8e0491c.png)
   
   
![image](https://user-images.githubusercontent.com/8371330/149271933-4ae6c8d1-defc-45c6-ba21-89a46016c3d2.png)
   
   
   **The raw task command write back the taskintance's state(like sucess) 
doesn't mean the child process is finished(returned)?**
   
   So, in this heatbeat callback, there maybe a race condition when task state 
is filled back while the child process is not returned.
   
   In this senario, the local task will kill the child process by mistake. And 
then, the scheduler will checkout this and report "task instance X finished 
(success) although the task says its queued. Was the task killed externally?"
   
   this is a simple schematic diagram:
   
   
![image](https://user-images.githubusercontent.com/8371330/149273573-45700f32-079b-4b22-8dba-d6a1ce37a243.png)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-01-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475117#comment-17475117
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

ghostbody edited a comment on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1011812804


   we reviewed the code and found that in `local_task_job.py`, the parent 
process has a `heatbeat_callback`, and will check the state and child process 
return code of the `task_instance`.
   
   However, theses lines may cover a bug?
   
   
![image](https://user-images.githubusercontent.com/8371330/149270821-45da67da-186e-409b-8f3e-072fe8e0491c.png)
   
   
![image](https://user-images.githubusercontent.com/8371330/149271933-4ae6c8d1-defc-45c6-ba21-89a46016c3d2.png)
   
   
   **The raw task command write back the taskintance's state(like sucess) 
doesn't mean the child process is finished(returned)?**
   
   So, in this heatbeat callback, there maybe a race condition when task state 
is filled back while the child process is not returned.
   
   In this senario, the local task will kill the child process by mistake. And 
then, the scheduler will checkout this and report "task instance X finished 
(success) although the task says its queued. Was the task killed externally?"
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-01-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475116#comment-17475116
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

ghostbody edited a comment on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1011812804


   we reviewed the code and found that in `local_task_job.py`, the parent 
process has a `heatbeat_callback`, and will check the state and child process 
return code of the `task_instance`.
   
   However, theses lines may cover a bug?
   
   
![image](https://user-images.githubusercontent.com/8371330/149270821-45da67da-186e-409b-8f3e-072fe8e0491c.png)
   
   
![image](https://user-images.githubusercontent.com/8371330/149271933-4ae6c8d1-defc-45c6-ba21-89a46016c3d2.png)
   
   
   **The raw task command write back the taskintance's state(like sucess) 
doesn't mean the child process is finished(returned)?**
   
   So, in this heatbeat callback, there maybe a race condition when task state 
is filled back while the child process is not returned.
   
   In this senario, the local task will kill the child process by mistake.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2022-01-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475115#comment-17475115
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

ghostbody commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-1011812804


   we reviewed the code and found that in `local_task_job.py`, the parent 
process has a `heatbeat_callback`, and will check the state and child process 
return code of the `task_instance`.
   
   However, theses lines may cover a bug?
   
   
![image](https://user-images.githubusercontent.com/8371330/149270821-45da67da-186e-409b-8f3e-072fe8e0491c.png)
   
   
![image](https://user-images.githubusercontent.com/8371330/149271933-4ae6c8d1-defc-45c6-ba21-89a46016c3d2.png)
   
   
   **The raw task command write back the taskintance's state(like sucess) 
doesn't mean the child process is finished(returned)?**
   
   So, in this heatbeat callback, there maybe a race condition when task state 
is filled back while the child process is not returned.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2022-01-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17467689#comment-17467689
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

derkuci commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-1003763219


   @xuemengran nvm.  I guess you meant the PR suggested here.  I tried that; 
the log table has changed, and I couldn't match the information easily.  The 
`execution_date` column is None when `event="cli_task_run"`, which makes 
filtering impossible.
   
   I understand why the PR was rejected.  For cases where the logs exist but 
the web UI couldn't locate the correct hostname, the issue is that the 
"task_instance" table only stores the latest `try_number`/`hostname` for a task 
run (as already indicated by @ITriangle).  The PK doesn't include `try_number`. 
 It's better to fix the task_instance table, which is more fundamental, and 
probably would intimidate most "amateurs" (like me).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2022-01-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17467682#comment-17467682
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

derkuci commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-1003754230


   > In the 2.1.1 version, I tried to modify the 
airflow/utils/log/file_task_handler.py file to obtain the hostname information 
by reading the log table. I confirmed through debug that I could get the host 
information in this way, 
   
   @xuemengran could you kindly point to how this could be done?
   
   With airflow 2.2.2 + Celery, I am seeing error messages like below due to 
`TaskInstance.hostname` being always the latest and not relying on the 
`try_number`.
   ```
   "Failed to fetch log file from worker. Client error '404 NOT FOUND' for url 
..."
   ```
   If we try really hard, the logs can be found from the local storage of 
_some_ celery workers.  But that is a huge burden for operational and/or 
debugging.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-4470) RBAC Github Enterprise OAuth provider callback URL?

2021-12-28 Thread Nidhi Chourasia (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nidhi Chourasia reassigned AIRFLOW-4470:


Assignee: Golokesh Patra

> RBAC Github Enterprise OAuth provider callback URL?
> ---
>
> Key: AIRFLOW-4470
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4470
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication, webserver
>Affects Versions: 1.10.2
>Reporter: Geez
>Assignee: Golokesh Patra
>Priority: Blocker
>  Labels: usability
> Attachments: airflow_ss0_2.PNG, airflow_sso3.PNG, airflow_sso4.PNG, 
> image-2019-10-30-16-25-14-436.png, image-2019-10-31-11-47-04-041.png
>
>
> Hi all,
> Quick question, when using RBAC with OAuth providers (1.10.2):
>  * we are not specifying the {{authenticate}} or {{auth_backend}} in the 
> [webserver] section of \{{airflow.cfg}}anymore
>  * Instead, we set the OAuth provider config in the flask-appbuilder's 
> {{webserver_config.py}}:
> {code:java}
>  
> # Adapting Google OAuth example to Github:
> OAUTH_PROVIDERS = [
> {'name':'github', 'icon':'fa-github', 'token_key':'access_token',
>  'remote_app': {
> 'base_url':'https://github.corporate-domain.com/login',
> 
> 'access_token_url':'https://github.corporate-domain.com/login/oauth/access_token',
> 
> 'authorize_url':'https://github.corporate-domain.com/login/oauth/authorize',
> 'request_token_url': None,
> 'consumer_key': '',
> 'consumer_secret': 'X',
>  }
> }
> ]
>  
> {code}
>  _Question:_
>  * so what callback URL do we specify in the app? 
> {{http:/webapp/ghe_oauth/callback}} would not work right? (example with 
> github entreprise)
> No matter what I specify for the callback url (/ghe_oauth/callback or 
> [http://webapp.com|http://webapp.com/]), I get an error message about 
> {{redirect_uri}} mismatch:
> {code:java}
> {{error=redirect_uri_mismatch_description=The+redirect_uri+MUST+match+the+registered+callback+URL+for+this+application
>  }}{code}
> _Docs ref:_
>  Here is how you setup OAuth with Github Entreprise on Airflow _*without*_ 
> RBAC: 
> [https://airflow.apache.org/security.html#github-enterprise-ghe-authentication]
> And here is how you setup OAuth via the {{webserver_config.py}} of 
> flask_appbuilder used by airflow _*with*_RBAC:
>  
> [https://flask-appbuilder.readthedocs.io/en/latest/security.html#authentication-oauth]
> What's the *callback url* when using RBAC and OAuth with Airflow?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (AIRFLOW-6647) Reduce the cluttering of Airflow UI by merging create and check into a single CHECK step.

2021-12-28 Thread Nidhi Chourasia (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nidhi Chourasia reassigned AIRFLOW-6647:


Assignee: Golokesh Patra  (was: Nidhi Chourasia)

> Reduce the cluttering of Airflow UI by merging create and check into a single 
> CHECK step.
> -
>
> Key: AIRFLOW-6647
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6647
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators, ui
>Affects Versions: 1.10.0
> Environment:  Ubuntu 18.04.1 LTS (GNU/Linux 4.15.0-1027-aws x86_64)
>Reporter: Nidhi Chourasia
>Assignee: Golokesh Patra
>Priority: Minor
>
> This is another UI Feature which make the Airflow UI  
>  # Clutter Free 
>  # More readable  
>  # More intuitive  
> Presently on Airflow, for any job there will be 2 steps – CREATE step and 
> CHECK step.  
> CREATE STEP – It only creates the jobs/Runs the Job as per the schedule and 
> dies of immediately  
> CHECK STEP – After the create step, this step will keep on tracking the 
> status of the job ( i.e check if the job is in RUNNING/FAILED/SUCCESSFUL 
> state)  
> We presently host close to approx. 40 jobs,  before our enhancement there 
> would have been 80 Steps created in the form of a TREE ( DAG ), but after our 
> enhancement , the steps CREATE and CHECK have been merged on the UI side to 
> just one single CHECK step.  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-6786) Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor

2021-12-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454196#comment-17454196
 ] 

ASF GitHub Bot commented on AIRFLOW-6786:
-

potiuk commented on pull request #12388:
URL: https://github.com/apache/airflow/pull/12388#issuecomment-987086352


   > @potiuk I have a same requirement. If I am able to implement it. Will 
raise a PR.
   
   Cool!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor
> 
>
> Key: AIRFLOW-6786
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6786
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, hooks
>Affects Versions: 1.10.9
>Reporter: Daniel Ferguson
>Assignee: Daniel Ferguson
>Priority: Minor
>
> Add the KafkaProducerHook.
>  Add the KafkaConsumerHook.
>  Add the KafkaSensor which listens to messages with a specific topic.
>  Related Issue:
>  #1311 (Pre-dates Jira Migration)
> Reminder to contributors:
> You must add an Apache License header to all new files
>  Please squash your commits when possible and follow the 7 rules of good Git 
> commits
>  I am new to the community, I am not sure the files are at the right place or 
> missing anything.
> The sensor could be used as the first node of a dag where the second node can 
> be a TriggerDagRunOperator. The messages are polled in a batch and the dag 
> runs are dynamically generated.
> Thanks!
> Note, as per denied PR [#1415|https://github.com/apache/airflow/pull/1415], 
> it is important to mention these integrations are not suitable for 
> low-latency/high-throughput/streaming. For reference, [#1415 
> (comment)|https://github.com/apache/airflow/pull/1415#issuecomment-484429806].
> Co-authored-by: Dan Ferguson 
> [dferguson...@gmail.com|mailto:dferguson...@gmail.com]
>  Co-authored-by: YuanfΞi Zhu



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-6786) Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor

2021-12-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454186#comment-17454186
 ] 

ASF GitHub Bot commented on AIRFLOW-6786:
-

debashis-das commented on pull request #12388:
URL: https://github.com/apache/airflow/pull/12388#issuecomment-987056630


   @potiuk I have a same requirement. If I am able to implement it. Will raise 
a PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor
> 
>
> Key: AIRFLOW-6786
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6786
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, hooks
>Affects Versions: 1.10.9
>Reporter: Daniel Ferguson
>Assignee: Daniel Ferguson
>Priority: Minor
>
> Add the KafkaProducerHook.
>  Add the KafkaConsumerHook.
>  Add the KafkaSensor which listens to messages with a specific topic.
>  Related Issue:
>  #1311 (Pre-dates Jira Migration)
> Reminder to contributors:
> You must add an Apache License header to all new files
>  Please squash your commits when possible and follow the 7 rules of good Git 
> commits
>  I am new to the community, I am not sure the files are at the right place or 
> missing anything.
> The sensor could be used as the first node of a dag where the second node can 
> be a TriggerDagRunOperator. The messages are polled in a batch and the dag 
> runs are dynamically generated.
> Thanks!
> Note, as per denied PR [#1415|https://github.com/apache/airflow/pull/1415], 
> it is important to mention these integrations are not suitable for 
> low-latency/high-throughput/streaming. For reference, [#1415 
> (comment)|https://github.com/apache/airflow/pull/1415#issuecomment-484429806].
> Co-authored-by: Dan Ferguson 
> [dferguson...@gmail.com|mailto:dferguson...@gmail.com]
>  Co-authored-by: YuanfΞi Zhu



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2021-12-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453706#comment-17453706
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

potiuk edited a comment on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-986311760


   Well. @blotouta2. I think this issue discusses 3 or 4 different problems. So 
your comment is pretty meaningless. Also the issue is clsoed so it's likely 
it's a different issue altogether.
   
   If you REALLLY want to get help, just open a new issue and provide all the 
details you can (and ideally a reproducible case). Or if you do not have 
reproducible case  - provide as much information as you can and open a Github 
Discussion
   
   And BTW. if you are using an older version of Airflow, just upgrade to the 
newest and check - Airlfow follows SemVer, so it should be rather safe to 
upgrade.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2021-12-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453705#comment-17453705
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

potiuk commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-986311760


   Well. @blotouta2. I think this issue discusses 3 or 4 different. Problems. 
The issue is clsoed so it's likely it's a different issue altogether.
   
   If you REALLLY want to get help, just open a new issue and provide all the 
details you can (and ideally a reproducible case). Or if you do not have 
reproducible case  - provide as much information as you can and open a Github 
Discussion
   
   And BTW. if you are using an older version of Airflow, just upgrade to the 
newest and check - Airlfow follows SemVer, so it should be rather safe to 
upgrade.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2021-12-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453287#comment-17453287
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

blotouta2 commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-985965125


   I have also got the same exception, was this issue fixed in any version ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-6786) Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor

2021-12-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452606#comment-17452606
 ] 

ASF GitHub Bot commented on AIRFLOW-6786:
-

potiuk commented on pull request #12388:
URL: https://github.com/apache/airflow/pull/12388#issuecomment-984991868


   unbelivable (!) you have not done it yet @serge-salamanka-1pt !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor
> 
>
> Key: AIRFLOW-6786
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6786
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, hooks
>Affects Versions: 1.10.9
>Reporter: Daniel Ferguson
>Assignee: Daniel Ferguson
>Priority: Minor
>
> Add the KafkaProducerHook.
>  Add the KafkaConsumerHook.
>  Add the KafkaSensor which listens to messages with a specific topic.
>  Related Issue:
>  #1311 (Pre-dates Jira Migration)
> Reminder to contributors:
> You must add an Apache License header to all new files
>  Please squash your commits when possible and follow the 7 rules of good Git 
> commits
>  I am new to the community, I am not sure the files are at the right place or 
> missing anything.
> The sensor could be used as the first node of a dag where the second node can 
> be a TriggerDagRunOperator. The messages are polled in a batch and the dag 
> runs are dynamically generated.
> Thanks!
> Note, as per denied PR [#1415|https://github.com/apache/airflow/pull/1415], 
> it is important to mention these integrations are not suitable for 
> low-latency/high-throughput/streaming. For reference, [#1415 
> (comment)|https://github.com/apache/airflow/pull/1415#issuecomment-484429806].
> Co-authored-by: Dan Ferguson 
> [dferguson...@gmail.com|mailto:dferguson...@gmail.com]
>  Co-authored-by: YuanfΞi Zhu



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-6786) Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor

2021-12-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452605#comment-17452605
 ] 

ASF GitHub Bot commented on AIRFLOW-6786:
-

potiuk commented on pull request #12388:
URL: https://github.com/apache/airflow/pull/12388#issuecomment-984991544


   @serge-salamanka-1pt - maybe you would like to contribute it ? Airflow is 
created by >1800 contributors and you can become one and add Kafka support! The 
OSS world works this way.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor
> 
>
> Key: AIRFLOW-6786
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6786
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, hooks
>Affects Versions: 1.10.9
>Reporter: Daniel Ferguson
>Assignee: Daniel Ferguson
>Priority: Minor
>
> Add the KafkaProducerHook.
>  Add the KafkaConsumerHook.
>  Add the KafkaSensor which listens to messages with a specific topic.
>  Related Issue:
>  #1311 (Pre-dates Jira Migration)
> Reminder to contributors:
> You must add an Apache License header to all new files
>  Please squash your commits when possible and follow the 7 rules of good Git 
> commits
>  I am new to the community, I am not sure the files are at the right place or 
> missing anything.
> The sensor could be used as the first node of a dag where the second node can 
> be a TriggerDagRunOperator. The messages are polled in a batch and the dag 
> runs are dynamically generated.
> Thanks!
> Note, as per denied PR [#1415|https://github.com/apache/airflow/pull/1415], 
> it is important to mention these integrations are not suitable for 
> low-latency/high-throughput/streaming. For reference, [#1415 
> (comment)|https://github.com/apache/airflow/pull/1415#issuecomment-484429806].
> Co-authored-by: Dan Ferguson 
> [dferguson...@gmail.com|mailto:dferguson...@gmail.com]
>  Co-authored-by: YuanfΞi Zhu



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-6786) Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor

2021-12-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452280#comment-17452280
 ] 

ASF GitHub Bot commented on AIRFLOW-6786:
-

serge-salamanka-1pt commented on pull request #12388:
URL: https://github.com/apache/airflow/pull/12388#issuecomment-984473014


   unbelievable (!) Ariflow does not support Kafka out of the box yet !?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Adding KafkaConsumerHook, KafkaProducerHook, and KafkaSensor
> 
>
> Key: AIRFLOW-6786
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6786
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, hooks
>Affects Versions: 1.10.9
>Reporter: Daniel Ferguson
>Assignee: Daniel Ferguson
>Priority: Minor
>
> Add the KafkaProducerHook.
>  Add the KafkaConsumerHook.
>  Add the KafkaSensor which listens to messages with a specific topic.
>  Related Issue:
>  #1311 (Pre-dates Jira Migration)
> Reminder to contributors:
> You must add an Apache License header to all new files
>  Please squash your commits when possible and follow the 7 rules of good Git 
> commits
>  I am new to the community, I am not sure the files are at the right place or 
> missing anything.
> The sensor could be used as the first node of a dag where the second node can 
> be a TriggerDagRunOperator. The messages are polled in a batch and the dag 
> runs are dynamically generated.
> Thanks!
> Note, as per denied PR [#1415|https://github.com/apache/airflow/pull/1415], 
> it is important to mention these integrations are not suitable for 
> low-latency/high-throughput/streaming. For reference, [#1415 
> (comment)|https://github.com/apache/airflow/pull/1415#issuecomment-484429806].
> Co-authored-by: Dan Ferguson 
> [dferguson...@gmail.com|mailto:dferguson...@gmail.com]
>  Co-authored-by: YuanfΞi Zhu



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2021-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451785#comment-17451785
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

danilocurvelo commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-983607427


   Exactly the same error as @vdusek posted in Airflow 2.1.x. Was this fixed in 
Airflow 2.2.x?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2021-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450537#comment-17450537
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

taohuzefu edited a comment on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-981768485


   > Hi, I am still seeing the issue in 2.1.1 version, my executor is celery
   
   @xuemengran
   Hi, how about now? Did you fix that? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2021-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450536#comment-17450536
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

taohuzefu commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-981768485


   > Hi, I am still seeing the issue in 2.1.1 version, my executor is celery
   
   Hi, how about now? Did you fix that?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-5024) RBAC & access_control params

2021-11-16 Thread Ugur B. (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17444833#comment-17444833
 ] 

Ugur B. commented on AIRFLOW-5024:
--

In which version is this available? We are using 1.10.5 and I can see that it 
is defined in the dag model but I am seeing similar issues. Also only getter 
and setter of the attribute seem to be defined, I couldn't find any logic that 
utilizes the access_control param. 
https://github.com/apache/airflow/blob/704e48dee368d193f742e064f42461205ef587e2/airflow/models/dag.py#L210

> RBAC & access_control params
> 
>
> Key: AIRFLOW-5024
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5024
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webserver
>Affects Versions: 1.10.4
>Reporter: Aneesh Joseph
>Priority: Major
>
> DAG level *access_control* permissions were recently setup with this PR - 
> [https://github.com/apache/airflow/pull/4642]
>  
> I created a sample role from the RBAC UI with no permissions, The role name 
> was *sample_team*. 
>  
> I created below DAG
>  
>  
> {code:java}
> from airflow import DAG
> from airflow.operators.bash_operator import BashOperator
> from datetime import datetime, timedelta
> default_args = {
>  'owner': 'sample_team',
>  'depends_on_past': False,
>  'start_date': datetime(2019, 2, 1),
>  'retries': 4,
>  'retry_delay': timedelta(minutes=1),
> }
> dag = DAG('sample-team-my-sample-pipeline', default_args=default_args, 
> schedule_interval='15 0 * * *',catchup=True,access_control={'sample_team': 
> ['can_dag_edit', 'can_dag_read']})
> t1 = BashOperator(
>  task_id='sample_task',
>  bash_command="""
>  echo {{ execution_date }}
>  """,
>  retries=1,
>  dag=dag)
> {code}
>  
> and was expecting that any user who is added to *sample_team* role will now 
> have access to this DAG, but that wasn't the case. The user is able to login, 
> but can't view the above DAG. I looked at the UI roles back again to see the 
> Permissions which were automatically added to the *sample_team* role.  Below 
> were the Permissions which were auto-added 
>  
>  
> {code:java}
> [menu access on About, can rendered on Airflow, can task stats on Airflow, 
> can pickle info on Airflow, can task on Airflow, can refresh on Airflow, can 
> index on Airflow, can blocked on Airflow, can log on Airflow, can duration on 
> Airflow, can landing times on Airflow, can clear on Airflow, can tree on 
> Airflow, can dag details on Airflow, can dagrun clear on Airflow, can code on 
> Airflow, can tries on Airflow, can get logs with metadata on Airflow, can run 
> on Airflow, can gantt on Airflow, can success on Airflow, can delete on 
> Airflow, can paused on Airflow, can task instances on Airflow, can trigger on 
> Airflow, can xcom on Airflow, can graph on Airflow, can dag stats on Airflow, 
> can list on DagModelView, can show on DagModelView, can edit on DagModelView, 
> can version on VersionView, can list on DagRunModelView, can add on 
> DagRunModelView, muldelete on DagRunModelView, set failed on DagRunModelView, 
> set running on DagRunModelView, set success on DagRunModelView, menu access 
> on DAG Runs, menu access on Browse, can list on JobModelView, menu access on 
> Jobs, can list on LogModelView, menu access on Logs, can list on 
> SlaMissModelView, menu access on SLA Misses, can list on 
> TaskInstanceModelView, clear on TaskInstanceModelView, set failed on 
> TaskInstanceModelView, set running on TaskInstanceModelView, set success on 
> TaskInstanceModelView, menu access on Task Instances, menu access on 
> Documentation, menu access on Docs, menu access on Version]{code}
>  
>  
> I guess this is a bug? or is it something which I have done wrong with my DAG 
> definition. 
> Another note: The Admin role has 2 permissions for each DAG(dag edit and dag 
> read). Will this work alright when we scale up to 1000s of DAGs?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-3702) Reverse Backfilling(from current date to start date)

2021-11-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441053#comment-17441053
 ] 

ASF GitHub Bot commented on AIRFLOW-3702:
-

aashayVimeo commented on pull request #4676:
URL: https://github.com/apache/airflow/pull/4676#issuecomment-964018648


   hi @dima-asana @feng-tao Any updates on the above comment for running 
backwards dag from dag definition.
   
   ```
   from airflow import DAG
   dag = DAG(
   dag_id = "...",
   scheduled_interval = "@daily",
   run_backwards = True,
   ...
   )
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Reverse Backfilling(from current date to start date)
> 
>
> Key: AIRFLOW-3702
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3702
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: configuration, DAG, models
>Affects Versions: 1.10.1
> Environment: MacOS High Sierra
> python2.7
> Airflow-1.10.1
>Reporter: Shubham Gupta
>Assignee: Dima Kamalov
>Priority: Major
>  Labels: critical, improvement, priority
> Fix For: 1.10.3
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Hello,
> I think there is a need to have reverse backfilling option as well because 
> recent jobs would take precedence over the historical jobs. We can come up 
> with some variable in the DAG such as dagrun_order_default = True/False . 
> This would help in many use cases, in which previous date pipeline does not 
> depends on current pipeline.
> I saw this page which talks about this -> 
> http://mail-archives.apache.org/mod_mbox/airflow-dev/201804.mbox/%3CCAPUwX3M7_qrn=1bqysmkdv_ifjbta6lbtq7czhhexszmdjk...@mail.gmail.com%3E
> Thanks!
> Regards,
> Shubham



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2021-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438722#comment-17438722
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

zhengxianh removed a comment on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-957984427


   Hi, I'm trying to set up Airflow 2.1.4 with docker on multiple machines, but 
failed when the webserver is not able to access the task logs. Can anyone help 
me out please?
   
   I've created a post in stackoverflow with details:
   
https://stackoverflow.com/questions/68694805/airflow-webserver-not-able-to-access-remote-worker-logs
   
   If you need any other info, please let me know.
   
   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (AIRFLOW-6395) [AIP-28] Add AsyncExecutor option

2021-11-04 Thread Ash Berlin-Taylor (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor closed AIRFLOW-6395.
--
Resolution: Delivered

Closed in favour of 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=177050929

> [AIP-28] Add AsyncExecutor option
> -
>
> Key: AIRFLOW-6395
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6395
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core, executors, operators, scheduler
>Affects Versions: 1.10.7
>Reporter: Darren Weber
>Priority: Minor
>
> Add an AsyncExecutor that is similar to LocalExecutor but designed to 
> optimize for high concurrency with async behavior for any blocking 
> operations.  It requires an async ecosystem and general flags for async 
> operations on hooks, operators, and sensors.
> Further details can be developed in an AIP and this description can be 
> updated with links to relevant resources and discussion(s).
> - 
> [https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-28%3A+Add+AsyncExecutor+option]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2021-11-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437616#comment-17437616
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

zhengxianh commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-957984427


   Hi, I'm trying to set up Airflow 2.1.4 with docker on multiple machines, but 
failed when the webserver is not able to access the task logs. Can anyone help 
me out please?
   
   I've created a post in stackoverflow with details:
   
https://stackoverflow.com/questions/68694805/airflow-webserver-not-able-to-access-remote-worker-logs
   
   If you need any other info, please let me know.
   
   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-4922) If a task crashes, host name is not committed to the database so logs aren't able to be seen in the UI

2021-11-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437528#comment-17437528
 ] 

ASF GitHub Bot commented on AIRFLOW-4922:
-

zhengxianh commented on pull request #6722:
URL: https://github.com/apache/airflow/pull/6722#issuecomment-957984427


   Hi, I'm trying to set up Airflow 2.1.4 with docker on multiple machines, but 
failed when the webserver is not able to access the task logs. Can anyone help 
me out please?
   
   I've created a post in stackoverflow with details:
   
https://stackoverflow.com/questions/68694805/airflow-webserver-not-able-to-access-remote-worker-logs
   
   If you need any other info, please let me know.
   
   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If a task crashes, host name is not committed to the database so logs aren't 
> able to be seen in the UI
> --
>
> Key: AIRFLOW-4922
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4922
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.3
>Reporter: Andrew Harmon
>Assignee: wanghong-T
>Priority: Major
>
> Sometimes when a task fails, the log show the following
> {code}
> *** Log file does not exist: 
> /usr/local/airflow/logs/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Fetching from: 
> http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log*** 
> Failed to fetch log file from worker. Invalid URL 
> 'http://:8793/log/my_dag/my_task/2019-07-07T09:00:00+00:00/1.log': No host 
> supplied
> {code}
> I believe this is due to the fact that the row is not committed to the 
> database until after the task finishes. 
> https://github.com/apache/airflow/blob/a1f9d9a03faecbb4ab52def2735e374b2e88b2b9/airflow/models/taskinstance.py#L857



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5911) Simplify lineage support and improve robustness

2021-10-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17433558#comment-17433558
 ] 

ASF GitHub Bot commented on AIRFLOW-5911:
-

alokjain-01 commented on pull request #6564:
URL: https://github.com/apache/airflow/pull/6564#issuecomment-950526422


   is this problem of lineage fixed in Airflow 2.0 onward. I am using 2.1.4 and 
still not getting any lineage in Apache atlas. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Simplify lineage support and improve robustness
> ---
>
> Key: AIRFLOW-5911
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5911
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: lineage
>Affects Versions: 1.10.6
>Reporter: Bolke de Bruin
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2021-10-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17432763#comment-17432763
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

nguyenmphu edited a comment on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-949189802


   I found that in the code of `airflow/jobs/scheduler_job.py`: 
https://github.com/apache/airflow/blob/main/airflow/jobs/scheduler_job.py#L535
   ``` python
  if ti.try_number == buffer_key.try_number and ti.state == 
State.QUEUED:
   Stats.incr('scheduler.tasks.killed_externally')
   msg = (
   "Executor reports task instance %s finished (%s) 
although the "
   "task says its %s. (Info: %s) Was the task killed 
externally?"
   )
   self.log.error(msg, ti, state, ti.state, info)
   ```
   The scheduler checks the state of the task instance. When a task instance is 
rescheduled (e.g: an external sensor), its state transition up_for_reschedule 
-> scheduled -> queued -> running. If its state is queued and not moved to the 
running state, the scheduler will raise an error.
   So I think the code needs to be changed:
   ``` python
  if ti.try_number == buffer_key.try_number and (
   ti.state == State.QUEUED and not 
TaskReschedule.find_for_task_instance(ti, session=session)
   ):
   Stats.incr('scheduler.tasks.killed_externally')
   msg = (
   "Executor reports task instance %s finished (%s) 
although the "
   "task says its %s. (Info: %s) Was the task killed 
externally?"
   )
   self.log.error(msg, ti, state, ti.state, info)
   ```
   Here is my PR: https://github.com/apache/airflow/pull/19123


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2021-10-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17432762#comment-17432762
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

nguyenmphu commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-949189802


   I found that in the code of `airflow/jobs/scheduler_job.py`: 
https://github.com/apache/airflow/blob/main/airflow/jobs/scheduler_job.py#L535
   ``` python
  if ti.try_number == buffer_key.try_number and ti.state == 
State.QUEUED:
   Stats.incr('scheduler.tasks.killed_externally')
   msg = (
   "Executor reports task instance %s finished (%s) 
although the "
   "task says its %s. (Info: %s) Was the task killed 
externally?"
   )
   self.log.error(msg, ti, state, ti.state, info)
   ```
   The scheduler checks the state of the task instance. When a task instance is 
rescheduled (e.g: an external sensor), its state transition up_for_reschedule 
-> scheduled -> queued -> running. If its state is queued and not moved to the 
running state, the scheduler will raise an error.
   So I think the code needs to change:
   ``` python
  if ti.try_number == buffer_key.try_number and (
   ti.state == State.QUEUED and not 
TaskReschedule.find_for_task_instance(ti, session=session)
   ):
   Stats.incr('scheduler.tasks.killed_externally')
   msg = (
   "Executor reports task instance %s finished (%s) 
although the "
   "task says its %s. (Info: %s) Was the task killed 
externally?"
   )
   self.log.error(msg, ti, state, ti.state, info)
   ```
   Here is my PR: https://github.com/apache/airflow/pull/19123


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2021-10-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17430385#comment-17430385
 ] 

ASF GitHub Bot commented on AIRFLOW-5071:
-

jledru-redoute commented on issue #10790:
URL: https://github.com/apache/airflow/issues/10790#issuecomment-946459118


   Hello,
   Still on version 1.10.12 managed by cloud composer but we are intending to 
move quite quickly to airflow 2. 
   But it seems that this issue is not really resolved on the version 2.
   We are experiencing this issue not every day, but quite often and always on 
the same dags. Those dags are dynamically generated by the same python file in 
airflow based on conf files scanning. It took generally around 12s to parse, so 
I don't think this is the issue.  It looks like this : 
   ```
   for country in DAG_PARAMS['countries']:
   
   for audience_type in AUDIENCES_TYPE:
   
   # get audiences conf file to generate the dags
   conf_files = glob.glob(
   f"/home/airflow/gcs/data/CAM/{ country['country_code'] 
}/COMPOSER_PARAM_SOURCES/{ audience_type['type'] }/*")
   
   audiences_list = []
   
   for conf_file in conf_files:
   
   string_conf = open(conf_file, 'rb').read().decode("UTF-8")
   audiences_list.append(json.loads(string_conf))
   
   for letter in ascii_uppercase:
   dag_aud_list = [
   aud for aud in audiences_list if aud["CATEG_CODE"][0] == 
letter]
   
   if dag_aud_list:
   dag = create_dag(audience_type, country, dag_aud_list)
   globals()[
   f"{ audience_type['type'] }_{ country['country_code'] 
}_{ letter }_dag"] = dag
   ```
   I understand it is not quite recommanded (however what is preco for this 
type of DAG) but that's the way it is done.
   It generates for now around 10 dags with approx 35 init sensor in reschedule 
mode every 20 minutes.
   Worker machine is n1-standard-4 set with worker_concurrency at 24. 
   Therefore yesterday on 35 celerys task set to be reschedule, 32 of them were 
rescheduled on the same worker (there are 3 workers) at quite the same time 
(I'm not sure how to see of the worker concurrency was respected or not but I 
doubt it) causing 17 of them to fail with this specific issue ...
   If I understand, set worker_autoscale to "4,2" (and keeping 
worker_concurrency to 24) would resolve the issue ? 
   Thanks,


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Fix For: 1.10.12
>
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-4848) MySQL warnings about aborted connections, missing engine disposal

2021-10-11 Thread Angelo Huang (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427273#comment-17427273
 ] 

Angelo Huang commented on AIRFLOW-4848:
---

[~mkjasinski] Have you found a solution yet?

> MySQL warnings about aborted connections, missing engine disposal
> -
>
> Key: AIRFLOW-4848
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4848
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database
>Reporter: t oo
>Assignee: Daniel Huang
>Priority: Minor
>
> I am not referring to airflow logs in filesystem. I am referrring to logs in 
> the MySQL db itself. This affects airflow 1.10.3, mysql rds 5.7.25
>  
> ie
>  
> 2019-06-25T09:55:25.126187Z 54996343 [Note] Aborted connection 54996343 to 
> db: 'airflowdb' user: 'f' host: 'host' (Got an error reading communication 
> packets)
>  2019-06-25T09:55:25.392705Z 54996375 [Note] Aborted connection 54996375 to 
> db: 'airflowdb' user: 'f' host: 'host' (Got an error reading communication 
> packets)
>  2019-06-25T09:55:25.450276Z 54996240 [Note] Aborted connection 54996240 to 
> db: 'airflowdb' user: 'f' host: 'host' (Got an error reading communication 
> packets)
>  2019-06-25T09:55:25.592741Z 54996391 [Note] Aborted connection 54996391 to 
> db: 'airflowdb' user: 'f' host: 'host' (Got an error reading communication 
> packets)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (AIRFLOW-4848) MySQL warnings about aborted connections, missing engine disposal

2021-10-11 Thread Angelo Huang (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427242#comment-17427242
 ] 

Angelo Huang edited comment on AIRFLOW-4848 at 10/11/21, 4:56 PM:
--

{{[~dxhuang] }}

{{We're seeing a lot connection warnnigs as well with Airflow 1.10.9 and 
Mariadb: 10.4.15.}}

{{2021-10-11 16:48:03 63580 [Warning] Aborted connection 63580 to db: 'airflow' 
user: 'blabla' host: 
'airflow-scheduler-prod-64d6cf8879-gb4lz.airflow-scheduler-pr' (Got an error 
reading communication packets)}}
 {{2021-10-11 16:48:03 63578 [Warning] Aborted connection 63578 to db: 
'airflow' user: 'blabla' host: 
'airflow-scheduler-prod-64d6cf8879-gb4lz.airflow-scheduler-pr' (Got an error 
reading communication packets)}}
 {{2021-10-11 16:48:03 63581 [Warning] Aborted connection 63581 to db: 
'airflow' user: 'blabla' host: 
'airflow-scheduler-prod-64d6cf8879-gb4lz.airflow-scheduler-pr' (Got an error 
reading communication packets)}}


was (Author: angelohuang):
{{We're seeing a lot connection warnnigs as well with Airflow 1.10.9 and 
Mariadb: 10.4.15.}}

{{2021-10-11 16:48:03 63580 [Warning] Aborted connection 63580 to db: 'airflow' 
user: 'blabla' host: 
'airflow-scheduler-prod-64d6cf8879-gb4lz.airflow-scheduler-pr' (Got an error 
reading communication packets)}}
{{2021-10-11 16:48:03 63578 [Warning] Aborted connection 63578 to db: 'airflow' 
user: 'blabla' host: 
'airflow-scheduler-prod-64d6cf8879-gb4lz.airflow-scheduler-pr' (Got an error 
reading communication packets)}}
{{2021-10-11 16:48:03 63581 [Warning] Aborted connection 63581 to db: 'airflow' 
user: 'blabla' host: 
'airflow-scheduler-prod-64d6cf8879-gb4lz.airflow-scheduler-pr' (Got an error 
reading communication packets)}}

> MySQL warnings about aborted connections, missing engine disposal
> -
>
> Key: AIRFLOW-4848
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4848
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database
>Reporter: t oo
>Assignee: Daniel Huang
>Priority: Minor
>
> I am not referring to airflow logs in filesystem. I am referrring to logs in 
> the MySQL db itself. This affects airflow 1.10.3, mysql rds 5.7.25
>  
> ie
>  
> 2019-06-25T09:55:25.126187Z 54996343 [Note] Aborted connection 54996343 to 
> db: 'airflowdb' user: 'f' host: 'host' (Got an error reading communication 
> packets)
>  2019-06-25T09:55:25.392705Z 54996375 [Note] Aborted connection 54996375 to 
> db: 'airflowdb' user: 'f' host: 'host' (Got an error reading communication 
> packets)
>  2019-06-25T09:55:25.450276Z 54996240 [Note] Aborted connection 54996240 to 
> db: 'airflowdb' user: 'f' host: 'host' (Got an error reading communication 
> packets)
>  2019-06-25T09:55:25.592741Z 54996391 [Note] Aborted connection 54996391 to 
> db: 'airflowdb' user: 'f' host: 'host' (Got an error reading communication 
> packets)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   8   9   10   >