nookcreed opened a new issue, #36920:
URL: https://github.com/apache/airflow/issues/36920
### Apache Airflow version
Other Airflow 2 version (please specify below)
### If "Other Airflow 2 version" selected, which one?
2.7.3
### What happened?
We are encounte
luke-goddard commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-2034802986
Upgrading from v2.7.2 -> v2.8.4 did not resolve the issue for me, although
the 'DAG scheduling was skipped, ...' error is gone now.
Strangely the issue is only occurrin
Kenny1217 commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-2045431878
I'm having the same type of issue running Airflow 2.8.3 on Kubernetes. We're
use the Kubernetes executer and use an external Postgres database. We get the
same error saying DAG r
nookcreed commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-2045726156
Our setup is similar, kubernetes executor + postgres (I reported the issue
originally). I attached debug logs around the time of failure to this ticket
sometime ago, but looks li
garywhiteford commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-2045985471
Speaking of locks (un-locks), is there any chance this issue and discussion
https://github.com/apache/airflow/discussions/38728 could be inversely related?
--
This is an
github-actions[bot] commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1953299882
This issue has been automatically marked as stale because it has been open
for 14 days with no response from the author. It will be closed in next 7 days
if no further
rahulnbhandari commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1953350633
We are also seeing same issue. We started getting this after switching
deployments to official helm chart on 2.7.3
--
This is an automated message from the Apache Git Serv
ruarfff commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1953856212
Just to add some information on our experience of this issue.
We deploy Airflow with [Astronomer](https://www.astronomer.io/). We started
seeing something like this issue in
potiuk commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1954062762
Can you raise it to Astronomer's support? I believe they provide paid
service including support, and since they can have direct insight into what you
see and be able to investigate
ruarfff commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1954182898
@potiuk thank you. We did try Astronomer support but no luck there :) They
couldn't figure it out.
I just wanted to add some extra information to this issue in case it might
potiuk commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1954213004
I'd suggest try again. If they could not figure it out with access to the
system, then I am afraid it's not gonna be any easier here as people here
cannot do any more diagnosis on y
ruarfff commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1954278694
@ashb sorry, my bad. From my pserspective it wasn't figured out but you're
right it was in fact someone in our internal Astronomer support team who closed
the ticket. Sorry about t
ashb commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1954283402
Yeah I get that! S'alright. We might be in touch if we need help reproducing
this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
ephraimbuddy commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1956814678
I have spent some time trying to reproduce this but I haven't been able to
do so.
I would like to suggest that we revert #31414, I looked at the issue it was
trying to sol
potiuk commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1956816880
> I have spent some time trying to reproduce this but I haven't been able to
do so. I would like to suggest that we revert #31414, I looked at the issue it
was trying to solve and I
ephraimbuddy commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1957137366
> > I have spent some time trying to reproduce this but I haven't been able
to do so. I would like to suggest that we revert #31414, I looked at the issue
it was trying to sol
ephraimbuddy closed issue #36920: Scheduler fails to schedule DagRuns due to
persistent DAG record lock
URL: https://github.com/apache/airflow/issues/36920
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
ssudrich-soundhound commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1970642905
We've had these issues lately and upgraded to 2.8.2 yesterday, but that
didn't solve the issue.
The message regarding DAGs being locked is gone, but the actual is
potiuk commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1970664788
> We've had these issues lately and upgraded to 2.8.2 yesterday, but that
didn't solve the issue.
>
> The message regarding DAGs being locked is gone, but the actual issue of
potiuk commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1970668268
It does look like something external is locking your Db - maybe you have
some special DB (not real postgres) or maybe your postgres is special? I
suggest to open a new issue and des
ssudrich-soundhound commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1970685478
We're using the official helm chart (so regular postgres) and before the
upgrade to `2.8.2` it coincided with the log messages described above. Now, I
can't find anythi
gr8web commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1970685999
> It does look like something external is locking your Db - maybe you have
some special DB (not real postgres) or maybe your postgres is special? I
suggest to open a new issue and d
Schlyterr commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1973422572
We ran into this issue today and after following these steps it started
working again.
1. Pause all DAGs
2. Going into Browse -> Task Instances and deleting all tasks th
kaxil commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1976291178
Reopening this as it hasn't been fixed yet, @ephraimbuddy will take a look
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
lihan commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1979547584
Hi, we are heavily affected by this. We are on 2.7.2.
Some observations:
* The dag lock up happens to a lot dags, but probably half of them never got
affected. The other
potiuk commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1979751971
> witching off SQL Alchemy Pool fixed this problem
Hmmm - that is an interesting note and might lead to some hypothesis why it
happens @ephraimbuddy @kaxil and might help with
potiuk commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1979760836
Did you have any special pool configuration before when it happened @lihan ?
Can you please share it here ?
--
This is an automated message from the Apache Git Service.
To respond
potiuk commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1979783880
> Hi, the only config I changed was
AIRFLOW__DATABASE__SQL_ALCHEMY_POOL_SIZE, which was set to 180. My Airflow has
1200 active running dags so this number was set at a rather high v
lihan commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1979797298
Thanks for the explanation, this make sense, the workers cannot reuse this,
so making less sense even use the POOL when there is only 1 scheduler running.
--
This is an automat
gr8web commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1981817654
Hello people.
Sorry, it looks like I was wrong. I just saw it again, connections getting
stuck in `idle in transaction` state in the database
and jobs not progressing.
ephraimbuddy commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1981829921
> The only thing what happened to the jobs is that the tasks were cleared
for few days in the past.
Can you elaborate more on this? Like explain how it was cleared and y
gr8web commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1981908959
> > The only thing what happened to the jobs is that the tasks were cleared
for few days in the past.
>
> Can you elaborate more on this? Like explain how it was cleared and y
ephraimbuddy commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1982837483
> > > The only thing what happened to the jobs is that the tasks were
cleared for few days in the past.
> >
> >
> > Can you elaborate more on this? Like explain how
renanxx1 commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1985698998
I faced the same issue on Airflow 2.8.1.
The scheduler was skipping dags execution with the message: "DAG scheduling
was skipped, probably because the DAG record was locked"
potiuk commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1902165462
I think what could help is to show more complete logs around the time when
the deadlock occurs with logging level set to debug - that would help anyone
who would analyse the problem
nookcreed commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1902235962
[scheduler.txt](https://github.com/apache/airflow/files/13998874/scheduler.txt)
@potiuk I have attached debug logs. It is harder to recreate the scenario
with debug logging en
plmnguyen commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-1924267212
I am trying to recreate or capture logs, but what I've been noticing thus
far on 2.8.1:
- Happens only on Kubernetes spark submit operators tasks
- For DAGs showing tha
Kenny1217 commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-2109162541
We were having the issue with Airflow 2.8.3 but ever since we upgraded to
Airflow to 2.9.0 we haven't seen the issue again. This could be just luck since
it only happens randomly
hterik commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-2114066312
After doing the following, all these all these errors went away for us.
1. Delete all dags from the Database that no longer exist in the dagbag,
using `airflow dags delete
loustler commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-2144143618
Hello everyone
Could this issue be related to the fact that tasks are being marked as
skipped without actually being executed?
The above issue is happening to me on rare
eladkal closed issue #36920: Scheduler fails to schedule DagRuns due to
persistent DAG record lock
URL: https://github.com/apache/airflow/issues/36920
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
eladkal commented on issue #36920:
URL: https://github.com/apache/airflow/issues/36920#issuecomment-2156309461
There are 3 reports on this thread that upgrading to Airflow 2.9 solves the
problem thus I am closing this issue
--
This is an automated message from the Apache Git Service.
To
42 matches
Mail list logo