pjavier29 commented on issue #61453:
URL: https://github.com/apache/airflow/issues/61453#issuecomment-3871420131
@wjddn279
I don't think it's a network issue or a delay related to data volume. In my
database, the `asset_event` table is 1.3GB and the `asset` table is 6.8MB
(16,000 records).
The query reaches the database, but I think it gets blocked because it's too
long. If you check the database for locks with this query:
SELECT
l.pid,
l.mode,
l.granted,
a.query,
a.state,
a.query_start
FROM pg_locks l
JOIN pg_stat_activity a ON l.pid = a.pid
WHERE l.relation = 'asset_event'::regclass
ORDER BY l.granted, l.pid;
you'll see that the query arrives successfully but seems to get stuck
there. You wait an hour and the record continues. The query never finishes.
I don't think the in-memory filter is a problem because you already have
those records in memory, and besides, filtering a list of 16000 records is not
usually a problem for Python.
Regards
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]