GitHub user joaofernandes5 added a comment to the discussion: Massive metadata 
table even after clean with CLI

Hi @potiuk. I'm sorry for the confusion, I was saying table _job_ but I was 
looking into the **log** table.
I'm using the following query to check my table size:
```
SELECT 
    schemaname || '.' || relname AS table_name,
    pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
    pg_size_pretty(pg_relation_size(relid)) AS table_size,
    pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) AS 
index_size
FROM pg_catalog.pg_statio_user_tables 
ORDER BY pg_total_relation_size(relid) DESC;
```

The results shows that I really do not have any data before my cut-off date:
```
airflow-metadata=> SELECT COUNT(*) 
FROM public.log 
WHERE dttm < '2025-03-01';
 count 
-------
     0
(1 row)

airflow-metadata=> SELECT COUNT(*) 
FROM public.log 
WHERE dttm > '2025-03-01';
  count  
---------
 7896329
(1 row)
```

But it still with 20gb:
                                                                               
size          | table_size | index_size 
-------------------------------------------------------------------------------+------------+------------+------------
 public.log                                                             | 21 GB 
    | 17 GB      | 4108 MB


Do you think that it is correct? It is really a really great amount, but logs 
are usually really have. Maybe I should reduce my cut-off date..


GitHub link: 
https://github.com/apache/airflow/discussions/52889#discussioncomment-13695812

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to