GitHub user Pedrinhonitz added a comment to the discussion: Need Help with 
deleting DagFiles from FileSystem using Airflow CLI

Hello,

>From what I understand, you want to actually delete a .py file from the DAG 
>after 48 hours of its execution, whether successful or failed. I'm not 
>entirely sure why you're doing this, and I understand correctly.

I did some brief research on your point and found how to do this by listing the 
DAGs in the Airflow database and then embedding a shell script that removes the 
files. I don't know if this helps, as your use case isn't clear to me. If it's 
not, please provide more details.

The Airflow version I used for testing was Airflow 3.2.1.

Basically, I copied a file called clean.sql into the container using the 
Airflow CLI. The query contained within was the following:

```sql
WITH last_run AS (
  SELECT
    _dag_run.dag_id,
    _dag_run.state,
    _dag_run.start_date,
    ROW_NUMBER() OVER (PARTITION BY _dag_run.dag_id ORDER BY 
_dag_run.start_date DESC NULLS LAST) AS rn
  FROM 
    dag_run AS _dag_run
)
SELECT
  _dag.dag_id,
  _dag.fileloc,
  _last_run.state,
  _last_run.start_date
FROM 
    dag  AS _dag
INNER JOIN last_run AS _last_run ON 
    _last_run.dag_id = _dag.dag_id
    AND _last_run.rn = 1
WHERE 
    _last_run.state IN ('success','failed')
  AND _last_run.start_date < (NOW() - INTERVAL '48 hours')
  AND _dag.fileloc IS NOT NULL
ORDER BY 
    _last_run.start_date 
ASC;


```

**Leave a blank line at the end of the file, as the command may become confused 
otherwise.**

After that, this file becomes available as clean.sql inside my container.

With that, I executed the following command.
```shell
psql -At "postgresql://airflow:airflow@postgres:5432/airflow" -f 
/opt/airflow/clean.sql   | awk -F'|' 'NF{print $2}'   | sort -u   | xargs -I{} 
echo rm -f "{}"
```


This command has an `echo`, and it doesn't actually execute the remove command. 
I only used it to test the structure before impacting the deletion of my DAG. 
As you can see in the screenshot, it returned the remove command correctly.

<img width="566" height="57" alt="image" 
src="https://github.com/user-attachments/assets/f1185d4f-6519-47f4-b6b3-e571e9b472b6";
 />

In other words, if you execute the command this way without the echo, it will 
remove the DAG Python files returned by the query in the .sql file.

```bash
psql -At "postgresql://airflow:airflow@postgres:5432/airflow" -f 
/opt/airflow/clean.sql   | awk -F'|' 'NF{print $2}'   | sort -u   | xargs -I{}  
rm -f "{}"
```

**In my research, I haven't found how to do this directly through the Airflow 
CLI, and I don't know if it's supported; we can wait for someone with 
experience with the CLI to help. But if my understanding of your problem is 
correct, this should solve and help, improving both the speed of command 
execution and code readability.**

GitHub link: 
https://github.com/apache/airflow/discussions/60954#discussioncomment-17065810

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to