ephraimbuddy commented on code in PR #28256:
URL: https://github.com/apache/airflow/pull/28256#discussion_r1111619848


##########
airflow/dag_processing/manager.py:
##########
@@ -782,7 +782,11 @@ def clear_nonexistent_import_errors(file_paths: list[str] 
| None, session=NEW_SE
         """
         query = session.query(errors.ImportError)
         if file_paths:
-            query = query.filter(~errors.ImportError.filename.in_(file_paths))
+            for file_path in file_paths:
+                if file_path.endswith(".zip"):
+                    query = 
query.filter(~(errors.ImportError.filename.startswith(file_path)))
+                else:
+                    query = query.filter(errors.ImportError.filename != 
file_path)

Review Comment:
   Should we make it so that if there's no zip file in file_paths then we don't 
need to iterate the file_paths but run the removed query: `query = 
query.filter(~errors.ImportError.filename.in_(file_paths))`, because now we run 
a query for every file. I think we still have performance issue here 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to