rawwar commented on issue #35529:
URL: https://github.com/apache/airflow/issues/35529#issuecomment-1927358607

   > 2. Make the original DAG available by copying it to the temporary 
directory and rename it to the modified module name(which is `unusual_prefix_*` 
. This temporary directory is created by the `PythonVirtualEnvironment` 
operator and it does copy inputs, pickled args, kwargs and etc.
   
   I don't have a clear solution for this approach. But, I think I understand 
the problem and probably can explain it here.  
   
   
   If we want to allow python callable to be defined as part of the dag module, 
we should make the original module available whenever we are serializing and 
deserializing objects.   But, since we are modifying module's name and 
re-importing it. we should try to re-import with the modified name in  
[python_virtualenv_script.jinja2](https://github.com/apache/airflow/blob/51419a789acb80c822e4f74a846f71f3aa00ffe2/airflow/utils/python_virtualenv_script.jinja2)
 .  
   
   so, condition to detect if python_callable is part of the dag module will be 
to check using `check_callable_in_dag_module` as below
   
   ```
   import hashlib
   from pathlib import Path
   import inspect
   
   MODIFIED_DAG_MODULE_NAME = "unusual_prefix_{path_hash}_{module_name}"
   def get_unique_dag_module_name(file_path: str) -> str:
       """Returns a unique module name in the format unusual_prefix_{sha1 of 
module's file path}_{original module name}."""
       if isinstance(file_path, str):
           path_hash = hashlib.sha1(file_path.encode("utf-8")).hexdigest()
           org_mod_name = Path(file_path).stem
           return MODIFIED_DAG_MODULE_NAME.format(path_hash=path_hash, 
module_name=org_mod_name)
       raise ValueError("file_path should be a string to generate unique module 
name")
   
   def check_callable_in_dag_module(python_callable):
           if get_unique_dag_module_name(inspect.getfile(python_callable)) == 
python_callable.__module__:
               return True
   ```
   
   If the above condition is met, we should copy the original module source 
code to the temp directory where the `PythonVirtualenvOperator` will execute 
the code
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to