spock-yh opened a new issue, #1484:
URL: https://github.com/apache/hamilton/issues/1484

   ## Description
   
   `is_submodule()` in `hamilton/graph_utils.py` (line 24) uses a substring 
match (`in`) instead of a proper prefix match:
   
   ```python
   def is_submodule(child: ModuleType, parent: ModuleType):
       return parent.__name__ in child.__name__
   ```
   
   This means that if a user module's name happens to be a substring of an 
imported function's module path, Hamilton will incorrectly treat that imported 
function as part of the user's DAG.
   
   ## Reproduction
   
   Create a user module named `modifiers.py`:
   
   ```python
   from hamilton.function_modifiers import source, value
   
   def my_func(input_data: int) -> int:
       return input_data * 2
   ```
   
   Then inspect what `find_functions` discovers:
   
   ```python
   import modifiers
   from hamilton.graph_utils import find_functions
   
   functions = find_functions(modifiers)
   for name, fn in functions:
       print(f'{name} (from {fn.__module__})')
   ```
   
   **Expected output:**
   ```
   my_func (from modifiers)
   ```
   
   **Actual output:**
   ```
   my_func (from modifiers)
   source (from hamilton.function_modifiers.dependencies)
   value (from hamilton.function_modifiers.dependencies)
   ```
   
   The `source` and `value` functions are included because `"modifiers" in 
"hamilton.function_modifiers.dependencies"` evaluates to `True`.
   
   ## Root Cause
   
   `is_submodule(child, parent)` checks `parent.__name__ in child.__name__`, 
which is a substring test. For the case above:
   - `parent.__name__` = `"modifiers"` (the user module)
   - `child.__name__` = `"hamilton.function_modifiers.dependencies"` (where 
`source` is defined)
   - `"modifiers" in "hamilton.function_modifiers.dependencies"` → `True` 
(incorrect)
   
   This affects any user module whose name is a substring of the module path of 
any imported function. Common examples include module names like `function`, 
`modifiers`, `ton`, `ilton`, etc.
   
   ## Suggested Fix
   
   Replace the substring check with a proper prefix/equality check:
   
   ```python
   def is_submodule(child: ModuleType, parent: ModuleType):
       return child.__name__ == parent.__name__ or 
child.__name__.startswith(parent.__name__ + ".")
   ```
   
   This ensures that `child` is truly a submodule of (or the same module as) 
`parent`, not just any module whose fully-qualified name happens to contain 
`parent`'s name as a substring.
   
   ## Impact
   
   Any user module whose name appears as a substring in the module path of an 
imported symbol will have those imported symbols incorrectly added as DAG 
nodes. This can lead to:
   - Unexpected nodes appearing in the graph
   - Potential name collisions and incorrect graph topology
   - Confusing errors that are hard to diagnose


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to