spock-yh opened a new issue, #1484:
URL: https://github.com/apache/hamilton/issues/1484
## Description
`is_submodule()` in `hamilton/graph_utils.py` (line 24) uses a substring
match (`in`) instead of a proper prefix match:
```python
def is_submodule(child: ModuleType, parent: ModuleType):
return parent.__name__ in child.__name__
```
This means that if a user module's name happens to be a substring of an
imported function's module path, Hamilton will incorrectly treat that imported
function as part of the user's DAG.
## Reproduction
Create a user module named `modifiers.py`:
```python
from hamilton.function_modifiers import source, value
def my_func(input_data: int) -> int:
return input_data * 2
```
Then inspect what `find_functions` discovers:
```python
import modifiers
from hamilton.graph_utils import find_functions
functions = find_functions(modifiers)
for name, fn in functions:
print(f'{name} (from {fn.__module__})')
```
**Expected output:**
```
my_func (from modifiers)
```
**Actual output:**
```
my_func (from modifiers)
source (from hamilton.function_modifiers.dependencies)
value (from hamilton.function_modifiers.dependencies)
```
The `source` and `value` functions are included because `"modifiers" in
"hamilton.function_modifiers.dependencies"` evaluates to `True`.
## Root Cause
`is_submodule(child, parent)` checks `parent.__name__ in child.__name__`,
which is a substring test. For the case above:
- `parent.__name__` = `"modifiers"` (the user module)
- `child.__name__` = `"hamilton.function_modifiers.dependencies"` (where
`source` is defined)
- `"modifiers" in "hamilton.function_modifiers.dependencies"` → `True`
(incorrect)
This affects any user module whose name is a substring of the module path of
any imported function. Common examples include module names like `function`,
`modifiers`, `ton`, `ilton`, etc.
## Suggested Fix
Replace the substring check with a proper prefix/equality check:
```python
def is_submodule(child: ModuleType, parent: ModuleType):
return child.__name__ == parent.__name__ or
child.__name__.startswith(parent.__name__ + ".")
```
This ensures that `child` is truly a submodule of (or the same module as)
`parent`, not just any module whose fully-qualified name happens to contain
`parent`'s name as a substring.
## Impact
Any user module whose name appears as a substring in the module path of an
imported symbol will have those imported symbols incorrectly added as DAG
nodes. This can lead to:
- Unexpected nodes appearing in the graph
- Potential name collisions and incorrect graph topology
- Confusing errors that are hard to diagnose
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]