Dev-iL opened a new pull request, #1509:
URL: https://github.com/apache/hamilton/pull/1509

   ### Motivation
   
   Hamilton's data quality validators (`DataValidator`, `BaseDefaultValidator`) 
are synchronous-only. When using `AsyncDriver`, the validation wrapper 
functions created by `@check_output` / `@check_output_custom` are always plain 
`def` — even if the underlying validator needs to perform async work. The 
`AsyncGraphAdapter` checks `asyncio.iscoroutinefunction(fn)` to decide whether 
to `await` a node callable, so a sync wrapper around an async `validate()` 
silently returns an unawaited coroutine instead of a `ValidationResult`, 
corrupting downstream results.
   
   #### Minimal example
   
   ```python
   # my_module.py
   from hamilton.data_quality.base import AsyncDataValidator, ValidationResult
   from hamilton.function_modifiers import check_output_custom
   
   
   class AsyncPositiveValidator(AsyncDataValidator):
       def __init__(self):
           super().__init__(importance="fail")
   
       def applies_to(self, datatype):
           return datatype == int
   
       def description(self):
           return "Value must be positive"
   
       @classmethod
       def name(cls):
           return "positive_validator"
   
       async def validate(self, dataset: int) -> ValidationResult:
           # async validation logic (e.g. await db_check(dataset))
           return ValidationResult(
               passes=dataset > 0,
               message=f"{dataset} is {'positive' if dataset > 0 else 'not 
positive'}",
           )
   
   
   @check_output_custom(AsyncPositiveValidator())
   async def doubled(input_value: int) -> int:
       return input_value * 2
   
   
   # main.py
   from hamilton import async_driver, base
   
   dr = async_driver.AsyncDriver({}, my_module, 
result_builder=base.DictResult())
   result = await dr.execute(final_vars=["doubled"], inputs={"input_value": 5})
   # result == {"doubled": 10}
   ```
   
   ## Changes
   **New base classes** (`hamilton/data_quality/base.py`):
   - `AsyncDataValidator` — async variant of `DataValidator` with `async def 
validate()`
   - `AsyncBaseDefaultValidator` — async variant of `BaseDefaultValidator` for 
use with `@check_output`
   - `is_async_validator()` helper using `inspect.iscoroutinefunction` for 
robust detection
   
   **Async-aware wrapper generation** 
(`hamilton/function_modifiers/validation.py`):
   - `transform_node()` now detects async validators and creates `async def` 
wrappers that `await` the validator's `validate()` call
   - Sync validators get a runtime guard that raises a clear `TypeError` if a 
coroutine is accidentally returned
   - Follows the established Hamilton pattern used in `expanders.py`, 
`macros.py`, and `recursive.py`
   
   **Documentation updates**:
   - `writeups/data_quality.md` — new "Async Validators" section with full 
examples
   - `docs/concepts/function-modifiers.rst` — mentions async validator base 
classes
   - `docs/reference/decorators/check_output.rst` — API reference entries for 
async classes
   - `docs/how-tos/run-data-quality-checks.rst` — async validators subsection
   - `examples/async/README.md` — data quality with async section
   
   ## How I tested this
   - Unit tests verifying async wrapper creation, mixed sync/async validators, 
and the misuse guard
   - End-to-end tests with `AsyncDriver` for async, sync, and mixed validator 
scenarios
   ## Notes
   - `AsyncDataValidator` inherits from `DataValidator`, so all existing 
`isinstance` checks and type hints work unchanged.
   - Detection uses `inspect.iscoroutinefunction(validator.validate)` rather 
than `isinstance`, catching any validator with an async `validate` regardless 
of class hierarchy.
   - `final_node_callable` (which aggregates validation results) remains sync — 
the `AsyncGraphAdapter` awaits all kwargs before calling it, so results are 
already resolved.
   - All changes are additive with no breaking API modifications. Existing sync 
validators continue to work identically.
   
   ## Checklist
   
   - [x] PR has an informative and human-readable title (this will be pulled 
into the release notes)
   - [x] Changes are limited to a single goal (no scope creep)
   - [x] Code passed the pre-commit check & code is left cleaner/nicer than 
when first encountered.
   - [x] Any _change_ in functionality is tested
   - [x] New functions are documented (with a description, list of inputs, and 
expected output)
   - [ ] Placeholder code is flagged / future TODOs are captured in comments
   - [x] Project documentation has been updated if adding/changing 
functionality.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to