0lai0 opened a new pull request, #1120: URL: https://github.com/apache/mahout/pull/1120
### Related Issues <!-- Closes #123 --> Closes #1109 follow up of [#1101 (comment)](https://github.com/apache/mahout/pull/1101#discussion_r2868916951) ### Changes - [ ] Bug fix - [ ] New feature - [ ] Refactoring - [x] Documentation - [ ] Test - [ ] CI/CD pipeline - [ ] Other ### Why <!-- Why is this change needed? --> This PR addresses the missing documentation regarding null handling in the streaming reader path, as discussed in [#1101 (comment)](https://github.com/apache/mahout/pull/1101#discussion_r2868916951) . Currently, `stream_encode()` uses hardcoded `NullHandling::FillZero` for `ParquetBlockReader` because the streaming API does not yet accept configuration options. To avoid confusion, this PR adds explicit docstrings to clarify this behavior for callers. ### How - Added a section on **Null handling** in `lib.rs` to explain that the streaming encoder replaces null values with `0.0` for backward compatibility. - Added detailed documentation to the `stream_encode` function in `mod.rs`, noting the historical behavior of Mahout and advising callers requiring stricter validation to ensure their input data is free of nulls. <!-- What was done? --> ## Checklist - [ ] Added or updated unit tests for all changes - [x] Added or updated documentation for all changes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
