rahil-c opened a new pull request, #18729: URL: https://github.com/apache/hudi/pull/18729
## Summary - Adds the fourth variant to the existing `vector_blob_demo/` family: a reproducible end-to-end certification of `hudi_vector_search_batch` (RFC-102) on a 1000-row corpus × 20-row query Hudi table. - Result is asserted against a numpy ground-truth cosine distance matrix to within 1e-5 — script exits non-zero on any divergence. - Ships both a standalone `.py` (canonical, scriptable, hooks into `run_demos.sh`) and a mirroring `.ipynb` notebook for live walkthroughs. ## Test plan - [x] Smoke test: 100 × 5, k=3, parquet → `CERTIFIED ✓`, max delta 7.22e-08 - [x] Full: 1000 × 20, k=5, parquet → `CERTIFIED ✓`, max delta 1.22e-07 - [ ] Lance run (`HUDI_BASE_FILE_FORMAT=lance python hudi_vector_search_batch_demo.py`) - [ ] Notebook Run All renders the panel + prints `CERTIFIED ✓` 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
