tub opened a new pull request, #7342: URL: https://github.com/apache/paimon/pull/7342
## Summary PR 1 of 3 for pure-Python streaming reads. This PR adds foundational infrastructure: - **Follow-up scanners** (delta, changelog, incremental diff) for continuous snapshot polling - **Consumer manager** for persisting read progress to the table path - **LRU caching** for snapshots, manifests, and manifest lists - **Batch existence checks** for efficient file IO - **Bucket-based sharding** params in FileScanner for parallel consumption - **Row kind support** in table reads - **Streaming-related core options** - **Backtick support** for identifier parsing 25 files changed, +2701 / -31 lines ## PR Stack 1. **👉 this PR** — Streaming infrastructure (scanners, consumers, caching, sharding) 2. Core streaming (StreamReadBuilder, AsyncStreamingTableScan, table integration) 3. CLI (`paimon tail` command) **Merge workflow:** Merge PR 1, rebase PR 2 onto updated master (PR 1 commits drop out), merge PR 2, repeat for PR 3. ## Test plan - [x] `python -m pytest pypaimon/tests` — 537 passed (9 pre-existing lance failures) - [x] `python -c "from pypaimon import CatalogFactory"` — no import errors - [x] Unit tests for all new scanners, consumer manager, manifest caching, identifier parsing - [x] Integration tests for FileScanner shard filtering 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
