Hi Everyone, I would like to discuss a new FLIP (FLIP-XXX: Table Planner Source Filter Reuse).
Brief background, today scans of the same table with different FilterPushDownSpec values produce independent source readers because their digests differ. For sources where scan operations are expensive (BigQuery Storage Read API sessions, JDBC query execution etc), this results in multiple source scans when one would suffice. We have a public draft of the FLIP[1], as well as a working prototype on our internal fork (to be shared soon and linked in the thread). The main open question from the FLIP I'd most value early feedback on is the optimization's configuration scope: "Should this optimization remain a job-level flag consistent with the established pattern, or should we pursue finer-grained scope (per-table or per-scan) for v1?" Thanks a ton in advance for the feedback, Daniel [1] https://docs.google.com/document/d/1CcdogFWShLdybEBhRNvu4E7zSc0ep3hJQIlsDCnr_nc/edit?usp=sharing
