addu390 opened a new pull request, #2776: URL: https://github.com/apache/fluss/pull/2776
### Purpose Linked issue: close #2550 Add rate limit support for Spark streaming reads to control the number of offsets processed per micro-batch trigger. ### Brief change log - Added `scan.max.offsets.per.trigger`, `scan.min.offsets.per.trigger`, and `scan.max.trigger.delay` config options in `SparkFlussConf` - Override `getDefaultReadLimit` in `FlussMicroBatchStream` to return appropriate `ReadLimit` based on config - Note: Offset capping uses proportional fair-share distribution across buckets. A simpler, more typical approach (`maxOffsets / numBuckets`) can be used instead, if that's preferred. ### Tests - `SparkStreamingTest#read: log table with maxOffsetsPerTrigger rate limit` ### API and Format New user-facing config options for Spark DataFrameReader: - `scan.max.offsets.per.trigger` - `scan.min.offsets.per.trigger` - `scan.max.trigger.delay` ### Documentation N/A, documentation update to be tracked separately. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
