Currently, samza exposes configuration in the form of 
"streams.%s.consumer.max.bytes.per.sec" for throttling the # of bytes the Task 
will read from a stream. This is a feature request for programmatic fine-grain 
control over stream consumption. The use-case is a samza task that will be 
consuming multiple streams where some streams may be from live systems that 
have stricter SLA requirements and must always be prioritized over other 
streams that may be from batch systems. The above configuration is not the 
ideal way to express this type of stream prioritization because configuring the 
"batch" streams with a low consumption rate will decrease the overall 
throughput of the system when there is no data in the "live" streams. 
Furthermore, we'll want to throttle each "batch" stream based on external 
signals that can change over time. Because of the dynamic nature of these 
external signals, we would like to have a programmatic interface that can 
dynamically change the prioritization as the signal changes.

Thanks, Alan

Reply via email to