[ https://issues.apache.org/jira/browse/SPARK-19114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen updated SPARK-19114: ------------------------------ Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) Summary: Backpressure could support non-integral rates (< 1) (was: Backpressure rate is cast from double to long to double) I think the assumption that it's a long (therefore, at least 1 event per second) is embedded throughout this code; it's not just one instance of casting to a long. It does sound like quite a corner case though, using Spark Streaming to process well under 1 event per second. Is it really a streaming use case? That's not to say it couldn't be fixed but would take some surgery. > Backpressure could support non-integral rates (< 1) > --------------------------------------------------- > > Key: SPARK-19114 > URL: https://issues.apache.org/jira/browse/SPARK-19114 > Project: Spark > Issue Type: Improvement > Reporter: Tony Novak > Priority: Minor > > We have a Spark streaming job where each record takes well over a second to > execute, so the stable rate is under 1 element/second. We set > spark.streaming.backpressure.enabled=true and > spark.streaming.backpressure.pid.minRate=0.1, but backpressure did not appear > to be effective, even though the TRACE level logs from PIDRateEstimator > showed that the new rate was 0.1. > As it turns out, even though the minRate parameter is a Double, and the rate > estimate generated by PIDRateEstimator is a Double as well, RateController > casts the new rate to a Long. As a result, if the computed rate is less than > 1, it's truncated to 0, which ends up being interpreted as "no limit". > What's particularly confusing is that the Guava RateLimiter class takes a > rate limit as a double, so the long value ends up being cast back to a double. > Is there any reason not to keep the rate limit as a double all the way > through? I'm happy to create a pull request if this makes sense. > We encountered the bug on Spark 1.6.2, but it looks like the code in the > master branch is still affected. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org