[ 
https://issues.apache.org/jira/browse/SPARK-19114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-19114:
------------------------------
      Priority: Minor  (was: Major)
    Issue Type: Improvement  (was: Bug)
       Summary: Backpressure could support non-integral rates (< 1)  (was: 
Backpressure rate is cast from double to long to double)

I think the assumption that it's a long (therefore, at least 1 event per 
second) is embedded throughout this code; it's not just one instance of casting 
to a long.

It does sound like quite a corner case though, using Spark Streaming to process 
well under 1 event per second. Is it really a streaming use case?

That's not to say it couldn't be fixed but would take some surgery.

> Backpressure could support non-integral rates (< 1)
> ---------------------------------------------------
>
>                 Key: SPARK-19114
>                 URL: https://issues.apache.org/jira/browse/SPARK-19114
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: Tony Novak
>            Priority: Minor
>
> We have a Spark streaming job where each record takes well over a second to 
> execute, so the stable rate is under 1 element/second. We set 
> spark.streaming.backpressure.enabled=true and 
> spark.streaming.backpressure.pid.minRate=0.1, but backpressure did not appear 
> to be effective, even though the TRACE level logs from PIDRateEstimator 
> showed that the new rate was 0.1.
> As it turns out, even though the minRate parameter is a Double, and the rate 
> estimate generated by PIDRateEstimator is a Double as well, RateController 
> casts the new rate to a Long. As a result, if the computed rate is less than 
> 1, it's truncated to 0, which ends up being interpreted as "no limit".
> What's particularly confusing is that the Guava RateLimiter class takes a 
> rate limit as a double, so the long value ends up being cast back to a double.
> Is there any reason not to keep the rate limit as a double all the way 
> through? I'm happy to create a pull request if this makes sense.
> We encountered the bug on Spark 1.6.2, but it looks like the code in the 
> master branch is still affected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to