hehuiyuan commented on a change in pull request #23999: [docs]Add additional 
explanation for "Setting the max receiving rate" in 
streaming-programming-guide.md
URL: https://github.com/apache/spark/pull/23999#discussion_r264024237
 
 

 ##########
 File path: docs/streaming-programming-guide.md
 ##########
 @@ -2036,7 +2036,7 @@ To run a Spark Streaming applications, you need to have 
the following.
   `spark.streaming.receiver.maxRate` for receivers and 
`spark.streaming.kafka.maxRatePerPartition`
   for Direct Kafka approach. In Spark 1.5, we have introduced a feature called 
*backpressure* that
   eliminate the need to set this rate limit, as Spark Streaming automatically 
figures out the
-  rate limits and dynamically adjusts them if the processing conditions 
change. This backpressure
+  rate limits and dynamically adjusts them if the processing conditions 
change.If the first batch of data is very large which causes the first batch is 
processing all the time and the task can not work normally , using a maximum 
rate limit can solve the problem .This backpressure
 
 Review comment:
   First of all,think you for your reply.
   
   The original document means that setting backpressure does not require to 
set this rate limit。However, In actual usage scenarios, such as spark streaming 
consuming kafka, the first batch of data is often very large, leading to the 
first batch has been processing, affecting the normal operation of tasks。Even 
the first batch of data is finished and it  costs much more time than the batch 
time , the efficiency of processing  subsequent batches is not as good as the 
efficiency of the first batch of data  was processed in batch time then 
continue  processing subsequent batches.
   
   In a word,i want to express  setting backpressure is not need setting rate 
limit that is not rigorous .
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to