mridulm commented on pull request #30876:
URL: https://github.com/apache/spark/pull/30876#issuecomment-751926362


   (Sigh, github prematurely posted my previous comment - fleshing it out here).
   
   As I mentioned above, the flag helps applications which are fine with paying 
the overhead for proactive replication. I have sketched some cases where 
proactive replication does not help, and others where they could be useful - 
these are examples ofcourse : but in the end, it is specific to the application.
   
   Making it default will impact all applications which have replication > 1:  
given this PR is proposing to make it the default, I would like to know if 
there was any motivating reason to make this change ?
   
   If the cost of proactive replication is close to zero now (my experiments 
were from a while back), ofcourse the discussion is moot - did we have any 
results for this ?
   What is the ongoing cost when application holds RDD references, but they are 
not in active use for rest of the application (not all references can be 
cleared by gc) - resulting in replication of blocks for an RDD which is 
legitimately not going to be used again ?
   
   Note that the above is orthogonal to DRA evicting an executor via storage 
timeout configuration.
   That just exacerbates the problem : since a larger number of executors could 
be lost.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to