[GitHub] [spark] HeartSaVioR edited a comment on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

GitBox Sun, 23 May 2021 19:24:52 -0700


HeartSaVioR edited a comment on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-846687139



   My major point is about the characteristic of the checkpoint location.
   
   We require checkpoint location to be "fault-tolerant" including hardware 
failures (local storage doesn't make sense here), and provide "high 
availability" by itself so that Spark can delegate such complexity to the 
checkpoint location. For sure, such requirement leads underlying file system to 
be heavy and non-trivial to maintain, but IMHO that's not an enough reason to 
take the complexity back to Spark, because:
   
   1. Spark is already quite complicated
   
   Nowadays it's quite uneasy to bring a major changes without affecting 
existing behaviors. Personally I'd like to see the mainline of Apache Spark as 
ensuring majority's demands, not everyone's demands.
   
   2. The industry on file systems (or alike) also makes improvements
   
   Many end users were trying to deal with checkpoint location in object stores 
despite of eventual consistency of S3, and it even became strong consistency. 
Object stores in Azure had been providing strong consistency if I understand 
correctly. Not 100% sure of GCS but I heard they claim strong consistency.
   
   HDFS has been exposing several shortcomings so far, but the community is 
also making improvements like Apache Ozone.
   
   3. Spark community even didn't try to optimize the path
   
   I'd interpret the reasons as two folds:
   
   A. Majority of real-world workloads are working well with current technology
   B. Some workloads don't work well, but no strong demand on this as possible 
issues are tolerable
   
   I think we still don't struggle for this enough. There're still spots to 
reduce the latency down, like I did on optimizing WAL commit phase (#31495). 
I'd like to put my efforts on helping majority's use cases.
   
   > Technically, PVC is kinds of abstract way to look at the volume mounted on 
container running executor. It could be local storage on nodes on k8s. It 
depends where the PVC is bound to.
   
   PVC is not a kind of abstraction which "guarantees" fault-tolerant file 
system, so it still has to depend on the actual file system under the hood, and 
also the guarantees on interface accessing file system. I imagine the 
operational costs on making PVC guarantees such requirements would be 
non-trivial as well.
   
   > Using PVC as checkpoint could be huge relief on the loading of HDFS. There 
are also others like better latency, simplified streaming architecture.
   
   I'd be happy to see the overall system design and the result of POC. Let's 
continue the talk about PVC once we get the details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR edited a comment on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

Reply via email to