Say if we support per-task resource requests in the future, it would be still inconvenient for users to declare the resource requirements for every single task/stage. So there must be some default values defined somewhere for task resource requirements. "spark.task.cpus" and "spark.task.accelerator.gpu.count" could serve for this purpose without introducing breaking changes. So I'm +1 on the updated SPIP. It fairly separated necessary GPU support from risky scheduler changes.
On Mon, Mar 25, 2019 at 8:39 AM Mark Hamstra <m...@clearstorydata.com> wrote: > Of course there is an issue of the perfect becoming the enemy of the good, > so I can understand the impulse to get something done. I am left wanting, > however, at least something more of a roadmap to a task-level future than > just a vague "we may choose to do something more in the future." At the > risk of repeating myself, I don't think the existing spark.task.cpus is > very good, and I think that building more on that weak foundation without a > more clear path or stated intention to move to something better runs the > risk of leaving Spark stuck in a bad neighborhood. > > On Thu, Mar 21, 2019 at 10:10 AM Tom Graves <tgraves...@yahoo.com> wrote: > >> While I agree with you that it would be ideal to have the task level >> resources and do a deeper redesign for the scheduler, I think that can be a >> separate enhancement like was discussed earlier in the thread. That feature >> is useful without GPU's. I do realize that they overlap some but I think >> the changes for this will be minimal to the scheduler, follow existing >> conventions, and it is an improvement over what we have now. I know many >> users will be happy to have this even without the task level scheduling as >> many of the conventions used now to scheduler gpus can easily be broken by >> one bad user. I think from the user point of view this gives many users >> an improvement and we can extend it later to cover more use cases. >> >> Tom >> On Thursday, March 21, 2019, 9:15:05 AM PDT, Mark Hamstra < >> m...@clearstorydata.com> wrote: >> >> >> I understand the application-level, static, global nature >> of spark.task.accelerator.gpu.count and its similarity to the >> existing spark.task.cpus, but to me this feels like extending a weakness of >> Spark's scheduler, not building on its strengths. That is because I >> consider binding the number of cores for each task to an application >> configuration to be far from optimal. This is already far from the desired >> behavior when an application is running a wide range of jobs (as in a >> generic job-runner style of Spark application), some of which require or >> can benefit from multi-core tasks, others of which will just waste the >> extra cores allocated to their tasks. Ideally, the number of cores >> allocated to tasks would get pushed to an even finer granularity that jobs, >> and instead being a per-stage property. >> >> Now, of course, making allocation of general-purpose cores and >> domain-specific resources work in this finer-grained fashion is a lot more >> work than just trying to extend the existing resource allocation mechanisms >> to handle domain-specific resources, but it does feel to me like we should >> at least be considering doing that deeper redesign. >> >> On Thu, Mar 21, 2019 at 7:33 AM Tom Graves <tgraves...@yahoo.com.invalid> >> wrote: >> >> Tthe proposal here is that all your resources are static and the gpu per >> task config is global per application, meaning you ask for a certain amount >> memory, cpu, GPUs for every executor up front just like you do today and >> every executor you get is that size. This means that both static or >> dynamic allocation still work without explicitly adding more logic at this >> point. Since the config for gpu per task is global it means every task you >> want will need a certain ratio of cpu to gpu. Since that is a global you >> can't really have the scenario you mentioned, all tasks are assuming to >> need GPU. For instance. I request 5 cores, 2 GPUs, set 1 gpu per task for >> each executor. That means that I could only run 2 tasks and 3 cores would >> be wasted. The stage/task level configuration of resources was removed and >> is something we can do in a separate SPIP. >> We thought erroring would make it more obvious to the user. We could >> change this to a warning if everyone thinks that is better but I personally >> like the error until we can implement the per lower level per stage >> configuration. >> >> Tom >> >> On Thursday, March 21, 2019, 1:45:01 AM PDT, Marco Gaido < >> marcogaid...@gmail.com> wrote: >> >> >> Thanks for this SPIP. >> I cannot comment on the docs, but just wanted to highlight one thing. In >> page 5 of the SPIP, when we talk about DRA, I see: >> >> "For instance, if each executor consists 4 CPUs and 2 GPUs, and each >> task requires 1 CPU and 1GPU, then we shall throw an error on application >> start because we shall always have at least 2 idle CPUs per executor" >> >> I am not sure this is a correct behavior. We might have tasks requiring >> only CPU running in parallel as well, hence that may make sense. I'd rather >> emit a WARN or something similar. Anyway we just said we will keep GPU >> scheduling on task level out of scope for the moment, right? >> >> Thanks, >> Marco >> >> Il giorno gio 21 mar 2019 alle ore 01:26 Xiangrui Meng < >> m...@databricks.com> ha scritto: >> >> Steve, the initial work would focus on GPUs, but we will keep the >> interfaces general to support other accelerators in the future. This was >> mentioned in the SPIP and draft design. >> >> Imran, you should have comment permission now. Thanks for making a pass! >> I don't think the proposed 3.0 features should block Spark 3.0 release >> either. It is just an estimate of what we could deliver. I will update the >> doc to make it clear. >> >> Felix, it would be great if you can review the updated docs and let us >> know your feedback. >> >> ** How about setting a tentative vote closing time to next Tue (Mar 26)? >> >> On Wed, Mar 20, 2019 at 11:01 AM Imran Rashid <im...@therashids.com> >> wrote: >> >> Thanks for sending the updated docs. Can you please give everyone the >> ability to comment? I have some comments, but overall I think this is a >> good proposal and addresses my prior concerns. >> >> My only real concern is that I notice some mention of "must dos" for >> spark 3.0. I don't want to make any commitment to holding spark 3.0 for >> parts of this, I think that is an entirely separate decision. However I'm >> guessing this is just a minor wording issue, and you really mean that's a >> minimal set of features you are aiming for, which is reasonable. >> >> On Mon, Mar 18, 2019 at 12:56 PM Xingbo Jiang <jiangxb1...@gmail.com> >> wrote: >> >> Hi all, >> >> I updated the SPIP doc >> <https://docs.google.com/document/d/1C4J_BPOcSCJc58HL7JfHtIzHrjU0rLRdQM3y7ejil64/edit#> >> and stories >> <https://docs.google.com/document/d/12JjloksHCdslMXhdVZ3xY5l1Nde3HRhIrqvzGnK_bNE/edit#heading=h.udyua28eu3sg>, >> I hope it now contains clear scope of the changes and enough details for >> SPIP vote. >> Please review the updated docs, thanks! >> >> Xiangrui Meng <men...@gmail.com> 于2019年3月6日周三 上午8:35写道: >> >> How about letting Xingbo make a major revision to the SPIP doc to make it >> clear what proposed are? I like Felix's suggestion to switch to the new >> Heilmeier template, which helps clarify what are proposed and what are not. >> Then let's review the new SPIP and resume the vote. >> >> On Tue, Mar 5, 2019 at 7:54 AM Imran Rashid <im...@therashids.com> wrote: >> >> OK, I suppose then we are getting bogged down into what a vote on an SPIP >> means then anyway, which I guess we can set aside for now. With the level >> of detail in this proposal, I feel like there is a reasonable chance I'd >> still -1 the design or implementation. >> >> And the other thing you're implicitly asking the community for is to >> prioritize this feature for continued review and maintenance. There is >> already work to be done in things like making barrier mode support dynamic >> allocation (SPARK-24942), bugs in failure handling (eg. SPARK-25250), and >> general efficiency of failure handling (eg. SPARK-25341, SPARK-20178). I'm >> very concerned about getting spread too thin. >> >> >> But if this is really just a vote on (1) is better gpu support important >> for spark, in some form, in some release? and (2) is it *possible* to do >> this in a safe way? then I will vote +0. >> >> On Tue, Mar 5, 2019 at 8:25 AM Tom Graves <tgraves...@yahoo.com> wrote: >> >> So to me most of the questions here are implementation/design questions, >> I've had this issue in the past with SPIP's where I expected to have more >> high level design details but was basically told that belongs in the design >> jira follow on. This makes me think we need to revisit what a SPIP really >> need to contain, which should be done in a separate thread. Note >> personally I would be for having more high level details in it. >> But the way I read our documentation on a SPIP right now that detail is >> all optional, now maybe we could argue its based on what reviewers request, >> but really perhaps we should make the wording of that more required. >> thoughts? We should probably separate that discussion if people want to >> talk about that. >> >> For this SPIP in particular the reason I +1 it is because it came down to >> 2 questions: >> >> 1) do I think spark should support this -> my answer is yes, I think this >> would improve spark, users have been requesting both better GPUs support >> and support for controlling container requests at a finer granularity for a >> while. If spark doesn't support this then users may go to something else, >> so I think it we should support it >> >> 2) do I think its possible to design and implement it without causing >> large instabilities? My opinion here again is yes. I agree with Imran and >> others that the scheduler piece needs to be looked at very closely as we >> have had a lot of issues there and that is why I was asking for more >> details in the design jira: >> https://issues.apache.org/jira/browse/SPARK-27005. But I do believe its >> possible to do. >> >> If others have reservations on similar questions then I think we should >> resolve here or take the discussion of what a SPIP is to a different thread >> and then come back to this, thoughts? >> >> Note there is a high level design for at least the core piece, which is >> what people seem concerned with, already so including it in the SPIP should >> be straight forward. >> >> Tom >> >> On Monday, March 4, 2019, 2:52:43 PM CST, Imran Rashid < >> im...@therashids.com> wrote: >> >> >> On Sun, Mar 3, 2019 at 6:51 PM Xiangrui Meng <men...@gmail.com> wrote: >> >> On Sun, Mar 3, 2019 at 10:20 AM Felix Cheung <felixcheun...@hotmail.com> >> wrote: >> >> IMO upfront allocation is less useful. Specifically too expensive for >> large jobs. >> >> >> This is also an API/design discussion. >> >> >> I agree with Felix -- this is more than just an API question. It has a >> huge impact on the complexity of what you're proposing. You might be >> proposing big changes to a core and brittle part of spark, which is already >> short of experts. >> >> I don't see any value in having a vote on "does feature X sound cool?" >> We have to evaluate the potential benefit against the risks the feature >> brings and the continued maintenance cost. We don't need super low-level >> details, but we have to a sketch of the design to be able to make that >> tradeoff. >> >>