[ https://issues.apache.org/jira/browse/SPARK-29153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas Graves resolved SPARK-29153. ----------------------------------- Fix Version/s: 3.1.0 Assignee: Thomas Graves Resolution: Fixed > ResourceProfile conflict resolution stage level scheduling > ---------------------------------------------------------- > > Key: SPARK-29153 > URL: https://issues.apache.org/jira/browse/SPARK-29153 > Project: Spark > Issue Type: Story > Components: Scheduler > Affects Versions: 3.0.0 > Reporter: Thomas Graves > Assignee: Thomas Graves > Priority: Major > Fix For: 3.1.0 > > > For the stage level scheduling, if a stage has ResourceProfiles from multiple > RDD that conflict we have to resolve that conflict. > We may have 2 approaches. > # default to error out if conflicting, that way user realizes what is going > on, have a config to turn this on and off. > # If config to error out if off, then resolve the conflict. See below from > the design doc on the SPIP. > For the merge strategy we can choose the max from the ResourceProfiles to > make the largest container required. This in general will work but there are > a few cases people may have intended them to be a sum. For instance lets say > one RDD needs X memory and another RDD needs Y memory. It might be when those > get combined into a stage you really need X+Y memory vs the max(X, Y). > Another example might be union, where you would want to sum the resources of > each RDD. I think we can document what we choose for now and later on add in > the ability to have other alternatives then max. Or perhaps we do need to > change what we do either per operation or per resource type. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org