[ https://issues.apache.org/jira/browse/SPARK-45527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas Graves reassigned SPARK-45527: ------------------------------------- Assignee: Bobby Wang > Task fraction resource request is not expected > ---------------------------------------------- > > Key: SPARK-45527 > URL: https://issues.apache.org/jira/browse/SPARK-45527 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.2.1, 3.3.3, 3.4.1, 3.5.0 > Reporter: wuyi > Assignee: Bobby Wang > Priority: Major > Labels: pull-request-available > > > {code:java} > test("SPARK-XXX") { > import org.apache.spark.resource.{ResourceProfileBuilder, > TaskResourceRequests} > withTempDir { dir => > val scriptPath = createTempScriptWithExpectedOutput(dir, > "gpuDiscoveryScript", > """{"name": "gpu","addresses":["0"]}""") > val conf = new SparkConf() > .setAppName("test") > .setMaster("local-cluster[1, 12, 1024]") > .set("spark.executor.cores", "12") > conf.set(TASK_GPU_ID.amountConf, "0.08") > conf.set(WORKER_GPU_ID.amountConf, "1") > conf.set(WORKER_GPU_ID.discoveryScriptConf, scriptPath) > conf.set(EXECUTOR_GPU_ID.amountConf, "1") > sc = new SparkContext(conf) > val rdd = sc.range(0, 100, 1, 4) > var rdd1 = rdd.repartition(3) > val treqs = new TaskResourceRequests().cpus(1).resource("gpu", 1.0) > val rp = new ResourceProfileBuilder().require(treqs).build > rdd1 = rdd1.withResources(rp) > assert(rdd1.collect().size === 100) > } > } {code} > In the above test, the 3 tasks generated by rdd1 are expected to be executed > in sequence as we expect "new TaskResourceRequests().cpus(1).resource("gpu", > 1.0)" should override "conf.set(TASK_GPU_ID.amountConf, "0.08")". However, > those 3 tasks are run in parallel in fact. > The root cause is that ExecutorData#ExecutorResourceInfo#numParts is static. > In this case, the "gpu.numParts" is initialized with 12 (1/0.08) and won't > change even if there's a new task resource request (e.g., resource("gpu", > 1.0) in this case). Thus, those 3 tasks are able to be executed in parallel. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org