Github user galv commented on a diff in the pull request: https://github.com/apache/spark/pull/21511#discussion_r198653002 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala --- @@ -104,6 +104,20 @@ private[spark] object Config extends Logging { .stringConf .createOptional + val KUBERNETES_EXECUTOR_LIMIT_GPUS = --- End diff -- Sometimes you need it. For example, to reduce data across multiple executors, you would ideally use ring all-reduce among your executors, but you cannot really do that right now given that executors are scheduled independently. The best you can do right now is to gather all of your data to the driver and then do the reduction there. You can learn more at the SPIP for project hydrogen/barrier execution.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org