[GitHub] liupc commented on a change in pull request #23580: [SPARK-26660]Add warning logs when broadcasting large task binary
liupc commented on a change in pull request #23580: [SPARK-26660]Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23580#discussion_r251289438 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ## @@ -1162,6 +1162,10 @@ private[spark] class DAGScheduler( partitions = stage.rdd.partitions } + if (taskBinaryBytes.length * 1000 > TaskSetManager.TASK_SIZE_TO_WARN_KB) { Review comment: @HyukjinKwon Sorry for that, also in the new PR we can make 1000 to 1024, that sounds more reasonable. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] liupc commented on a change in pull request #23580: [SPARK-26660]Add warning logs when broadcasting large task binary
liupc commented on a change in pull request #23580: [SPARK-26660]Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23580#discussion_r249644541 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ## @@ -1162,6 +1162,12 @@ private[spark] class DAGScheduler( partitions = stage.rdd.partitions } + val taskBinarySizeKb = Utils.byteStringAsKb(s"${taskBinaryBytes.length}b") + + if (taskBinarySizeKb > TaskSetManager.TASK_SIZE_TO_WARN_KB) { Review comment: @srowen Yes, it's warning about broadcasts, not just task size. I will update the code to avoid the convertion. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org