[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20493 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/20493#discussion_r166197592 --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusUtilsSuite.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.status + +import java.util.Date + +import org.apache.spark.SparkFunSuite +import org.apache.spark.status.api.v1.{TaskData, TaskMetrics} + +class AppStatusUtilsSuite extends SparkFunSuite { + + test("schedulerDelay") { +val runningTask = new TaskData( --- End diff -- Yeah, I'm inclined to keep it as they are more real. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20493#discussion_r166197254 --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusUtilsSuite.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.status + +import java.util.Date + +import org.apache.spark.SparkFunSuite +import org.apache.spark.status.api.v1.{TaskData, TaskMetrics} + +class AppStatusUtilsSuite extends SparkFunSuite { + + test("schedulerDelay") { +val runningTask = new TaskData( --- End diff -- Actually there are many different values between these 2 code blocks ``` +executorDeserializeTime = 5L, +executorDeserializeCpuTime = 3L, +executorRunTime = 90L, +executorCpuTime = 10L, +resultSize = 100L, +jvmGcTime = 10L, +resultSerializationTime = 2L, ``` I think it's OK keep the code as it is. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20493#discussion_r166181600 --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusUtilsSuite.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.status + +import java.util.Date + +import org.apache.spark.SparkFunSuite +import org.apache.spark.status.api.v1.{TaskData, TaskMetrics} + +class AppStatusUtilsSuite extends SparkFunSuite { + + test("schedulerDelay") { +val runningTask = new TaskData( --- End diff -- +1 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20493#discussion_r166181455 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusUtils.scala --- @@ -17,16 +17,23 @@ package org.apache.spark.status -import org.apache.spark.status.api.v1.{TaskData, TaskMetrics} +import org.apache.spark.status.api.v1.TaskData private[spark] object AppStatusUtils { + private val TASK_FINISHED_STATES = Set("FAILED", "KILLED", "SUCCESS") + + private def isTaskFinished(task: TaskData): Boolean = { +TASK_FINISHED_STATES.contains(task.status) + } + def schedulerDelay(task: TaskData): Long = { -if (task.taskMetrics.isDefined && task.duration.isDefined) { +if (isTaskFinished(task) && task.taskMetrics.isDefined && task.duration.isDefined) { --- End diff -- Logically `duration` should be set for running tasks, to indicate how long a task has been run. I feel it's safer to keep `task.duration.isDefined`, as we call `task.duration.get` below. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20493#discussion_r166068708 --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusUtilsSuite.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.status + +import java.util.Date + +import org.apache.spark.SparkFunSuite +import org.apache.spark.status.api.v1.{TaskData, TaskMetrics} + +class AppStatusUtilsSuite extends SparkFunSuite { + + test("schedulerDelay") { +val runningTask = new TaskData( --- End diff -- +1 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20493#discussion_r166064864 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusUtils.scala --- @@ -17,16 +17,23 @@ package org.apache.spark.status -import org.apache.spark.status.api.v1.{TaskData, TaskMetrics} +import org.apache.spark.status.api.v1.TaskData private[spark] object AppStatusUtils { + private val TASK_FINISHED_STATES = Set("FAILED", "KILLED", "SUCCESS") + + private def isTaskFinished(task: TaskData): Boolean = { +TASK_FINISHED_STATES.contains(task.status) + } + def schedulerDelay(task: TaskData): Long = { -if (task.taskMetrics.isDefined && task.duration.isDefined) { +if (isTaskFinished(task) && task.taskMetrics.isDefined && task.duration.isDefined) { --- End diff -- `task.duration.isDefined` should be redundant now, right? (I remember the duration didn't use to be set for running tasks, so this code worked, but apparently it changed while I worked on these changes...) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20493#discussion_r165819565 --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusUtilsSuite.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.status + +import java.util.Date + +import org.apache.spark.SparkFunSuite +import org.apache.spark.status.api.v1.{TaskData, TaskMetrics} + +class AppStatusUtilsSuite extends SparkFunSuite { + + test("schedulerDelay") { +val runningTask = new TaskData( --- End diff -- Can we make this test case more concise and easy to read by deduplication? For the purpose of this test case, what about the following pattern? ``` Seq(("RUNNING", 0), ("SUCCESS", 3L)).foreach { case (status, schedulerDelay) => // the code from `finishedTask` } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20493: [SPARK-23326][WEBUI]schedulerDelay should return ...
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/20493 [SPARK-23326][WEBUI]schedulerDelay should return 0 when the task is running ## What changes were proposed in this pull request? When a task is still running, metrics like executorRunTime are not available. Then `schedulerDelay` will be almost the same as `duration` and that's confusing. This PR makes `schedulerDelay` return 0 when the task is running which is the same behavior as 2.2. ## How was this patch tested? `AppStatusUtilsSuite.schedulerDelay` You can merge this pull request into a Git repository by running: $ git pull https://github.com/zsxwing/spark SPARK-23326 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20493.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20493 commit 7889fb0e5e4515ade35c2a07703017e16ee6194a Author: Shixiong ZhuDate: 2018-02-03T00:25:34Z schedulerDelay should return 0 when the task is running --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org