[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-18 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63549732
  
Okay pulling this in - thanks davies!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-18 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63598077
  
merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-18 Thread davies
Github user davies closed the pull request at:

https://github.com/apache/spark/pull/3029


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20467516
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
--- End diff --

minor: but could this just take the `StatusTracker` in the constructor? 
This makes the dependencies between the components more clear.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20467643
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
+  val DELAY_SHOW_UP = 500L
+
+  // The width of terminal
+  val TerminalWidth = if (!sys.env.getOrElse(COLUMNS, ).isEmpty) {
+sys.env.get(COLUMNS).get.toInt
+  } else {
+80
+  }
+
+  var hasShowed = false
+  var lastFinishTime = 0L
+
+  // Schedule a refresh thread to run in every 200ms
+  private val timer = new Timer(refresh progress, true)
+  timer.schedule(new TimerTask{
+override def run() {
+  refresh()
+}
+  }, DELAY_SHOW_UP, UPDATE_PERIOD)
+
+  /**
+   * Try to refresh the progress bar in every cycle
+   */
+  private def refresh(): Unit = synchronized {
+val now = System.currentTimeMillis()
+if (now - lastFinishTime  DELAY_SHOW_UP) {
+  return
+}
+val stageIds = sc.statusTracker.getActiveStageIds()
+val stages = 
stageIds.map(sc.statusTracker.getStageInfo).flatten.filter(_.numTasks()  1)
+  .filter(now - _.submissionTime()  DELAY_SHOW_UP).sortBy(_.stageId())
+if (stages.size  0) {
+  show(stages.take(3))  // display at most 3 stages in same time
+  hasShowed = true
+}
+  }
+
+  /**
+   * Show progress bar in console. The progress bar is displayed in the 
next line
+   * after your last output, keeps overwriting itself to hold in one line. 
The logging will follow
+   * the progress bar, then progress bar will be showed in next line 
without overwrite logs.
+   */
+  private def show(stages: Seq[SparkStageInfo]) {
+System.err.print(\r)
+val width = TerminalWidth / stages.size
+stages.foreach { s =
+  val total = s.numTasks()
+  val header = s[Stage ${s.stageId()}:
+  val tailer = s(${s.numCompletedTasks()} + ${s.numActiveTasks()}) / 
$total]
+  val w = width - header.size - tailer.size
+  val bar = if (w  0) {
+val percent = w * s.numCompletedTasks() / total
+(0 until w).map { i =
+  if (i  percent) = else if (i == percent)  else  
+}.mkString()
+  } else {
+
+  }
+  System.err.print(header + bar + tailer)
+}
+  }
+
+  /**
+   * Clear the progress bar if showed.
+   */
+  private def clear() = {
+if (hasShowed) {
+  System.err.printf(\r +   * TerminalWidth + \r)
--- End diff --

does this work on windows? Is there a constant we should be using instead 
of `\r`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20467709
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
+  val DELAY_SHOW_UP = 500L
+
+  // The width of terminal
+  val TerminalWidth = if (!sys.env.getOrElse(COLUMNS, ).isEmpty) {
+sys.env.get(COLUMNS).get.toInt
+  } else {
+80
+  }
+
+  var hasShowed = false
+  var lastFinishTime = 0L
+
+  // Schedule a refresh thread to run in every 200ms
+  private val timer = new Timer(refresh progress, true)
+  timer.schedule(new TimerTask{
+override def run() {
+  refresh()
+}
+  }, DELAY_SHOW_UP, UPDATE_PERIOD)
+
+  /**
+   * Try to refresh the progress bar in every cycle
+   */
+  private def refresh(): Unit = synchronized {
+val now = System.currentTimeMillis()
+if (now - lastFinishTime  DELAY_SHOW_UP) {
+  return
+}
+val stageIds = sc.statusTracker.getActiveStageIds()
+val stages = 
stageIds.map(sc.statusTracker.getStageInfo).flatten.filter(_.numTasks()  1)
+  .filter(now - _.submissionTime()  DELAY_SHOW_UP).sortBy(_.stageId())
+if (stages.size  0) {
+  show(stages.take(3))  // display at most 3 stages in same time
+  hasShowed = true
+}
+  }
+
+  /**
+   * Show progress bar in console. The progress bar is displayed in the 
next line
+   * after your last output, keeps overwriting itself to hold in one line. 
The logging will follow
+   * the progress bar, then progress bar will be showed in next line 
without overwrite logs.
+   */
+  private def show(stages: Seq[SparkStageInfo]) {
+System.err.print(\r)
+val width = TerminalWidth / stages.size
+stages.foreach { s =
+  val total = s.numTasks()
+  val header = s[Stage ${s.stageId()}:
+  val tailer = s(${s.numCompletedTasks()} + ${s.numActiveTasks()}) / 
$total]
+  val w = width - header.size - tailer.size
+  val bar = if (w  0) {
+val percent = w * s.numCompletedTasks() / total
+(0 until w).map { i =
+  if (i  percent) = else if (i == percent)  else  
+}.mkString()
+  } else {
+
+  }
+  System.err.print(header + bar + tailer)
+}
+  }
+
+  /**
+   * Clear the progress bar if showed.
+   */
+  private def clear() = {
+if (hasShowed) {
--- End diff --

It more correct gramatically for this to be isShown rather than 
hasShowed. This refers to whether there is currently a bar printed in the 
console, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20467763
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
+  val DELAY_SHOW_UP = 500L
+
+  // The width of terminal
+  val TerminalWidth = if (!sys.env.getOrElse(COLUMNS, ).isEmpty) {
+sys.env.get(COLUMNS).get.toInt
+  } else {
+80
+  }
+
+  var hasShowed = false
+  var lastFinishTime = 0L
+
+  // Schedule a refresh thread to run in every 200ms
+  private val timer = new Timer(refresh progress, true)
+  timer.schedule(new TimerTask{
+override def run() {
+  refresh()
+}
+  }, DELAY_SHOW_UP, UPDATE_PERIOD)
+
+  /**
+   * Try to refresh the progress bar in every cycle
+   */
+  private def refresh(): Unit = synchronized {
+val now = System.currentTimeMillis()
+if (now - lastFinishTime  DELAY_SHOW_UP) {
+  return
+}
+val stageIds = sc.statusTracker.getActiveStageIds()
+val stages = 
stageIds.map(sc.statusTracker.getStageInfo).flatten.filter(_.numTasks()  1)
+  .filter(now - _.submissionTime()  DELAY_SHOW_UP).sortBy(_.stageId())
+if (stages.size  0) {
+  show(stages.take(3))  // display at most 3 stages in same time
+  hasShowed = true
+}
+  }
+
+  /**
+   * Show progress bar in console. The progress bar is displayed in the 
next line
+   * after your last output, keeps overwriting itself to hold in one line. 
The logging will follow
+   * the progress bar, then progress bar will be showed in next line 
without overwrite logs.
+   */
+  private def show(stages: Seq[SparkStageInfo]) {
+System.err.print(\r)
+val width = TerminalWidth / stages.size
+stages.foreach { s =
+  val total = s.numTasks()
+  val header = s[Stage ${s.stageId()}:
+  val tailer = s(${s.numCompletedTasks()} + ${s.numActiveTasks()}) / 
$total]
+  val w = width - header.size - tailer.size
+  val bar = if (w  0) {
+val percent = w * s.numCompletedTasks() / total
+(0 until w).map { i =
+  if (i  percent) = else if (i == percent)  else  
+}.mkString()
+  } else {
+
+  }
+  System.err.print(header + bar + tailer)
+}
+  }
+
+  /**
+   * Clear the progress bar if showed.
+   */
+  private def clear() = {
--- End diff --

Add `: Unit` return type, here and other places


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20467790
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
--- End diff --

milliseconds is 1 word


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20467830
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
+  val DELAY_SHOW_UP = 500L
--- End diff --

Isn't this more like `SHOW_DELAY`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20467888
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
+  val DELAY_SHOW_UP = 500L
+
+  // The width of terminal
+  val TerminalWidth = if (!sys.env.getOrElse(COLUMNS, ).isEmpty) {
--- End diff --

Also, is `COLUMNS` a thing you added or something that's already there? If 
it's something you added maybe we should rename it to `CONSOLE_COLUMN_WIDTH` or 
something


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20467846
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
+  val DELAY_SHOW_UP = 500L
+
+  // The width of terminal
+  val TerminalWidth = if (!sys.env.getOrElse(COLUMNS, ).isEmpty) {
--- End diff --

You can just do `sys.env.contains`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20467949
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
--- End diff --

I wouldn't hard-code the time in the comment. Just say periodically here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20467968
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
--- End diff --

I wouldn't hard-code the time in the comment. Just say periodically here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20468037
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
+  val DELAY_SHOW_UP = 500L
+
+  // The width of terminal
+  val TerminalWidth = if (!sys.env.getOrElse(COLUMNS, ).isEmpty) {
+sys.env.get(COLUMNS).get.toInt
+  } else {
+80
+  }
+
+  var hasShowed = false
+  var lastFinishTime = 0L
+
+  // Schedule a refresh thread to run in every 200ms
+  private val timer = new Timer(refresh progress, true)
+  timer.schedule(new TimerTask{
+override def run() {
+  refresh()
+}
+  }, DELAY_SHOW_UP, UPDATE_PERIOD)
+
+  /**
+   * Try to refresh the progress bar in every cycle
+   */
+  private def refresh(): Unit = synchronized {
+val now = System.currentTimeMillis()
+if (now - lastFinishTime  DELAY_SHOW_UP) {
+  return
+}
+val stageIds = sc.statusTracker.getActiveStageIds()
+val stages = 
stageIds.map(sc.statusTracker.getStageInfo).flatten.filter(_.numTasks()  1)
+  .filter(now - _.submissionTime()  DELAY_SHOW_UP).sortBy(_.stageId())
+if (stages.size  0) {
+  show(stages.take(3))  // display at most 3 stages in same time
+  hasShowed = true
+}
+  }
+
+  /**
+   * Show progress bar in console. The progress bar is displayed in the 
next line
+   * after your last output, keeps overwriting itself to hold in one line. 
The logging will follow
+   * the progress bar, then progress bar will be showed in next line 
without overwrite logs.
+   */
+  private def show(stages: Seq[SparkStageInfo]) {
+System.err.print(\r)
+val width = TerminalWidth / stages.size
+stages.foreach { s =
+  val total = s.numTasks()
+  val header = s[Stage ${s.stageId()}:
+  val tailer = s(${s.numCompletedTasks()} + ${s.numActiveTasks()}) / 
$total]
+  val w = width - header.size - tailer.size
+  val bar = if (w  0) {
+val percent = w * s.numCompletedTasks() / total
+(0 until w).map { i =
+  if (i  percent) = else if (i == percent)  else  
+}.mkString()
+  } else {
+
+  }
+  System.err.print(header + bar + tailer)
+}
+  }
+
+  /**
+   * Clear the progress bar if showed.
+   */
+  private def clear() = {
+if (hasShowed) {
+  System.err.printf(\r +   * TerminalWidth + \r)
+  hasShowed = false
+}
+  }
+
+  /**
+   * Mark all the stages as finished, clear the progress bar if showed, 
then the progress will not
+   * interwave with output of jobs.
--- End diff --

interleave? interweave? I don't think interwave is a word


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63382985
  
Super cool. I left mostly minor comments. Otherwise it LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20468318
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
+  val DELAY_SHOW_UP = 500L
+
+  // The width of terminal
+  val TerminalWidth = if (!sys.env.getOrElse(COLUMNS, ).isEmpty) {
--- End diff --

COLUMNS is a variable exported by bash via:
```
export COLUMNS
```
It's not enabled by default, so we add it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20468419
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
+  val DELAY_SHOW_UP = 500L
--- End diff --

This is the delay for first show up, I did not figure out the right now for 
it. `UPDATE_DELAY` may be confusing with UPDATE_PERIOD


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread kayousterhout
Github user kayousterhout commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63383666
  
@davies @pwendell The three-stage solution looks reasonable to me!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20468644
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+
+import org.apache.spark._
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
--- End diff --

Right now, it only depends on statusTracker, but it may show some metrics 
later, so it's better to have `SparkContext` here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63385857
  
The new thing for multiple stages is really nice. I also think the new 
architecture is great. I made some minor comments, but overall looks good.

On thing, does this work in the SQL cli? If not, we can have a follow-up 
task be making it work there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63388317
  
  [Test build #23506 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23506/consoleFull)
 for   PR 3029 at commit 
[`95336d5`](https://github.com/apache/spark/commit/95336d575f3dc2a6e277a0d8778797c106a6098f).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63389665
  
@pwendell @andrewor14 I should had addressed you comments, please have 
another look, thanks!

This should work in SQL Cli, also in Windows.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63389796
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23504/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63399151
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23506/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63399144
  
  [Test build #23506 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23506/consoleFull)
 for   PR 3029 at commit 
[`95336d5`](https://github.com/apache/spark/commit/95336d575f3dc2a6e277a0d8778797c106a6098f).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20414049
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+import scala.collection.mutable.HashMap
+
+import org.apache.spark._
+import org.apache.spark.scheduler.{SparkListenerStageSubmitted, 
SparkListenerStageCompleted, SparkListener}
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
+  val DELAY_SHOW_UP = 500L
+  // The width of terminal
+  val TerminalWidth = if (!sys.env.getOrElse(COLUMNS, ).isEmpty) {
+sys.env.get(COLUMNS).get.toInt
+  } else {
+80
+  }
+
+  @volatile var hasShowed = false
+
+  /**
+   * Track the life cycle of stages
+   */
+  val activeStages = new HashMap[Int, Long]()
+
+  private class StageProgressListener extends SparkListener {
+override def onStageSubmitted(stageSubmitted: 
SparkListenerStageSubmitted) = {
--- End diff --

Instead of building your own listener here, why don't we just add 
`submissionTime` to the `SparkStageInfo` class?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20414069
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+import scala.collection.mutable.HashMap
+
+import org.apache.spark._
+import org.apache.spark.scheduler.{SparkListenerStageSubmitted, 
SparkListenerStageCompleted, SparkListener}
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
+  val DELAY_SHOW_UP = 500L
+  // The width of terminal
+  val TerminalWidth = if (!sys.env.getOrElse(COLUMNS, ).isEmpty) {
+sys.env.get(COLUMNS).get.toInt
+  } else {
+80
+  }
+
+  @volatile var hasShowed = false
+
+  /**
+   * Track the life cycle of stages
+   */
+  val activeStages = new HashMap[Int, Long]()
+
+  private class StageProgressListener extends SparkListener {
+override def onStageSubmitted(stageSubmitted: 
SparkListenerStageSubmitted) = {
+  activeStages.synchronized {
+activeStages.put(stageSubmitted.stageInfo.stageId, 
System.currentTimeMillis())
+  }
+}
+override def onStageCompleted(stageCompleted: 
SparkListenerStageCompleted) = {
+  activeStages.synchronized {
+activeStages.remove(stageCompleted.stageInfo.stageId)
+if (activeStages.isEmpty) {
+  clearProgressBar()
+}
+  }
+}
+  }
+  sc.listenerBus.addListener(new StageProgressListener)
+
+  // Schedule a update thread to run in every 200ms
+  private val timer = new Timer(show progress, true)
+  timer.schedule(new TimerTask{
+override def run() {
+  var running = 0
+  var finished = 0
+  var tasks = 0
+  var failed = 0
+  val now = System.currentTimeMillis()
+  val stageIds = sc.statusTracker.getActiveStageIds()
+  stageIds.map(sc.statusTracker.getStageInfo).foreach{
+case Some(stage) =
+  activeStages.synchronized {
+// Don't show progress for stage which has only one task 
(useless),
+// also don't show progress for stage which had started in 500 
ms
+if (stage.numTasks  1  activeStages.contains(stage.stageId)
+   now - activeStages(stage.stageId)  DELAY_SHOW_UP) {
+  tasks += stage.numTasks
+  running += stage.numActiveTasks
+  finished += stage.numCompletedTasks
+  failed += stage.numFailedTasks
+}
+  }
+  }
+  if (tasks  0) {
+showProgressBar(stageIds, tasks, running, finished, failed)
+  }
+}
+  }, DELAY_SHOW_UP, UPDATE_PERIOD)
+
+  /**
+   * Show progress in console (also in title). The progress bar is 
displayed in the next line
+   * after your last output, keeps overwriting itself to hold in one line. 
The logging will follow
+   * the progress bar, then progress bar will be showed in next line 
without overwrite logs.
+   */
+  private def showProgressBar(stageIds: Seq[Int], total: Int, running: 
Int, finished: Int,
+  failed: Int): Unit = {
+// show progress of all stages in one line progress bar
+val ids = stageIds.mkString(/)
+if 

[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20414182
  
--- Diff: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala 
---
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui
+
+import java.util.{Timer, TimerTask}
+import scala.collection.mutable.HashMap
+
+import org.apache.spark._
+import org.apache.spark.scheduler.{SparkListenerStageSubmitted, 
SparkListenerStageCompleted, SparkListener}
+
+/**
+ * ConsoleProgressBar shows the progress of stages in the next line of the 
console. It poll the
+ * status of active stages from `sc.statusTracker` in every 200ms, the 
progress bar will be showed
+ * up after the stage has ran at least 500ms. If multiple stages run in 
the same time, the status
+ * of them will be combined together, showed in one line.
+ */
+private[spark] class ConsoleProgressBar(sc: SparkContext) extends Logging {
+
+  // Update period of progress bar, in milli seconds
+  val UPDATE_PERIOD = 200L
+  // Delay to show up a progress bar, in milli seconds
+  val DELAY_SHOW_UP = 500L
+  // The width of terminal
+  val TerminalWidth = if (!sys.env.getOrElse(COLUMNS, ).isEmpty) {
+sys.env.get(COLUMNS).get.toInt
+  } else {
+80
+  }
+
+  @volatile var hasShowed = false
+
+  /**
+   * Track the life cycle of stages
+   */
+  val activeStages = new HashMap[Int, Long]()
+
+  private class StageProgressListener extends SparkListener {
+override def onStageSubmitted(stageSubmitted: 
SparkListenerStageSubmitted) = {
+  activeStages.synchronized {
+activeStages.put(stageSubmitted.stageInfo.stageId, 
System.currentTimeMillis())
+  }
+}
+override def onStageCompleted(stageCompleted: 
SparkListenerStageCompleted) = {
+  activeStages.synchronized {
+activeStages.remove(stageCompleted.stageInfo.stageId)
+if (activeStages.isEmpty) {
+  clearProgressBar()
+}
+  }
+}
+  }
+  sc.listenerBus.addListener(new StageProgressListener)
+
+  // Schedule a update thread to run in every 200ms
+  private val timer = new Timer(show progress, true)
+  timer.schedule(new TimerTask{
+override def run() {
+  var running = 0
+  var finished = 0
+  var tasks = 0
+  var failed = 0
+  val now = System.currentTimeMillis()
+  val stageIds = sc.statusTracker.getActiveStageIds()
+  stageIds.map(sc.statusTracker.getStageInfo).foreach{
+case Some(stage) =
+  activeStages.synchronized {
+// Don't show progress for stage which has only one task 
(useless),
+// also don't show progress for stage which had started in 500 
ms
+if (stage.numTasks  1  activeStages.contains(stage.stageId)
+   now - activeStages(stage.stageId)  DELAY_SHOW_UP) {
+  tasks += stage.numTasks
+  running += stage.numActiveTasks
+  finished += stage.numCompletedTasks
+  failed += stage.numFailedTasks
+}
+  }
+  }
+  if (tasks  0) {
+showProgressBar(stageIds, tasks, running, finished, failed)
+  }
+}
+  }, DELAY_SHOW_UP, UPDATE_PERIOD)
+
+  /**
+   * Show progress in console (also in title). The progress bar is 
displayed in the next line
+   * after your last output, keeps overwriting itself to hold in one line. 
The logging will follow
+   * the progress bar, then progress bar will be showed in next line 
without overwrite logs.
+   */
+  private def showProgressBar(stageIds: Seq[Int], total: Int, running: 
Int, finished: Int,
+  failed: Int): Unit = {
+// show progress of all stages in one line progress bar
+val ids = stageIds.mkString(/)
--- End diff --

[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63244719
  
Hey @davies - I played with this a bit and I actually found the behavior 
around concurrent stages might not be great. The reason is that the set of 
active stages will change as stages complete, and then it will suddenly change 
the slider significantly once one stage completes. Here is an example workload:

```
 ./bin/spark-shell --conf spark.scheduler.mode=FAIR
scala val a = sc.makeRDD(1 to 1000, 1).map(x = (x, x)).reduceByKey(_ 
+ _)
scala val b = sc.makeRDD(1 to 1000, 1).map(x = (x, x)).reduceByKey(_ 
+ _)
scala a.union(b).count()
```

Probably what we want in the longer term is to have a slider for the entire 
job rather than stages. But anyways, I'd prefer either the flip flop behavior 
or have multiple stacked progress bars. @kayousterhout didn't like the flip 
flop but I find it more understandable than what is here now. Since this is an 
opt-in feature I think it's fine to have some version that can go in now and 
then refine it later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63244854
  
By the way, @squito, I don't think this can go through log4j, it needs to 
access the jline console interface directly. This feature will just be 
controlled by a flag and users can decide whether to use it or not. By default 
we turn it on when the log level is WARN or higher, since at INFO level it's 
hard to display progress given all the other messages that are interpolated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20414302
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -231,6 +231,13 @@ class SparkContext(config: SparkConf) extends Logging {
 
   val statusTracker = new SparkStatusTracker(this)
 
+  private[spark] val progressBar: Option[ConsoleProgressBar] =
+if (conf.getBoolean(spark.ui.showConsoleProgress, true)) {
--- End diff --

should we disable this when the log level is set to info?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20414811
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -231,6 +231,13 @@ class SparkContext(config: SparkConf) extends Logging {
 
   val statusTracker = new SparkStatusTracker(this)
 
+  private[spark] val progressBar: Option[ConsoleProgressBar] =
+if (conf.getBoolean(spark.ui.showConsoleProgress, true)) {
--- End diff --

It's disabled in ConsoleProgressBar (because we may like to keep the 
progress in title when logging level is INFO).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63249917
  
  [Test build #23442 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23442/consoleFull)
 for   PR 3029 at commit 
[`0081bcc`](https://github.com/apache/spark/commit/0081bcca2d67097c33ecbd0052e72cda8889935b).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63251889
  
@pwendell  @kayousterhout  It can show mutiple stages (at most 3) in one 
line in the same time now, it looks like
```
[Stage 0:  (316 + 4) / 1000][Stage 1:(0 + 0) 
/ 1000][Stage 2:(0 + 0) / 1000]]]
```
```
[Stage 2:= 
(294 + 4) / 1000]
```

If there are more than three concurrent stages, the first three of them 
will be showed. Once a stage is finished, it will be removed.

Does this work for both of you?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63251878
  
  [Test build #23448 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23448/consoleFull)
 for   PR 3029 at commit 
[`a353e85`](https://github.com/apache/spark/commit/a353e8567d6b4fdcdeaabb03d8f8ca1f6b6ddbad).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63253165
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23442/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63253164
  
  [Test build #23442 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23442/consoleFull)
 for   PR 3029 at commit 
[`0081bcc`](https://github.com/apache/spark/commit/0081bcca2d67097c33ecbd0052e72cda8889935b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63254820
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23448/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63254818
  
  [Test build #23448 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23448/consoleFull)
 for   PR 3029 at commit 
[`a353e85`](https://github.com/apache/spark/commit/a353e8567d6b4fdcdeaabb03d8f8ca1f6b6ddbad).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63257525
  
  [Test build #23454 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23454/consoleFull)
 for   PR 3029 at commit 
[`2e90f75`](https://github.com/apache/spark/commit/2e90f7599779fe1e51b7b39d0d39e7c77260e47a).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63261422
  
  [Test build #523 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/523/consoleFull)
 for   PR 3029 at commit 
[`a353e85`](https://github.com/apache/spark/commit/a353e8567d6b4fdcdeaabb03d8f8ca1f6b6ddbad).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63261770
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23454/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63261768
  
  [Test build #23454 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23454/consoleFull)
 for   PR 3029 at commit 
[`2e90f75`](https://github.com/apache/spark/commit/2e90f7599779fe1e51b7b39d0d39e7c77260e47a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63204439
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23434/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63204436
  
  [Test build #23434 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23434/consoleFull)
 for   PR 3029 at commit 
[`0cee236`](https://github.com/apache/spark/commit/0cee2368b09fb8167e0a992bccea8eb17257ad35).
 * This patch **fails RAT tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63204432
  
  [Test build #23434 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23434/consoleFull)
 for   PR 3029 at commit 
[`0cee236`](https://github.com/apache/spark/commit/0cee2368b09fb8167e0a992bccea8eb17257ad35).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63204653
  
  [Test build #521 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/521/consoleFull)
 for   PR 3029 at commit 
[`0cee236`](https://github.com/apache/spark/commit/0cee2368b09fb8167e0a992bccea8eb17257ad35).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63204768
  
@JoshRosen @pwendell  @kayousterhout  @squito  I had re-implemented it 
using the new poll based progress api. The progress bar is much simplified as 
the original one, remove the progress in title (which did not work well with 
pyspark), remove the stage summary.

Once the job/stage is finished, the progress bar will be disappear.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63204747
  
  [Test build #23436 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23436/consoleFull)
 for   PR 3029 at commit 
[`30ac852`](https://github.com/apache/spark/commit/30ac852e87cbc3d2017567c21e1a895b8828fbe1).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63204911
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23435/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63204979
  
  [Test build #23437 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23437/consoleFull)
 for   PR 3029 at commit 
[`ab87958`](https://github.com/apache/spark/commit/ab879587d6d230a044afeb3789170220533bb861).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63206309
  
  [Test build #521 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/521/consoleFull)
 for   PR 3029 at commit 
[`0cee236`](https://github.com/apache/spark/commit/0cee2368b09fb8167e0a992bccea8eb17257ad35).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63206407
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23436/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63206404
  
  [Test build #23436 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23436/consoleFull)
 for   PR 3029 at commit 
[`30ac852`](https://github.com/apache/spark/commit/30ac852e87cbc3d2017567c21e1a895b8828fbe1).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63206787
  
  [Test build #23438 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23438/consoleFull)
 for   PR 3029 at commit 
[`38c42f1`](https://github.com/apache/spark/commit/38c42f18ab24c8e3aecce0e39f0f2fa996627ec4).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63206812
  
  [Test build #23437 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23437/consoleFull)
 for   PR 3029 at commit 
[`ab87958`](https://github.com/apache/spark/commit/ab879587d6d230a044afeb3789170220533bb861).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63206817
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23437/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63208927
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23438/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console

2014-11-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63208925
  
  [Test build #23438 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23438/consoleFull)
 for   PR 3029 at commit 
[`38c42f1`](https://github.com/apache/spark/commit/38c42f18ab24c8e3aecce0e39f0f2fa996627ec4).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-14 Thread squito
Github user squito commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-63149772
  
Sorry for my delay in responding ...

(a) I think this DOES add a lot of value over the std INFO logging.  One 
log line per task completion is *much* noisier than what I'm proposing here, 
for a job with hundreds of tasks that completes in a few seconds (at least for 
me, a very common case).

(b) I think changing the logging configuration to be INFO for this, and 
leaving at WARN for everything else, is pretty easy.  The other comments above 
already request this be moved into a `SparkListener`, so you would just add a 
line:

```
log4j.logger.org.apache.spark.reporter.JobProgressConsoleReporter=INFO
```

(though I realize now that I actually am not sure where the logging setup 
for the examples is configured ...)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20112947
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -554,12 +627,11 @@ private[spark] class TaskSetManager(
 val index = info.index
 info.markSuccessful()
 removeRunningTask(tid)
-sched.dagScheduler.taskEnded(
-  tasks(index), Success, result.value(), result.accumUpdates, info, 
result.metrics)
 if (!successful(index)) {
   tasksSuccessful += 1
-  logInfo(Finished task %s in stage %s (TID %d) in %d ms on %s 
(%d/%d).format(
-info.id, taskSet.id, info.taskId, info.duration, info.host, 
tasksSuccessful, numTasks))
+  logDebug(Finished task %s in stage %s (TID %d) in %.3fs on %s 
(%d/%d).format(
+info.id, taskSet.id, info.taskId, info.duration/1000.0, info.host,
--- End diff --

Why did you change this message from ms to s?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-62454596
  
This is a very nifty feature. However, it's not great to have modifications 
to the TaskSetManager and other scheduler internals for this 
presentation-related logic. It would be good if this could instead use our new 
progress reporting api (/cc @JoshRosen) and if we need to modify one or two 
things about that API we can do it. /cc @kayousterhout who maintains the 
scheduler code. My suggestion was to move a lot of this logic outside of the 
scheduler.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20112968
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -569,6 +641,11 @@ private[spark] class TaskSetManager(
   logInfo(Ignoring task-finished event for  + info.id +  in stage  
+ taskSet.id +
  because task  + index +  has already completed successfully)
 }
+if (showProgress) {
+  showProgressBar(tasksSuccessful, numTasks)
+}
+sched.dagScheduler.taskEnded(
--- End diff --

What's the reason for moving this call?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r20112906
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -528,7 +601,7 @@ private[spark] class TaskSetManager(
 sched.dagScheduler.taskGettingResult(info)
   }
 
-  /**
+  /*
--- End diff --

Can you revert this style change, which seems unrelated to what you did?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread kayousterhout
Github user kayousterhout commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-62458897
  
This is definitely a cool/useful feature!  A few things:

(1) As Patrick alluded to, I think it could and should be able to 
implemented as a SparkListener, which is definitely preferable to adding, as 
@aarondav wisely said, a bunch of random code in the middle of 
TaskSetManager.  The scheduling code is already quite complex and we should 
try to avoid adding unnecessary complexity there.

(2) I tried this out and and it doesn't seem to work properly when a job 
has multiple concurrent stages, because of the way lines get overwritten (only 
one of the stages gets shown).

(3) This bar doesn't show up if any other messages get logged to the 
console (e.g., if info logging is turned on).  Is there a way to make this so 
that it's always the bottom thing shown in the console, after other messages?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-62473806
  
@pwendell @kayousterhout , Thanks for review this.

Console is a character device, all the graph are implemented as an stream 
of characters. It's very hard to show a near realtime progress bar without 
discruptting the logging (they don't know each other), for example:  1) if 
progress bar is showed after cursor of logging, then future logging will 
overwrite the progress bar (part of) 2) if progress bar is showed after the 
cursor of logging (such as top line of console), then old loggings (also the 
output of results) will be overwritten by progress bar.

So the current approach is that the progress bar is only showed when the 
logging level is WARN (or higher). If the logging level is DEBUG or INFO, users 
can get the progress info from logging, also it's hard the manage these twos. 
The progress bar is showed between call a action API and it returns, so it's 
expected that there is no output/logging in this period, the console will not 
become mess. If we move to listener based implementation, then it's hard to 
cleanup the progress bar before the api `return`, it's also the reason that I 
move `sched.dagScheduler.taskEnded` after showProgressBar().

It's did not work properly when a job has multiple concurrent stages, the 
concurrent progress bar will overwrite each other randomly. Each bar will begin 
with it's stage id, so it's still kind of readable.

I agree that putting the code of progress bar into TaskSetManager is not 
good idea, I will move them out after we finalize other stuff (how to deal with 
logging, use listener api or not). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread kayousterhout
Github user kayousterhout commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-62474857
  
I'm not sure if this is a good idea, but what if the progress bar printed a 
new line each time it advanced, and just used the normal logging 
infrastructure?  So the log would look something like:

[INFO] Stage 1 [=  
[INFO] Stage 1 [==
[INFO] Stage 1 [===

and so on?  It's a little more verbose / less pretty to look at, but I 
think it would more cleanly handle both (a) playing nice with the existing info 
logging, and (b) showing info about multiple stages.  Thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread squito
Github user squito commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-62475673
  
I was just about to suggest the same thing .  So I admit it seemed a lot 
cooler to have the console keep updating, but I agree with their concerns.

As a slight modification of @kayousterhout 's proposal, what if instead of 
logging for *every* update, you log whenever some time unit have elapsed (eg., 
1 second) *and* some unit of work has been done (that is, both conditions must 
be true, not either for either condition)?  That way the logs dont' get 
clobbered with lots of little updates -- if you have 1000 tasks but the whole 
thing finishes in under 1 second, you really don't to monitor the progress in 
the logs.  But by just using the normal logging mechanism, its still 
controllable via normal logging mechanism  plays nicely.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread kayousterhout
Github user kayousterhout commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-62476320
  
@squito that's a good idea with the minimum time interval to avoid 
unnecessary clutter!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-62480890
  
@kayousterhout @squito The motivation of this PR is to allow user to see 
the progress without flushing away their input/output, so it's better to have a 
one-line progress bar. For example
```scala
scala rdd.count()
[Stage 1] finished in xxx seconds (med/avg/)
res0: Int = 
```
If we put the progress bar into normal logging infrastructure (as level 
WARN), also do not mix them into one-line, then this will much easier, but not 
as good as current approach to users (it's still noisy).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread squito
Github user squito commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-62499404
  
I totally see the appeal of the one-progress bar (hence my initial 
excitement when I tried this out).  But if it doesn't play nicely with logging 
 multiple stages, this seems like a very small improvement for the initial 
user experience, but a big headache for serious users.

I don't think its really that much worse if your example changes to

```
scala rdd.count()
[INFO] Stage 1 [= ]
[INFO] Stage 1 [==]
[INFO] Stage 1 finished in 2 seconds (med/avg/)
res0: Int = 
```

In this case its just a couple more lines.  If the stage took longer, than 
it would be even more lines, but that seems ok, since its not that much noise 
per unit time.

If the code were moved to a separate SparkListener implementation, than it 
could have its own log level, and even be INFO by default (so we leave 
everything else as WARN).  INFO for everything in spark is way too noisy for 
the average spark user, but that doesn't mean we can't use INFO for a few 
select classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-10 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-62503514
  
 But if it doesn't play nicely with logging  multiple stages, this seems 
like a very small improvement for the initial user experience, but a big 
headache for serious users.

If the logging level is DEBUG/INFO, progress bar can not contribute more 
value than logging, so it's fine to disable it. In WARN level, the logging 
should be much less (even nothing), in case there is some logging, they will 
look like this
```
[Stage 2] ==   ] 10 + 
5/100 xxx
[WARN] 
[Stage 2] = ] 15 + 
5/100 xxx
```
I think it's not bad, especially it's not common cases.

If there are multiple stages, which are not running concurrently (this is 
common cases), the progress bar will be showed stage by stage, looks like
```
[Stage 1] Finished in xxx seconds. (med/avg/xxx)
[Stage 2] = ] 15 + 
5/100 xxx
```

We can improve the case in which multiple stages run concurrently, looks 
like
```
[Stage 1/2/3] [=  ] 
140+39/340 
```

So, I can not agree with you that the current approach does not play nicely 
with logging/multiple stages.

 In this case its just a couple more lines. If the stage took longer, than 
it would be even more lines, but that seems ok, since its not that much noise 
per unit time.

Progress bar is useful for slow jobs, so it's expected to take long time 
(maybe more than 1 minute) to finish a stage, then the progress bar will occupy 
the whole screen (more than 40 lines), use need to scroll the screen to see 
previous output (results).

If we use INFO for progress bar, we need to special trick to enable 
progress bar also disable others. If we can do this, we can filter out the 
progress logging without all others right now. It will not so easy to use to 
users, special for user who is not familiar with log4j.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-09 Thread squito
Github user squito commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-62330615
  
this is awesome!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61581372
  
  [Test build #22845 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22845/consoleFull)
 for   PR 3029 at commit 
[`a60477c`](https://github.com/apache/spark/commit/a60477c274675a35276a6e924656db45935b6144).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61581788
  
  [Test build #22846 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22846/consoleFull)
 for   PR 3029 at commit 
[`e1f524d`](https://github.com/apache/spark/commit/e1f524d9239bd94099fdd9c227ac8a4b5dda70ba).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61583786
  
  [Test build #22850 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22850/consoleFull)
 for   PR 3029 at commit 
[`6fd30ff`](https://github.com/apache/spark/commit/6fd30ff1a1935c26d716271d729a38e26b953e49).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61587953
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22846/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61587948
  
  [Test build #22846 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22846/consoleFull)
 for   PR 3029 at commit 
[`e1f524d`](https://github.com/apache/spark/commit/e1f524d9239bd94099fdd9c227ac8a4b5dda70ba).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61588063
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22845/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61588061
  
  [Test build #22845 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22845/consoleFull)
 for   PR 3029 at commit 
[`a60477c`](https://github.com/apache/spark/commit/a60477c274675a35276a6e924656db45935b6144).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class ExecutorLostFailure(execId: String) extends 
TaskFailedReason `
  * `class NullType(PrimitiveType):`
  * `class DecimalType(DataType):`
  * `//   in some cases, such as when a class is enclosed in an 
object (in which case`
  * `  case class ScalaUdfBuilder[T: TypeTag](f: AnyRef) `
  * `case class UnscaledValue(child: Expression) extends UnaryExpression `
  * `case class MakeDecimal(child: Expression, precision: Int, scale: Int) 
extends UnaryExpression `
  * `case class MutableLiteral(var value: Any, dataType: DataType, 
nullable: Boolean = true)`
  * `abstract class GenericStrategy[PhysicalPlan : TreeNode[PhysicalPlan]] 
extends Logging `
  * `case class PrecisionInfo(precision: Int, scale: Int)`
  * `case class DecimalType(precisionInfo: Option[PrecisionInfo]) extends 
FractionalType `
  * `abstract class UserDefinedType[UserType] extends DataType with 
Serializable `
  * `final class Decimal extends Ordered[Decimal] with Serializable `
  * `  trait DecimalIsConflicted extends Numeric[Decimal] `
  * `public abstract class UserDefinedTypeUserType extends DataType 
implements Serializable `
  * `trait RunnableCommand extends logical.Command `
  * `case class ExecutedCommand(cmd: RunnableCommand) extends SparkPlan `
  * `  protected case class Keyword(str: String)`
  * `sys.error(sFailed to load class for data source: 
$provider)`
  * `case class EqualTo(attribute: String, value: Any) extends Filter`
  * `case class GreaterThan(attribute: String, value: Any) extends Filter`
  * `case class GreaterThanOrEqual(attribute: String, value: Any) extends 
Filter`
  * `case class LessThan(attribute: String, value: Any) extends Filter`
  * `case class LessThanOrEqual(attribute: String, value: Any) extends 
Filter`
  * `trait RelationProvider `
  * `abstract class BaseRelation `
  * `abstract class TableScan extends BaseRelation `
  * `abstract class PrunedScan extends BaseRelation `
  * `abstract class PrunedFilteredScan extends BaseRelation `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61589372
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22850/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-03 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61589367
  
  [Test build #22850 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22850/consoleFull)
 for   PR 3029 at commit 
[`6fd30ff`](https://github.com/apache/spark/commit/6fd30ff1a1935c26d716271d729a38e26b953e49).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class ExecutorLostFailure(execId: String) extends 
TaskFailedReason `
  * `class NullType(PrimitiveType):`
  * `class DecimalType(DataType):`
  * `//   in some cases, such as when a class is enclosed in an 
object (in which case`
  * `  case class ScalaUdfBuilder[T: TypeTag](f: AnyRef) `
  * `case class UnscaledValue(child: Expression) extends UnaryExpression `
  * `case class MakeDecimal(child: Expression, precision: Int, scale: Int) 
extends UnaryExpression `
  * `case class MutableLiteral(var value: Any, dataType: DataType, 
nullable: Boolean = true)`
  * `abstract class GenericStrategy[PhysicalPlan : TreeNode[PhysicalPlan]] 
extends Logging `
  * `case class PrecisionInfo(precision: Int, scale: Int)`
  * `case class DecimalType(precisionInfo: Option[PrecisionInfo]) extends 
FractionalType `
  * `abstract class UserDefinedType[UserType] extends DataType with 
Serializable `
  * `final class Decimal extends Ordered[Decimal] with Serializable `
  * `  trait DecimalIsConflicted extends Numeric[Decimal] `
  * `public abstract class UserDefinedTypeUserType extends DataType 
implements Serializable `
  * `trait RunnableCommand extends logical.Command `
  * `case class ExecutedCommand(cmd: RunnableCommand) extends SparkPlan `
  * `  protected case class Keyword(str: String)`
  * `sys.error(sFailed to load class for data source: 
$provider)`
  * `case class EqualTo(attribute: String, value: Any) extends Filter`
  * `case class GreaterThan(attribute: String, value: Any) extends Filter`
  * `case class GreaterThanOrEqual(attribute: String, value: Any) extends 
Filter`
  * `case class LessThan(attribute: String, value: Any) extends Filter`
  * `case class LessThanOrEqual(attribute: String, value: Any) extends 
Filter`
  * `trait RelationProvider `
  * `abstract class BaseRelation `
  * `abstract class TableScan extends BaseRelation `
  * `abstract class PrunedScan extends BaseRelation `
  * `abstract class PrunedFilteredScan extends BaseRelation `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61359587
  
  [Test build #22685 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22685/consoleFull)
 for   PR 3029 at commit 
[`bc53d99`](https://github.com/apache/spark/commit/bc53d99d518d6fafd607c617d0915c7a2f9eee85).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61359713
  
@JoshRosen I had make some improvements: 1) finished the bar before print 
the result in scala shell 2) can interwave with logging better(will not 
overwrite each other) 3) will not show progress in jenkins (using console 
instead of stderr


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61360811
  
  [Test build #22685 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22685/consoleFull)
 for   PR 3029 at commit 
[`bc53d99`](https://github.com/apache/spark/commit/bc53d99d518d6fafd607c617d0915c7a2f9eee85).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61360812
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22685/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread aarondav
Github user aarondav commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r19702287
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -521,6 +529,51 @@ private[spark] class TaskSetManager(
 sched.dagScheduler.taskGettingResult(info)
   }
 
+  private def progressBar(curr: Int, total: Int): Unit = {
+val now = clock.getTime()
+// Only update title once in one second
+if (now - lastUpdate  100  curr  total) {
--- End diff --

comment says 1 second, but this looks like 100ms


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread aarondav
Github user aarondav commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r19702293
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -521,6 +529,51 @@ private[spark] class TaskSetManager(
 sched.dagScheduler.taskGettingResult(info)
   }
 
+  private def progressBar(curr: Int, total: Int): Unit = {
--- End diff --

Please add method comment describing what this is doing and what the 
parameters are (it is, after all, a bunch of random code in the middle of 
TaskSetManager).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread aarondav
Github user aarondav commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r19702297
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -521,6 +529,51 @@ private[spark] class TaskSetManager(
 sched.dagScheduler.taskGettingResult(info)
   }
 
+  private def progressBar(curr: Int, total: Int): Unit = {
+val now = clock.getTime()
+// Only update title once in one second
+if (now - lastUpdate  100  curr  total) {
+  return
+}
+lastUpdate = now
+
+// show progress in title
+if (Terminal.getTerminal.isANSISupported) {
+  val ESC = \033
+  val title = if (curr  total) {
+sSpark Job: $curr/$total Finished, $runningTasks are running
+  } else {
+sSpark Job: Finished in ${Utils.msDurationToString(now - 
startTime)}
+  }
+  console.printf(s$ESC]0; $title \007)
+}
+
+// show one line progress bar
+if (!log.isInfoEnabled) {
+  if (curr  total) {
+val header = sStage $stageId: [
+val tailer = s] $curr+$runningTasks/$total - 
${Utils.msDurationToString(now - startTime)}
+val width = Terminal.getTerminal.getTerminalWidth - header.size - 
tailer.size
+val percent = curr * width / total;
+val bar = (0 until width).map { i =
+  if (i  percent) = else if (i==percent)  else  
+}.mkString()
+console.printf(\r + header + bar + tailer)
--- End diff --

I'm not familiar with the finer points of console, but does this overwrite 
the last log line? Or would it do so if the last log line was `print`'d instead 
of `println`'d?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r19709497
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -521,6 +529,51 @@ private[spark] class TaskSetManager(
 sched.dagScheduler.taskGettingResult(info)
   }
 
+  private def progressBar(curr: Int, total: Int): Unit = {
+val now = clock.getTime()
+// Only update title once in one second
+if (now - lastUpdate  100  curr  total) {
+  return
+}
+lastUpdate = now
+
+// show progress in title
+if (Terminal.getTerminal.isANSISupported) {
+  val ESC = \033
+  val title = if (curr  total) {
+sSpark Job: $curr/$total Finished, $runningTasks are running
+  } else {
+sSpark Job: Finished in ${Utils.msDurationToString(now - 
startTime)}
+  }
+  console.printf(s$ESC]0; $title \007)
+}
+
+// show one line progress bar
+if (!log.isInfoEnabled) {
+  if (curr  total) {
+val header = sStage $stageId: [
+val tailer = s] $curr+$runningTasks/$total - 
${Utils.msDurationToString(now - startTime)}
+val width = Terminal.getTerminal.getTerminalWidth - header.size - 
tailer.size
+val percent = curr * width / total;
+val bar = (0 until width).map { i =
+  if (i  percent) = else if (i==percent)  else  
+}.mkString()
+console.printf(\r + header + bar + tailer)
--- End diff --

Either stdout or stderr could be redirect to somewhere, but console is the 
real target.

If last log line ws print, it will overwrite it. In most cases, logging 
will be writted as println.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61393355
  
  [Test build #22737 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22737/consoleFull)
 for   PR 3029 at commit 
[`ea49fe0`](https://github.com/apache/spark/commit/ea49fe07d681d3821110954342f1c17cbbcf7ccc).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61393375
  
  [Test build #22737 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22737/consoleFull)
 for   PR 3029 at commit 
[`ea49fe0`](https://github.com/apache/spark/commit/ea49fe07d681d3821110954342f1c17cbbcf7ccc).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61393376
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22737/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61393558
  
  [Test build #22739 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22739/consoleFull)
 for   PR 3029 at commit 
[`5cae3f2`](https://github.com/apache/spark/commit/5cae3f22bd187d56b5bd0067dd9129f22ced4941).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61394840
  
  [Test build #22739 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22739/consoleFull)
 for   PR 3029 at commit 
[`5cae3f2`](https://github.com/apache/spark/commit/5cae3f22bd187d56b5bd0067dd9129f22ced4941).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-11-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61394842
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22739/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-10-31 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3029#discussion_r19697163
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -521,6 +528,47 @@ private[spark] class TaskSetManager(
 sched.dagScheduler.taskGettingResult(info)
   }
 
+  private def progressBar(curr: Int, total: Int): Unit = {
+val now = clock.getTime()
+// Only update title once in one second
+if (now - lastUpdate  100  curr  total) {
+  return
+}
+val SetTitle = \033]0;
+val EndTitle = \007
+if (curr  total) {
+  System.err.print(s${SetTitle} Spark Job: $curr/$total Finished,  +
+s$runningTasks are running ${EndTitle})
+
+  if (!log.isInfoEnabled) {
+val used = (now - startTime) / 1000
+val header = sStage ${stageId}: [
+val tailer = s] ${curr}+${runningTasks}/${total} - ${used}s
+val width = Terminal.getTerminal.getTerminalWidth - header.size - 
tailer.size
+val percent = curr * width / total;
+val bar = (0 until width).map { i =
+  if (i  percent) = else if (i==percent)  else  
+}.mkString()
+System.err.print(header + bar + tailer + s\n${ANSICodes.up(0)})
+  }
+} else {
+  System.err.print(s${SetTitle} Spark Job: All Finished ${EndTitle})
+  if (!log.isInfoEnabled) {
+val used = (now  - startTime) / 1000
+val finishTimes = taskInfos.map(_._2.finishTime - startTime)
+val avg = finishTimes.sum / finishTimes.size / 1000
+val min = finishTimes.min / 1000
+val max = finishTimes.max / 1000
+val med = finishTimes.toSeq.sorted.slice(0, finishTimes.size / 
2).last / 1000
+// erase current line
+System.err.print(  * Terminal.getTerminal.getTerminalWidth + 
\n + ANSICodes.up(0))
+System.err.println(sStage ${stageId}: Finished in ${used}s with 
${total} tasks  +
+  s(${min}/${med}/${avg}/${max}s).)
--- End diff --

Maybe we could explicitly say `min=*/median=*/avg=*/max=*` to make this 
clearer to useres?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-10-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61340878
  
  [Test build #22656 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22656/consoleFull)
 for   PR 3029 at commit 
[`7e7d4e7`](https://github.com/apache/spark/commit/7e7d4e784864baa4168819b0fc3fc01a01abc1cd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-10-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61349714
  
**[Test build #22656 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22656/consoleFull)**
 for PR 3029 at commit 
[`7e7d4e7`](https://github.com/apache/spark/commit/7e7d4e784864baa4168819b0fc3fc01a01abc1cd)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4017] show progress bar in console and ...

2014-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3029#issuecomment-61349718
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22656/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >