[GitHub] spark pull request: [SPARK-4677] [WEB] Add hadoop input time in ta...

2014-12-01 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request:

https://github.com/apache/spark/pull/3539

[SPARK-4677] [WEB] Add hadoop input time in task webui

Add hadoop input time in task webui like GC Time to explicitly show the 
time used by task to read input data.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/YanTangZhai/spark WebuiInputTime

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3539.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3539


commit cdef539abc5d2d42d4661373939bdd52ca8ee8e6
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-08-06T13:07:08Z

Merge pull request #1 from apache/master

update

commit cbcba66ad77b96720e58f9d893e87ae5f13b2a95
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-08-20T13:14:08Z

Merge pull request #3 from apache/master

Update

commit 8a0010691b669495b4c327cf83124cabb7da1405
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-09-12T06:54:58Z

Merge pull request #6 from apache/master

Update

commit 03b62b043ab7fd39300677df61c3d93bb9beb9e3
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-09-16T12:03:22Z

Merge pull request #7 from apache/master

Update

commit 76d40277d51f709247df1d3734093bf2c047737d
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-10-20T12:52:22Z

Merge pull request #8 from apache/master

update

commit d26d98248a1a4d0eb15336726b6f44e05dd7a05a
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-11-04T09:00:31Z

Merge pull request #9 from apache/master

Update

commit e249846d9b7967ae52ec3df0fb09e42ffd911a8a
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-11-11T03:18:24Z

Merge pull request #10 from apache/master

Update

commit 6e643f81555d75ec8ef3eb57bf5ecb6520485588
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-12-01T11:23:56Z

Merge pull request #11 from apache/master

Update

commit 3816f8540b947809cb821bcb3af36d7be0210d9c
Author: yantangzhai tyz0...@163.com
Date:   2014-12-01T14:09:24Z

add hadoop input read time in webui




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4677] [WEB] Add hadoop input time in ta...

2014-12-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3539#issuecomment-65069992
  
  [Test build #23991 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23991/consoleFull)
 for   PR 3539 at commit 
[`3816f85`](https://github.com/apache/spark/commit/3816f8540b947809cb821bcb3af36d7be0210d9c).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4677] [WEB] Add hadoop input time in ta...

2014-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3539#issuecomment-65070129
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23991/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4677] [WEB] Add hadoop input time in ta...

2014-12-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3539#issuecomment-65070127
  
  [Test build #23991 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23991/consoleFull)
 for   PR 3539 at commit 
[`3816f85`](https://github.com/apache/spark/commit/3816f8540b947809cb821bcb3af36d7be0210d9c).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4677] [WEB] Add hadoop input time in ta...

2014-12-01 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3539#discussion_r21090555
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -238,10 +238,13 @@ class HadoopRDD[K, V](
   val value: V = reader.createValue()
 
   var recordsSinceMetricsUpdate = 0
+  var startTime : Long = 0L
 
   override def getNext() = {
 try {
+  startTime = System.nanoTime
   finished = !reader.next(key, value)
+  inputMetrics.readTime += (System.nanoTime - startTime)
--- End diff --

Hm, is this going to be expensive, making 2 system calls for every read?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4677] [WEB] Add hadoop input time in ta...

2014-12-01 Thread YanTangZhai
Github user YanTangZhai commented on a diff in the pull request:

https://github.com/apache/spark/pull/3539#discussion_r21140476
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -238,10 +238,13 @@ class HadoopRDD[K, V](
   val value: V = reader.createValue()
 
   var recordsSinceMetricsUpdate = 0
+  var startTime : Long = 0L
 
   override def getNext() = {
 try {
+  startTime = System.nanoTime
   finished = !reader.next(key, value)
+  inputMetrics.readTime += (System.nanoTime - startTime)
--- End diff --

Oh sorry. It may be expensive. Let me think about it. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4677] [WEB] Add hadoop input time in ta...

2014-12-01 Thread YanTangZhai
Github user YanTangZhai closed the pull request at:

https://github.com/apache/spark/pull/3539


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org