[jira] [Commented] (SPARK-11155) Stage summary json should include stage duration
[ https://issues.apache.org/jira/browse/SPARK-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036375#comment-15036375 ] Apache Spark commented on SPARK-11155: -- User 'keypointt' has created a pull request for this issue: https://github.com/apache/spark/pull/10107 > Stage summary json should include stage duration > - > > Key: SPARK-11155 > URL: https://issues.apache.org/jira/browse/SPARK-11155 > Project: Spark > Issue Type: Improvement > Components: Web UI >Reporter: Imran Rashid >Assignee: Xin Ren >Priority: Minor > Labels: Starter > > The json endpoint for stages doesn't include information on the stage > duration that is present in the UI. This looks like a simple oversight, they > should be included. eg., the metrics should be included at > {{api/v1/applications//stages}}. The missing metrics are > {{submissionTime}} and {{completionTime}} (and whatever other metrics come > out of the discussion on SPARK-10930) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11155) Stage summary json should include stage duration
[ https://issues.apache.org/jira/browse/SPARK-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980802#comment-14980802 ] Xin Ren commented on SPARK-11155: - Hi [~imranr], thank you so much for your detailed explanation, really helps me out. I'm digging into it now. I'll try to finish it asap. Thank you again! :) > Stage summary json should include stage duration > - > > Key: SPARK-11155 > URL: https://issues.apache.org/jira/browse/SPARK-11155 > Project: Spark > Issue Type: Improvement > Components: Web UI >Reporter: Imran Rashid >Assignee: Xin Ren >Priority: Minor > Labels: Starter > > The json endpoint for stages doesn't include information on the stage > duration that is present in the UI. This looks like a simple oversight, they > should be included. eg., the metrics should be included at > {{api/v1/applications//stages}}. The missing metrics are > {{submissionTime}} and {{completionTime}} (and whatever other metrics come > out of the discussion on SPARK-10930) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11155) Stage summary json should include stage duration
[ https://issues.apache.org/jira/browse/SPARK-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980789#comment-14980789 ] Imran Rashid commented on SPARK-11155: -- Hi [~iamshrek], First start by reading https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-PreparingtoContributeCodeChanges (especialy the section on "Pull Requests") and https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools. Of course everyone has their own favorite development environment / techniques, but I'll tell you what I do. I've found debugging from within IDEs to not work very well for spark (or scala codebases in general), so I just don't bother anymore. (But I know others do debug from within intellij, so its not impossible.) Instead, I just navigate through code in Intellij, and I use sbt, unit tests, & printlns to test out code. 1) run {{build/sbt -Pyarn -Phadoop-2.4 -Dhadoop.version=2.5.0 -Pscala-2.10}}. This will start an sbt shell 2) inside the sbt shell, run {{project core}} (since you only care about the "spark-core" project) 3) run {{~test:compile}}. This will compile all the main & test code for the "spark-core" project and its dependencies. Also it'll watch all the src files -- anytime you change them it'll *incrementally* recompile. (the "~" prefix makes sbt run the command in the incremental watch mode.) 4) leave that sbt-shell open, and then go to Intellij and do some coding. Eg., try adding some garbage syntax, and then you'll see the sbt shell recompile and tell you about the compile error (and it should recompile quickly this time). 5) you can run some tests with {{~test-only }}. In your case, you probably want {{~test-only *.HistoryServerSuite}}. As before, it'll run in watch mode, but this time it'll rerun the tests as you change your code. The tests you are most interested in are these: https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala#L102 which reference expected results here: https://github.com/apache/spark/tree/master/core/src/test/resources/HistoryServerExpectations While debugging, you can add printlns, which will show up in the sbt shell, or you can also add logging statements, which will appear in core/target/unit-tests.log (along with all the existing logging statements). 6) before submitting your PR, run {{scalastyle}} and {{test:scalastyle}} from within sbt (or just run {{dev/scalastyle}} from bash). that'll help you track down style violations locally. (jenkins would do this for you, but its a lot faster if you fix them locally -- that said, I often forget to do this myself.) 7) re-read the wiki guidelines, then submit your PR. Jenkins will then run the full set of tests for you, and reviewers will comment. For the HistoryServer in particular, you can also just run it locally, navigate to some endpoints in your browser and see what happens. hope this helps! > Stage summary json should include stage duration > - > > Key: SPARK-11155 > URL: https://issues.apache.org/jira/browse/SPARK-11155 > Project: Spark > Issue Type: Improvement > Components: Web UI >Reporter: Imran Rashid >Assignee: Xin Ren >Priority: Minor > Labels: Starter > > The json endpoint for stages doesn't include information on the stage > duration that is present in the UI. This looks like a simple oversight, they > should be included. eg., the metrics should be included at > {{api/v1/applications//stages}}. The missing metrics are > {{submissionTime}} and {{completionTime}} (and whatever other metrics come > out of the discussion on SPARK-10930) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11155) Stage summary json should include stage duration
[ https://issues.apache.org/jira/browse/SPARK-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979386#comment-14979386 ] Xin Ren commented on SPARK-11155: - Hi sorry for being slow, this is my first attempt to change spark code, and just managed to set up IntelliJ and build the source code from IntelliJ successfully. Could you please tell me how to debug the source code? as my understanding, if I make some code changes in the source code and test it, I have to: 1. build the project after each code change (takes a long time to finish...) 2. start spark cluster by running "./sbin/start-all.sh" 3. test UI by running "./bin/spark-class org.apache.spark.ui.UIWorkloadGenerator spark://mac-xr.local:7077 FAIR 1" I tried to google how to debug step by step for source code but didn't find much useful, mostly for debugging Spark app but not Spark source code itself. Could you please tell me how to debug Spark source code step by step in debug mode? Thank you very much :) > Stage summary json should include stage duration > - > > Key: SPARK-11155 > URL: https://issues.apache.org/jira/browse/SPARK-11155 > Project: Spark > Issue Type: Improvement > Components: Web UI >Reporter: Imran Rashid >Assignee: Xin Ren >Priority: Minor > Labels: Starter > > The json endpoint for stages doesn't include information on the stage > duration that is present in the UI. This looks like a simple oversight, they > should be included. eg., the metrics should be included at > {{api/v1/applications//stages}}. The missing metrics are > {{submissionTime}} and {{completionTime}} (and whatever other metrics come > out of the discussion on SPARK-10930) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11155) Stage summary json should include stage duration
[ https://issues.apache.org/jira/browse/SPARK-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962152#comment-14962152 ] Imran Rashid commented on SPARK-11155: -- Thanks for taking it on [~iamshrek]! I have assigned the issue to you. > Stage summary json should include stage duration > - > > Key: SPARK-11155 > URL: https://issues.apache.org/jira/browse/SPARK-11155 > Project: Spark > Issue Type: Improvement > Components: Web UI >Reporter: Imran Rashid >Assignee: Xin Ren >Priority: Minor > Labels: Starter > > The json endpoint for stages doesn't include information on the stage > duration that is present in the UI. This looks like a simple oversight, they > should be included. eg., the metrics should be included at > {{api/v1/applications//stages}}. The missing metrics are > {{submissionTime}} and {{completionTime}} (and whatever other metrics come > out of the discussion on SPARK-10930) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11155) Stage summary json should include stage duration
[ https://issues.apache.org/jira/browse/SPARK-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962151#comment-14962151 ] Imran Rashid commented on SPARK-11155: -- [~kayousterhout] I'm referring to the http endpoints that expose the data in the UI as json http://spark.apache.org/docs/latest/monitoring.html#rest-api. Eg., if you run any job in a spark-shell, you can go to: http://localhost:4040/api/v1/applications/Spark%20shell/stages The json is produced from [o.a.s.status.api.v1.StageData|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/status/api/v1/api.scala#L110], which doesn't have the right fields. > Stage summary json should include stage duration > - > > Key: SPARK-11155 > URL: https://issues.apache.org/jira/browse/SPARK-11155 > Project: Spark > Issue Type: Improvement > Components: Web UI >Reporter: Imran Rashid >Priority: Minor > Labels: Starter > > The json endpoint for stages doesn't include information on the stage > duration that is present in the UI. This looks like a simple oversight, they > should be included. eg., the metrics should be included at > {{api/v1/applications//stages}}. The missing metrics are > {{submissionTime}} and {{completionTime}} (and whatever other metrics come > out of the discussion on SPARK-10930) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11155) Stage summary json should include stage duration
[ https://issues.apache.org/jira/browse/SPARK-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961366#comment-14961366 ] Kay Ousterhout commented on SPARK-11155: [~imranr] where exactly do you mean this is missing? I thought you meant the Json info for StageSubmitted / StageCompleted, but that does include the stage submission and completion time (via StageInfo), which can be used to compute the duration. > Stage summary json should include stage duration > - > > Key: SPARK-11155 > URL: https://issues.apache.org/jira/browse/SPARK-11155 > Project: Spark > Issue Type: Improvement > Components: Web UI >Reporter: Imran Rashid >Priority: Minor > Labels: Starter > > The json endpoint for stages doesn't include information on the stage > duration that is present in the UI. This looks like a simple oversight, they > should be included. eg., the metrics should be included at > {{api/v1/applications//stages}}. The missing metrics are > {{submissionTime}} and {{completionTime}} (and whatever other metrics come > out of the discussion on SPARK-10930) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11155) Stage summary json should include stage duration
[ https://issues.apache.org/jira/browse/SPARK-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961282#comment-14961282 ] Xin Ren commented on SPARK-11155: - Hi, I'd like to have a try on this one. Thanks > Stage summary json should include stage duration > - > > Key: SPARK-11155 > URL: https://issues.apache.org/jira/browse/SPARK-11155 > Project: Spark > Issue Type: Improvement > Components: Web UI >Reporter: Imran Rashid >Priority: Minor > Labels: Starter > > The json endpoint for stages doesn't include information on the stage > duration that is present in the UI. This looks like a simple oversight, they > should be included. eg., the metrics should be included at > {{api/v1/applications//stages}}. The missing metrics are > {{submissionTime}} and {{completionTime}} (and whatever other metrics come > out of the discussion on SPARK-10930) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org