[GitHub] spark pull request #19698: [SPARK-20648][core] Port JobsTab and StageTab to ...

vanzin Wed, 08 Nov 2017 11:04:07 -0800

GitHub user vanzin opened a pull request:

    https://github.com/apache/spark/pull/19698


    [SPARK-20648][core] Port JobsTab and StageTab to the new UI backend.

    This change is a little larger because there's a whole lot of logic
    behind these pages, all really tied to internal types and listeners,
    and some of that logic had to be implemented in the new listener and
    the needed data exposed through the API types.
    
    - Added missing StageData and ExecutorStageSummary fields which are
      used by the UI. Some json golden files needed to be updated to account
      for new fields.
    
    - Save RDD graph data in the store. This tries to re-use existing types as
      much as possible, so that the code doesn't need to be re-written. So it's
      probably not very optimal.
    
    - Some old classes (e.g. JobProgressListener) still remain, since they're 
used
      in other parts of the code; they're not used by the UI anymore, though, 
and
      will be cleaned up in a separate change.
    
    - Save information about active pools in the store. This data is not really 
used
      in the SHS, but it's not a lot of data so it's still recorded when 
replaying
      applications.
    
    - Because the new store sorts things slightly differently from the previous
      code, some json golden files had some elements within them shuffled 
around.
    
    - The retention unit test in UISeleniumSuite was disabled because the code
      to throw away old stages / tasks hasn't been added yet.
    
    - The job description field in the API tries to follow the old behavior, 
which
      makes it be empty most of the time, even though there's information to 
fill it
      in. For stages, a new field was added to hold the description (which is 
basically
      the job description), so that the UI can be rendered in the old way.
    
    - A new stage status ("SKIPPED") was added to account for the fact that the 
API
      couldn't represent that state before. Without this, the stage would show 
up as
      "PENDING" in the UI, which is now based on API types.
    
    - The API used to expose "executorRunTime" as the value of the task's 
duration,
      which wasn't really correct (also because that value was easily available
      from the metrics object); this change fixes that by storing the correct 
duration,
      which also means a few expectation files needed to be updated to account 
for
      the new durations and sorting differences due to the changed values.
    
    - Added changes to implement SPARK-20713 and SPARK-21922 in the new code.
    
    Tested with existing unit tests (and by using the UI a lot).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vanzin/spark SPARK-20648

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19698.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19698
    
----
commit a22c45889d8fc0982caf4325eb729048537872bb
Author: Marcelo Vanzin <van...@cloudera.com>
Date:   2017-01-31T21:31:55Z

    [SPARK-20648][core] Port JobsTab and StageTab to the new UI backend.
    
    This change is a little larger because there's a whole lot of logic
    behind these pages, all really tied to internal types and listeners,
    and some of that logic had to be implemented in the new listener and
    the needed data exposed through the API types.
    
    - Added missing StageData and ExecutorStageSummary fields which are
      used by the UI. Some json golden files needed to be updated to account
      for new fields.
    
    - Save RDD graph data in the store. This tries to re-use existing types as
      much as possible, so that the code doesn't need to be re-written. So it's
      probably not very optimal.
    
    - Some old classes (e.g. JobProgressListener) still remain, since they're 
used
      in other parts of the code; they're not used by the UI anymore, though, 
and
      will be cleaned up in a separate change.
    
    - Save information about active pools in the store. This data is not really 
used
      in the SHS, but it's not a lot of data so it's still recorded when 
replaying
      applications.
    
    - Because the new store sorts things slightly differently from the previous
      code, some json golden files had some elements within them shuffled 
around.
    
    - The retention unit test in UISeleniumSuite was disabled because the code
      to throw away old stages / tasks hasn't been added yet.
    
    - The job description field in the API tries to follow the old behavior, 
which
      makes it be empty most of the time, even though there's information to 
fill it
      in. For stages, a new field was added to hold the description (which is 
basically
      the job description), so that the UI can be rendered in the old way.
    
    - A new stage status ("SKIPPED") was added to account for the fact that the 
API
      couldn't represent that state before. Without this, the stage would show 
up as
      "PENDING" in the UI, which is now based on API types.
    
    - The API used to expose "executorRunTime" as the value of the task's 
duration,
      which wasn't really correct (also because that value was easily available
      from the metrics object); this change fixes that by storing the correct 
duration,
      which also means a few expectation files needed to be updated to account 
for
      the new durations and sorting differences due to the changed values.
    
    - Added changes to implement SPARK-20713 and SPARK-21922 in the new code.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19698: [SPARK-20648][core] Port JobsTab and StageTab to ...

Reply via email to