[GitHub] spark pull request: [SPARK-3454] [WIP] separate json endpoints for...

JoshRosen Mon, 09 Mar 2015 11:54:50 -0700

Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/4435#issuecomment-77918134
  
    Hi @squito,
    
    I took a quick pass through this PR and design doc. I like the overall 
approach here:
    
    - Structuring the hierarchy around applications / jobs / stages / tasks is 
a good design.
    - Using our existing binary-compatibility checking tools to ensure JSON 
compatibility is a good idea.  The end-to-end "golden file" tests are good, 
too, since this guards against breakage unexpected changes in the JSON 
serialization / translation layer.
    - Building explicit versioning into the API from day one is a good idea.
    
    A couple of high-level questions:
    
    - Does this work significantly differently in the HistoryServer versus a 
live application UI?
    - How do the JSON endpoints respond to invalid requests or requests for 
stages / jobs / tasks whose information has been garbage-collected?  We should 
add a test to ensure that these endpoints return the right HTTP error codes.
    - Will adding / changing the Jackson dependency create a dependency-hell 
mess for us?  Should we pre-emptively resort to shading or a different 
packaging strategy?
    
    
    
    General code review comments:
    
    - I don't think that the public JSON response classes should be case 
classes: the addition of new fields will break binary compatiblity (since 
`apply / unapply` will have their signatures change, breaking users' ability to 
pattern-match on these objects).  We've already hit this problem with 
SparkListenerEvent, so I think we should seek to avoid it here.
    - Restrict the visibility of components as much as possible.  For example, 
`JsonRoute` and all of its subclasses should probably be `private[ui]`.
    - Imports need to be re-organized in several files.
    - The `WorkerJsonRoute.scala` file is empty.
    - I haven't been following the mailing list thread about enumerations very 
closely, but we should figure out whether the new `enum` is compatible with 
that decision.
    - The debugging `println`s should become log statements or be removed.
    - Minor code style nit, but there are a bunch of style violations here that 
aren't caught by our checker.  In general, you need extra whitespace around 
operators, braces, etc, e.g. `.map{case(k,v) => k` should be `.map { case (k, 
v) => k`.
    - I see unit tests for the individual components, but it would nice to add 
some end-to-end tests which can also serve as usage examples.  Take a look at 
the tests in `UISeleniumSuite` for inspiration.  It might be nice to add a test 
which spins up a driver, runs some long-running tasks / jobs, and polls the 
REST api to get their statuses.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3454] [WIP] separate json endpoints for...

Reply via email to