[GitHub] spark pull request: SPARK-7889 Jobs progress of apps on complete p...

squito Fri, 26 Jun 2015 12:37:33 -0700

Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/6935#issuecomment-115849879
  
    sorry I didn't see that comment about the tests ... I wrote something up 
which mimics the existing `UISeleniumSuite`.
    
    ```scala
    From 2dddcd036a4cacf31d1ff5f15417fcdbcfdbc22e Mon Sep 17 00:00:00 2001
    From: Imran Rashid <iras...@cloudera.com>
    Date: Fri, 26 Jun 2015 14:16:16 -0500
    Subject: [PATCH] failing test case
    
    ---
     .../spark/deploy/history/HistoryServerSuite.scala  | 74 
+++++++++++++++++++++-
     1 file changed, 72 insertions(+), 2 deletions(-)
    
    diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala 
b/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
    index e5b5e1b..a288490 100644
    --- 
a/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
    +++ 
b/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
    @@ -21,14 +21,24 @@ import java.net.{HttpURLConnection, URL}
     import java.util.zip.ZipInputStream
     import javax.servlet.http.{HttpServletRequest, HttpServletResponse}
     
    +import scala.concurrent.duration._
    +import scala.language.postfixOps
    +
     import com.google.common.base.Charsets
     import com.google.common.io.{ByteStreams, Files}
     import org.apache.commons.io.{FileUtils, IOUtils}
    +import org.json4s.JsonAST.JArray
    +import org.json4s.jackson.JsonMethods._
     import org.mockito.Mockito.when
    +import org.openqa.selenium.WebDriver
    +import org.openqa.selenium.htmlunit.HtmlUnitDriver
    +import org.scalatest.concurrent.Eventually
    +import org.scalatest.selenium.WebBrowser
     import org.scalatest.{BeforeAndAfter, Matchers}
     import org.scalatest.mock.MockitoSugar
     
    -import org.apache.spark.{JsonTestUtils, SecurityManager, SparkConf, 
SparkFunSuite}
    +import org.apache.spark.util.Utils
    +import org.apache.spark._
     import org.apache.spark.ui.SparkUI
     
     /**
    @@ -43,7 +53,7 @@ import org.apache.spark.ui.SparkUI
      * are considered part of Spark's public api.
      */
     class HistoryServerSuite extends SparkFunSuite with BeforeAndAfter with 
Matchers with MockitoSugar
    -  with JsonTestUtils {
    +  with JsonTestUtils with Eventually with WebBrowser {
     
       private val logDir = new File("src/test/resources/spark-events")
       private val expRoot = new 
File("src/test/resources/HistoryServerExpectations/")
    @@ -264,6 +274,66 @@ class HistoryServerSuite extends SparkFunSuite with 
BeforeAndAfter with Matchers
         justHrefs should contain(link)
       }
     
    +  test("incomplete apps get refreshed") {
    +
    +    implicit val webDriver: WebDriver = new HtmlUnitDriver
    +    implicit val formats = org.json4s.DefaultFormats
    +
    +    val testDir = Utils.createTempDir()
    +    testDir.deleteOnExit()
    +
    +    val myConf = new SparkConf()
    +      .set("spark.history.fs.logDirectory", testDir.getAbsolutePath)
    +      .set("spark.eventLog.dir", testDir.getAbsolutePath)
    +      .set("spark.history.fs.update.interval", "1")
    +      .set("spark.eventLog.enabled", "true")
    +      .set("spark.history.cache.interval", "1")
    +      .remove("spark.testing")
    +    val provider = new FsHistoryProvider(myConf)
    +    val securityManager = new SecurityManager(myConf)
    +
    +    val server = new HistoryServer(myConf, provider, securityManager, 
18080)
    +    val sc = new SparkContext("local", "test", myConf)
    +    server.initialize()
    +    server.bind()
    +    val port = server.boundPort
    +    try {
    +      val d = sc.parallelize(1 to 10)
    +      d.count()
    +      val appId = eventually(timeout(20 seconds), interval(100 
milliseconds)) {
    +        val json = getContentAndCode("applications", port)._2.get
    +        val apps = parse(json).asInstanceOf[JArray].arr
    +        apps.size should be (1)
    +        (apps(0) \ "id").extract[String]
    +      }
    +
    +      def getNumJobs(suffix: String): Int = {
    +        go to (s"http://localhost:$port/history/$appId$suffix";)
    +        findAll(cssSelector("tbody tr")).toIndexedSeq.size
    +      }
    +
    +      getNumJobs("") should be (1)
    +      d.count()
    +      eventually(timeout(10 seconds), interval(100 milliseconds)) {
    +        // this fails with patch https://github.com/apache/spark/pull/6935 
as well, I'm not sure why
    +        getNumJobs("") should be (2)
    +      }
    +
    +      getNumJobs("/jobs") should be (2)
    +      // Here is where we still have an error.  /jobs is handled by 
another servlet,
    +      // not controlled by the app cache.  We need some way for each of 
those servlets
    +      // to know that they can get expired, so they can be detached and 
then the reloaded
    +      // one can get re-attached
    +      d.count()
    +      getNumJobs("/jobs") should be (3)
    +
    +    } finally {
    +      sc.stop()
    +      server.stop()
    +    }
    +
    +  }
    +
       def getContentAndCode(path: String, port: Int = port): (Int, 
Option[String], Option[String]) = {
         HistoryServerSuite.getContentAndCode(new 
URL(s"http://localhost:$port/api/v1/$path";))
       }
    -- 
    2.2.2
    ```
    
    patch https://github.com/apache/spark/pull/6545 is closer to working (for 
some reason the cache does not seem to be getting refreshed in this PR, maybe 
my test is setup wrong).  But I do think the cache with a refresh interval is 
probably a better way to go, rather than reloading the data on every single 
request.
    
    I have a rough understanding of what is going wrong when we make requests 
to `http://localhost:18080/history/<appId>/jobs` -- after the first time, that 
is handled by a completely separate servlet, as @XuTingjun has pointed out, 
which is never detached.  We need a way to wrap each of the servlets for the UI 
so they know if they need to refresh.
    
    Also I just realized, the same problem applies to the rest api at 
`http://localhost:18080/api/v1/applications/`, but that should be much simpler 
to handle, its just one servlet.  We should add a test case and fix that too, 
though.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-7889 Jobs progress of apps on complete p...

Reply via email to