Thanks Josh! These tests seem to cover the cases I'm looking for already =).
What's interesting though is that we still ran into SPARK-3736 despite such integration tests being in place to catch it - specifically, the case when the master disconnects and reconnects, the workers should reconnect to the master after the master restarts. Are the tests here run regularly, i.e. Jenkins build or nightly build, and if so how did that test case pass while SPARK-3736 apparently still exists? At any rate, I think I'll submit my fix PR but with no particular extra automated test written for it, since it seems like FaultToleranceTest sufficiently covers what I need. On Wed, Oct 15, 2014 at 2:42 PM, Josh Rosen <rosenvi...@gmail.com> wrote: > There are some end-to-end integration tests of Master <-> Worker > fault-tolerance in > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala > > I’ve actually been working to develop a more generalized Docker-based > integration-testing framework for Spark in order to test Master <-> Worker > interactions. I’d like to eventually clean up my code and release it > publicly. > > On October 15, 2014 at 2:39:22 PM, Matthew Cheah ( > matthew.c.ch...@gmail.com) wrote: > > I think on a higher level I also want to ask why such unit testing has not > actually been done in this codebase. If it's not a common practice to test > message passing then I'm fine with leaving out the unit test, however I'm > more curious as to why such testing was not done before. > > On Wed, Oct 15, 2014 at 2:18 PM, Chester Chen <ches...@alpinenow.com> > wrote: > > > You can call resolve method on ActorSelection.resolveOne() to see if the > > actor is still there or the path is correct. The method returns a future > > and you can wait for it with timeout. This way, you know the actor is > live > > or already dead or incorrect. > > > > Another way, is to send Identify method to ActorSystem, if it returns > with > > correct identified message; then you can act on it, otherwise, ... > > > > hope this helps > > > > Chester > > > > On Wed, Oct 15, 2014 at 1:38 PM, Matthew Cheah < > matthew.c.ch...@gmail.com> > > wrote: > > > >> What's happening when I do this is that the Worker tries to get the > Master > >> actor by calling context.actorSelection(), and the RegisterWorker > message > >> gets sent to the dead letters mailbox instead of being picked up by > >> expectMsg. I'm new to Akka and I've tried various ways to registering a > >> "mock" master to no avail. > >> > >> I would think there would be at least some kind of test for master - > >> worker > >> message passing, no? > >> > >> On Wed, Oct 15, 2014 at 11:28 AM, Nan Zhu <zhunanmcg...@gmail.com> > wrote: > >> > >> > I don’t think there are test cases for Worker itself > >> > > >> > > >> > You can > >> > > >> > > >> > val actorRef = TestActorRef[Master](Props(classOf[Master], ...))( > >> > actorSystem) actorRef.underlyingActor.receive(Heartbeat) > >> > >> > > >> > and use expectMsg to test if Master can reply correct message by > >> assuming > >> > Worker is absolutely correct > >> > > >> > Then in another test case to test if Worker can send register message > to > >> > Master after receiving Master’s “re-register” instruction, (in this > test > >> > case assuming that the Master is absolutely right) > >> > > >> > Best, > >> > > >> > -- > >> > Nan Zhu > >> > > >> > On Wednesday, October 15, 2014 at 2:04 PM, Matthew Cheah wrote: > >> > > >> > Thanks, the example was helpful. > >> > > >> > However, testing the Worker itself is a lot more complicated than > >> > WorkerWatcher, since the Worker class is quite a bit more complex. > Are > >> > there any tests that inspect the Worker itself? > >> > > >> > Thanks, > >> > > >> > -Matt Cheah > >> > > >> > On Tue, Oct 14, 2014 at 6:40 PM, Nan Zhu <zhunanmcg...@gmail.com> > >> wrote: > >> > > >> > You can use akka testkit > >> > > >> > Example: > >> > > >> > > >> > > >> > https://github.com/apache/spark/blob/ef4ff00f87a4e8d38866f163f01741c2673e41da/core/src/test/scala/org/apache/spark/deploy/worker/WorkerWatcherSuite.scala > >> > > >> > -- > >> > Nan Zhu > >> > > >> > On Tuesday, October 14, 2014 at 9:17 PM, Matthew Cheah wrote: > >> > > >> > Hi everyone, > >> > > >> > I’m adding some new message passing between the Master and Worker > >> actors in > >> > order to address https://issues.apache.org/jira/browse/SPARK-3736 . > >> > > >> > I was wondering if these kinds of interactions are tested in the > >> automated > >> > Jenkins test suite, and if so, where I could find some examples to > help > >> me > >> > do the same. > >> > > >> > Thanks! > >> > > >> > -Matt Cheah > >> > > >> > > >> > > >> > > >> > > >> > > > > > >