Hey Cory Not even I'm that crazy! :) I have recycled the bootstrapped test environment but the only nodes running are those used in this test suite.
I tried to use wait_for_messages initially and was a little confused as to what a "message" equated to (and again in those tests I got a timeout as well, but I'm happy to retest) If I want to wait for a message is it something from the right side of status_set for example: https://github.com/OSBI/layer-pdi/blob/master/reactive/pdi.py#L83 'Configuration has changed, restarting Carte.'? Thanks Tom -------------- Director Meteorite.bi - Saiku Analytics Founder Tel: +44(0)5603641316 (Thanks to the Saiku community we reached our Kickstart <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/> goal, but you can always help by sponsoring the project <http://www.meteorite.bi/products/saiku/sponsorship>) On 15 March 2016 at 17:54, Cory Johns <cory.jo...@canonical.com> wrote: > Tom, > > It's also important to note that sentry.wait() waits for *all* units in > the deployment to settle for at least 30 seconds, so it might be possible > that another unit that wasn't included in the status gist you provided is > churning and causing it to time out. That's particularly possible if > you're reusing the deployer instance and all 34+ of those machines (going > by the machine numbers in your gist) are still extant; with that many > machines, even the periodic update-status hooks could be overlapping enough > to prevent the 30 second idle window from registering. > > I'd recommend using the wait_for_mesages [1] alternative which relies on > the charm to report its status explicitly and thus doesn't need to use > heuristics like the 30 second idle window. It could also make your test > case code a bit cleaner. > > And, of course, reusing units when possible and cleaning up between test > cases can help, as well. > > [1]: > https://pythonhosted.org/amulet/amulet.html#amulet.sentry.Talisman.wait_for_messages > > On Tue, Mar 15, 2016 at 1:02 PM, Tim Van Steenburgh < > tim.van.steenbu...@canonical.com> wrote: > >> >> >> On Tue, Mar 15, 2016 at 12:30 PM, Tom Barber <t...@analytical-labs.com> >> wrote: >> >>> Hi Tim, >>> >>> Why would I need to increase the timeout when the status says all the >>> unit are operational? >>> >> >> The default wait time is 300s, with an "idle threshold" of 30s. Which >> means, it waits for everything to be idle for 30s before returning from the >> wait. This means that with the default timeout, if the env doesn't settle >> within 4m30s, it'll time out. This may not be what's happening in your >> case, but it's worth trying a longer timeout value to make sure. >> >> >>> The status dump came out of bundletester which said that it failed on >>> the first wait(), I assume the status dump arrived at the same time? >>> Bugs are allowed, the test was hacked up from a previous one, it doesn't >>> do anything yet, I'm trying to make sure the logic works first. >>> >>> Tom >>> >>> -------------- >>> >>> Director Meteorite.bi - Saiku Analytics Founder >>> Tel: +44(0)5603641316 >>> >>> (Thanks to the Saiku community we reached our Kickstart >>> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/> >>> goal, but you can always help by sponsoring the project >>> <http://www.meteorite.bi/products/saiku/sponsorship>) >>> >>> On 15 March 2016 at 16:27, Tim Van Steenburgh < >>> tim.van.steenbu...@canonical.com> wrote: >>> >>>> Hey Tom, >>>> >>>> 1. You can increase the wait time until it doesn't time out: >>>> self.d.sentry.wait(timeout=1200) >>>> 2. At what point in this sequence of commands was the status dump >>>> captured? >>>> 3. There is a bug here. You take a reference to the pdi/0 info dict on >>>> line 1. It's the same object you use to get message2 and message3 later. >>>> So, you'll get the same message that you got on line 1. You need `message3 >>>> = self.d.sentry['pdi'][0].info['workload-status'].get('message')` >>>> instead. >>>> >>>> Hope this helps. >>>> >>>> On Tue, Mar 15, 2016 at 11:41 AM, Tom Barber <t...@analytical-labs.com> >>>> wrote: >>>> >>>>> Okay back here again, so my nice leader election function looks like: >>>>> >>>>> def test_leader_election_failover(self): >>>>> unit = self.d.sentry['pdi'][0].info >>>>> message = unit['workload-status'].get('message') >>>>> ip = message.split(':', 1)[-1] >>>>> self.d.add_unit('pdi', 2) >>>>> self.d.sentry.wait() >>>>> message2 = unit['workload-status'].get('message') >>>>> ip2 = message2.split(':', 1)[-1] >>>>> self.assertEqual(ip, ip2) >>>>> self.d.remove_unit('pdi/0') >>>>> self.d.sentry.wait() >>>>> message3 = unit['workload-status'].get('message') >>>>> ip3 = message3.split(':', 1)[-1] >>>>> >>>>> self.assertNotEqual(ip3, ip2) >>>>> >>>>> I know there's no logic in there, but I need to make sure the stuff >>>>> actually functions. >>>>> >>>>> So Tim says wait() should work, but when I tested this last night, >>>>> >>>>> I get a timeout error o the wait right after add_unit. >>>>> >>>>> https://gist.github.com/buggtb/c271dd79d782af57dea6 >>>>> >>>>> Yet in the status dump you can see all 3 units sat there seemingly >>>>> happy. >>>>> >>>>> Tom >>>>> >>>>> -------------- >>>>> >>>>> Director Meteorite.bi - Saiku Analytics Founder >>>>> Tel: +44(0)5603641316 >>>>> >>>>> (Thanks to the Saiku community we reached our Kickstart >>>>> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/> >>>>> goal, but you can always help by sponsoring the project >>>>> <http://www.meteorite.bi/products/saiku/sponsorship>) >>>>> >>>>> On 9 March 2016 at 18:31, Tom Barber <t...@analytical-labs.com> wrote: >>>>> >>>>>> Oh really? >>>>>> >>>>>> /me stokes his invisible beard. >>>>>> >>>>>> >>>>>> Okay I'll go back and try again. >>>>>> >>>>>> Tom >>>>>> >>>>>> -------------- >>>>>> >>>>>> Director Meteorite.bi - Saiku Analytics Founder >>>>>> Tel: +44(0)5603641316 >>>>>> >>>>>> (Thanks to the Saiku community we reached our Kickstart >>>>>> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/> >>>>>> goal, but you can always help by sponsoring the project >>>>>> <http://www.meteorite.bi/products/saiku/sponsorship>) >>>>>> >>>>>> On 9 March 2016 at 16:56, Tim Van Steenburgh < >>>>>> tim.van.steenbu...@canonical.com> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Mar 9, 2016 at 6:31 AM, Tom Barber <t...@analytical-labs.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks Stuart. >>>>>>>> >>>>>>>> I do put a note in my charm message indicating the leader IP >>>>>>>> address so that users know which to connect to. >>>>>>>> >>>>>>>> So with juju wait, would I destroy a unit then execute juju wait? >>>>>>>> At which point it will hang until the leader election stuff is over >>>>>>>> and all >>>>>>>> becomes stable again? >>>>>>>> >>>>>>>> >>>>>>> Since you're already using amulet, there's no need to use the >>>>>>> juju-wait plugin >>>>>>> since d.sentry.wait() does the same thing. So yes, you would do >>>>>>> d.remove_unit(...) >>>>>>> and then call d.sentry.wait(). >>>>>>> >>>>>>> >>>>>>>> Also, will this work if I push it upstream to the charmers and the >>>>>>>> automated tests up there? >>>>>>>> >>>>>>>> >>>>>>> Yes. >>>>>>> >>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> Tom >>>>>>>> >>>>>>>> -------------- >>>>>>>> >>>>>>>> Director Meteorite.bi - Saiku Analytics Founder >>>>>>>> Tel: +44(0)5603641316 >>>>>>>> >>>>>>>> (Thanks to the Saiku community we reached our Kickstart >>>>>>>> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/> >>>>>>>> goal, but you can always help by sponsoring the project >>>>>>>> <http://www.meteorite.bi/products/saiku/sponsorship>) >>>>>>>> >>>>>>>> On 9 March 2016 at 11:00, Stuart Bishop < >>>>>>>> stuart.bis...@canonical.com> wrote: >>>>>>>> >>>>>>>>> On 9 March 2016 at 20:31, Tom Barber <t...@analytical-labs.com> >>>>>>>>> wrote: >>>>>>>>> > Morning all >>>>>>>>> > >>>>>>>>> > I'm trying to test for charm reconfiguration if the leader goes >>>>>>>>> AWOL. >>>>>>>>> >>>>>>>>> I put the role of the unit in its workload status, so it is easy >>>>>>>>> for >>>>>>>>> operators to see which unit is master. And this also makes it easy >>>>>>>>> for >>>>>>>>> tests to tell. >>>>>>>>> >>>>>>>>> >>>>>>>>> > Adam suggested that I watch the status waiting for the next >>>>>>>>> leader election >>>>>>>>> > hook the wait on that and then check my service configs. >>>>>>>>> >>>>>>>>> You are best of waiting for all the hooks to complete and a steady >>>>>>>>> state, not just leader elected (since things will still be in flux >>>>>>>>> when that hook fires, such as the leader-settings-changed hooks it >>>>>>>>> will probably trigger and the relation changes those hooks will >>>>>>>>> likely >>>>>>>>> trigger). Use the juju-wait plugin, and maybe add support to >>>>>>>>> https://bugs.launchpad.net/juju-core/+bug/1488777 to get this into >>>>>>>>> core. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Stuart Bishop <stuart.bis...@canonical.com> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Juju mailing list >>>>>>>> Juju@lists.ubuntu.com >>>>>>>> Modify settings or unsubscribe at: >>>>>>>> https://lists.ubuntu.com/mailman/listinfo/juju >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >> -- >> Juju mailing list >> Juju@lists.ubuntu.com >> Modify settings or unsubscribe at: >> https://lists.ubuntu.com/mailman/listinfo/juju >> >> >
-- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju