Matthew Booth wrote:
On Tue, Nov 10, 2015 at 6:46 PM, Joshua Harlow <harlo...@fastmail.com
<mailto:harlo...@fastmail.com>> wrote:

    Matthew Booth wrote:

        My patch to MessageHandlingServer is currently being reverted
        because it
        broke Nova tests:

        https://review.openstack.org/#/c/235347/

        Specifically it causes a number of tests to take a very long time to
        execute, which ultimately results in the total build time limit
        being
        exceeded. This is very easy to replicate. The
        test
        
nova.tests.functional.test_server_group.ServerGroupTest.test_boot_servers_with_affinity
        is an example test which will always hit this issue. The problem is
        that ServerGroupTest.setUp() does:

                  self.compute2 = self.start_service('compute',
        host='host2')
                  self.addCleanup(self.compute2.kill)

        The problem with this is that start_service() adds a fixture
        which also
        adds kill as a cleanup method. kill does stop(), wait(). This
        means that
        the resulting call order is: start, stop, wait, stop, wait. The
        redundant call to kill is obviously a wart, but I feel we should
        have
        handled it anyway.

        The problem is that we decided it should be possible to restart a
        server. There are some unit tests in oslo.messaging that do
        this. It's
        not clear to me that there are any projects which do this, but after
        this experience I feel like it would be good to check before
        changing it :)

        The implication of that is that after wait() the state wraps,
        and we're
        now waiting on start() again. Consequently, when the second
        cleanup call
        hangs.

        We could fix Nova (at least the usage we have seen) by removing the
        wrapping. After wait() if you want to start a server again you
        need to
        create a new one.

        So, to be specific, lets consider the following 2 call sequences:

        1. start stop wait stop wait
        2. start stop wait start stop wait

        What should they do? The behaviours with and without wrapping are:

        1. start stop wait stop wait
        WRAP: start stop wait HANG HANG
        NO WRAP: start stop wait NO-OP NO-OP

        2. start stop wait start stop wait
        WRAP: start stop wait start stop wait
        NO WRAP: start stop wait NO-OP NO-OP NO-OP

        I'll refresh my memory on what they did before my change in the
        morning.
        Perhaps it might be simpler to codify the current behaviour, but
        iirc I
        proposed this because it was previously undefined due to races.


    I personally prefer not allowing restarting, its needless code
    complexity imho and a feature that people imho probably aren't using
    anyway (just create a new server object if u are doing this), so I'd
    be fine with doing the above NO WRAP and turning those into NO-OPs
    (and for example raising a runtime error in the case of start stop
    wait start ... to denote that restarting isn't
    recommended/possible). If we have a strong enough reason to really
    to start stop wait start ...

    I might be convinced the code complexity is worth it but for now I'm
    not convinced...


I agree, and in the hopefully unlikely event that we did break anybody,
at least they would get an obvious exception rather than a hang. A
lesson from breaking nova was that the log messages were generated and
were available in the failed test runs, but nobody noticed them.

Incidentally, I think I'd also merge my second patch into the first
before resubmitting, which adds timeouts and the option not to log.


+1

Makes sense to me.

IMHO we can remove the log output later if we determine it's to noisy for folks (and/or not helping)...

Matt

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to