On 05/22/2017 05:39 AM, Matthew Booth wrote:

There are also a couple of optimisations to make which I won't bother with up front. Dan suggested in his CellsV2 talk that we would only query cells where the user actually has instances. If we find users tend to clump in a small number of cells this would be a significant optimisation, although the overhead on the api node for a query returning no rows is probably very little. Also, I think you mentioned that there's an option to tell SQLA not to batch-process rows, but that it is less efficient for total throughput? I suspect there would be a point at which we'd want that.

it's the yield_per() option and I think you should use it up front, just so it's there and we can hit any issues it might cause (shouldn't be any provided no eager loading is used). Have it yield on about 5 rows at a time. The pymysql driver these days I think does not actually buffer the rows but 50 is very little anyway.




If there's a reasonable way to calculate
a tipping point, that might give us some additional life.

Bear in mind that the principal advantages to not using Searchlight are:

* It is simpler to implement
* It is simpler to manage
* It will return accurate results

Following the principal of 'as simple as possible, but no simpler', I think there's enormous benefit to this much simpler approach for anybody who doesn't need a more complex approach. However, while it reduces the urgency of something like the Searchlight solution, I expect there are going to be deployments which need that.


        More over, during the query there are instances operation(
        create, delete)  in parallel during the pagination/sort query,
        there is situation some cells may not provide response in time,
        or network connection broken, etc, many abnormal cases may
        happen. How to deal with some of cells abnormal query response
        is also one great factor to be considered.


Aside: For a query operation, what's the better user experience when a single cell is failing:

1. The whole query fails.
2. The user gets incomplete results.

Either of these are simple to implement. Incomplete results would also additionally be logged as an ERROR, but I can't think of any way to also return to the user that there's a problem with the data we returned without throwing an error.

Thoughts?

Matt


        It's not good idea to support pagination and sort at the same
        time (may not provide exactly the result end user want) if
        searchlight should not be integrated.

        In fact in Tricircle, when query ports from neutron where
        tricircle central plugin is installed, the tricircle central
        plugin do the similar cross local Neutron ports query, and not
        support pagination/sort together.

        Best Regards
        Chaoyi Huang (joehuang)

        ________________________________________
        From: Matt Riedemann [mriede...@gmail.com
        <mailto:mriede...@gmail.com>]
        Sent: 19 May 2017 5:21
        To: openstack-dev@lists.openstack.org
        <mailto:openstack-dev@lists.openstack.org>
        Subject: [openstack-dev] [nova] Boston Forum session recap -
        searchlight        integration

        Hi everyone,

        After previous summits where we had vertical tracks for Nova
        sessions I
        would provide a recap for each session.

        The Forum in Boston was a bit different, so here I'm only
        attempting to
        recap the Forum sessions that I ran. Dan Smith led a session on
        Cells
        v2, John Garbutt led several sessions on the VM and Baremetal
        platform
        concept, and Sean Dague led sessions on hierarchical quotas and API
        microversions, and I'm going to leave recaps for those sessions
        to them.

        I'll do these one at a time in separate emails.


        Using Searchlight to list instances across cells in nova-api
        ------------------------------------------------------------

        The etherpad for this session is here [1]. The goal for this
        session was
        to explain the problem and proposed plan from the spec [2] to the
        operators in the room and get feedback.

        Polling the room we found that not many people are deploying
        Searchlight
        but most everyone was using ElasticSearch.

        An immediate concern that came up was the complexity involved with
        integrating Searchlight, especially around issues with latency
        for state
        changes and questioning how this does not redo the top-level
        cells v1
        sync issue. It admittedly does to an extent, but we don't have
        all of
        the weird side code paths with cells v1 and it should be
        self-healing.
        Kris Lindgren noted that the instance.usage.exists periodic
        notification
        from the computes hammers their notification bus; we suggested
        he report
        a bug so we can fix that.

        It was also noted that if data is corrupted in ElasticSearch or
        is out
        of sync, you could re-sync that from nova to searchlight, however,
        searchlight syncs up with nova via the compute REST API, which
        if the
        compute REST API is using searchlight in the backend, you end up
        getting
        into an infinite loop of broken. This could probably be fixed with
        bypass query options in the compute API, but it's not a fun problem.

        It was also suggested that we store a minimal set of data about
        instances in the top-level nova API database's instance_mappings
        table,
        where all we have today is the uuid. Anything that is set in the API
        would probably be OK for this, but operators in the room noted
        that they
        frequently need to filter instances by an IP, which is set in the
        compute. So this option turns into a slippery slope, and is
        potentially
        not inter-operable across clouds.

        Matt Booth is also skeptical that we can't have a multi-cell query
        perform well, and he's proposed a POC here [3]. If that works
        out, then
        it defeats the main purpose for using Searchlight for listing
        instances
        in the compute API.

        Since sorting instances across cells is the main issue, it was also
        suggested that we allow a config option to disable sorting in
        the API.
        It was stated this would be without a microversion, and
        filtering/paging
        would still be supported. I'm personally skeptical about how
        this could
        be consider inter-operable or discoverable for API users, and
        would need
        more thought and input from users like Monty Taylor and Clark
        Boylan.

        Next steps are going to be fleshing out Matt Booth's POC for
        efficiently
        listing instances across cells. I think we can still continue
        working on
        the versioned notifications changes we're making for searchlight as
        those are useful on their own. And we should still work on enabling
        searchlight in the nova-next CI job so we can get an idea for
        how the
        versioned notifications are working by a consumer. However, any
        major
        development for actually integrating searchlight into Nova is
        probably
        on hold at the moment until we know how Matt's POC works.

        [1]
        
https://etherpad.openstack.org/p/BOS-forum-using-searchlight-to-list-instances
        
<https://etherpad.openstack.org/p/BOS-forum-using-searchlight-to-list-instances>
        [2]
        
https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/list-instances-using-searchlight.html
        
<https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/list-instances-using-searchlight.html>
        [3] https://review.openstack.org/#/c/463618/
        <https://review.openstack.org/#/c/463618/>

        --

        Thanks,

        Matt

        
__________________________________________________________________________
        OpenStack Development Mailing List (not for usage questions)
        Unsubscribe:
        openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
        <http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
        http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev 
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>

        
__________________________________________________________________________
        OpenStack Development Mailing List (not for usage questions)
        Unsubscribe:
        openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
        <http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
        http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev 
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>


    __________________________________________________________________________
    OpenStack Development Mailing List (not for usage questions)
    Unsubscribe:
    openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
    <http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
    <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>




--
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to