GitHub user supertom opened a pull request:

    https://github.com/apache/libcloud/pull/813

    [GCE] LIBCLOUD-826: Improve performance of list nodes by caching volume 
information

    ## [GCE] Improve performance of list nodes by caching volume information
    
    ### Description
    
    When listing nodes, the GCE driver currently calls the disk API for each 
disk attached to the node.  This PR changes that behavior by using the 
aggregatedLIst call for disks (once per list_node request) and using that 
information to provide disk details.
    
    See sample performance info in 
[LIBCLOUD-826](https://issues.apache.org/jira/browse/LIBCLOUD-826)
    
    #### Implementation:
    
    For disk information, aggregated calls are now always used and the disk 
information is stored in a dictionary, called 'volume_dict'.  If the user would 
like the most current information, they may set the use_cache keyword to false 
and the call (and subsequent population of volume_dict) will be made prior to 
returning disk information.
    
    Code was added/changed in two classes.  In GCENodeDriver, added two methods 
and an additional parameter to build, lookup and toggle the refresh of the 
volume cache.   In GCEConnection, added convenience and helper methods to the 
class, which not only support this performance improvement but also support the 
longer term vision of leveraging aggregatedList calls elsewhere.
    
    ##### GCEConnection 
    * _(new method)_ **def request_aggregated_items(self, api_name)** - make 
all necessary calls, handling maxresults and saving the 'items' portion of the 
response.
    
    * _(new method)_ **def _merge_response_items(self, list_name, 
response_list)** - helper method to merge responses into a single dictionary
    
    ##### GCENodeDriver
    * _(new member)_ **volume_dict** - dictionary organized by name, zone.  
Name is always available to us, but is not unique across zones.  Zones, though, 
are optionally supplied.  By organizing by name, we remove the need to search 
through the entire list of disks each time and can do a single hash lookup to 
have access to all disks by that name.  If we have the zone, another hash 
lookup, if not, we take the first key alphabetically.
    
    * _(new method)_ **_build_name_zone_dict(self, zone_dict)** - internal 
method to populate volume dict
    
    * _(new method)_ **_ex_lookup_volume(self, name, zone=None)** - implements 
the actual lookup.  If zone is not provided, take the disk with that name from 
the (alphabetical) first zone (this is only an issue if there are more than two 
disks with the same name).
    
    * _(new parameter)_ **list_nodes(self, ex_zone=None, use_disk_cache=True)** 
- use_disk_cache parameter for list_nodes to pass through, defaults to True.  
If set to true, no more than one call per 500 disks would be made to populate 
all disk info for nodes
    
    * _(new parameter, revision)_ **ex_get_volume(self, name, zone=None, 
use_cache=False)** - revised to check if volume_dict has been populated or 
should be used, followed by returning the call to _ex_lookup_volume
    
    ### Status
    - done, ready for review
    
    ### Checklist (tick everything that applies)
    
    - [X] [Code 
linting](http://libcloud.readthedocs.org/en/latest/development.html#code-style-guide)
 (required, can be done after the PR checks)
    - [ ] Documentation
    - [X] [Tests](http://libcloud.readthedocs.org/en/latest/testing.html)
    - [ ] 
[ICLA](http://libcloud.readthedocs.org/en/latest/development.html#contributing-bigger-changes)
 (required for bigger changes)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/supertom/libcloud LIBCLOUD-826

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/libcloud/pull/813.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #813
    
----
commit ae8ce86eb899e7d33d558dd55b556c308b8a1449
Author: Tom Melendez <t...@supertom.com>
Date:   2016-05-31T16:57:43Z

    Merge pull request #1 from apache/trunk
    
    Update from head

commit c436531a190da59e5b1e64ba94e50e2e0bb7ddbd
Author: Tom Melendez <super...@google.com>
Date:   2016-06-06T15:57:17Z

    Merge remote-tracking branch 'upstream/trunk' into trunk

commit 3e7a5398f2621363b4ec5d5b0602fc0f4287534c
Author: Tom Melendez <super...@google.com>
Date:   2016-06-13T20:16:21Z

    Merge remote-tracking branch 'upstream/trunk' into LIBCLOUD-826

commit 66bf0a67951a5fcef31348d123973de9859e6f41
Author: Tom Melendez <super...@google.com>
Date:   2016-06-15T01:47:06Z

    Merge remote-tracking branch 'upstream/trunk' into LIBCLOUD-826

commit d4022c4d8eaf982d7924d315610705532361255f
Author: Tom Melendez <super...@google.com>
Date:   2016-06-15T02:08:33Z

    GCE list nodes performance improvement.  Resolves LIBCLOUD-826.
    
    We leverage the aggregated disk call and store the result.  For the list 
node operation, we've added an extra parameter to use the cached data, which 
results to true.
    
    Tests and fixtures updated as well.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to