On 05/08/16 12:01, Hongbin Lu wrote:
Add [heat] to the title to get more feedback.



Best regards,

Hongbin



*From:*Ricardo Rocha [mailto:[email protected]]
*Sent:* August-05-16 5:48 AM
*To:* OpenStack Development Mailing List (not for usage questions)
*Subject:* Re: [openstack-dev] [magnum] 2 million requests / sec, 100s
of nodes



Hi.



Quick update is 1000 nodes and 7 million reqs/sec :) - and the number of
requests should be higher but we had some internal issues. We have a
submission for barcelona to provide a lot more details.



But a couple questions came during the exercise:



1. Do we really need a volume in the VMs? On large clusters this is a
burden, and local storage only should be enough?



2. We observe a significant delay (~10min, which is half the total time
to deploy the cluster) on heat when it seems to be crunching the
kube_minions nested stacks. Once it's done, it still adds new stacks
gradually, so it doesn't look like it precomputed all the info in advance



Anyone tried to scale Heat to stacks this size? We end up with a stack with:

* 1000 nested stacks (depth 2)

* 22000 resources

* 47008 events

Wow, that's a big stack :) TripleO has certainly been pushing the boundaries of how big a stack Heat can handle, but this sounds like another step up even from there.

And already changed most of the timeout/retrial values for rpc to get
this working.



This delay is already visible in clusters of 512 nodes, but 40% of the
time in 1000 nodes seems like something we could improve. Any hints on
Heat configuration optimizations for large stacks very welcome.

Y'all were right to set max_resources_per_stack to -1, because actually checking the number of resources in a tree of stacks is sloooooow. (Not as slow as it used to be when it was O(n^2), but still pretty slow.)

We're actively working on trying to make Heat more horizontally scalable (even at the cost of some performance penalty) so that if you need to handle this kind of scale then you'll be able to reach it by adding more heat-engines. Another big step forward on this front is coming with Newton, as (barring major bugs) the convergence_engine architecture will be enabled by default.

RPC timeouts are caused by the synchronous work that Heat does before returning a result to the caller. Most of this is validation of the data provided by the user. We've talked about trying to reduce the amount of validation done synchronously to a minimum (just enough to guarantee that we can store and retrieve the data from the DB) and push the rest into the asynchronous part of the stack operation alongside the actual create/update. (FWIW, TripleO typically uses a 600s RPC timeout.)

The "QueuePool limit of size ... overflow ... reached" sounds like we're pulling messages off the queue even when we don't have threads available in the pool to pass them to. If you have a fix for this it would be much appreciated. However, I don't think there's any guarantee that just leaving messages on the queue can't lead to deadlocks. The problem with very large trees of nested stacks is not so much that it's a lot of stacks (Heat doesn't have _too_ much trouble with that) but that they all have to be processed simultaneously. e.g. to validate the top level stack you also need to validate all of the lower level stacks before returning the result. If higher-level stacks consume all of the thread pools then you'll get a deadlock as you'll be unable to validate any lower-level stacks. At this point you'd have maxed out the capacity of your Heat engines to process stacks simultaneously and you'd need to scale out to more Heat engines. The solution is probably to try limit the number of nested stack validations we send out concurrently.

Improving performance at scale is a priority area of focus for the Heat team at the moment. That's been mostly driven by TripleO and Sahara, but we'd be very keen to hear about the kind of loads that Magnum is putting on Heat and working with folks across the community to figure out how to improve things for those use cases.

cheers,
Zane.

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to