On 08 Aug 2016, at 11:51, Ricardo Rocha <[email protected]<mailto:[email protected]>> wrote:
Hi. On Mon, Aug 8, 2016 at 1:52 AM, Clint Byrum <[email protected]<mailto:[email protected]>> wrote: Excerpts from Steve Baker's message of 2016-08-08 10:11:29 +1200: On 05/08/16 21:48, Ricardo Rocha wrote: Hi. Quick update is 1000 nodes and 7 million reqs/sec :) - and the number of requests should be higher but we had some internal issues. We have a submission for barcelona to provide a lot more details. But a couple questions came during the exercise: 1. Do we really need a volume in the VMs? On large clusters this is a burden, and local storage only should be enough? 2. We observe a significant delay (~10min, which is half the total time to deploy the cluster) on heat when it seems to be crunching the kube_minions nested stacks. Once it's done, it still adds new stacks gradually, so it doesn't look like it precomputed all the info in advance Anyone tried to scale Heat to stacks this size? We end up with a stack with: * 1000 nested stacks (depth 2) * 22000 resources * 47008 events And already changed most of the timeout/retrial values for rpc to get this working. This delay is already visible in clusters of 512 nodes, but 40% of the time in 1000 nodes seems like something we could improve. Any hints on Heat configuration optimizations for large stacks very welcome. Yes, we recommend you set the following in /etc/heat/heat.conf [DEFAULT]: max_resources_per_stack = -1 Enforcing this for large stacks has a very high overhead, we make this change in the TripleO undercloud too. Wouldn't this necessitate having a private Heat just for Magnum? Not having a resource limit per stack would leave your Heat engines vulnerable to being DoS'd by malicious users, since one can create many many thousands of resources, and thus python objects, in just a couple of cleverly crafted templates (which is why I added the setting). This makes perfect sense in the undercloud of TripleO, which is a private, single tenant OpenStack. But, for Magnum.. now you're talking about the Heat that users have access to. We have it already at -1 for these tests. As you say a malicious user could DoS, right now this is manageable in our environment. But maybe move it to a per tenant value, or some special policy? The stacks are created under a separate domain for magnum (for trustees), we could also use that for separation. If there was a quota system within Heat for items like stacks and resources, this could be controlled through that. Looks like https://blueprints.launchpad.net/heat/+spec/add-quota-api-for-heat did not make it into upstream though. Tim A separate heat instance sounds like an overkill. Cheers, Ricardo __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]<mailto:[email protected]>?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]<mailto:[email protected]>?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
