On 07/23/2013 12:32 PM, Sergey Lukjanov wrote:
Hi evereyone,

We’ve started working on upgrading Savanna architecture in version
0.3 to make it horizontally scalable.

The most part of information is in the wiki page -
https://wiki.openstack.org/wiki/Savanna/NextGenArchitecture.

Additionally there are several blueprints created for this activity -
https://blueprints.launchpad.net/savanna?searchtext=ng-

We are looking for comments / questions / suggestions.

Some comments on "Why not provision agents to Hadoop cluster's to provision all other stuff?"

Re problems with scaling agents for launching large clusters - launching large clusters may be resource intensive, those resources must be provided by someone. They're either going to be provided by a) the hardware running the savanna infrastructure or b) the instance hardware provided to the tenant. If they are provided by (a) then the cost of launching the cluster is incurred by all users of savanna. If (b) then the cost is incurred by the user trying to launch the large cluster. It is true that some instance recommendations may be necessary, e.g. if you want to run a 500 instance cluster than your head node should be large (vs medium or small). That sizing decision needs to happen for (a) or (b) because enough virtual resources must be present to maintain the large cluster after it is launched. There are accounting and isolation benefits to (b).

Re problems migrating agents while cluster is scaling - will you expand on this point?

Re unexpected resource consumers - during launch, maybe, during execution the agent should be a minimal consumer of resources. sshd may also be an unexpected resource consumer.

Re security vulnerability - the agents should only communicate within the instance network, primarily w/ the head node. The head node can relay information to the savanna infrastructure outside the instances in the same way savanna-api gets information now. So there should be no difference in vulnerability assessment.

Re support multiple distros - yes, but I'd argue this is at most a small incremental complexity on what already exists today w/ properly creating savanna plugin compatible instances.

-

Concretely, the architecture of using instance resources for provisioning is no different than spinning an instance w/ ambari and then telling that instance to provision the rest of the cluster and report back status.

-

Re metrics - wherever you gather Hz (# req per sec, # queries per sec, etc), also gather standard summary statistics (mean, median, std dev, quartiles, range)

Best,


matt

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to