Let's summarize what we are agreeing on. One service entry with multiple bare-metal compute_node entries are registered at the start of bare-metal nova-compute. 'hypervisor_hostname' must be different for each bare-metal machine, such as 'bare-metal-0001.xxx.com', 'bare-metal-0002.xxx.com', etc.) [I think Arata suggests to use just 0001, 0002, ....]
One extension we need to do at the scheduler side is using (host, hypervisor_hostname) instead of (host) only in host_manager.py.[I think Arata suggest to use "host/baremetal-node-id" as the key.] 'HostManager.service_state' is { <host> : { <service > : { cap k : v }}}. The followings were suggested for new "HostaManager.service_state' { <host> : { <service> : { <hypervisor_name> : { cap k : v }}}} { <host>/<bm_node_id> : { <service> : { cap k : v }}} Please correct/edit it. Thanks, David ----- Original Message ----- > Hi Michael, > > > Looking at line 203 in nova/scheduler/filter_scheduler.py, the > > target host in the cast call is weighted_host*.*host_state*.*host > > and not a service host. (My guess is this will likely require a fair > > number of changes in the scheduler area to change cast calls to > > target a service host instead of a compute node) > > weighted_host.host_state.host still seems to be service['host']... > Please look at it again with me. > > # First, HostStateManager.get_all_host_states: > # host_manager.py:264 > compute_nodes = db.compute_node_get_all(context) > for compute in compute_nodes: > # service is from services table (joined-loaded with compute_nodes) > service = compute['service'] > if not service: > LOG.warn(_("No service for compute ID %s") % compute['id']) > continue > host = service['host'] > capabilities = self.service_states.get(host, None) > # go to HostState constructor: > # the 1st parameter 'host' is service['host'] > host_state = self.host_state_cls(host, topic, > capabilities=capabilities, > service=dict(service.iteritems())) > > # host_manager.py:101 > def __init__(self, host, topic, capabilities=None, service=None): > self.host = host > self.topic = topic > # here, HostState.host is service['host'] > > Then, update_from_compute_node(compute) is called but it leaves > self.host unchanged. > WeightedHost.host_state is this HostState. So, host at > filter_scheduler.py:203 is service['host']. We can use existing code > about RPC target. Do I miss something? > > Thanks, > Arata > > > (2012/08/28 6:45), Michael J Fork wrote: > > VTJ NOTSU Arata <no...@virtualtech.jp> wrote on 08/27/2012 05:19:40 > > PM: > > > > > From: VTJ NOTSU Arata <no...@virtualtech.jp> > > > To: Michael J Fork/Rochester/IBM@IBMUS, > > > Cc: David Kang <dk...@isi.edu>, OpenStack Development Mailing > > > List > > > <openstack-...@lists.openstack.org>, openstack-bounces > > > +mjfork=us.ibm....@lists.launchpad.net, > > > "openstack@lists.launchpad.net (openstack@lists.launchpad.net)" > > > <openstack@lists.launchpad.net> > > > Date: 08/27/2012 05:19 PM > > > Subject: Re: [Openstack] [openstack-dev] Discussion about where > > > to > > > put database for bare-metal provisioning (review 10726) > > > > > > Hello all, > > > > > > It seems that the requirement for keys of > > > HostManager.service_state > > > is just to be unique; > > > these do not have to be valid hostnames or queues (Already, > > > existingcode casts > > > messages to <topic>.<service-hostname>. Michael, doesn't it?). > > > > Looking at line 203 in nova/scheduler/filter_scheduler.py, the > > target host in the cast call is weighted_host*.*host_state*.*host > > and not a service host. (My guess is this will likely require a fair > > number of changes in the scheduler area to change cast calls to > > target a service host instead of a compute node) > > > > > So, I tried > > > '<host>/<bm_node_id>' as 'host' of capabilities. Then, > > > HostManager.service_state is: > > > { <host>/<bm_node_id> : { <service> : { cap k : v }}}. > > > So far, it works fine. How about this way? > > > > I will defer to Vish here, but seems like a reasonable solution. > > > > > I paste relevant code in the bottom of this mail just to make > > > sure. > > > NOTE: I added a new column 'nodename' to compute_nodes to store > > > bm_node_id, > > > but storing it in 'hypervisor_hostname' may be a right solution. > > > > Again, I will defer to Vish, but seems like using the existing > > "hypervisor_hostname" would be correct (otherwise I have no idea > > what that field would have been intended for). > > > > > (The whole code is in our github(NTTdocomo-openstack/nova, branch > > > 'multinode'), > > > multiple resource_trackers are also implemented.) > > > > > > Thanks, > > > Arata > > > > > > > > > diff --git a/nova/scheduler/host_manager.py > > > b/nova/scheduler/host_manager.py > > > index 33ba2c1..567729f 100644 > > > --- a/nova/scheduler/host_manager.py > > > +++ b/nova/scheduler/host_manager.py > > > @@ -98,9 +98,10 @@ class HostState(object): > > > previously used and lock down access. > > > """ > > > > > > - def __init__(self, host, topic, capabilities=None, > > > service=None): > > > + def __init__(self, host, topic, capabilities=None, > > > service=None, nodename=None): > > > self.host = host > > > self.topic = topic > > > + self.nodename = nodename > > > > > > # Read-only capability dicts > > > > > > @@ -175,8 +176,8 @@ class HostState(object): > > > return True > > > > > > def __repr__(self): > > > - return ("host '%s': free_ram_mb:%s free_disk_mb:%s" % > > > - (self.host, self.free_ram_mb, self.free_disk_mb)) > > > + return ("host '%s' / nodename '%s': free_ram_mb:%s > > > free_disk_mb:%s" % > > > + (self.host, self.nodename, self.free_ram_mb, > > > self.free_disk_mb)) > > > > > > > > > class HostManager(object): > > > @@ -268,11 +269,16 @@ class HostManager(object): > > > LOG.warn(_("No service for compute ID %s") % > > > compute['id']) > > > continue > > > host = service['host'] > > > - capabilities = self.service_states.get(host, None) > > > + if compute['nodename']: > > > + host_node = '%s/%s' % (host, compute['nodename']) > > > + else: > > > + host_node = host > > > + capabilities = self.service_states.get(host_node, None) > > > host_state = self.host_state_cls(host, topic, > > > capabilities=capabilities, > > > - service=dict(service.iteritems())) > > > + service=dict(service.iteritems()), > > > + nodename=compute['nodename']) > > > host_state.update_from_compute_node(compute) > > > - host_state_map[host] = host_state > > > + host_state_map[host_node] = host_state > > > > > > return host_state_map > > > > > > diff --git a/nova/virt/baremetal/driver.py > > > b/nova/virt/baremetal/driver.py > > > index 087d1b6..dbcfbde 100644 > > > --- a/nova/virt/baremetal/driver.py > > > +++ b/nova/virt/baremetal/driver.py > > > (skip...) > > > + def _create_node_cap(self, node): > > > + dic = self._node_resources(node) > > > + dic['host'] = '%s/%s' % (FLAGS.host, node['id']) > > > + dic['cpu_arch'] = self._extra_specs.get('cpu_arch') > > > + dic['instance_type_extra_specs'] = self._extra_specs > > > + dic['supported_instances'] = self._supported_instances > > > + # TODO: put node's extra specs > > > + return dic > > > > > > def get_host_stats(self, refresh=False): > > > - return self._get_host_stats() > > > + caps = [] > > > + context = nova_context.get_admin_context() > > > + nodes = bmdb.bm_node_get_all(context, > > > + service_host=FLAGS.host) > > > + for node in nodes: > > > + node_cap = self._create_node_cap(node) > > > + caps.append(node_cap) > > > + return caps > > > > > > > > > (2012/08/28 5:55), Michael J Fork wrote: > > > > openstack-bounces+mjfork=us.ibm....@lists.launchpad.net wrote > > > > on > > > 08/27/2012 02:58:56 PM: > > > > > > > > > From: David Kang <dk...@isi.edu> > > > > > To: Vishvananda Ishaya <vishvana...@gmail.com>, > > > > > Cc: OpenStack Development Mailing List <openstack- > > > > > d...@lists.openstack.org>, "openstack@lists.launchpad.net \ > > > > > (openstack@lists.launchpad.net\)" > > > > > <openstack@lists.launchpad.net> > > > > > Date: 08/27/2012 03:06 PM > > > > > Subject: Re: [Openstack] [openstack-dev] Discussion about > > > > > where to > > > > > put database for bare-metal provisioning (review 10726) > > > > > Sent by: > > > > > openstack-bounces+mjfork=us.ibm....@lists.launchpad.net > > > > > > > > > > > > > > > Hi Vish, > > > > > > > > > > I think I understand your idea. > > > > > One service entry with multiple bare-metal compute_node > > > > > entries are > > > > > registered at the start of bare-metal nova-compute. > > > > > 'hypervisor_hostname' must be different for each bare-metal > > > > > machine, > > > > > such as 'bare-metal-0001.xxx.com', > > > > > 'bare-metal-0002.xxx.com', etc.) > > > > > But their IP addresses must be the IP address of bare-metal > > > > > nova- > > > > > compute, such that an instance is casted > > > > > not to bare-metal machine directly but to bare-metal > > > > > nova-compute. > > > > > > > > I believe the change here is to cast out the message to the > > > <topic>.<service-hostname>. Existing code sends it to the > > > compute_node hostname (see line 202 of nova/scheduler/ > > > filter_scheduler.py, specifically > > > host=weighted_host.host_state.host). Changing that to cast to the > > > service hostname would send the message to the bare-metal proxy > > > node > > > and should not have an effect on current deployments since the > > > service hostname and the host_state.host would always be equal. > > > This model will also let you keep the bare-metal compute node IP > > > in > > > the compute node table. > > > > > > > > > One extension we need to do at the scheduler side is using > > > > > (host, > > > > > hypervisor_hostname) instead of (host) only in > > > > > host_manager.py. > > > > > 'HostManager.service_state' is { <host> : { <service > : { > > > > > cap k : v }}}. > > > > > It needs to be changed to { <host> : { <service> : { > > > > > <hypervisor_name> : { cap k : v }}}}. > > > > > Most functions of HostState need to be changed to use (host, > > > > > hypervisor_name) pair to identify a compute node. > > > > > > > > Would an alternative here be to change the top level "host" to > > > > be > > > the hypervisor_hostname and enforce uniqueness? > > > > > > > > > Are we on the same page, now? > > > > > > > > > > Thanks, > > > > > David > > > > > > > > > > ----- Original Message ----- > > > > > > Hi David, > > > > > > > > > > > > I just checked out the code more extensively and I don't > > > > > > see why you > > > > > > need to create a new service entry for each compute_node > > > > > > entry. The > > > > > > code in host_manager to get all host states explicitly > > > > > > gets all > > > > > > compute_node entries. I don't see any reason why multiple > > > > > > compute_node > > > > > > entries can't share the same service. I don't see any > > > > > > place in the > > > > > > scheduler that is grabbing records by "service" instead of > > > > > > by "compute > > > > > > node", but if there is one that I missed, it should be > > > > > > fairly easy to > > > > > > change it. > > > > > > > > > > > > The compute_node record is created in the > > > > > > compute/resource_tracker.py > > > > > > as of a recent commit, so I think the path forward would > > > > > > be to make > > > > > > sure that one of the records is created for each bare > > > > > > metal node by > > > > > > the bare metal compute, perhaps by having multiple > > > > > > resource_trackers. > > > > > > > > > > > > Vish > > > > > > > > > > > > On Aug 27, 2012, at 9:40 AM, David Kang <dk...@isi.edu> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > Vish, > > > > > > > > > > > > > > I think I don't understand your statement fully. > > > > > > > Unless we use different hostnames, (hostname, > > > > > > > hypervisor_hostname) > > > > > > > must be the > > > > > > > same for all bare-metal nodes under a bare-metal > > > > > > > nova-compute. > > > > > > > > > > > > > > Could you elaborate the following statement a little > > > > > > > bit more? > > > > > > > > > > > > > >> You would just have to use a little more than hostname. > > > > > > >> Perhaps > > > > > > >> (hostname, hypervisor_hostname) could be used to update > > > > > > >> the entry? > > > > > > >> > > > > > > > > > > > > > > Thanks, > > > > > > > David > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > >> I would investigate changing the capabilities to key > > > > > > >> off of > > > > > > >> something > > > > > > >> other than hostname. It looks from the table structure > > > > > > >> like > > > > > > >> compute_nodes could be have a many-to-one relationship > > > > > > >> with > > > > > > >> services. > > > > > > >> You would just have to use a little more than hostname. > > > > > > >> Perhaps > > > > > > >> (hostname, hypervisor_hostname) could be used to update > > > > > > >> the entry? > > > > > > >> > > > > > > >> Vish > > > > > > >> > > > > > > >> On Aug 24, 2012, at 11:23 AM, David Kang > > > > > > >> <dk...@isi.edu> wrote: > > > > > > >> > > > > > > >>> > > > > > > >>> Vish, > > > > > > >>> > > > > > > >>> I've tested your code and did more testing. > > > > > > >>> There are a couple of problems. > > > > > > >>> 1. host name should be unique. If not, any repetitive > > > > > > >>> updates of > > > > > > >>> new > > > > > > >>> capabilities with the same host name are simply > > > > > > >>> overwritten. > > > > > > >>> 2. We cannot generate arbitrary host names on the fly. > > > > > > >>> The scheduler (I tested filter scheduler) gets host > > > > > > >>> names from > > > > > > >>> db. > > > > > > >>> So, if a host name is not in the 'services' table, > > > > > > >>> it is not > > > > > > >>> considered by the scheduler at all. > > > > > > >>> > > > > > > >>> So, to make your suggestions possible, nova-compute > > > > > > >>> should > > > > > > >>> register > > > > > > >>> N different host names in 'services' table, > > > > > > >>> and N corresponding entries in 'compute_nodes' table. > > > > > > >>> Here is an example: > > > > > > >>> > > > > > > >>> mysql> select id, host, binary, topic, report_count, > > > > > > >>> disabled, > > > > > > >>> availability_zone from services; > > > > > > >>> +----+-------------+----------------+----------- > > > > > +--------------+----------+-------------------+ > > > > > > >>> | id | host | binary | topic | report_count | disabled > > > > > > >>> | | > > > > > > >>> | availability_zone | > > > > > > >>> +----+-------------+----------------+----------- > > > > > +--------------+----------+-------------------+ > > > > > > >>> | 1 | bespin101 | nova-scheduler | scheduler | 17145 > > > > > > >>> | | 0 | nova | > > > > > > >>> | 2 | bespin101 | nova-network | network | 16819 | 0 > > > > > > >>> | | nova | > > > > > > >>> | 3 | bespin101-0 | nova-compute | compute | 16405 | > > > > > > >>> | 0 | nova | > > > > > > >>> | 4 | bespin101-1 | nova-compute | compute | 1 | 0 | > > > > > > >>> | nova | > > > > > > >>> +----+-------------+----------------+----------- > > > > > +--------------+----------+-------------------+ > > > > > > >>> > > > > > > >>> mysql> select id, service_id, hypervisor_hostname from > > > > > > >>> compute_nodes; > > > > > > >>> +----+------------+------------------------+ > > > > > > >>> | id | service_id | hypervisor_hostname | > > > > > > >>> +----+------------+------------------------+ > > > > > > >>> | 1 | 3 | bespin101.east.isi.edu | > > > > > > >>> | 2 | 4 | bespin101.east.isi.edu | > > > > > > >>> +----+------------+------------------------+ > > > > > > >>> > > > > > > >>> Then, nova db (compute_nodes table) has entries of > > > > > > >>> all bare-metal > > > > > > >>> nodes. > > > > > > >>> What do you think of this approach. > > > > > > >>> Do you have any better approach? > > > > > > >>> > > > > > > >>> Thanks, > > > > > > >>> David > > > > > > >>> > > > > > > >>> > > > > > > >>> > > > > > > >>> ----- Original Message ----- > > > > > > >>>> To elaborate, something the below. I'm not absolutely > > > > > > >>>> sure you > > > > > > >>>> need > > > > > > >>>> to > > > > > > >>>> be able to set service_name and host, but this gives > > > > > > >>>> you the > > > > > > >>>> option > > > > > > >>>> to > > > > > > >>>> do so if needed. > > > > > > >>>> > > > > > > >>>> iff --git a/nova/manager.py b/nova/manager.py > > > > > > >>>> index c6711aa..c0f4669 100644 > > > > > > >>>> --- a/nova/manager.py > > > > > > >>>> +++ b/nova/manager.py > > > > > > >>>> @@ -217,6 +217,8 @@ class > > > > > > >>>> SchedulerDependentManager(Manager): > > > > > > >>>> > > > > > > >>>> def update_service_capabilities(self, capabilities): > > > > > > >>>> """Remember these capabilities to send on next > > > > > > >>>> periodic > > > > > > >>>> update.""" > > > > > > >>>> + if not isinstance(capabilities, list): > > > > > > >>>> + capabilities = [capabilities] > > > > > > >>>> self.last_capabilities = capabilities > > > > > > >>>> > > > > > > >>>> @periodic_task > > > > > > >>>> @@ -224,5 +226,8 @@ class > > > > > > >>>> SchedulerDependentManager(Manager): > > > > > > >>>> """Pass data back to the scheduler at a periodic > > > > > > >>>> interval.""" > > > > > > >>>> if self.last_capabilities: > > > > > > >>>> LOG.debug(_('Notifying Schedulers of capabilities > > > > > > >>>> ...')) > > > > > > >>>> - > > > > > > >>>> self.scheduler_rpcapi.update_service_capabilities(context, > > > > > > >>>> - self.service_name, self.host, > > > > > > >>>> self.last_capabilities) > > > > > > >>>> + for capability_item in self.last_capabilities: > > > > > > >>>> + name = capability_item.get('service_name', > > > > > > >>>> self.service_name) > > > > > > >>>> + host = capability_item.get('host', self.host) > > > > > > >>>> + > > > > > > >>>> self.scheduler_rpcapi.update_service_capabilities(context, > > > > > > >>>> + name, host, capability_item) > > > > > > >>>> > > > > > > >>>> On Aug 21, 2012, at 1:28 PM, David Kang > > > > > > >>>> <dk...@isi.edu> wrote: > > > > > > >>>> > > > > > > >>>>> > > > > > > >>>>> Hi Vish, > > > > > > >>>>> > > > > > > >>>>> We are trying to change our code according to your > > > > > > >>>>> comment. > > > > > > >>>>> I want to ask a question. > > > > > > >>>>> > > > > > > >>>>>>>> a) modify driver.get_host_stats to be able to > > > > > > >>>>>>>> return a list > > > > > > >>>>>>>> of > > > > > > >>>>>>>> host > > > > > > >>>>>>>> stats instead of just one. Report the whole list > > > > > > >>>>>>>> back to the > > > > > > >>>>>>>> scheduler. We could modify the receiving end to > > > > > > >>>>>>>> accept a list > > > > > > >>>>>>>> as > > > > > > >>>>>>>> well > > > > > > >>>>>>>> or just make multiple calls to > > > > > > >>>>>>>> self.update_service_capabilities(capabilities) > > > > > > >>>>> > > > > > > >>>>> Modifying driver.get_host_stats to return a list of > > > > > > >>>>> host stats > > > > > > >>>>> is > > > > > > >>>>> easy. > > > > > > >>>>> Calling muliple calls to > > > > > > >>>>> self.update_service_capabilities(capabilities) > > > > > > >>>>> doesn't seem to > > > > > > >>>>> work, > > > > > > >>>>> because 'capabilities' is overwritten each time. > > > > > > >>>>> > > > > > > >>>>> Modifying the receiving end to accept a list seems > > > > > > >>>>> to be easy. > > > > > > >>>>> However, 'capabilities' is assumed to be dictionary > > > > > > >>>>> by all other > > > > > > >>>>> scheduler routines, > > > > > > >>>>> it looks like that we have to change all of them to > > > > > > >>>>> handle > > > > > > >>>>> 'capability' as a list of dictionary. > > > > > > >>>>> > > > > > > >>>>> If my understanding is correct, it would affect > > > > > > >>>>> many parts of > > > > > > >>>>> the > > > > > > >>>>> scheduler. > > > > > > >>>>> Is it what you recommended? > > > > > > >>>>> > > > > > > >>>>> Thanks, > > > > > > >>>>> David > > > > > > >>>>> > > > > > > >>>>> > > > > > > >>>>> ----- Original Message ----- > > > > > > >>>>>> This was an immediate goal, the bare-metal > > > > > > >>>>>> nova-compute node > > > > > > >>>>>> could > > > > > > >>>>>> keep an internal database, but report capabilities > > > > > > >>>>>> through nova > > > > > > >>>>>> in > > > > > > >>>>>> the > > > > > > >>>>>> common way with the changes below. Then the > > > > > > >>>>>> scheduler wouldn't > > > > > > >>>>>> need > > > > > > >>>>>> access to the bare metal database at all. > > > > > > >>>>>> > > > > > > >>>>>> On Aug 15, 2012, at 4:23 PM, David Kang > > > > > > >>>>>> <dk...@isi.edu> wrote: > > > > > > >>>>>> > > > > > > >>>>>>> > > > > > > >>>>>>> Hi Vish, > > > > > > >>>>>>> > > > > > > >>>>>>> Is this discussion for long-term goal or for this > > > > > > >>>>>>> Folsom > > > > > > >>>>>>> release? > > > > > > >>>>>>> > > > > > > >>>>>>> We still believe that bare-metal database is > > > > > > >>>>>>> needed > > > > > > >>>>>>> because there is not an automated way how > > > > > > >>>>>>> bare-metal nodes > > > > > > >>>>>>> report > > > > > > >>>>>>> their capabilities > > > > > > >>>>>>> to their bare-metal nova-compute node. > > > > > > >>>>>>> > > > > > > >>>>>>> Thanks, > > > > > > >>>>>>> David > > > > > > >>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> I am interested in finding a solution that > > > > > > >>>>>>>> enables bare-metal > > > > > > >>>>>>>> and > > > > > > >>>>>>>> virtualized requests to be serviced through the > > > > > > >>>>>>>> same > > > > > > >>>>>>>> scheduler > > > > > > >>>>>>>> where > > > > > > >>>>>>>> the compute_nodes table has a full view of > > > > > > >>>>>>>> schedulable > > > > > > >>>>>>>> resources. > > > > > > >>>>>>>> This > > > > > > >>>>>>>> would seem to simplify the end-to-end flow while > > > > > > >>>>>>>> opening up > > > > > > >>>>>>>> some > > > > > > >>>>>>>> additional use cases (e.g. dynamic allocation of > > > > > > >>>>>>>> a node from > > > > > > >>>>>>>> bare-metal to hypervisor and back). > > > > > > >>>>>>>> > > > > > > >>>>>>>> One approach would be to have a proxy running a > > > > > > >>>>>>>> single > > > > > > >>>>>>>> nova-compute > > > > > > >>>>>>>> daemon fronting the bare-metal nodes . That > > > > > > >>>>>>>> nova-compute > > > > > > >>>>>>>> daemon > > > > > > >>>>>>>> would > > > > > > >>>>>>>> report up many HostState objects (1 per > > > > > > >>>>>>>> bare-metal node) to > > > > > > >>>>>>>> become > > > > > > >>>>>>>> entries in the compute_nodes table and accessible > > > > > > >>>>>>>> through the > > > > > > >>>>>>>> scheduler HostManager object. > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> The HostState object would set cpu_info, vcpus, > > > > > > >>>>>>>> member_mb and > > > > > > >>>>>>>> local_gb > > > > > > >>>>>>>> values to be used for scheduling with the > > > > > > >>>>>>>> hypervisor_host > > > > > > >>>>>>>> field > > > > > > >>>>>>>> holding the bare-metal machine address (e.g. for > > > > > > >>>>>>>> IPMI based > > > > > > >>>>>>>> commands) > > > > > > >>>>>>>> and hypervisor_type = NONE. The bare-metal > > > > > > >>>>>>>> Flavors are > > > > > > >>>>>>>> created > > > > > > >>>>>>>> with > > > > > > >>>>>>>> an > > > > > > >>>>>>>> extra_spec of hypervisor_type= NONE and the > > > > > > >>>>>>>> corresponding > > > > > > >>>>>>>> compute_capabilities_filter would reduce the > > > > > > >>>>>>>> available hosts > > > > > > >>>>>>>> to > > > > > > >>>>>>>> those > > > > > > >>>>>>>> bare_metal nodes. The scheduler would need to > > > > > > >>>>>>>> understand that > > > > > > >>>>>>>> hypervisor_type = NONE means you need an exact > > > > > > >>>>>>>> fit (or > > > > > > >>>>>>>> best-fit) > > > > > > >>>>>>>> host > > > > > > >>>>>>>> vs weighting them (perhaps through the > > > > > > >>>>>>>> multi-scheduler). The > > > > > > >>>>>>>> scheduler > > > > > > >>>>>>>> would cast out the message to the > > > > > > >>>>>>>> <topic>.<service-hostname> > > > > > > >>>>>>>> (code > > > > > > >>>>>>>> today uses the HostState hostname), with the > > > > > > >>>>>>>> compute driver > > > > > > >>>>>>>> having > > > > > > >>>>>>>> to > > > > > > >>>>>>>> understand if it must be serviced elsewhere (but > > > > > > >>>>>>>> does not > > > > > > >>>>>>>> break > > > > > > >>>>>>>> any > > > > > > >>>>>>>> existing implementations since it is 1 to 1). > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> Does this solution seem workable? Anything I > > > > > > >>>>>>>> missed? > > > > > > >>>>>>>> > > > > > > >>>>>>>> The bare metal driver already is proxying for the > > > > > > >>>>>>>> other nodes > > > > > > >>>>>>>> so > > > > > > >>>>>>>> it > > > > > > >>>>>>>> sounds like we need a couple of things to make > > > > > > >>>>>>>> this happen: > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> a) modify driver.get_host_stats to be able to > > > > > > >>>>>>>> return a list > > > > > > >>>>>>>> of > > > > > > >>>>>>>> host > > > > > > >>>>>>>> stats instead of just one. Report the whole list > > > > > > >>>>>>>> back to the > > > > > > >>>>>>>> scheduler. We could modify the receiving end to > > > > > > >>>>>>>> accept a list > > > > > > >>>>>>>> as > > > > > > >>>>>>>> well > > > > > > >>>>>>>> or just make multiple calls to > > > > > > >>>>>>>> self.update_service_capabilities(capabilities) > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> b) make a few minor changes to the scheduler to > > > > > > >>>>>>>> make sure > > > > > > >>>>>>>> filtering > > > > > > >>>>>>>> still works. Note the changes here may be very > > > > > > >>>>>>>> helpful: > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> https://review.openstack.org/10327 > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> c) we have to make sure that instances launched > > > > > > >>>>>>>> on those > > > > > > >>>>>>>> nodes > > > > > > >>>>>>>> take > > > > > > >>>>>>>> up > > > > > > >>>>>>>> the entire host state somehow. We could probably > > > > > > >>>>>>>> do this by > > > > > > >>>>>>>> making > > > > > > >>>>>>>> sure that the instance_type ram, mb, gb etc. > > > > > > >>>>>>>> matches what the > > > > > > >>>>>>>> node > > > > > > >>>>>>>> has, but we may want a new boolean field "used" > > > > > > >>>>>>>> if those > > > > > > >>>>>>>> aren't > > > > > > >>>>>>>> sufficient. > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> I This approach seems pretty good. We could > > > > > > >>>>>>>> potentially get > > > > > > >>>>>>>> rid > > > > > > >>>>>>>> of > > > > > > >>>>>>>> the > > > > > > >>>>>>>> shared bare_metal_node table. I guess the only > > > > > > >>>>>>>> other concern > > > > > > >>>>>>>> is > > > > > > >>>>>>>> how > > > > > > >>>>>>>> you populate the capabilities that the bare metal > > > > > > >>>>>>>> nodes are > > > > > > >>>>>>>> reporting. > > > > > > >>>>>>>> I guess an api extension that rpcs to a baremetal > > > > > > >>>>>>>> node to add > > > > > > >>>>>>>> the > > > > > > >>>>>>>> node. Maybe someday this could be autogenerated > > > > > > >>>>>>>> by the bare > > > > > > >>>>>>>> metal > > > > > > >>>>>>>> host > > > > > > >>>>>>>> looking in its arp table for dhcp requests! :) > > > > > > >>>>>>>> > > > > > > >>>>>>>> > > > > > > >>>>>>>> Vish > > > > > > >>>>>>>> > > > > > > >>>>>>>> _______________________________________________ > > > > > > >>>>>>>> OpenStack-dev mailing list > > > > > > >>>>>>>> openstack-...@lists.openstack.org > > > > > > >>>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/ > > > openstack-dev > > > > > > >>>>>>> > > > > > > >>>>>>> _______________________________________________ > > > > > > >>>>>>> OpenStack-dev mailing list > > > > > > >>>>>>> openstack-...@lists.openstack.org > > > > > > >>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/ > > > openstack-dev > > > > > > >>>>>> > > > > > > >>>>>> > > > > > > >>>>>> _______________________________________________ > > > > > > >>>>>> OpenStack-dev mailing list > > > > > > >>>>>> openstack-...@lists.openstack.org > > > > > > >>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/ > > > openstack-dev > > > > > > >>>>> > > > > > > >>>>> _______________________________________________ > > > > > > >>>>> OpenStack-dev mailing list > > > > > > >>>>> openstack-...@lists.openstack.org > > > > > > >>>>> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > > >>>> > > > > > > >>>> > > > > > > >>>> _______________________________________________ > > > > > > >>>> OpenStack-dev mailing list > > > > > > >>>> openstack-...@lists.openstack.org > > > > > > >>>> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > > > > > > _______________________________________________ > > > > > Mailing list: https://launchpad.net/~openstack > > > > > Post to : openstack@lists.launchpad.net > > > > > Unsubscribe : https://launchpad.net/~openstack > > > > > More help : https://help.launchpad.net/ListHelp > > > > > > > > > > > > > Michael > > > > > > > > ------------------------------------------------- > > > > Michael Fork > > > > Cloud Architect, Emerging Solutions > > > > IBM Systems & Technology Group > > > > > > > > > > > > _______________________________________________ > > > > Mailing list: https://launchpad.net/~openstack > > > > Post to : openstack@lists.launchpad.net > > > > Unsubscribe : https://launchpad.net/~openstack > > > > More help : https://help.launchpad.net/ListHelp > > > > > > > > > > > Michael > > > > ------------------------------------------------- > > Michael Fork > > Cloud Architect, Emerging Solutions > > IBM Systems & Technology Group > > > > > -- > 日本仮想化技術株式会社(http://VirtualTech.jp) > 技術部 開発課 課長 野津 新(no...@virtualtech.jp) > > 〒150-0002 東京都渋谷区渋谷1-8-1 第3西青山ビル 8F > TEL:03-6419-7841 FAX:03-5774-9462 _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp