On Wed, May 25, 2011 at 06:04:05PM +0200, Michael Hanselmann wrote:
> Am 25. Mai 2011 11:19 schrieb Iustin Pop <[email protected]>:
> > On Mon, May 23, 2011 at 06:57:03PM +0200, Michael Hanselmann wrote:
> >> Commit aac4511a added CheckArguments to LUNodeMigrate with a call to
> >> _CheckIAllocatorOrNode. I haven't understood why this was done and can't
> >> see how it would've worked, at least on clusters with default settings.
> >
> > What is not clear about that?
> 
> (this taken care of in commit f8fa4175232)
> 
> >> --- a/lib/cmdlib.py
> >> +++ b/lib/cmdlib.py
> >> +  def CheckPrereq(self):
> >> +    pass
> >> +
> >> +  def Exec(self, feedback_fn):
> >> +    # Prepare jobs for migration instances
> >> +    jobs = [
> >> +      [opcodes.OpInstanceMigrate(instance_name=inst.name,
> >> +                                 mode=self.op.mode,
> >> +                                 live=self.op.live,
> >> +                                 iallocator=self.op.iallocator,
> >> +                                 target_node=self.op.target_node)]
> >> +      for inst in _GetNodePrimaryInstances(self.cfg, self.op.node_name)
> >> +      ]
> >> +
> >> +    assert (frozenset(self.glm.list_owned(locking.LEVEL_NODE)) ==
> >> +            frozenset([self.op.node_name]))
> >> +
> >> +    return ResultWithJobs(jobs)
> >
> > Well well. This is completely suboptimal, as in case of iallocator runs
> > and external storage, you will run the iallocator and get the cluster
> > locks N times.
> >
> > The proper model is that this LU (LUNodeMigrate) takes an iallocator,
> > that it runs the iallocator to compute the node strategy, and then
> > submits N instance migrate jobs that use explicit nodes, without
> > iallocator decisions to be done in the individual jobs, thus allowing
> > more parallelism.
> 
> What I'm worried about here is that for some reason (e.g. a busy
> queue) the migration jobs aren't run for a while, meanwhile
> modifications on the cluster take place which change iallocator's
> decision and may lead to job errors. There are other places with the
> same situation, but here it can be avoided. With more changes upcoming
> on instance migration/failover locks will be released as soon as
> possible. Do you think risking job failures for performance benefits
> is acceptable?

Hard to say. I think this is another case where we don't have clear
semantics for multi-job operations and what their consistency model will
be.

That said, in this particular case, I believe the above model is bad due
to the fact that, by requiring N acquires of cluster-wide locks, it is a
generator of "busy queues" itself, hence we should try to avoid it.

It's fine for now, but I want to have it fixed sometime in the future,
so please add a TODO.

iustin

Reply via email to