On Mon, Feb 22, 2010 at 03:23:20PM +0100, Michael Hanselmann wrote:
> Jobs submitted via the standard command line utilities didn't give any
> indication that anything is happening while they were waiting in the job
> queue (e.g. due to other jobs using all worker threads) or acquiring
> locks. This could be very confusing for people not familiar with Ganeti's
> architecture. Now they'll show a message after the first WaitForJobChanges
> timeout.
> 
> Signed-off-by: Michael Hanselmann <[email protected]>
> ---
>  lib/cli.py  |   16 ++++++++++++++++
>  lib/luxi.py |   13 +++----------
>  2 files changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/cli.py b/lib/cli.py
> index f016ac4..734e433 100644
> --- a/lib/cli.py
> +++ b/lib/cli.py
> @@ -1132,12 +1132,28 @@ def PollJob(job_id, cl=None, feedback_fn=None):
>    prev_job_info = None
>    prev_logmsg_serial = None
>  
> +  status = None
> +
> +  notified_queued = False
> +  notified_waitlock = False
> +
>    while True:
>      result = cl.WaitForJobChange(job_id, ["status"], prev_job_info,
>                                   prev_logmsg_serial)
>      if not result:
>        # job not found, go away!
>        raise errors.JobLost("Job with id %s lost" % job_id)
> +    elif result == constants.JOB_NOTCHANGED:
> +      if status is not None and not callable(feedback_fn):
> +        if status == constants.JOB_STATUS_QUEUED and not notified_queued:
> +          ToStderr("Job %s is waiting in queue", job_id)
> +          notified_queued = True
> +        elif status == constants.JOB_STATUS_WAITLOCK and not 
> notified_waitlock:
> +          ToStderr("Job %s is trying to acquire all necessary locks", job_id)
> +          notified_waitlock = True
> +
> +      # Wait again
> +      continue
>  
>      # Split result, a tuple of (field values, log entries)
>      (job_info, log_entries) = result
> diff --git a/lib/luxi.py b/lib/luxi.py
> index 97333dc..b1bace6 100644
> --- a/lib/luxi.py
> +++ b/lib/luxi.py
> @@ -370,13 +370,9 @@ class Client(object):
>  
>    def WaitForJobChange(self, job_id, fields, prev_job_info, prev_log_serial):
>      timeout = (DEF_RWTO - 1) / 2
> -    while True:
> -      result = self.CallMethod(REQ_WAIT_FOR_JOB_CHANGE,
> -                               (job_id, fields, prev_job_info,
> -                                prev_log_serial, timeout))
> -      if result != constants.JOB_NOTCHANGED:
> -        break
> -    return result
> +    return self.CallMethod(REQ_WAIT_FOR_JOB_CHANGE,
> +                             (job_id, fields, prev_job_info,
> +                              prev_log_serial, timeout))

Ah, I see now what you meant offline by 'removing the code from luxi'.

The problem is that any non-CLI client that uses luxi will have to
handle this by itself, which is a no-go; we don't want to require use of
cli.py always.

On the other hand, no-one is using directly luxi.WaitForJobChange....
but if we need to, in the future, then we should re-add the loop in
luxi.

Maybe adding a separate method 'CheckForJobChange', that does
self.CallMethod(), and then add in luxi itself the wrapper
'WaitForJobChange'?

iustin

Reply via email to