Re: Adding Priority Scheduling feature to the subprocess

2008-02-25 Thread Nick Craig-Wood
TimeHorse <[EMAIL PROTECTED]> wrote:
>  On Feb 22, 4:30 am, Nick Craig-Wood <[EMAIL PROTECTED]> wrote:
> > Interestingly enough this was changed in recent linux kernels.
> > Process levels in linus kernels are logarithmic now, whereas before
> > they weren't (but I wouldn't like to say exactly what!).
> 
>  Wow!  That's a VERY good point.  I ran a similar test on Windows with
>  the 'start' command which is similar to nice but you need to specify
>  the Priority Class by name, e.g.
> 
>  start /REALTIME python.exe bench1.py
> 
>  Now, in a standard operating system you'd expect some variance between
>  runs, and I did find that.  So I wrote a script to compute the Mode
>  (but not the Standard Deviation as I didn't have time for it) for each
>  Priority Class, chosen each run at random, accumulated the running
>  value for each one.  Now, when I read the results, I really wish I'd
>  computed the Chi**2 to calculate the Standard Deviation because the
>  results all appeared within very close relation to one another, as if
>  the Priority Class had overall very little effect.  In fact, I would
>  be willing to guess that say NORMAL and ABOVENORMAL lie with one
>  Standard Deviation of one another!
> 
>  That having been said, the tests all ran in about 10 seconds so it may
>  be that the process was too simple to show any statistical results.  I
>  know for instance that running ffmpeg as NORMAL or REALTIME makes a
>  sizable difference.

You need to run N x tasks at normal priority which just use up CPU, eg

  python -c "while 1: pass"

N needs to be the number of CPUs that you have.

If you don't do that then your time test prog will just run at the
full CPU speed and not test the scheduling at all!

>  So, I concede the "Unified Priority" may indeed be dead in the water,
>  but I am thinking of giving it once last go with the following
>  suggestion:
> 
>  0.0 == Zero-Page (Windows, e.g. 0) / +20 (Unix)
>  1.0 == Normal (Foreground) Priority (Windows, e.g. 9) / 0 (Unix)
>  MAX_PRIORITY == Realtime / Time Critical (Windows, e.g. 31) / -20
>  (Unix)
> 
>  With the value of MAX_PRIORITY TBD.  Now, 0.0 would still represent
>  (relatively) 0% CPU usage, but now 1.0 would represent 100% of
>  'Normal' priority.  I would still map 0.0 - 1.0 linearly over the
>  scale corresponding to the given operating system (0 - 9, Window; +20
>  - 0, Unix), but higher priorities would correspond to > 1.0 values.
> 
>  The idea here is that most user will only want to lower priority, not
>  raise it, so it makes lowering pretty intuitive.  As for the linear
>  mapping, I would leave a note in the documentation that although the
>  scale is "linear", the operating system may choose to behave as if the
>  scale is linear and that the user should consult the documentation for
>  their OS to determine specific behavior.  This is similar to the
>  documentation of the file time-stamps in os.stat, since their
>  granularity differs based on OS.  Most users, I should think, would
>  just want to make their spawn "slower" and use the scale do determine
>  "how much" in a relative fashion rather than expecting hard-and-fast
>  numbers for the actually process retardation.
> 
>  Higher than Normal priorities may OTHO, be a bit harder to deal with.
>  It strikes me that maybe the best approach is to make MAX_PRIORITY
>  operating system dependent, specifically 31 - 9 + 1.0 = +23.0 for
>  Windows and -20 - 0 + 1.0 = +21.0 for Unix.  This way, again the
>  priorities map linearly and in this case 1:1.  I think for most users,
>  they would choose a "High Priority" relative to MAX_PRIORITY or just
>  choose a small increment about 1.0 to add just a little boost.
> 
>  Of course, the 2 biggest problems with this approach are, IMHO, a) the
> < Normal scale is percent but the > Normal scale is additive.
>  However, there is no "Simple" definition of MAX_PRIORITY, so I think
>  using the OS's definition is natural. b) This use of the priority
>  scale may be confusing to Unix users, since 1.0 now represents
>  "Normal" and +21, not +/-20 represents Max Priority.  However, the
>  definition of MAX_PRIORITY would be irrelevant to the definition of
>  setPriority and getPriority, since each would, in my proposal, compute
>  for p > 1.0:
> 
>  Windows: 9 + int((p - 1) / (MAX_PRIORITY - 1) * 22 + .5)
>  Unix: -int((p - 1) / (MAX_PRIORITY - 1) * 20 + .5)
> 
>  Anyway, that's how I'd propose to do the nitty-gritty.  But, more than
>  anything, I think the subprocess 'priority' methods should use a
>  priority scheme that is easy to explain.  And by that, I propose:
> 
>  1.0 represents normal priority, 100%.  Any priority less than 1
>  represents a below normal priority, down to 0.0, the lowest possible
>  priority or 0%.  Any priority above 1.0 represents an above normal
>  priority, with MAX_PRIORITY being the highest priority level available
>  for a given os.
> 
>  Granted, it's not much simpler than 0 is normal, -20 is highest and
>  +20 is l

Re: Adding Priority Scheduling feature to the subprocess

2008-02-25 Thread TimeHorse
On Feb 22, 4:30 am, Nick Craig-Wood <[EMAIL PROTECTED]> wrote:
> Interestingly enough this was changed in recent linux kernels.
> Process levels in linus kernels are logarithmic now, whereas before
> they weren't (but I wouldn't like to say exactly what!).

Wow!  That's a VERY good point.  I ran a similar test on Windows with
the 'start' command which is similar to nice but you need to specify
the Priority Class by name, e.g.

start /REALTIME python.exe bench1.py

Now, in a standard operating system you'd expect some variance between
runs, and I did find that.  So I wrote a script to compute the Mode
(but not the Standard Deviation as I didn't have time for it) for each
Priority Class, chosen each run at random, accumulated the running
value for each one.  Now, when I read the results, I really wish I'd
computed the Chi**2 to calculate the Standard Deviation because the
results all appeared within very close relation to one another, as if
the Priority Class had overall very little effect.  In fact, I would
be willing to guess that say NORMAL and ABOVENORMAL lie with one
Standard Deviation of one another!

That having been said, the tests all ran in about 10 seconds so it may
be that the process was too simple to show any statistical results.  I
know for instance that running ffmpeg as NORMAL or REALTIME makes a
sizable difference.

So, I concede the "Unified Priority" may indeed be dead in the water,
but I am thinking of giving it once last go with the following
suggestion:

0.0 == Zero-Page (Windows, e.g. 0) / +20 (Unix)
1.0 == Normal (Foreground) Priority (Windows, e.g. 9) / 0 (Unix)
MAX_PRIORITY == Realtime / Time Critical (Windows, e.g. 31) / -20
(Unix)

With the value of MAX_PRIORITY TBD.  Now, 0.0 would still represent
(relatively) 0% CPU usage, but now 1.0 would represent 100% of
'Normal' priority.  I would still map 0.0 - 1.0 linearly over the
scale corresponding to the given operating system (0 - 9, Window; +20
- 0, Unix), but higher priorities would correspond to > 1.0 values.

The idea here is that most user will only want to lower priority, not
raise it, so it makes lowering pretty intuitive.  As for the linear
mapping, I would leave a note in the documentation that although the
scale is "linear", the operating system may choose to behave as if the
scale is linear and that the user should consult the documentation for
their OS to determine specific behavior.  This is similar to the
documentation of the file time-stamps in os.stat, since their
granularity differs based on OS.  Most users, I should think, would
just want to make their spawn "slower" and use the scale do determine
"how much" in a relative fashion rather than expecting hard-and-fast
numbers for the actually process retardation.

Higher than Normal priorities may OTHO, be a bit harder to deal with.
It strikes me that maybe the best approach is to make MAX_PRIORITY
operating system dependent, specifically 31 - 9 + 1.0 = +23.0 for
Windows and -20 - 0 + 1.0 = +21.0 for Unix.  This way, again the
priorities map linearly and in this case 1:1.  I think for most users,
they would choose a "High Priority" relative to MAX_PRIORITY or just
choose a small increment about 1.0 to add just a little boost.

Of course, the 2 biggest problems with this approach are, IMHO, a) the
< Normal scale is percent but the > Normal scale is additive.
However, there is no "Simple" definition of MAX_PRIORITY, so I think
using the OS's definition is natural. b) This use of the priority
scale may be confusing to Unix users, since 1.0 now represents
"Normal" and +21, not +/-20 represents Max Priority.  However, the
definition of MAX_PRIORITY would be irrelevant to the definition of
setPriority and getPriority, since each would, in my proposal, compute
for p > 1.0:

Windows: 9 + int((p - 1) / (MAX_PRIORITY - 1) * 22 + .5)
Unix: -int((p - 1) / (MAX_PRIORITY - 1) * 20 + .5)

Anyway, that's how I'd propose to do the nitty-gritty.  But, more than
anything, I think the subprocess 'priority' methods should use a
priority scheme that is easy to explain.  And by that, I propose:

1.0 represents normal priority, 100%.  Any priority less than 1
represents a below normal priority, down to 0.0, the lowest possible
priority or 0%.  Any priority above 1.0 represents an above normal
priority, with MAX_PRIORITY being the highest priority level available
for a given os.

Granted, it's not much simpler than 0 is normal, -20 is highest and
+20 is lowest, except in so far as it being non-intuitive to consider
a lower priority number representing a higher priority.  Certainly, we
could conform all systems to the +20.0 to -20.0 floating point system,
but I prefer not to bias the methods and honestly feel percentage is
more intuitive.

So, how does that sound to people?  Is that more palatable?

Thanks again for all the input!

Jeffrey.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Adding Priority Scheduling feature to the subprocess

2008-02-22 Thread Nick Craig-Wood
TimeHorse <[EMAIL PROTECTED]> wrote:
>  Anyway, on the one hand AmigaOS support where -128 -> p = 0.0 and +127
>  -> p = 1.0 would be a good example of why simply using a 41 point UNIX
>  scale is defecient in representing all possible priorities, but apart
>  from the support AmigaOS argument, you bring up another issue which
>  may be dangerous: are priorities linear in nature?

Interestingly enough this was changed in recent linux kernels.
Process levels in linus kernels are logarithmic now, whereas before
they weren't (but I wouldn't like to say exactly what!).

If I run two CPU intensive jobs to busy out both my CPUs, then using
kernel 2.6.22, run

$ python -c 'from time import time
t = time()
for i in xrange(1000):
pass
print time()-t'
1.36508607864

$ nice -n 10 python -c 'from time import time
t = time()
for i in xrange(1000):
pass
print time()-t'
4.27783703804

$ nice -n 20 python -c 'from time import time
t = time()
for i in xrange(1000):
pass
print time()-t'
36.9293899536

You can see that the levels are not linear! A nice 10 job gets 32% of
the CPU of a nice 0 job, but a nice 20 job gets only 4%.

I think you are on to a loser here trying to normalise it across
OSes unfortunately :-(

-- 
Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Adding Priority Scheduling feature to the subprocess

2008-02-21 Thread TimeHorse
On Feb 21, 1:17 pm, Dennis Lee Bieber <[EMAIL PROTECTED]> wrote:
>         Why imagine... AmigaOS ran -128..+127 (though in practice, one never
> went above +20 as the most time critical system processes ran at that
> level; User programs ran at 0, the Workbench [desktop] ran at +1... I
> think file system ran +5 and disk device handlers ran +10)
>
>         On the other side, VMS ran 0..31, with 16..31 being fixed realtime
> priorities, and 0..15 being variable (they don't drop below the process
> base, but can and are boosted higher the longer they are preempted)

Thanks for the info!  Actually, many peices of the WinNT kernel were
based on VMS, interestingly enough (go back to the NT 3.5 days and you
can see this) and of course Windows 2000, XP and Vista are all derived
from NT.  So I guess one long-time holdover from the VMS days of NT is
just as you say, it's still true in Windows: processes run 0..15 for
anything but realtime, even if the thread is boosted to Time Critical,
but in realtime the priorities go from 16..31.

Anyway, on the one hand AmigaOS support where -128 -> p = 0.0 and +127
-> p = 1.0 would be a good example of why simply using a 41 point UNIX
scale is defecient in representing all possible priorities, but apart
from the support AmigaOS argument, you bring up another issue which
may be dangerous: are priorities linear in nature?  For instance, a
with-Focus Windows App running Normal-Normal runs at priority 9, p ~=
0.29 (as well as likely for VMS), where as UNIX and Amiga have 0 for
normal processes, p ~= 0.50.  In many ways, the "Normal" mode being
the epoch makes sense, but this clearly cannot be done on a linear
scale.  Perhaps I should modify the PEP to instead having the generic
priorities from 0.0 to 1.0, to have them go from -1.0 to +1.0, with
0.0 considered normal priority.  Then, the negative and positive
regions can be considered linearly but not necessairily with the same
spacing since on Windows -1.0 to 0.0 spans 0 to 9 priorities and 0.0
to +1.0 spans 10 to 31 priorities.  And then, since +20 is the highest
AmigaOS priority in practice, yet the scale goes up to +127, the would
mean that from p ~= +0.16 to p ~= -1.0 you get obscenely high
priorities which do not seem practical.

I will need to think about this problem so more.  I'd hate to think
there might be a priority scale based on a Normal Distribution!  I'm
already going bald; I can't afford to loose any more hair!  :)

Anyway, thanks for info Dennis; you've given me quite a bit to think
about!

Jeffrey.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Adding Priority Scheduling feature to the subprocess

2008-02-21 Thread TimeHorse
On Feb 20, 10:15 pm, "Terry Reedy" <[EMAIL PROTECTED]> wrote:
> | Because UNIX uses priorities between +20 and -20 and Windows, via
> | Process and Thread priorities, allows settings between 0 and 31, a
> | uniform setting for each system should be derived.  This would be
> | accomplished by giving process priority in terms of a floating-point
> | value between 0.0 and 1.0 for lowest and highest possible priority,
> | respectively.
>
> I would rather that the feature use the -20 to 20 priorities and map that
> appropriately to the Windows range on Windows.

The problem as I see it is that -20 to +20 is only just over 5 bits of
precision and I can easily imagine an OS with many more than just 5
bits to specify a process priority.  Of course, the os.getpriority and
os.setpriority, being specific to UNIX, WOULD use the -20 to +20
scale, it's just the generic subprocess that would not.  But for a
generic priority, I like floating point because it gives 52 bits of
precision on most platforms.  This would allow for the most
flexibility.  Also, 0.0 to 1.0 is in some ways more intuitive to new
programmers because it can be modeled as ~0% CPU usage vs. ~100% CPU
usage, theoretically.  Users not familiar with UNIX might OTHO be
confused by the idea that a lower priority number constitutes a
"higher priority".

Of course, the scale used for p in Popen(...).setPriority(p) is really
not an important issue to me as long as it makes sense in the context
of priorities.  Given that os.setpriority and Popen(...).setPriority
have virtually the same name, it would probably be better to rename
the later to something a bit less prone to confusion.  Alternatively,
it would not be unreasonable to design setPriority (and getPriority
correspondingly) such that under UNIX it takes 1 parameter, -20 to +20
and under Windows it takes 2 parameters, second one optional, where
the Windows API priorities are directly passed to it (for getPriority,
Windows would return a Tuple pair corresponding to Priority Class and
Main Thread Priority).  However, I personally prefer a unified
definition for subprocess.py's Priority since there already is or will
be direct os-level methods to accomplish the same thing in the os-
native scale.

Anyway, thanks for the input and I will make a note of it in the PEP.
Other than the generic Property ranges, do you see any other issues
with my proposal?

Jeffrey.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Adding Priority Scheduling feature to the subprocess

2008-02-20 Thread Terry Reedy

| Because UNIX uses priorities between +20 and -20 and Windows, via
| Process and Thread priorities, allows settings between 0 and 31, a
| uniform setting for each system should be derived.  This would be
| accomplished by giving process priority in terms of a floating-point
| value between 0.0 and 1.0 for lowest and highest possible priority,
| respectively.

I would rather that the feature use the -20 to 20 priorities and map that 
appropriately to the Windows range on Windows. 



-- 
http://mail.python.org/mailman/listinfo/python-list


PEP: Adding Priority Scheduling feature to the subprocess

2008-02-20 Thread [EMAIL PROTECTED]
I am working on a PEP and would appreciate comment.  The proposal is
available at

http://python.timehorse.com/PEP_-_Application_Priority.reST

and is repeated below:

-

:PEP: 
:Title: Adding Priority Scheduling feature to the subprocess module
:Version: $Rev: 93 $
:Last Modified: $Date: 2008-02-20 08:37:00 -0500 (Wed, 20 Feb 2008) $
:Author: Jeffrey C. Jacobs
:Discussions-To: python at timehorse dot com
:Status: Draft
:Type: Standards Track
:Content-Type: text/x-rst
:Created: 19-Feb-2008
:Python-Version: 2.5.1
:Post-History: 

--

Introduction


The ``subprocess`` module is intended to create a platform-independent
method of *spawning* or *popening* or *execing* an OS-level sub-
process.  This module provides the ability to spawn processes on a
variety of platforms by creating an instance of the
``subprocess.Popen`` class.

Currently, only the Windows version of the ``Popen.__init__`` method
can set the OS priority of a spawned process directly using the
``createFlags`` parameter.  Clients operating under UNIX must use some
version of the ``preexec_fn`` to either wrap a C-call to the
``setpriority`` POSIX function or do so by prepending ``nice `` to
the ``args`` or in the ``executable``, which would only work for sub-
processes spawned from a shell.  Under a Windows environment, however,
the ``preexec_fn`` parameter is ignored and there is no ``nice`` shell
program.  This thus precludes the easy setting of a spawned process's
priority in the UNIX environment, as well as creates a disconnect in
terms of the way priorities are set between Windows and UNIX
platforms, which somewhat defeats the purpose of ``subprocess.py`` in
terms of being relatively platform-independent.

In addition to these issues, because Windows processes are referenced
by handles, not by process IDs, although this would divide the UNIX-
specific and Windows specific code, the Handle for the windows would
none the less be useful in the ``win32process`` module and therefore
it is proposed that this handle be officially exposed from the
``Popen`` instance via the name ``handle``.  Currently, this property
is unofficially accessible via the name ``_handle``, so this proposes
to rename that value and make the name official for python running
under Windows.  The value of ``handle`` for other platforms would then
be ``None``.  In addition to officially exposing the Handle to
Process, it is proposed that the Main Thread Handle, also internally
exposed within the ``Popen.__init__`` process, be externally exposed
via the name ``hthread`` for use by ``win32process`` and the priority
methods.

Because UNIX uses priorities between +20 and -20 and Windows, via
Process and Thread priorities, allows settings between 0 and 31, a
uniform setting for each system should be derived.  This would be
accomplished by giving process priority in terms of a floating-point
value between 0.0 and 1.0 for lowest and highest possible priority,
respectively.  This priority would be added to the ``Popen``
constructor as an optional parameter, with a default value of
``None``.  If and only if something other than ``None`` is set, that
value would override any setting in ``createFlags``, and the Process
Priority flags of ``createFlags`` would be masked off under the
Windows environment.  In all cases a non-``None`` setting would cause
the spawned process to run at the indicated priority converted to the
platform-specific units.

The Macintosh OS X / BSD subsystem should be compatible with the UNIX-
specific sections of this proposal.

Specifics
~

The following steps would accomplish this proposal:

1) Add a C-Python wrapper for the POSIX ``getpriority`` /
``setpriority`` methods to the ``os`` module.  These methods would
only be available in a UNIX environment and would mesh well with the
existing file/process functions specified therein.  The
``win32process`` module already exposes the corresponding Windows
methods.

2) Officially expose the Process Handle as ``handle`` and the
Process's Main Thread Handle as ``hThread`` within the ``Popen``
instance interface.

3) Add ``setPriority`` / ``getPriority`` methods to the ``Popen``
class which will use the platform-specific priority function either
from ``os`` for UNIX or ``win32process`` for Windows. The input
priority to ``setPriority`` will be checked and raise a ``ValueError``
exception for values not in the range 0.0 to 1.0.

   The 0.0 to 1.0 scale will be converted to the 32-point Windows
scale by linear mapping, e.g. for a floating-point priority *p*, the
Windows priority is given by:

   ``int(p*31 + .5)``

   Windows is limited to a non-linear, 7-point scale for process
priorities, with thread priorities making up the difference relative
to the 32-point internal priority scale.  Thus, the Windows versions
of these functions will use a combination of ``SetPriorityClass`` and
``SetThreadPriority`` in order to set the requested process priority,
using the ``handle`` and