Re: python reliability with EINTR handling in general modules

2012-02-02 Thread Mel Wilson
Dennis Lee Bieber wrote:

 On Wed, 1 Feb 2012 23:25:36 -0800 (PST), oleg korenevich
 void.of.t...@gmail.com wrote:
 
 
Thanks for help. In first case all vars is python integers, maybe
math.floor is redundant, but i'm afraid that same error with math
module call will occur in other places of app, where math is needed.
Strange thing here is that math library call is not a system call, and
strange exception ValueError (all values have right values) and why in
braces i have (4, Interruted system call).

 math.floor() may still be a system call of some sort if access to
 the math processor requires synchronization between processes (that is,
 the math processor/registers are maintained as a separate structure
 apart from the task status during process switches). {Yes -- that is a
 wild hypothesis}

One thing to remember about errno is that C library code will set it to a 
non-zero value when an error is encountered, but (I believe) there's no 
requirement to clear it in the absence of an error.  EINTR might just be 
left over from some long-gone I/O call, then reported just in case in 
handling an exception that didn't involve the C library at all.

As a C coder there are times when it's wise to clear errno yourself to make 
sure your code doesn't get fooled.

Mel.

-- 
http://mail.python.org/mailman/listinfo/python-list


python reliability with EINTR handling in general modules

2012-02-01 Thread oleg korenevich
I have linux board on samsung SoC s3c6410 (ARM11). I build rootfs with
buildroot: Python 2.7.1, uClibc-0.9.31. Linux kernel: Linux buildroot
2.6.28.6 #177 Mon Oct 3 12:50:57 EEST 2011 armv6l GNU/Linux

My app, written on python, in some mysterios conditons raise this
exceptions:

1) exception:

 File ./dfbUtils.py, line 3209, in setItemData
ValueError: (4, 'Interrupted system call')
code:

currentPage=int(math.floor(float(rowId)/
self.pageSize))==self.selectedPage
2) exception:

File ./terminalGlobals.py, line 943, in getFirmawareName
OSError: [Errno 4] Interrupted system call: 'firmware'
code:

for fileName in os.listdir('firmware'):
Some info about app: it have 3-7 threads, listen serial ports via
'serial' module, use gui implemented via c extension that wrap
directfb, i can't reproduce this exceptions, they are not predictable.

I googled for EINTR exceptions in python, but only found that EINTR
can occur only on slow system calls and python's modules socket,
subprocess and another one is already process EINTR. So what happens
in my app? Why simple call of math function can interrupt program at
any time, it's not reliable at all. I have only suggestions: ulibc
bug, kernel/hw handling bug. But this suggestions don't show me
solution.

Now i created wrap functions (that restart opertion in case of EINTR)
around some functions from os module, but wrapping math module will
increase execution time in 2 times. There another question: if math
can be interrutped than other module also can and how to get
reliability?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python reliability with EINTR handling in general modules

2012-02-01 Thread oleg korenevich
On Feb 1, 6:07 pm, Dennis Lee Bieber wlfr...@ix.netcom.com wrote:
 On Wed, 1 Feb 2012 06:15:22 -0800 (PST), oleg korenevich









 void.of.t...@gmail.com wrote:
 I have linux board on samsung SoC s3c6410 (ARM11). I build rootfs with
 buildroot: Python 2.7.1, uClibc-0.9.31. Linux kernel: Linux buildroot
 2.6.28.6 #177 Mon Oct 3 12:50:57 EEST 2011 armv6l GNU/Linux

 My app, written on python, in some mysterios conditons raise this
 exceptions:

 1) exception:

  File ./dfbUtils.py, line 3209, in setItemData
 ValueError: (4, 'Interrupted system call')
 code:

 currentPage=int(math.floor(float(rowId)/
 self.pageSize))==self.selectedPage
 2) exception:

 File ./terminalGlobals.py, line 943, in getFirmawareName
 OSError: [Errno 4] Interrupted system call: 'firmware'
 code:

 for fileName in os.listdir('firmware'):
 Some info about app: it have 3-7 threads, listen serial ports via
 'serial' module, use gui implemented via c extension that wrap
 directfb, i can't reproduce this exceptions, they are not predictable.

 I googled for EINTR exceptions in python, but only found that EINTR
 can occur only on slow system calls and python's modules socket,
 subprocess and another one is already process EINTR. So what happens
 in my app? Why simple call of math function can interrupt program at
 any time, it's not reliable at all. I have only suggestions: ulibc
 bug, kernel/hw handling bug. But this suggestions don't show me
 solution.

         I see nothing in your traceback that indicates that the interrupt
 occurred in the math library call -- unless you deleted that line. In
 the first one, I'd be more likely to suspect your C extension/wrapper...
 (are the fields .pageSize and .selectedPage coming from an object
 implemented in C?)

         As for the math stuff... I presume both rowID and .pageSize are
 constrained to be 0 or positive integers. If that is the case, invoking
 math.floor() is just redundant overhead as the documented behavior of
 int() is to truncate towards 0, which for a positive value, is the same
 as floor()



  neg = -3.141592654
  pos = 3.141592654
  int(neg)
 -3
  math.floor(neg)
 -4.0
  int(pos)
 3
  math.floor(pos)
 3.0

         In the second case... Well, os.listdir() is most likely translated
 into some operating system call.

 http://www.gnu.org/software/libc/manual/html_node/Interrupted-Primiti...

 And, while that call is waiting for I/O to complete, some sort of signal
 is being received.
 --
         Wulfraed                 Dennis Lee Bieber         AF6VN
         wlfr...@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

Thanks for help. In first case all vars is python integers, maybe
math.floor is redundant, but i'm afraid that same error with math
module call will occur in other places of app, where math is needed.
Strange thing here is that math library call is not a system call, and
strange exception ValueError (all values have right values) and why in
braces i have (4, Interruted system call).

For second case: if python really does some slow system call from
module os, why it doesn't handle EINTR and not restart call. Is
SA_RESTART flag in signal can be solution? But how i can set this
flag? By placing flag for signal handler in c extension (or ctypes
manipulation)?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-14 Thread Paul Rubin
[EMAIL PROTECTED] writes:
 Folks, most common GC schemes have been tried as experiments over
 the years.  None have succeeeded, for various reasons.  I think one
 of the main reasons is that Python has to play nice with external
 libraries, many of which weren't written with GC beyond malloc and
 free in mind.

Most GC'd language implementations I've seen actually have friendlier
FFI's (foreign function interfaces) than Python does.  I find it very
easy to make errors with the reference counts in what little Python C
extension hacking I've done.  Maybe one gets better at it with
experience.

 Tagged integers: 
 http://mail.python.org/pipermail/python-dev/2004-July/046139.html

That actually says there was a speedup from tagged ints with no
obvious counterbalancing cost.  However, it looks like there's no
traditional GC, it stays with the same reference counts for heap objects.

 Boehm GC:
 http://mail.python.org/pipermail/python-dev/2005-January/051370.html

This mentions use of Python idioms like txt = open(filename).read()
expecting the file handle to get freed immediately.  I think there's
been some language extensions (with statement) discussed to replace
that idiom.  Note that the idiom is already not so safe in Jython.

The Boehm GC doesn't seem like the right choice for an interpreter
being built from scratch anyway, but who knows.  Maybe I'm being
premature but I'm already thinking of CPython as a legacy
implementation and PyPy as the more likely testbed for stuff like
this.  I sorta wanted to go to the current sprint but oh well.

 http://wiki.python.org/moin/CodingProjectIdeas/PythonGarbageCollected

This correctly describes difficulties of using a copying GC in
CPython.  Note that the Boehm GC is mark-and-sweep.  As Alex mentions,
that usually means there's a pause every so often while the GC scans
the entire heap, touching all data both live and dead (maybe the Boehm
GC got around this somehow).

I haven't been keeping up with this stuff in recent years so I have a
worse concern.  I don't know whether it's founded or not.  Basically
in the past decade or so, memory has gotten 100x larger and cpu's have
gotten 100x faster, but memory is less than 10x faster once you're out
of the cpu cache.  The mark phase of mark/sweep tends to have a very
random access pattern (at least for Lisp).  In the old days that
wasn't so bad, since a random memory access took maybe a couple of cpu
cycles, but today it takes hundreds of cycles.  So for applications
that use a lot of memory, simple mark/sweep may be a much worse dog
than it was in the Vax era, even if you don't mind the pauses.

 Miscellaneous:
   http://mail.python.org/pipermail/python-dev/2002-June/026032.html

That looks potentially kind of neat, though it's not obvious what it does.

   http://mail.python.org/pipermail/python-dev/2003-November/040299.html

I'll try to read that sometime out of general interest.  I'm too
sleepy right now.

 And lest anyone here think they were the first to suggest getting rid of
 reference counting in Python:
 
 http://www.python.org/search/hypermail/python-1993/0554.html

I think the language suggestions in that message are sound in
principle, if not quite the right thing in every detail.  Lisp went
through about the same evolution in the 60's or 70's (before my time).

With PyPy, it looks like Python implementation methodology is getting
a lot more advanced, so hopefully some big performance gains are
possible, and the language will likely evolve to help things along.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-14 Thread Diez B. Roggisch
 Yeah, I noticed that, I could have been pedantic about it but chose to
 just describe how these language implementations work in the real
 world with zero exceptions that I know of.  I guess I should have
 spelled it out.

You talked about CPU architectures:


 And this presumes an architecture which byte-addresses and only
  uses aligned addresses.


Yes, that would describe just about every cpu for the past 30 years
that's a plausible Python target.


And regarding the zero exceptions - I know for sure that quite a few 
programs were crashing when the transition in 68K from 24 bit addresses 
to real 32 bit was done on popular systems like the ATARI ST - as some 
smart-asses back then used the MSByte for additional parameter space.

I can't say though if that was only in assembler writen code - quite 
popular back then even for larger apps - or a compiler optimization. I 
do presume the former.


Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-14 Thread Mike Meyer
Diez B. Roggisch [EMAIL PROTECTED] writes:
 And regarding the zero exceptions - I know for sure that quite a few
 programs were crashing when the transition in 68K from 24 bit
 addresses to real 32 bit was done on popular systems like the ATARI ST
 - as some smart-asses back then used the MSByte for additional
 parameter space.

 I can't say though if that was only in assembler writen code - quite
 popular back then even for larger apps - or a compiler optimization. I
 do presume the former.

Being one of the smart-asses in question, I'll say that in some cases
the answer was neither. It wasn't hard to convince a C compiler to
let you fool with the high-order bits of a pointer. I don't think any
of the Amiga compilers would do that on purpose, though.

   mike
-- 
Mike Meyer [EMAIL PROTECTED]  http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-14 Thread [EMAIL PROTECTED]
Paul Rubin wrote:
 This correctly describes difficulties of using a copying GC in
 CPython.  Note that the Boehm GC is mark-and-sweep.  As Alex mentions,
 that usually means there's a pause every so often while the GC scans
 the entire heap, touching all data both live and dead (maybe the Boehm
 GC got around this somehow).

From the docs:

void GC_enable_incremental(void)
Cause the garbage collector to perform a small amount of work every few
invocations of GC_MALLOC or the like, instead of performing an entire
collection at once. This is likely to increase total running time. It
will improve response on a platform that either has suitable support in
the garbage collector (Linux and most Unix versions, win32 if the
collector was suitably built) or if stubborn allocation is used (see
gc.h).

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Diez B. Roggisch
Delaney, Timothy (Tim) wrote:
 Tom Anderson wrote:
 
 
Except that in smalltalk, this isn't true: in ST, every variable
*appears* to contain a reference to an object, but implementations
may not actually work like that. In particular, SmallTalk 80 (and
some earlier smalltalks, and all subsequent smalltalks, i think)
handles small integers (those that fit in wordsize-1 bits)
differently: all variables contain a word, whose bottom bit is a tag
bit; if it's one, the word is a genuine reference, and if it's zero,
the top bits of the word contain a signed integer.
 
 
 This type of implementation has been discussed on python-dev. IIRC it
 was decided by Guido that unless anyone wanted to implement it and show
 a significant performance advantage without any regressions on any
 platform, it wasn't worth it.

AFAIK some LISPs do a similar trick to carry int values on cons-cells. 
And by this tehy reduce integer precision to 28 bit or something. Surely 
_not_ going to pass a regression test suite :)

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Paul Rubin
Diez B. Roggisch [EMAIL PROTECTED] writes:
 AFAIK some LISPs do a similar trick to carry int values on
 cons-cells. And by this tehy reduce integer precision to 28 bit or
 something. Surely _not_ going to pass a regression test suite :)

Lisps often use just one tag bit, to distinguish between an immediate
object and a heap object.  With int/long unification, Python shouldn't
be able to tell the difference between an immediate int and a heap int.

I seem to remember that KCL (and maybe GCL/AKCL) uses heap-consed ints
just like Python does.  It doesn't seem to be disastrous.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Diez B. Roggisch
Paul Rubin wrote:
 Diez B. Roggisch [EMAIL PROTECTED] writes:
 
AFAIK some LISPs do a similar trick to carry int values on
cons-cells. And by this tehy reduce integer precision to 28 bit or
something. Surely _not_ going to pass a regression test suite :)
 
 
 Lisps often use just one tag bit, to distinguish between an immediate
 object and a heap object.  With int/long unification, Python shouldn't
 be able to tell the difference between an immediate int and a heap int.

That particular implementation used 3 or 4 tag-bits. Of course you are 
right that nowadays python won't notice the difference, as larger nums 
get implicitely converted to a suitable representation. But then the 
efficiency goes away... Basically I think that trying to come up with 
all sorts of optimizations for rather marginal problems (number 
crunching should be - if a python domain at all - done using Numarray) 
simply distracts and complicates the code-base. Speeding up dictionary 
lookups OTOH would have a tremendous impact (and if I'm n ot mistaken 
was one of the reasons for the 30% speed increase  between 2.2 and 2.3)

DIEZ
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Fredrik Lundh
Tom Anderson wrote:

 In both smalltalk and python, every single variable contains a reference
 to an object - there isn't the object/primitive distinction you find in
 less advanced languages like java.

 Except that in smalltalk, this isn't true: in ST, every variable *appears*
 to contain a reference to an object, but implementations may not actually
 work like that.

Python implementations don't have to work that way either.  Please don't
confuse the Python language with the CPython implementation and with
other implementations (existing as well as hypothetical).

(fwiw, switching to tagging in CPython would break most about everything.
might as well start over, and nobody's likely to do that to speed up integer-
dominated programs a little...)

/F 



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Paul Rubin
Diez B. Roggisch [EMAIL PROTECTED] writes:
 That particular implementation used 3 or 4 tag-bits. Of course you are
 right that nowadays python won't notice the difference, as larger nums
 get implicitely converted to a suitable representation. But then the
 efficiency goes away... Basically I think that trying to come up with
 all sorts of optimizations for rather marginal problems (number
 crunching should be - if a python domain at all - done using Numarray)

I don't think it's necessarily marginal.  Tagged ints can be kept in
registers, which means that even the simplest code that does stuff
with small integers becomes a lot more streamlined, easing the load on
both the Python GC and the cpu's memory cache.  Right now with the
bytecode interpreter, it probably doesn't matter, but with Pypy
generating native machine code, this kind of thing can make a real
difference.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Paul Rubin
Fredrik Lundh [EMAIL PROTECTED] writes:
 (fwiw, switching to tagging in CPython would break most about
 everything.  might as well start over, and nobody's likely to do
 that to speed up integer- dominated programs a little...)

Yeah, a change of that magnitude in CPython would be madness, but
the question is well worth visiting for PyPy.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Steve Holden
Paul Rubin wrote:
 Diez B. Roggisch [EMAIL PROTECTED] writes:
 
That particular implementation used 3 or 4 tag-bits. Of course you are
right that nowadays python won't notice the difference, as larger nums
get implicitely converted to a suitable representation. But then the
efficiency goes away... Basically I think that trying to come up with
all sorts of optimizations for rather marginal problems (number
crunching should be - if a python domain at all - done using Numarray)
 
 
 I don't think it's necessarily marginal.  Tagged ints can be kept in
 registers, which means that even the simplest code that does stuff
 with small integers becomes a lot more streamlined, easing the load on
 both the Python GC and the cpu's memory cache.  Right now with the
 bytecode interpreter, it probably doesn't matter, but with Pypy
 generating native machine code, this kind of thing can make a real
 difference.

Until someone does the experiment this stuff is bound to be speculation 
(what's that saying about premature optimization?). But I can foresee 
that there'd be problems at the outer edges of the language: for 
example, sys.maxint would have to be reduced, and this in turn would 
lead to reduction in, for example, the theoretical maximum length of 
sequences.

Even if it reduced the average execution time of the average program, 
this will involve trade-offs which can only be fully appreciated in the 
light of practical experience.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006  www.python.org/pycon/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Paul Rubin
Steve Holden [EMAIL PROTECTED] writes:
 Until someone does the experiment this stuff is bound to be
 speculation (what's that saying about premature optimization?). 

40 years of practical Lisp implementation efforts and around the globe
and hundreds of published papers on the subject might not be directly
Python-specific, but they're not what I'd call a total vacuum of
experimental results.

 But I can foresee that there'd be problems at the outer edges of the
 language: for example, sys.maxint would have to be reduced, and this
 in turn would lead to reduction in, for example, the theoretical
 maximum length of sequences.

if we're talking about 1 tag bit, sys.maxint would be 2**30-1 at the
lowest, which means the objects in the sequence would have to be
smaller than 4 bytes each if more than sys.maxint of them are supposed
to fit in a 32-bit address space.  Since we're using 4-byte pointers,
that can't happen.  We may have a worse problem by running out of
virtual address space if we use a copying GC.  Of course, on a 64-bit
cpu, this all becomes irrelevant.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Alex Martelli
Tom Anderson [EMAIL PROTECTED] wrote:

 On Tue, 11 Oct 2005, Alex Martelli wrote:
 
  Tom Anderson [EMAIL PROTECTED] wrote:
...
  Has anyone looked into using a real GC for python? I realise it would be a
 
  If you mean mark-and-sweep, with generational twists,
 
 Yes, more or less.
 
  that's what gc uses for cyclic garbage.
 
 Do you mean what python uses for cyclic garbage? If so, i hadn't realised

Yes, gc (a standard library module) gives you access to the mechanism
(to some reasonable extent).

 that. There are algorithms for extending refcounting to cyclic structures
 (i forget the details, but you sort of go round and experimentally 
 decrement an object's count and see it ends up with a negative count or
 something), so i assumed python used one of those. Mind you, those are
 probably more complex than mark-and-sweep!

Not sure about that, when you consider the generational twists, but
maybe.


  lot more complexity in the interpreter itself, but it would be faster,
  more reliable, and would reduce the complexity of extensions.
 
  ???  It adds no complexity (it's already there), it's slower,
 
 Ah. That would be why all those java, .net, LISP, smalltalk and assorted
 other VMs out there, with decades of development, hojillions of dollars
 and the serried ranks of some of the greatest figures in computer science
 behind them all use reference counting rather than garbage collection,
 then.
 
 No, wait ...

Not everybody agrees that practicality beats purity, which is one of
Python's principles.  A strategy based on PURE reference counting just
cannot deal with cyclic garbage -- you'd also need the kind of kludges
you refer to above, or a twin-barreled system like Python's.  A strategy
based on PURE mark-and-sweep *CAN* be complete and correct... at the
cost of horrid delays, of course, but what's such a practical
consideration to a real purist?-)

In practice, more has probably been written about garbage collection
implementations than about almost every issue in CS (apart from sorting
and searching;-).  Good techniques need to be incremental -- the need
to stop the world for unbounded amounts of time (particularly in a
paged virtual memory world...), typical of pure ms (even with
generational twists), is simply unacceptable in all but the most batch
type of computations, which occupy a steadily narrowing niche.
Reference counting is intrinsically reasonably incremental; the
worst-case of very long singly-linked lists (such that a dec-to-0 at the
head causes a cascade of N dec-to-0's all along) is as rare in Python as
it is frequent in LISP (and other languages that go crazy with such
lists -- Haskell, which defines *strings* as single linked lists of
characters, being a particularly egregious example) [[admittedly, the
techniques for amortizing the cost of such worst-cases are well known in
any case, though CPython has not implemented them]].

In any case, if you like Python (which is a LANGUAGE, after all) and
don't like one implementation of it, why not use a different
implementation, which uses a different virtual machine?  Jython, for the
JVM, and IronPython, for MSCLR (presumably what you call .net), are
quite usable; project pypy is producing others (an implementation based
on Common LISP was one of the first practical results, over a year ago);
not to count Parrot, and other projects yet...


  it is, if anything, LESS reliable than reference counting (which is way
  simpler!),
 
 Reliability is a red herring - in the absence of ill-behaved native 
 extensions, and with correct implementations, both refcounting and GC are
 perfectly reliable. And you can rely on the implementation being correct,
 since any incorrectness will be detected very quickly!

Not necessarily: tiny memory leaks in supposedly stable versions of
the JVM, for example, which get magnified in servers operating for
extremely long times and on very large scales, keep turning up.  So, you
can't count on subtle and complicated implementations of garbage
collection algorithms being correct, any more than you can count on that
for (for example) subtle and complicated optimizations -- corner cases
can be hidden everywhere.

There are two ways to try to make a software system reliable: make it so
simple that it obviously has no bugs, or make it so complicated that it
has no obvious bugs.  RC is definitely tilted towards the first of the
two options (and so would be mark-and-sweep in the pure form, the one
where you may need to stop everything for a LONG time once in a while),
while more sophisticated GC schemes get more and more complicated.

BTW, RC _IS_ a form of GC, just like, say, MS is.


  and (if generalized to deal with ALL garbage) it might make it almost
  impossible to write some kinds of extensions (ones which need to 
  interface existing C libraries that don't cooperate with whatever GC
  collection you choose).
 
 Lucky those existing C libraries were written to use python's refcounting!
 
 Oh, you have to 

Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Scott David Daniels
Paul Rubin wrote:
 Diez B. Roggisch [EMAIL PROTECTED] writes:
(about tag bits)
 ... Basically I think that trying to come up with all sorts of
  optimizations for rather marginal problems (number crunching
  should be - if a python domain at all - done using Numarray)
 I don't think it's necessarily marginal.  Tagged ints can be kept in
 registers, which means that even the simplest code that does stuff
 with small integers becomes a lot more streamlined, easing the load on
 both the Python GC and the cpu's memory cache
But the cost on modern computers is more problematic to characterize.
Current speeds are due to deep pipelines, and a conditional in the
INCREF code would blow a pipeline.  On machines with a conditional
increment instruction (and a C (or whatever) compiler clever enough to
use it, saving the write saves dirty cache in the CPU, but most of
today's CPU/compiler combos will flush the pipeline, killing a number
of pending instructions.

 Right now with the bytecode interpreter, it probably doesn't matter,
 but with Pypy generating native machine code, this kind of thing can
 make a real difference.
You are right that Pypy is the place to experiment with all of this.
That project holds a lot of promise for answering questions that seem
to otherwise degenerate into Jane, you ignorant slut (for non-US
readers, this is a reference to an old Saturday Night Live debate
skit where the debate always degenerated into name-calling).

--Scott David Daniels
[EMAIL PROTECTED]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Paul Rubin
Scott David Daniels [EMAIL PROTECTED] writes:
 Current speeds are due to deep pipelines, and a conditional in the
 INCREF code would blow a pipeline.

I think most of the time, branch prediction will prevent the cache
flush.  Anyway, with consed integers, there's still going to be a
conditional or even a dispatch on the tag field.  Finally, when you
know you're dealing with small integers, you don't need to even check
the tags.  Some implementations use the low order bits as tags with a
tag==0 meaning an integer.  That means if you have two tagged integers,
you can add and subtract them without having to twiddle the tags.

The alternative (tag==1 means integer) means you don't have to mask
off the tag bits to dereference pointers, and you can still add a
constant to a tagged int by simply adjusting the constant
appropriately.  E.g., with one tag bit, to increment you'd add 2 to
the tagged int.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread skip

Diez AFAIK some LISPs do a similar trick to carry int values on
Diez cons-cells.  And by this tehy reduce integer precision to 28 bit
Diez or something. Surely _not_ going to pass a regression test suite
Diez :)

I'm pretty sure this was tried a few years ago w/ Python.  I don't recall
the results, but I'm pretty sure they weren't good enough.  had they been we
could just look at the source.

Folks, most common GC schemes have been tried as experiments over the years.
None have succeeeded, for various reasons.  I think one of the main reasons
is that Python has to play nice with external libraries, many of which
weren't written with GC beyond malloc and free in mind.

Here are some pointers interested readers might want to check out:

Tagged integers: 
http://mail.python.org/pipermail/python-dev/2004-July/046139.html

Boehm GC:
http://mail.python.org/pipermail/python-dev/2005-January/051370.html
http://www.python.org/doc/faq/general.html#how-does-python-manage-memory
http://wiki.python.org/moin/CodingProjectIdeas/PythonGarbageCollected

Miscellaneous:
http://mail.python.org/pipermail/python-dev/2002-June/026032.html
http://mail.python.org/pipermail/python-dev/2003-November/040299.html

And lest anyone here think they were the first to suggest getting rid of
reference counting in Python:

http://www.python.org/search/hypermail/python-1993/0554.html

I wouldn't be surprised if there were even earlier suggestions...

Skip
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Scott David Daniels
Paul Rubin wrote:
 Scott David Daniels [EMAIL PROTECTED] writes:
 
Current speeds are due to deep pipelines, and a conditional in the
INCREF code would blow a pipeline.
 
 
 I think most of the time, branch prediction will prevent the cache
 flush.
But, branch prediction is usually a compiler thing, based on code
that is, in this case, a spot in the interpreter that is actually
taking both sides of the branch quite often.  If you split the
interpreter to have a presumed int side, you might do OK, but
that is not how the code works at the moment.

 Some implementations use the low order bits as tags with a tag==0
 meaning an integer.  That means if you have two tagged integers,
 you can add and subtract them without having to twiddle the tags.
You can do so only once you discover you have two tagged integers.
The test for tagged integers (rather, the subsequent branch) is the
thing that blows the pipe.

 The alternative (tag==1 means integer) means you don't have to mask
 off the tag bits to dereference pointers, and you can still add a
 constant to a tagged int by simply adjusting the constant
 appropriately.  
And this presumes an architecture which byte-addresses and only
uses aligned addresses.


--Scott David Daniels
[EMAIL PROTECTED]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Paul Rubin
Scott David Daniels [EMAIL PROTECTED] writes:
  I think most of the time, branch prediction will prevent the cache
  flush.
 But, branch prediction is usually a compiler thing, based on code
 that is, in this case, a spot in the interpreter that is actually
 taking both sides of the branch quite often.  If you split the
 interpreter to have a presumed int side, you might do OK, but
 that is not how the code works at the moment.

Yes, I'm hypothesizing a native-code compiler as might come out of
the PyPy project.

  if you have two tagged integers,
  you can add and subtract them without having to twiddle the tags.
 You can do so only once you discover you have two tagged integers.
 The test for tagged integers (rather, the subsequent branch) is the
 thing that blows the pipe.

I mean in cases where the compiler can determine that it's dealing
with small ints.  That might be harder in Python than in Lisp, given
how dynamic Python is.  

  The alternative (tag==1 means integer) means you don't have to mask
  off the tag bits to dereference pointers, and you can still add a
  constant to a tagged int by simply adjusting the constant
  appropriately.
 And this presumes an architecture which byte-addresses and only
 uses aligned addresses.

Yes, that would describe just about every cpu for the past 30 years
that's a plausible Python target.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Diez B. Roggisch
 Yes, that would describe just about every cpu for the past 30 years
 that's a plausible Python target.

No. The later 68K (68020) could address on odd adresses. And AFAIK all 
x86 can because of their 8080 stemming.

Don't confuse this with 16Bit aligned addressing - _that_ is the minimum 
for years, and of course doing something like

move.l #1, d0   // move 4 byte ad immediate address 1 into data register 0

is a nightmare on runtime - but working, nontehless.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Paul Rubin
Diez B. Roggisch [EMAIL PROTECTED] writes:
  Yes, that would describe just about every cpu for the past 30 years
  that's a plausible Python target.
 
 No. The later 68K (68020) could address on odd adresses. And AFAIK
 all x86 can because of their 8080 stemming.

Yes, could but not does in terms of what any reasonable actual
compiler implmementations do.  You get a huge performance hit for
using unaligned data.  The one exception is you could conceivably have
character strings starting at odd addresses but it's no big deal to
start them all on 4-byte (or even 2-byte) boundaries.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Steve Holden
Paul Rubin wrote:
 Diez B. Roggisch [EMAIL PROTECTED] writes:
 
Yes, that would describe just about every cpu for the past 30 years
that's a plausible Python target.

No. The later 68K (68020) could address on odd adresses. And AFAIK
all x86 can because of their 8080 stemming.
 
 
 Yes, could but not does in terms of what any reasonable actual
 compiler implmementations do.  You get a huge performance hit for
 using unaligned data.  The one exception is you could conceivably have
 character strings starting at odd addresses but it's no big deal to
 start them all on 4-byte (or even 2-byte) boundaries.

You made your original assertion in response to Diez saying

 And this presumes an architecture which byte-addresses and only
  uses aligned addresses.

He was talking about the arachiteecture, for Pete's sake, not a compiler.

I personally find it irritating that you continue to try and justify 
your assertions even when you are plainly on shaky ground, which tends 
to make the threads you are involved in endless. Sorry for the ad 
hominem remarks, which I normally try and avoid, but this (ab)uses 
newsgroup bandwidth unnecessarily.

Unwillingness to admit any mistake can be rather unattractive. Can't 
you, just once, say I was wrong? Or are you perchance related to 
President Bush? Better still, just let these small things go without 
further comment.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006  www.python.org/pycon/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-13 Thread Paul Rubin
Steve Holden [EMAIL PROTECTED] writes:
  And this presumes an architecture which byte-addresses and only
   uses aligned addresses.
 
 He was talking about the arachiteecture, for Pete's sake, not a compiler.

Yeah, I noticed that, I could have been pedantic about it but chose to
just describe how these language implementations work in the real
world with zero exceptions that I know of.  I guess I should have
spelled it out.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-12 Thread Jorgen Grahn
On Mon, 10 Oct 2005 20:37:03 +0100, Tom Anderson [EMAIL PROTECTED] wrote:
 On Mon, 10 Oct 2005, it was written:
...
 There is no way you can avoid making garbage.  Python conses everything, 
 even integers (small positive ones are cached).

 So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar? 

If the SmallInteger hack is something like this, it does:

 a = 42
 b = 42
 a is b
True
 a = 42000
 b = 42000
 a is b
False


... which I guess is what if referred to above as small positive
ones are cached.

/Jorgen

-- 
  // Jorgen Grahn jgrahn@   Ph'nglui mglw'nafh Cthulhu
\X/algonet.se   R'lyeh wgah'nagl fhtagn!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-12 Thread Peter Hansen
John Waycott wrote:
 I wrote a simple Python program that acts as a buffer between a 
 transaction network and a database server, writing the transaction logs 
 to a file that the database reads the next day for billing. The simple 
 design decoupled the database from network so it wasn't stresed during 
 high-volume times. The two systems (one for redundancy) that run the 
 Python program have been running for six years.

Six years?  With no downtime at all for the server?  That's a lot of 
9s of reliability...

Must still be using Python 1.5.2 as well...

-Peter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-12 Thread Tom Anderson
On Wed, 12 Oct 2005, Jorgen Grahn wrote:

 On Mon, 10 Oct 2005 20:37:03 +0100, Tom Anderson [EMAIL PROTECTED] wrote:
 On Mon, 10 Oct 2005, it was written:
 ...
 There is no way you can avoid making garbage.  Python conses everything,
 even integers (small positive ones are cached).

 So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar?

 If the SmallInteger hack is something like this, it does:

 a = 42
 b = 42
 a is b
 True
 a = 42000
 b = 42000
 a is b
 False


 ... which I guess is what if referred to above as small positive
 ones are cached.

That's not what i meant.

In both smalltalk and python, every single variable contains a reference 
to an object - there isn't the object/primitive distinction you find in 
less advanced languages like java.

Except that in smalltalk, this isn't true: in ST, every variable *appears* 
to contain a reference to an object, but implementations may not actually 
work like that. In particular, SmallTalk 80 (and some earlier smalltalks, 
and all subsequent smalltalks, i think) handles small integers (those that 
fit in wordsize-1 bits) differently: all variables contain a word, whose 
bottom bit is a tag bit; if it's one, the word is a genuine reference, and 
if it's zero, the top bits of the word contain a signed integer. The 
innards of the VM know about this (where it matters), and do the right 
thing. All this means that small (well, smallish - up to a billion!) 
integers can be handled with zero heap space and much reduced instruction 
counts. Of course, it means that references are more expensive, since they 
have to be checked for integerness before dereferencing, but since this is 
a few instructions at most, and since small integers account for a huge 
fraction of the variables in most programs (as loop counters, array 
indices, truth values, etc), this is a net win.

See the section 'Representation of Small Integers' in:

http://users.ipa.net/~dwighth/smalltalk/bluebook/bluebook_chapter26.html#TheObjectMemory26

The precise implementation is sneaky - the tag bit for an integer is zero, 
so in many cases you can do arithmetic directly on the word, with a few 
judicious shifts here and there; the tag bit for a pointer is one, and the 
pointer is stored in two's-complement form *with the bottom bit in the 
same place as the tag bit*, so you can recover a full-length pointer from 
the word by complementing the whole thing, rather than having to shift. 
Since pointers are word-aligned, the bottom bit is always a zero, so in 
the complement it's always a one, so it can also be the status bit!

I think this came from LISP initially (most things do) and was probably 
invented by Guy Steele (most things were).

tom

-- 
That's no moon!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-12 Thread Tom Anderson
On Mon, 10 Oct 2005, it was written:

 Tom Anderson [EMAIL PROTECTED] writes:

 Has anyone looked into using a real GC for python? I realise it would 
 be a lot more complexity in the interpreter itself, but it would be 
 faster, more reliable, and would reduce the complexity of extensions.

 The next PyPy sprint (this week I think) is going to focus partly on GC.

Good stuff!

 Hmm. Maybe it wouldn't make extensions easier or more reliable. You'd 
 still need some way of figuring out which variables in C-land held 
 pointers to objects; if anything, that might be harder, unless you want 
 to impose a horrendous JAI-like bondage-and-discipline interface.

 I'm not sure what JAI is (do you mean JNI?)

Yes. Excuse the braino - JAI is Java Advanced Imaging, a component whose 
horribleness exceed even that of JNI, hence the confusion.

 but you might look at how Emacs Lisp does it.  You have to call a macro 
 to protect intermediate heap results in C functions from GC'd, so it's 
 possible to make errors, but it cleans up after itself and is generally 
 less fraught with hazards than Python's method is.

That makes a lot of sense.

tom

-- 
That's no moon!
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Python's garbage collection was Re: Python reliability

2005-10-12 Thread Delaney, Timothy (Tim)
Tom Anderson wrote:

 Except that in smalltalk, this isn't true: in ST, every variable
 *appears* to contain a reference to an object, but implementations
 may not actually work like that. In particular, SmallTalk 80 (and
 some earlier smalltalks, and all subsequent smalltalks, i think)
 handles small integers (those that fit in wordsize-1 bits)
 differently: all variables contain a word, whose bottom bit is a tag
 bit; if it's one, the word is a genuine reference, and if it's zero,
 the top bits of the word contain a signed integer.

This type of implementation has been discussed on python-dev. IIRC it
was decided by Guido that unless anyone wanted to implement it and show
a significant performance advantage without any regressions on any
platform, it wasn't worth it.

Basically, put up or shut up ;)

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-12 Thread jepler
On Mon, Oct 10, 2005 at 08:37:03PM +0100, Tom Anderson wrote:
 So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar? 
 Fair enough - the performance gain is nice, but the extra complexity would 
 be a huge pain, i imagine.

I tried to implement this once.  There was not a performance gain for general
code, and I also made the interpreter buggy in the process.

I wrote in 2002:
| Many Lisp interpreters use 'tagged types' to, among other things, let
| small ints reside directly in the machine registers.
| 
| Python might wish to take advantage of this by designating pointers to odd
| addresses stand for integers according to the following relationship:
| p = (i1) | 1
| i = (p1)
| (due to alignment requirements on all common machines, all valid
| pointers-to-struct have 0 in their low bit)  This means that all integers
| which fit in 31 bits can be stored without actually allocating or deallocating
| anything.
| 
| I modified a Python interpreter to the point where it could run simple
| programs.  The changes are unfortunately very invasive, because they
| make any C code which simply executes
| o-ob_type
| or otherwise dereferences a PyObject* invalid when presented with a
| small int.  This would obviously affect a huge amount of existing code in
| extensions, and is probably enough to stop this from being implemented
| before Python 3000.
| 
| This also introduces another conditional branch in many pieces of code, such
| as any call to PyObject_TypeCheck().
| 
| Performance results are mixed.  A small program designed to test the
| speed of all-integer arithmetic comes out faster by 14% (3.38 vs 2.90
| user time on my machine) but pystone comes out 5% slower (14124 vs 13358
| pystones/second).
| 
| I don't know if anybody's barked up this tree before, but I think
| these results show that it's almost certainly not worth the effort to
| incorporate this performance hack in Python.  I'll keep my tree around
| for awhile, in case anybody else wants to see it, but beware that it
| still has serious issues even in the core:
|  0+0j
| Traceback (most recent call last):
|   File stdin, line 1, in ?
| TypeError: unsupported operand types for +: 'int' and 'complex'
|  (0).__class__
| Segmentation fault
| 
| 
http://mail.python.org/pipermail/python-dev/2002-August/027685.html

Note that the tree where I worked on this is long since lost.

Jeff


pgp73wpVIhmAA.pgp
Description: PGP signature
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python's garbage collection was Re: Python reliability

2005-10-12 Thread Tom Anderson
On Tue, 11 Oct 2005, Alex Martelli wrote:

 Tom Anderson [EMAIL PROTECTED] wrote:
   ...
 Has anyone looked into using a real GC for python? I realise it would be a

 If you mean mark-and-sweep, with generational twists,

Yes, more or less.

 that's what gc uses for cyclic garbage.

Do you mean what python uses for cyclic garbage? If so, i hadn't realised 
that. There are algorithms for extending refcounting to cyclic structures 
(i forget the details, but you sort of go round and experimentally 
decrement an object's count and see it ends up with a negative count or 
something), so i assumed python used one of those. Mind you, those are 
probably more complex than mark-and-sweep!

 lot more complexity in the interpreter itself, but it would be faster, 
 more reliable, and would reduce the complexity of extensions.

 ???  It adds no complexity (it's already there), it's slower,

Ah. That would be why all those java, .net, LISP, smalltalk and assorted 
other VMs out there, with decades of development, hojillions of dollars 
and the serried ranks of some of the greatest figures in computer science 
behind them all use reference counting rather than garbage collection, 
then.

No, wait ...

 it is, if anything, LESS reliable than reference counting (which is way 
 simpler!),

Reliability is a red herring - in the absence of ill-behaved native 
extensions, and with correct implementations, both refcounting and GC are 
perfectly reliable. And you can rely on the implementation being correct, 
since any incorrectness will be detected very quickly!

 and (if generalized to deal with ALL garbage) it might make it almost 
 impossible to write some kinds of extensions (ones which need to 
 interface existing C libraries that don't cooperate with whatever GC 
 collection you choose).

Lucky those existing C libraries were written to use python's refcounting!

Oh, you have to write a wrapper round the library to interface with the 
automatic memory management? Well, as it happens, the stuff you need to do 
is more or less identical for refcounting and GC - the extension has to 
tell the VM which of the VM's objects it holds references to, so that the 
VM knows that they aren't garbage.

 Are we talking about the same thing?!

Doesn't look like it, does it?

 So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar?
 Fair enough - the performance gain is nice, but the extra complexity would
 be a huge pain, i imagine.

 CPython currently is implemented on a strict minimize all tricks 
 strategy.

A very, very sound principle. If you have the aforementioned decades, 
hojillions and serried ranks, an all-tricks-turned-up-to-eleven strategy 
can be made to work. If you're a relatively small non-profit outfit like 
the python dev team, minimising tricks buys you reliability and agility, 
which is, really, what we all want.

tom

-- 
That's no moon!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-11 Thread John Waycott
Ville Voipio wrote:
 In article [EMAIL PROTECTED], Thomas Bartkus wrote:
 
All in all, it would seem that the reliability of the Python run time is the
least of your worries.  

I agree - design of the application, keeping it simple and testing it 
thoroughly is more important for reliability than implementation 
language. Indeed, I'd argue that in many cases you'd have better 
reliability using Python over C because of easier maintainability and 
higher-level data constructs.

 
 Well, let's put it this way. I have seen many computers running
 Linux with a high load of this and that (web services, etc.) with
 uptimes of years. I have not seen any recent Linux crash without
 faulty hardware or drivers.
 
 If using Python does not add significantly to the level of 
 irreliability, then I can use it. If it adds, then I cannot
 use it.
 

I wrote a simple Python program that acts as a buffer between a 
transaction network and a database server, writing the transaction logs 
to a file that the database reads the next day for billing. The simple 
design decoupled the database from network so it wasn't stresed during 
high-volume times. The two systems (one for redundancy) that run the 
Python program have been running for six years.

-- John Waycott
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-11 Thread Alex Martelli
Tom Anderson [EMAIL PROTECTED] wrote:
   ...
 Has anyone looked into using a real GC for python? I realise it would be a

If you mean mark-and-sweep, with generational twists, that's what gc
uses for cyclic garbage.

 lot more complexity in the interpreter itself, but it would be faster,
 more reliable, and would reduce the complexity of extensions.

???  It adds no complexity (it's already there), it's slower, it is, if
anything, LESS reliable than reference counting (which is way simpler!),
and (if generalized to deal with ALL garbage) it might make it almost
impossible to write some kinds of extensions (ones which need to
interface existing C libraries that don't cooperate with whatever GC
collection you choose).  Are we talking about the same thing?!


 So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar?
 Fair enough - the performance gain is nice, but the extra complexity would
 be a huge pain, i imagine.

CPython currently is implemented on a strict minimize all tricks
strategy.  There are several other implementations of the Python
language, which may target different virtual machines -- Jython for JVM,
IronPython for MS-CLR, and (less mature) stuff for the Parrot VM, and
others yet from the pypy project.  Each implementation may use whatever
strategy is most appropriate for the VM it targets, of course -- this is
the reason behind Python's refusal to strictly specify GC semantics
(exactly WHEN some given garbage gets collected)... allow such multiple
implementations leeway in optimizing behavior for the target VM(s).


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-11 Thread Paul Rubin
[EMAIL PROTECTED] (Alex Martelli) writes:
  Has anyone looked into using a real GC for python? ...
  lot more complexity in the interpreter itself, but it would be faster,
  more reliable, and would reduce the complexity of extensions.
 
 ???  It adds no complexity (it's already there), it's slower, it is, if
 anything, LESS reliable than reference counting (which is way simpler!),
 and (if generalized to deal with ALL garbage) it might make it almost
 impossible to write some kinds of extensions (ones which need to
 interface existing C libraries that don't cooperate with whatever GC
 collection you choose).  Are we talking about the same thing?!

I've done it both ways and it seems to me that a simple mark/sweep gc
does require a lump of complexity in one place, but Python has that
anyway to deal with cyclic garbage.  Once the gc module is there, then
extensions really do seem to be simpler to right.  Having extensions
know about the gc is no harder than having them maintain reference
counts, in fact it's easier, they have to register new objects with
the gc (by pushing onto a stack) but can remove them all in one go.
Take a look at how Emacs Lisp does it.  Extensions are easy to write.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Ville Voipio
In article [EMAIL PROTECTED], Paul Rubin wrote:

 I would say give the app the heaviest stress testing that you can
 before deploying it, checking carefully for leaks and crashes.  I'd
 say that regardless of the implementation language.

Goes without saying. But I would like to be confident (or as
confident as possible) that all bugs are mine. If I use plain
C, I think this is the case. Of course, bad memory management
in the underlying platform will wreak havoc. I am planning to
use Linux 2.4.somethingnew as the OS kernel, and there I have
not experienced too many problems before.

Adding the Python interpreter adds one layer on uncertainty.
On the other hand, I am after the simplicity of programming
offered by Python.

- Ville

-- 
Ville Voipio, Dr.Tech., M.Sc. (EE)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Ville Voipio
In article [EMAIL PROTECTED], 
Steven D'Aprano wrote:

 If performance is really not such an issue, would it really matter if you
 periodically restarted Python? Starting Python takes a tiny amount of time:

Uhhh. Sounds like playing with Microsoft :) I know of a mission-
critical system which was restarted every week due to some memory
leaks. If it wasn't, it crashed after two weeks. Guess which
platform...

 $ time python -c pass
 real0m0.164s
 user0m0.021s
 sys 0m0.015s

This is on the limit of being acceptable. I'd say that a one-second
time lag is the maximum. The system is a safety system after all,
and there will be a hardware watchdog to take care of odd crashes.
The software itself is stateless in the sense that its previous
state does not affect the next round. Basically, it is just checking
a few numbers over the network. Even the network connection is
stateless (single UDP packet pairs) to avoid TCP problems with
partial closings, etc.

There are a gazillion things which may go wrong. A stray cosmic
ray may change the state of one bit in the wrong place of memory,
and that's it, etc. So, the system has to be able to recover from
pretty much everything. I will in any case build an independent
process which probes the state of the main process. However,
I hope it is never really needed.

 I'm not saying that you will need to restart Python once an hour, or even
 once a month. But if you did, would it matter? What's more important is
 the state of the operating system. (I'm assuming that, with a year uptime
 the requirements, you aren't even thinking of WinCE.)

Not even in my worst nightmares! The platform will be an embedded
Linux computer running 2.4.somethingnew.

- Ville

-- 
Ville Voipio, Dr.Tech., M.Sc. (EE)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Paul Rubin
Ville Voipio [EMAIL PROTECTED] writes:
 Goes without saying. But I would like to be confident (or as
 confident as possible) that all bugs are mine. If I use plain
 C, I think this is the case. Of course, bad memory management
 in the underlying platform will wreak havoc. I am planning to
 use Linux 2.4.somethingnew as the OS kernel, and there I have
 not experienced too many problems before.

You might be better off with a 2.6 series kernel.  If you use Python
conservatively (be careful with the most advanced features, and don't
stress anything too hard) you should be ok.  Python works pretty well
if you use it the way the implementers expected you to.  Its
shortcomings are when you try to press it to its limits.

You do want reliable hardware with ECC and all that, maybe with multiple
servers and automatic failover.  This site might be of interest:

  http://www.linux-ha.org/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Steven D'Aprano
Ville Voipio wrote:

 There are a gazillion things which may go wrong. A stray cosmic
 ray may change the state of one bit in the wrong place of memory,
 and that's it, etc. So, the system has to be able to recover from
 pretty much everything. I will in any case build an independent
 process which probes the state of the main process. However,
 I hope it is never really needed.

If you have enough hardware grunt, you could think 
about having three independent processes working in 
parallel. They vote on their output, and best out of 
three gets reported back to the user. In other words, 
only if all three results are different does the device 
throw its hands up in the air and say I don't know!

Of course, unless you are running each of them on an 
independent set of hardware and OS, you really aren't 
getting that much benefit. And then there is the 
question, can you trust the voting mechanism... But if 
this is so critical you are worried about cosmic rays, 
maybe it is the way to go.

If it is not a secret, what are you monitoring with 
this device?


-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Ville Voipio
In article [EMAIL PROTECTED], Steven D'Aprano wrote:

 If you have enough hardware grunt, you could think 
 about having three independent processes working in 
 parallel. They vote on their output, and best out of 
 three gets reported back to the user. In other words, 
 only if all three results are different does the device 
 throw its hands up in the air and say I don't know!

Ok, I will give you a bit more information, so that the
situation is a bit clearer. (Sorry, I cannot tell you
the exact application.)

The system is a safety system which supervises several
independent measurements (two or more). The measurements
are carried out by independent measurement instruments
which have their independent power supplies, etc.

The application communicates with the independent
measurement instruments thrgough the network. Each
instrument is queried its measurement results and
status information regularly. If the results given
by different instruments differ more than a given
amount, then an alarm is set (relay contacts opened).

Naturally, in case of equipment malfunction, the 
alarm is set. This covers a wide range of problems from
errors reported by the instrument to physical failures
or program bugs. 

The system has several weak spots. However, the basic
principle is simple: if anything goes wrong, start
yelling. A false alarm is costly, but not giving the
alarm when required is downright impossible.

I am not building a redundant system with independent
instruments voting. At this point I am trying to minimize
the false alarms. This is why I want to know if Python
is reliable enough to be used in this application.

By the postings I have seen in this thread it seems that
the answer is positive. At least if I do not try 
apply any adventorous programming techniques.

- Ville

-- 
Ville Voipio, Dr.Tech., M.Sc. (EE)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Ville Voipio
In article [EMAIL PROTECTED], Paul Rubin wrote:

 You might be better off with a 2.6 series kernel.  If you use Python
 conservatively (be careful with the most advanced features, and don't
 stress anything too hard) you should be ok.  Python works pretty well
 if you use it the way the implementers expected you to.  Its
 shortcomings are when you try to press it to its limits.

Just one thing: how reliable is the garbage collecting system?
Should I try to either not produce any garbage or try to clean
up manually?

 You do want reliable hardware with ECC and all that, maybe with multiple
 servers and automatic failover.  This site might be of interest:

Well... Here the uptime benefit from using several servers is
not eceonomically justifiable. I am right now at the phase of
trying to minimize the downtime with given hardware resources.
This is not flying; downtime does not kill anyone. I just want
to avoid choosing tools which belong more to the problem than
to the solution set.

- Ville

-- 
Ville Voipio, Dr.Tech., M.Sc. (EE)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Paul Rubin
Ville Voipio [EMAIL PROTECTED] writes:
 Just one thing: how reliable is the garbage collecting system?
 Should I try to either not produce any garbage or try to clean
 up manually?

The GC is a simple, manually-updated reference counting system
augmented with some extra contraption to resolve cyclic dependencies.
It's extremely easy to make errors with the reference counts in C
extensions, and either leak references (causing memory leaks) or
forget to add them (causing double-free crashes).  The standard
libraries are pretty careful about managing references but if you're
using 3rd party C modules, or writing your own, then watch out.  

There is no way you can avoid making garbage.  Python conses
everything, even integers (small positive ones are cached).  But I'd
say, avoid making cyclic dependencies, be very careful if you use the
less popular C modules or any 3rd party ones, and stress test the hell
out of your app while monitoring memory usage very carefully.  If you
can pound it with as much traffic in a few hours as it's likely to see
in a year of deployment, without memory leaks or thread races or other
errors, that's a positive sign.

 Well... Here the uptime benefit from using several servers is not
 eceonomically justifiable. I am right now at the phase of trying to
 minimize the downtime with given hardware resources.  This is not
 flying; downtime does not kill anyone. I just want to avoid choosing
 tools which belong more to the problem than to the solution set.

You're probably ok with Python in this case.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Max M
Ville Voipio wrote:
 In article [EMAIL PROTECTED], Paul Rubin wrote:
 
I would say give the app the heaviest stress testing that you can
before deploying it, checking carefully for leaks and crashes.  I'd
say that regardless of the implementation language.
 
 Goes without saying. But I would like to be confident (or as
 confident as possible) that all bugs are mine. If I use plain
 C, I think this is the case. Of course, bad memory management
 in the underlying platform will wreak havoc.

Python isn't perfect, but I do believe that is as good as the best of 
the major standard systems out there.

You will have *far* greater chances of introducing errors yourself by 
coding in c, than you will encounter in Python.

You can see the bugs fixed in recent versions, and see for yourself 
whether they would have crashed your system. That should be an indicator:

http://www.python.org/2.4.2/NEWS.html


-- 

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Scott David Daniels
You might try a take over mode -- starting another copy gets to the
point it looks to listen for UDP, the (if the listening fails), tells
the other process to die over UDP, taking over then.  This scheme would
would reduce your time-to-switch to a much shorter window.  Whenever
given the shutdown signal, you could turn on the watcher sleeping
light.  The remaining issue, replacing hardware or OS (possibly due to
failure) probably changes the UDP address.  That might be trickier.  I
certainly feel happier when I can do a hand-off to another box, but it
sounds like you might need the instruments to broadcast or multicast
to get to that happy place.

--Scott David Daniels
[EMAIL PROTECTED]
-- 
http://mail.python.org/mailman/listinfo/python-list


Python's garbage collection was Re: Python reliability

2005-10-10 Thread Tom Anderson
On Mon, 10 Oct 2005, it was written:

 Ville Voipio [EMAIL PROTECTED] writes:

 Just one thing: how reliable is the garbage collecting system? Should I 
 try to either not produce any garbage or try to clean up manually?

 The GC is a simple, manually-updated reference counting system augmented 
 with some extra contraption to resolve cyclic dependencies. It's 
 extremely easy to make errors with the reference counts in C extensions, 
 and either leak references (causing memory leaks) or forget to add them 
 (causing double-free crashes).

Has anyone looked into using a real GC for python? I realise it would be a 
lot more complexity in the interpreter itself, but it would be faster, 
more reliable, and would reduce the complexity of extensions.

Hmm. Maybe it wouldn't make extensions easier or more reliable. You'd 
still need some way of figuring out which variables in C-land held 
pointers to objects; if anything, that might be harder, unless you want to 
impose a horrendous JAI-like bondage-and-discipline interface.

 There is no way you can avoid making garbage.  Python conses everything, 
 even integers (small positive ones are cached).

So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar? 
Fair enough - the performance gain is nice, but the extra complexity would 
be a huge pain, i imagine.

tom

-- 
Fitter, Happier, More Productive.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-10 Thread Aahz
In article [EMAIL PROTECTED],
Tom Anderson  [EMAIL PROTECTED] wrote:

Has anyone looked into using a real GC for python? I realise it would be a 
lot more complexity in the interpreter itself, but it would be faster, 
more reliable, and would reduce the complexity of extensions.

Hmm. Maybe it wouldn't make extensions easier or more reliable. You'd 
still need some way of figuring out which variables in C-land held 
pointers to objects; if anything, that might be harder, unless you want to 
impose a horrendous JAI-like bondage-and-discipline interface.

Bingo!  There's a reason why one Python motto is Plays well with
others.
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur.  --Red Adair
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-10 Thread Mike Meyer
Tom Anderson [EMAIL PROTECTED] writes:
 Has anyone looked into using a real GC for python? I realise it would
 be a lot more complexity in the interpreter itself, but it would be
 faster, more reliable, and would reduce the complexity of extensions.

 Hmm. Maybe it wouldn't make extensions easier or more reliable. You'd
 still need some way of figuring out which variables in C-land held
 pointers to objects; if anything, that might be harder, unless you
 want to impose a horrendous JAI-like bondage-and-discipline interface.

Wouldn't necessarily be faster, either. I rewrote an program that
built a static data structure of a couple of hundred thousand objects
and then went traipsing through that while generating a few hundred
objects in a compiled language with a real garbage collector. The
resulting program ran about an order of magnitude slower than the
Python version.

Profiling revealed that it was spending 95% of it's time in the
garbage collector, marking and sweeping that large data structure.

There's lots of research on dealing with this problem, as my usage
pattern isn't unusual - just a little extreme. Unfortunately, none of
them were applicable to comiled code without a serious performance
impact on pretty much everything. Those could probably be used in
Python without a problem.

   mike
-- 
Mike Meyer [EMAIL PROTECTED]  http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Thomas Bartkus
Ville Voipio [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 In article [EMAIL PROTECTED], Paul Rubin wrote:
snip

 I would need to make some high-reliability software
 running on Linux in an embedded system. Performance
 (or lack of it) is not an issue, reliability is.

 The software should be running continously for
 practically forever (at least a year without a reboot).
 is the Python interpreter (on Linux) stable and
 leak-free enough to achieve this?


 Adding the Python interpreter adds one layer on uncertainty.
 On the other hand, I am after the simplicity of programming
 offered by Python.
snip

 I would need to make some high-reliability software
 running on Linux in an embedded system. Performance
 (or lack of it) is not an issue, reliability is.
snip
 The software should be running continously for
 practically forever (at least a year without a reboot).
 is the Python interpreter (on Linux) stable and
 leak-free enough to achieve this?
snip

All in all, it would seem that the reliability of the Python run time is the
least of your worries.  The best multi-tasking operating systems do a good
job of segragating different processes BUT what multitasking operating
system meets the standard you request in that last paragraph?  Assuming that
the Python interpreter itself is robust enough to meet that standard, what
about that other 99% of everything else that is competing with your Python
script for cpu, memory, and other critical resources? Under ordinary Linux,
your Python script will be interrupted frequently and regularly by processes
entirely outside of Python's control.

You may not want a multitasking OS at all but rather a single tasking OS
where nothing  happens that isn't 100% under your program control. Or if you
do need a multitasking system, you probably want something designed for the
type of rugged use you are demanding.  I would google embedded systems.
If you want to use Python/Linux, I might suggest you search Embedded
Linux.

And I wouldn't be surprised if some dedicated microcontrollers aren't
showing up with Python capability.  In any case, it would seem you need more
control than a Python interpreter would receive when running under Linux.

Good Luck.
Thomas Bartkus




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python's garbage collection was Re: Python reliability

2005-10-10 Thread Paul Rubin
Tom Anderson [EMAIL PROTECTED] writes:
 Has anyone looked into using a real GC for python? I realise it would
 be a lot more complexity in the interpreter itself, but it would be
 faster, more reliable, and would reduce the complexity of extensions.

The next PyPy sprint (this week I think) is going to focus partly on GC.

 Hmm. Maybe it wouldn't make extensions easier or more reliable. You'd
 still need some way of figuring out which variables in C-land held
 pointers to objects; if anything, that might be harder, unless you
 want to impose a horrendous JAI-like bondage-and-discipline interface.

I'm not sure what JAI is (do you mean JNI?) but you might look at how
Emacs Lisp does it.  You have to call a macro to protect intermediate
heap results in C functions from GC'd, so it's possible to make
errors, but it cleans up after itself and is generally less fraught
with hazards than Python's method is.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Peter Hansen
Ville Voipio wrote:
 I am not building a redundant system with independent
 instruments voting. At this point I am trying to minimize
 the false alarms. This is why I want to know if Python
 is reliable enough to be used in this application.
 
 By the postings I have seen in this thread it seems that
 the answer is positive. At least if I do not try 
 apply any adventorous programming techniques.

We built a system with similar requirements using an older version of 
Python (either 2.0 or 2.1 I believe).  A number of systems were shipped 
and operate without problems.  We did have a memory leak issue in an 
early version and spent ages debugging it (and actually implemented the 
suggested reboot when necessary feature as a stop-gap measure at one 
point), before finally discovering it.  (Then, knowing what to search 
for, we quickly found that the problem had been fixed in CVS for the 
Python version we were using, and actually released in the subsequent 
major revision.  (The leak involved extending empty lists, or extending 
lists with empty lists, as I recall.)

Other than that, we had no real issues and definitely felt the choice of 
Python was completely justified.  I have no hesitation recommending it, 
other than to caution (as I believe Paul R did) that use of new features 
is dangerous in that they won't have as wide usage and shouldn't 
always be considered proven in long-term field use, by definition.

Another suggestion would be to carefully avoid cyclic references (if the 
app is simple enough for this to be feasible), allowing you to rely on 
reference-counting for garbage collection and the resultant more 
deterministic behaviour.

Also test heavily.  We were using test-driven development and had 
effectively thousands of hours of run-time by the time the first system 
shipped, so we had great confidence in it.

-Peter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Ville Voipio
In article [EMAIL PROTECTED], Thomas Bartkus wrote:
 
 All in all, it would seem that the reliability of the Python run time is the
 least of your worries.  The best multi-tasking operating systems do a good
 job of segragating different processes BUT what multitasking operating
 system meets the standard you request in that last paragraph?

Well, let's put it this way. I have seen many computers running
Linux with a high load of this and that (web services, etc.) with
uptimes of years. I have not seen any recent Linux crash without
faulty hardware or drivers.

If using Python does not add significantly to the level of 
irreliability, then I can use it. If it adds, then I cannot
use it.

 type of rugged use you are demanding.  I would google embedded systems.
 If you want to use Python/Linux, I might suggest you search Embedded
 Linux.

I am an embedded system designer by my profession :) Both hardware
and software for industrial instruments. Computers are just a
side effect of nicer things.

But here I am looking into the possibility of making something 
with embedded PC hardware (industrial PC/104 cards). The name of 
the game is as good as possible with the given amount of money. 
In that respect this is not flying or shooting. If something goes
wrong, someone loses a bunch of dollars, not their life.

I think that in this game Python might be handy when it comes to
maintainability and legibility (vs. C). But choosing a tool which
is known to be bad for the task is not a good idea.

- Ville

-- 
Ville Voipio, Dr.Tech., M.Sc. (EE)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-10 Thread Ville Voipio
In article [EMAIL PROTECTED], Peter Hansen wrote:

 Other than that, we had no real issues and definitely felt the choice of 
 Python was completely justified.  I have no hesitation recommending it, 
 other than to caution (as I believe Paul R did) that use of new features 
 is dangerous in that they won't have as wide usage and shouldn't 
 always be considered proven in long-term field use, by definition.

Thank you for this information. Of course, we try to be as conservative
as possible. The application fortunately allows for this, cyclic
references and new features can most probably be avoided.

 Also test heavily.  We were using test-driven development and had 
 effectively thousands of hours of run-time by the time the first system 
 shipped, so we had great confidence in it.

Yes, it is usually much nicer to debug the software in the quiet,
air-conditioned lab than somewhere in a jungle on the other side
of the globe with an extremely angry customer next to you...

- Ville

-- 
Ville Voipio, Dr.Tech., M.Sc. (EE)

-- 
http://mail.python.org/mailman/listinfo/python-list


Python reliability

2005-10-09 Thread Ville Voipio
I would need to make some high-reliability software
running on Linux in an embedded system. Performance
(or lack of it) is not an issue, reliability is.

The piece of software is rather simple, probably a
few hundred lines of code in Python. There is a need 
to interact with network using the socket module, 
and then probably a need to do something hardware-
related which will get its own driver written in
C.

Threading and other more error-prone techniques can
be left aside, everything can run in one thread with
a poll loop.

The software should be running continously for 
practically forever (at least a year without a reboot).
Is the Python interpreter (on Linux) stable and
leak-free enough to achieve this?

- Ville

-- 
Ville Voipio, Dr.Tech., M.Sc. (EE)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-09 Thread Paul Rubin
Ville Voipio [EMAIL PROTECTED] writes:
 The software should be running continously for 
 practically forever (at least a year without a reboot).
 Is the Python interpreter (on Linux) stable and
 leak-free enough to achieve this?

I would say give the app the heaviest stress testing that you can
before deploying it, checking carefully for leaks and crashes.  I'd
say that regardless of the implementation language.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-09 Thread Steven D'Aprano
On Sun, 09 Oct 2005 23:00:04 +0300, Ville Voipio wrote:

 I would need to make some high-reliability software
 running on Linux in an embedded system. Performance
 (or lack of it) is not an issue, reliability is.

[snip]

 The software should be running continously for 
 practically forever (at least a year without a reboot).
 Is the Python interpreter (on Linux) stable and
 leak-free enough to achieve this?

If performance is really not such an issue, would it really matter if you
periodically restarted Python? Starting Python takes a tiny amount of time:

$ time python -c pass
real0m0.164s
user0m0.021s
sys 0m0.015s

If performance isn't an issue, your users may not even care about ten
times that delay even once an hour. In other words, built your software to
deal gracefully with restarts, and your users won't even notice or care if
it restarts.

I'm not saying that you will need to restart Python once an hour, or even
once a month. But if you did, would it matter? What's more important is
the state of the operating system. (I'm assuming that, with a year uptime
the requirements, you aren't even thinking of WinCE.)


-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-09 Thread Paul Rubin
Steven D'Aprano [EMAIL PROTECTED] writes:
 If performance is really not such an issue, would it really matter if you
 periodically restarted Python? Starting Python takes a tiny amount of time:

If you have to restart an application, every network peer connected to
it loses its connection.  Think of a phone switch.  Do you really want
your calls dropped every few hours of conversation time, just because
some lame application decided to restart itself?  Phone switches go to
great lengths to keep running through both hardware failures and
software upgrades, without dropping any calls.  That's the kind of
application it sounds like the OP is trying to run.

To the OP: besides Python you might also consider Erlang.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-09 Thread Jp Calderone
On Sun, 9 Oct 2005 23:00:04 +0300 (EEST), Ville Voipio [EMAIL PROTECTED] 
wrote:
I would need to make some high-reliability software
running on Linux in an embedded system. Performance
(or lack of it) is not an issue, reliability is.

The piece of software is rather simple, probably a
few hundred lines of code in Python. There is a need
to interact with network using the socket module,
and then probably a need to do something hardware-
related which will get its own driver written in
C.

Threading and other more error-prone techniques can
be left aside, everything can run in one thread with
a poll loop.

The software should be running continously for
practically forever (at least a year without a reboot).
Is the Python interpreter (on Linux) stable and
leak-free enough to achieve this?


As a data point, I've had python programs run on linux for more than a year 
using both Python 2.1.3 and 2.2.3.  These were network apps, with both client 
and server functionality, using Twisted.

Jp
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-09 Thread George Sakkis
Steven D'Aprano wrote:

 On Sun, 09 Oct 2005 23:00:04 +0300, Ville Voipio wrote:

  I would need to make some high-reliability software
  running on Linux in an embedded system. Performance
  (or lack of it) is not an issue, reliability is.

 [snip]

  The software should be running continously for
  practically forever (at least a year without a reboot).
  Is the Python interpreter (on Linux) stable and
  leak-free enough to achieve this?

 If performance is really not such an issue, would it really matter if you
 periodically restarted Python? Starting Python takes a tiny amount of time:

You must have missed or misinterpreted the The software should be
running continously for practically forever part. The problem of
restarting python is not the 200 msec lost but putting at stake
reliability (e.g. for health monitoring devices, avionics, nuclear
reactor controllers, etc.) and robustness (e.g. a computation that
takes weeks of cpu time to complete is interrupted without the
possibility to restart from the point it stopped).

George

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-09 Thread Neal Norwitz
Ville Voipio wrote:

 The software should be running continously for
 practically forever (at least a year without a reboot).
 Is the Python interpreter (on Linux) stable and
 leak-free enough to achieve this?

Jp gave you the answer that he has done this.

I've spent quite a bit of time since 2.1 days trying to improve the
reliability.  I think it has gotten much better.  Valgrind is run on
(nearly) every release.  We look for various kinds of problems.  I try
to review C code for these sorts of problems etc.

There are very few known issues that can crash the interpreter.  I
don't know of any memory leaks.  socket code is pretty well tested and
heavily used, so you should be in fairly safe territory, particularly
on Unix.

n

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-09 Thread Steven D'Aprano
George Sakkis wrote:

 Steven D'Aprano wrote:
 
 
On Sun, 09 Oct 2005 23:00:04 +0300, Ville Voipio wrote:


I would need to make some high-reliability software
running on Linux in an embedded system. Performance
(or lack of it) is not an issue, reliability is.

[snip]


The software should be running continously for
practically forever (at least a year without a reboot).
Is the Python interpreter (on Linux) stable and
leak-free enough to achieve this?

If performance is really not such an issue, would it really matter if you
periodically restarted Python? Starting Python takes a tiny amount of time:
 
 
 You must have missed or misinterpreted the The software should be
 running continously for practically forever part. The problem of
 restarting python is not the 200 msec lost but putting at stake
 reliability (e.g. for health monitoring devices, avionics, nuclear
 reactor controllers, etc.) and robustness (e.g. a computation that
 takes weeks of cpu time to complete is interrupted without the
 possibility to restart from the point it stopped).


Er, no, I didn't miss that at all. I did miss that it 
needed continual network connections. I don't know if 
there is a way around that issue, although mobile 
phones move in and out of network areas, swapping 
connections when and as needed.

But as for reliability, well, tell that to Buzz Aldrin 
and Neil Armstrong. The Apollo 11 moon lander rebooted 
multiple times on the way down to the surface. It was 
designed to recover gracefully when rebooting unexpectedly:

http://www.hq.nasa.gov/office/pao/History/alsj/a11/a11.1201-pa.html

I don't have an authoritive source of how many times 
the computer rebooted during the landing, but it was 
measured in the dozens. Calculations were performed in 
an iterative fashion, with an initial estimate that was 
improved over time. If a calculation was interupted the 
computer lost no more than one iteration.

I'm not saying that this strategy is practical or 
useful for the original poster, but it *might* be. In a 
noisy environment, it pays to design a system that can 
recover transparently from a lost connection.

If your heart monitor can reboot in 200 ms, you might 
miss one or two beats, but so long as you pick up the 
next one, that's just noise. If your calculation takes 
more than a day of CPU time to complete, you should 
design it in such a way that you can save state and 
pick it up again when you are ready. You never know 
when the cleaner will accidently unplug the computer...


-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python reliability

2005-10-09 Thread Jp Calderone
On Mon, 10 Oct 2005 12:18:42 +1000, Steven D'Aprano [EMAIL PROTECTED] wrote:
George Sakkis wrote:

 Steven D'Aprano wrote:


On Sun, 09 Oct 2005 23:00:04 +0300, Ville Voipio wrote:


I would need to make some high-reliability software
running on Linux in an embedded system. Performance
(or lack of it) is not an issue, reliability is.

[snip]


The software should be running continously for
practically forever (at least a year without a reboot).
Is the Python interpreter (on Linux) stable and
leak-free enough to achieve this?

If performance is really not such an issue, would it really matter if you
periodically restarted Python? Starting Python takes a tiny amount of time:


 You must have missed or misinterpreted the The software should be
 running continously for practically forever part. The problem of
 restarting python is not the 200 msec lost but putting at stake
 reliability (e.g. for health monitoring devices, avionics, nuclear
 reactor controllers, etc.) and robustness (e.g. a computation that
 takes weeks of cpu time to complete is interrupted without the
 possibility to restart from the point it stopped).


Er, no, I didn't miss that at all. I did miss that it
needed continual network connections. I don't know if
there is a way around that issue, although mobile
phones move in and out of network areas, swapping
connections when and as needed.

But as for reliability, well, tell that to Buzz Aldrin
and Neil Armstrong. The Apollo 11 moon lander rebooted
multiple times on the way down to the surface. It was
designed to recover gracefully when rebooting unexpectedly:

http://www.hq.nasa.gov/office/pao/History/alsj/a11/a11.1201-pa.html


This reminds me of crash-only software:

  http://www.stanford.edu/~candea/papers/crashonly/crashonly.html

Which seems to have some merits.  I have yet to attempt to develop any large 
scale software explicitly using this technique (although I have worked on 
several systems that very loosely used this approach; eg, a server which 
divided tasks into two processes, with one restarting the other whenever it 
noticed it was gone), but as you point out, there's certainly precedent.

Jp
-- 
http://mail.python.org/mailman/listinfo/python-list