How to get insight in the relations between tracebacks of exceptions in an exception-chain

2024-04-04 Thread Klaas van Schelven via Python-list
Hi,

This question is best introduced example-first:

Consider the following trivial program:

```
class OriginalException(Exception):
pass


class AnotherException(Exception):
pass


def raise_another_exception():
raise AnotherException()


def show_something():
try:
raise OriginalException()
except OriginalException:
raise_another_exception()


show_something()
```

running this will dump the following on screen (minus annotations on the
Right-Hand-Side):

```
Traceback (most recent call last):
  File "./stackoverflow_single_complication.py", line 15, in
show_something t1
raise OriginalException()
__main__.OriginalException

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./stackoverflow_single_complication.py", line 20, in 
t0
show_something()
  File "./stackoverflow_single_complication.py", line 17, in
show_something t2
raise_another_exception()
  File "./stackoverflow_single_complication.py", line 10, in
raise_another_exceptiont3
raise AnotherException()
__main__.AnotherException
```

What we see here is first the `OriginalException` with the stackframes
between the moment that it was raised and the moment it was handled.
Then we see `AnotherException`, with _all_ a complete traceback from its
point-of-raising to the start of the program.

In itself this is perfectly fine, but a consequence of this way of
presenting the information is that the stackframes are _not_ laid out on
the screen in the order in which they were called (and not in the reverse
order either), as per the annotations _t1_, _t0_, _t2_, _t3_. The path
leading up to _t1_ is of course the same as the path leading up to _t2_,
and the creators of Python have chosen to present it only once, in the
latter case, presumably because that Exception is usually the most
interesting one, and because it allows one to read the bottom exception
bottom-up without loss of information. However, it does leave people that
want to analyze the `OriginalException` somewhat mystified: what led up to
it?

A programmer that wants to understand what led up to _t1_ would need to
[mentally] copy all the frames above the point _t2_ to the first stacktrace
to get a complete view. However, in reality the point _t2_ is, AFAIK, not
automatically annotated for you as a special frame, which makes the task of
mentally copying the stacktrace much harder.

Since the point _t2_ is in general "the failing line in the `except`
block", by cross-referencing the source-code this excercise can usually be
completed, but this seems unnecessarily hard.

**Is it possible to automatically pinpoint _t2_ as the "handling frame"?**

(The example above is given without some outer exception-handling context;
I'm perfectly fine with answers that introduce it and then use `traceback`
or other tools to arrive at the correct answer).

This is the most trivial case that illustrates the problem; real cases have
many more stack frames and thus less clearly illustrate the problem but
more clearly illustrate the need for (potentially automated) clarification
of what's happening that this SO question is about.


regards,
Klaas


Previously asked here:
https://stackoverflow.com/questions/78270044/how-to-get-insight-in-the-relations-between-tracebacks-of-exceptions-in-an-excep
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: platform system may be Windows or Microsoft since Vista

2007-08-31 Thread Klaas
On Aug 31, 9:47 am, [EMAIL PROTECTED] wrote:
> Let's suppose you get Python for Vista Windows today 
> fromhttp://www.python.org/download/.
>
> Should you then conclude that the tests:
>
> if platform.system() in ('Windows', 'Microsoft'):
> if not (platform.system() in ('Windows', 'Microsoft')):

Good analysis.  Log a bug @ bugs.python.org

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Closures / Blocks in Python

2007-07-25 Thread Klaas
On Jul 24, 7:58 am, treble54 <[EMAIL PROTECTED]> wrote:
> Does anyone know a way to use closures or blocks in python like those
> used in Ruby? Particularly those used in the { } braces.

Inner functions allow you to define closures and (named) blocks
anywhere).  Anonymous blocks must consist of a single expression.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: class C: vs class C(object):

2007-07-23 Thread Klaas
On Jul 20, 5:47 am, Hrvoje Niksic <[EMAIL PROTECTED]> wrote:
> "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> writes:
> > In particular, old-style classes are noticeably faster than
> > new-style classes for some things (I think it was attribute lookup
> > that surprised me recently, possibly related to the property
> > stuff...)
>
> Can you post an example that we can benchmark?  I ask because the
> opposite is usually claimed, that (as of Python 2.4 or 2.5) new-style
> classes are measurably faster.

Why do people ask for trivial examples?

$ cat classes.py
class Classic:
def __init__(self):
self.attr = 1

class NewStyle(object):
def __init__(self):
self.attr = 1

$ python -m timeit -s 'from classes import *; c = Classic()' 'c.attr'
:2: SyntaxWarning: import * only allowed at module level
100 loops, best of 3: 0.182 usec per loop

$ python -m timeit -s 'from classes import *; c = NewStyle()' 'c.attr'
:2: SyntaxWarning: import * only allowed at module level
100 loops, best of 3: 0.269 usec per loop

New style classes have more machinery to process for attribute/method
lookup, and are slower.

There are very few algorithms for which attribute access is the
bottleneck however (seeing as how easier they can be extracted out of
inner loops into locals, which are much faster than attribute access
on either type of class).

Using old-style classes for performance is a useful hack for python
perf wizards, but is a dangerous meme to perpetuate.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The ** operator ambiguous?

2007-07-16 Thread Klaas
On Jul 16, 10:40 am, Robert Dailey <[EMAIL PROTECTED]> wrote:
> I noticed that the ** operator is used as the power operator, however
> I've seen it used when passing variables into a function. For example,
> I was researching a way to combine dictionaries. I found that if you
> do this:
>
> a = {"t1":"a", "t2":"b"}
> b = {"t3":"c"}
> dict( a, **b )
>
> This combines the two dictionaries.

Use dict.update to combine dictionaries.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Where does str class represent its data?

2007-07-13 Thread Klaas
On Jul 11, 4:37 pm, Miles <[EMAIL PROTECTED]> wrote:

> Since strings are immutable, you need to override the __new__ method.
> Seehttp://www.python.org/download/releases/2.2.3/descrintro/#__new__

In case this isn't clear, here is how to do it:

In [1]: class MyString(str):
   ...: def __new__(cls, value):
   ...: return str.__new__(cls, value.lower())

In [2]: s = MyString('Hello World')

In [3]: s
Out[3]: 'hello world'

Note that this will not do fancy stuff like automatically call
__str__() methods.  If you want that, call str() first:

In [5]: class MyString(str):
   ...: def __new__(cls, value):
   ...: return str.__new__(cls, str(value).lower())

-Mike


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Shed Skin Python-to-C++ Compiler 0.0.21, Help needed

2007-07-05 Thread Klaas
On Jun 29, 3:48 am, "Mark Dufour" <[EMAIL PROTECTED]> wrote:

> I have just released version 0.0.22 of Shed Skin, an experimental
> Python-to-C++ compiler. Among other things, it has the exciting new
> feature of being able to generate (simple, for now) extension modules,
> so it's much easier to compile parts of a program and use them (by
> just importing them). Here's the complete changelog:
>
> -support for generating simple extension modules (linux/windows; see README)

Great work.  You might want to advertise this on the main site
(currently it states that this is impossible).

You've said somewhere that you didn't/don't plan on working on this
aspect, but it is surely the "killer feature" of shed skin needed to
for it to be able to be used as pyrex is currently (optimizing bits of
larger projects).

Of course, the perfect synthesis would be to combine the two projects
into something that applied type inferencing with a fallback to python
vm when necessary .  But there is such a large gap betwixt the
twain that such dreaming is but an excercise in fantasy (there's
always pypy).

I wish I had time to help,
-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using a switch-like if/else construct versus a dictionary?

2007-06-25 Thread Klaas
On Jun 19, 12:40 pm, asincero <[EMAIL PROTECTED]> wrote:
> Which is better: using an if/else construct to simulate a C switch or
> use a dictionary?  Example:

Whichever results in the clearest code that meets the performance
requirements.

FWIW, if you define the dictionary beforehand, the dict solution is
O(1) while if/else is O(N), which can be important.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: contextlib.closing annoyance

2007-06-25 Thread Klaas
On Jun 22, 4:54 pm, Paul Rubin  wrote:
> it looks like contextlib.closing fails to be idempotent,
> i.e. wrapping closing() around another closing() doesn't work.

> This is annoying because the idea of closing() is to let you
> use legacy file-like objects as targets of the "with" statement,
> e.g.
>
> with closing(gzip.open(filename)) as zf: ...
>
> but what happens if the gzip library gets updated the dumb way to
> support the enter and exit methods so you don't need the explicit
> closing call any more?  The dumb way of course is to just call
> closing() inside the library.  It seems to me that
> closing(closing(f)) ought to do the same thing as closing(f).
>
> Let me know if I'm overlooking something.  I'm thinking of submitting
> an RFE.

I'm not sure what "calling closing() inside the library" entails.  In
the __enter__ method?  I don't see how that could work.  Nor anywhere
else, really: an object does not have the ability to wrap itself in a
context manager (without explicitly emulating the functionality by
calling the __-methods).  Indeed, why wouldn't this be the shortest
(and dumbest) implementation?

class gzip.File:
   def __enter__(self):
   return self
   def __exit__(self, *args):
   self.close()

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pyrex problem with cdef'd attribute

2007-06-08 Thread Klaas
On Jun 8, 6:00 am, [EMAIL PROTECTED] wrote:
> I'm using Pyrex 0.9.5.1a.  I have this simple Pyrex module:

You might get more help on the pyrex list.

> cdef class Foo:
> cdef public char attr
>
> def __init__(self):
> self.attr = 0
>
> class Bar(Foo):
> def __init__(self):
> Foo.__init__(self)
> self.attr2 = None
>
> def mod(self):
> self.attr = c'B'
>
> f = Bar()
> f.mod()
>
> When I run Python and import it an exception is raised:

Yes, since you didn't cdef the class, it is essentially python code.
Python code cannot assign to a cdef class attribute that is not of
type 'object'

> If the mod() method is defined in the base class it works properly.  Is this
> a Pyrex bug or am I not allowed to modify cdef'd attributes in subclasses?
> (Note that I want Bar visible outside the module, hence no "cdef".)

cdef does not affect visibility, IIRC, just whether the class is
compiled into an extension type or not.

This is just from memory though.  Greg would be able to give you a
better answer.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Running a process every N days

2007-06-07 Thread Klaas
On Jun 7, 3:27 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:
> What's the best way to run either an entire python process or a python
> thread every N days. I'm running Python 2.4.3 on Fedora Core 5 Linux.
> My code consists of a test and measurement system that runs 24/7 in a
> factory setting. It collects alot of data and I'd like to remove all
> data older than 30 days. My ideal solution would be something that
> runs in the background but only wakes up to run every few days to
> check for old data.

google "cron"

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python rocks

2007-06-05 Thread Klaas
On Jun 3, 8:56 am, [EMAIL PROTECTED] (Alex Martelli) wrote:

> Allowing a trailing ! in method names has no such cost, because in no
> language I know is ! used as a "postfix unary operator"; the gain in the
> convention "mutators end with !" is not huge, but substantial.  So, the
> tradeoffs are different: small pain, substantial gain == not a bad idea.
>
> However, this is all quite theoretical, because no more PEPs will be
> accepted for Python 3000, so any language change like this would have to
> wait for Python 4000, which is no doubt quite a distant prospect:-).

Would it?  If it isn't backwards-incompatible, it could even go in 2.6

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Who uses Python?

2007-06-05 Thread Klaas
On Jun 4, 12:37 pm, walterbyrd <[EMAIL PROTECTED]> wrote:
> I mean other than sysadmins, programmers, and web-site developers?
>
> I have heard of some DBAs who use a lot of python.
>
> I suppose some scientists. I think python is used in bioinformatics. I
> think some math and physics people use python.
>
> I suppose some people use python to learn "programming" in general.
> Python would do well as a teaching language.
>
> I would think that python would be a good language for data analysis.
>
> Anything else? Finance? Web-analytics? SEO? Digital art?


Large-scale distributed systems...

-Mike


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory handling

2007-05-31 Thread Klaas
On May 31, 11:00 am, Thorsten Kampe <[EMAIL PROTECTED]> wrote:

> If it's swapped to disk than this is a big concern. If your Python app
> allocates 600 MB of RAM and does not use 550 MB after one minute and
> this unused memory gets into the page file then the Operating System
> has to allocate and write 550 MB onto your hard disk. Big deal.

You have a long-running python process that allocates 550Mb of _small_
objects and then never again uses more than a tenth of that space?

This is an abstract corner case, and points more to a multi-process
design rather than a flaw in python.

The unbounded size of python's int/float freelists are slightly more
annoying problems, but nearly as trivial.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: File I/O

2007-05-09 Thread Klaas
On May 9, 2:43 pm, HMS Surprise <[EMAIL PROTECTED]> wrote:
> > [lst.append(list(line.split())) for line in file]
>
> Thanks, this is the direction I wanted to go, BUT I must use v2.2 so
> the line above gives me the error:
>
> AttributeError: __getitem__
>
> But the write format will be helpful.

(change to file.xreadlines(). Btw that snippet will fail miserably for
most data).

Instead, use pickle:

import pickle
pickle.dump(lst_of_lst, open('outfile', 'wb'))
lst_of_lst = pickle.load(open('outfile', 'rb'))

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Towards faster Python implementations - theory

2007-05-09 Thread Klaas
On May 9, 10:02 am, John Nagle <[EMAIL PROTECTED]> wrote:

>  One option might be a class "simpleobject", from which other classes
> can inherit.  ("object" would become a subclass of "simpleobject").
> "simpleobject" classes would have the following restrictions:
>
> - New fields and functions cannot be introduced from outside
> the class.  Every field and function name must explicitly appear
> at least once in the class definition.  Subclassing is still
> allowed.
> - Unless the class itself uses "getattr" or "setattr" on itself,
> no external code can do so.  This lets the compiler eliminate the
> object's dictionary unless the class itself needs it.
>
> This lets the compiler see all the field names and assign them fixed slots
> in a fixed sized object representation.  Basically, this means simple objects
> have a C/C++ like internal representation, with the performance that comes
> with that representation.

Hey look, it already exists:

>>> class A(object):
... __slots__ = 'a b c d'.split()

>>> a = A()
>>> a.e = 2
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'A' object has no attribute 'e'
>>> hasattr(a, '__dict__')
False

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How safe is a set of floats?

2007-05-08 Thread Klaas
On May 4, 10:15 am, Paul McGuire <[EMAIL PROTECTED]> wrote:

> Just to beat this into the ground, "test for equality" appears to be
> implemented as "test for equality of hashes".  So if you want to
> implement a class for the purposes of set membership, you must
> implement a suitable __hash__ method.  It is not sufficient to
> implement __cmp__ or __eq__, which I assumed "test for equality" would
> make use of.  Not having a __hash__ method in my original class caused
> my initial confusion.

overriding __hash__ (even to raise NotImplementedError) is always wise
if you have override __eq__.  And of course __hash__ is necessary for
using hashtable-based structures (how else could it determine whether
objects are equal?  compare against every existing element?)

Finally, two objects which return the same __hash__ but return False
for __eq__ are, of course, unequal.  sets/dicts do not simply "test
for equality of hashes"

> So would you suggest that any class implemented in a general-purpose
> class library should implement __hash__, since one cannot anticipate
> when a user might want to insert class instances into a set?  (It
> certainly is not on my current checklist of methods to add to well-
> behaved classes.)

a class should be only inserted into a set if it is immutable, and
thus designed to such.  User's might also execute 'del x.attr', so
perhaps you should start each method with a series of hasattr()
checks...

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python regular expressions just ain't PCRE

2007-05-08 Thread Klaas
On May 5, 6:57 pm, Wiseman <[EMAIL PROTECTED]> wrote:

> > There's also the YAGNI factor; most folk would restrict using regular
> > expressions to simple grep-like functionality and data validation --
> > e.g. re.match("[A-Z][A-Z]?[0-9]{6}[0-9A]$", idno). The few who want to
> > recognise yet another little language tend to reach for parsers, using
> > regular expressions only in the lexing phase.
>
> Well, I find these features very useful. I've used a complex, LALR
> parser to parse complex grammars, but I've solved many problems with
> just the PCRE lib. Either way seeing nobody's interested on these
> features, I'll see if I can expose PCRE to Python myself; it sounds
> like the fairest solution because it doesn't even deal with the re
> module - you can do whatever you want with it (though I'd rather have
> it stay as it is or enhance it), and I'll still have PCRE. That's if I
> find the time to do it though, even having no life.

A polished wrapper for PCRE would be a great contribution to the
python community.  If it becomes popular, then the argument for
replacing the existing re engine becomes much stronger.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: No speedup on multi-processor machine?

2007-04-23 Thread Klaas
On Apr 21, 5:14 pm, Fuzzyman <[EMAIL PROTECTED]> wrote:

> Additionally, extending IronPython from C# is orders of magnitude
> easier than extending CPython from C.

Given the existence of Pyrex, that statement is pretty difficult to
substantiate.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Select weirdness

2007-04-23 Thread Klaas
On Apr 23, 9:51 am, Ron Garret <[EMAIL PROTECTED]> wrote:
> In article <[EMAIL PROTECTED]>,
>  Jean-Paul Calderone <[EMAIL PROTECTED]> wrote:
>
> > Twisted does this out of the box, for what it's worth.
>
> Thanks.  I will look at that.

There is also asyncore in the standard library, which is a very light
pythonic wrapper around select() dispatching to handlers.  Works
great.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Really badly structured Python Books.

2007-04-16 Thread Klaas
On Apr 14, 11:37 am, "Andre P.S Duarte" <[EMAIL PROTECTED]>
wrote:
> I started reading the beginning Python book. It is intended for people
> who are starting out in the Python world. But it is really
> complicated, because he tries to explain, then after a bad explanation
> he puts out a bad example. I really recommend NOT reading the book.
> For it will make you want not to continue in Python. This is just me
> letting the air out of my lungs. No need to reply this is just a
> recommendation. Txs for the opportunity .

I went ahead and didn't read the book, and I can feel the improvement
already!

-Mikw

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Queue enhancement suggestion

2007-04-16 Thread Klaas
On Apr 15, 11:12 pm, Paul Rubin  wrote:
> I'd like to suggest adding a new operation
>
>Queue.finish()
>
> This puts a special sentinel object on the queue.  The sentinel
> travels through the queue like any other object, however, when
> q.get() encounters the sentinel, it raises StopIteration instead
> of returning the sentinel.  It does not remove the sentinel from
> the queue, so further calls to q.get also raise StopIteration.
> That permits writing the typical "worker thread" as

This is a pretty good idea.  However, it needs a custom __iter__
method to work... the syntax below is wrong on many levels.

>for item in iter(q.get): ...

Once you implement __iter__, you are left with 'for item in q'.  The
main danger here is that all the threading synchro stuff is hidden in
the guts of the __iter__ implementation, which isn't terribly clear.
There is no way to handle Empty exceptions and use timeouts, for
instance.

> however that actually pops the sentinel, so if there are a lot of
> readers then the writing side has to push a separate sentinel for
> each reader.  I found my code cluttered with
>
> for i in xrange(number_of_worker_threads):
>q.put(sentinel)
>
> which certainly seems like a code smell to me.

Yeah, it kind of does.  Why not write a Queue + Worker manager that
keeps track of the number of workers, that has a .finish() method that
does this smelly task for you?

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: itertools, functools, file enhancement ideas

2007-04-10 Thread Klaas
On Apr 8, 9:34 am, Paul Rubin  wrote:
> [EMAIL PROTECTED] writes:

> > >   a) def flip(f): return lambda x,y: f(y,x)
> > Curious resemblance to:
> >itemgetter(1,0)
>
> Not sure I understand that.

I think he read it as lambda (x, y): (y, x)

More interesting would be functools.rshift/lshift, that would rotate
the positional arguments (with wrapping)

def f(a, b, c, d, e):
...
rshift(f, 3) --> g, where g(c, d, e, a, b) == f(a, b, c, d, e)

Still don't see much advantage over writing a lambda (except perhaps
speed).

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problems with os.spawnv

2007-04-06 Thread Klaas
On Apr 5, 3:25 pm, "Henrik Lied" <[EMAIL PROTECTED]> wrote:

> > > I'd still love to get a working example of my problem using the
> > > Subprocess module. :-)
>
> > The same thing:
> > p = subprocess.Popen(["mencoder", "/users/...", "-ofps", ...])
>
> That example looked great at first, but on a closer look it didn't
> quite end up to be what I wanted. In a real environment the user still
> had to wait for the command to finish.

Then you are not using it correctly.  subprocess.Popen() returns
immediately.  Notice the order of events here:

In [2]: subprocess.Popen('sleep 2; echo foo', shell=True); print
'bar'
bar
foo

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why NOT only one class per file?

2007-04-04 Thread Klaas
On Apr 4, 2:52 pm, Thomas Krüger <[EMAIL PROTECTED]> wrote:
>
> At first: if he really like it he can place every class in a single
> file. But there are some reasons why Python "allows" you to place many
> classes in one file:
>
> - It's (a little bit) faster, no additional file system lookup is needed. ;)
> - You can define a class in a class. Django, for example, uses this for
> it's data models. If you do this you are forced to have multiple classes
> in on file. 
> Example:http://www.djangoproject.com/documentation/tutorial02/#make-the-poll-...

That is somewhat specious: inner classes can be defined in java too.

The main reason is that in java, classes are magical entities which
correspond to one "exportable" unit of code.  Thus it makes a great
deal of sense to limit to one _public_ class per file (java also
allows unlimited private and package-private classes defined in a
single file).

If you want to define a bunch of utility functions in java, you write
a file containing a single class with static methods.

In python, classes do not have special status.  The exportable unit of
code is a module, which, like public classes in java, can contain
functions, static variables, and classes.  Similar to java, you are
limited to a single module object per file (modulo extreme trickery).

If you want to define a bunch of utility functions in python, you
write a file containing a single module with functions.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: with timeout(...):

2007-03-27 Thread Klaas
On Mar 27, 3:28 pm, Paul Rubin  wrote:
> Nick Craig-Wood <[EMAIL PROTECTED]> writes:
> > It could be made to work I'm sure by getting the interpreter to check
> > for timeouts every few hundred bytecodes (like it does for thread
> > switching).
>
> Is there some reason not to use sigalarm for this?

 * doesn't work with threads
 * requires global state/handler
 * cross-platform?

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: with timeout(...):

2007-03-26 Thread Klaas
On Mar 26, 3:30 am, Nick Craig-Wood <[EMAIL PROTECTED]> wrote:
> Did anyone write a contextmanager implementing a timeout for
> python2.5?
>
> I'd love to be able to write something like
>
> with timeout(5.0) as exceeded:
> some_long_running_stuff()
> if exceeded:
> print "Oops - took too long!"
>
> And have it work reliably and in a cross platform way!

Doubt it.  But you could try:

class TimeoutException(BaseException):
   pass

class timeout(object):
   def __init__(self, limit_t):
   self.limit_t = limit
   self.timer = None
   self.timed_out = False
   def __nonzero__(self):
   return self.timed_out
   def __enter__(self):
   self.timer = threading.Timer(self.limit_t, ...)
   self.timer.start()
   return self
   def __exit__(self, exc_c, exc, tb):
   if exc_c is TimeoutException:
  self.timed_out = True
  return True # suppress exception
   return False # raise exception (maybe)

where '...' is a ctypes call to raise the given exception in the
current thread (the capi call PyThreadState_SetAsyncExc)

Definitely not fool-proof, as it relies on thread switching.  Also,
lock acquisition can't be interrupted, anyway.  Also, this style of
programming is rather unsafe.

But I bet it would work frequently.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: threading and iterator crashing interpreter

2007-03-12 Thread Klaas
On Mar 12, 1:10 pm, "Rhamphoryncus" <[EMAIL PROTECTED]> wrote:
> http://sourceforge.net/tracker/index.php?func=detail&aid=1579370&grou...
>
> That refers to a generator crash.  You are using generators, but also
> getting a weird dict error.  Maybe related, maybe not.
>
> I'll figure out if I've got a "fixed" version or not when I get back.

I was the one who filed the first bug.  login2 is definitely the same
bug: you have a generator running in a thread, but the thread is being
garbage collected before the generator's .close() method is run (which
references the thread state).

Try the patch I posted in the bug above; if it works, then python
trunk/5.1maint should work.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pep 3105: the end of print?

2007-02-16 Thread Klaas
On Feb 16, 2:31 pm, Sam <[EMAIL PROTECTED]> wrote:
>pass
> except (ImportError, SyntaxError):
># python 3.0
>print2 = print
> SyntaxError: invalid syntax
>
> Any and all aliasing must happen in compat26.py. My suggested solution is 
> this:

Good catch.  Point is that it is not impossible.

-mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pep 3105: the end of print?

2007-02-16 Thread Klaas
On Feb 16, 6:01 am, "Edward K Ream" <[EMAIL PROTECTED]> wrote:

> That's the proof.  Can you find a flaw in it?

Casting this in terms of theorem proving only obfuscates the
discussion.

Here is how to maintain a single codebase for this feature:

1. Convert all your print statements to 3.0 print functions, named
something else (say, print2())
2. define a module called compat26 containing:

def print2(*args, **kwargs):
# code to convert the above to print statements (better still,
sys.stdout.write())

3. in your code:
try:
from compat26 import print2
except (ImportError, SyntaxError):
# python 3.0
print2 = print

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help with Optimization of Python software: real-time audio controller

2007-02-12 Thread Klaas
On Feb 11, 6:40 pm, [EMAIL PROTECTED] wrote:
> Currently, I have all of the above "working", although I'm running
> into some serious timing issues.  When I run the program, I get
> irregular timing for my metronome (if it sounds at all), as well as
> irregular timing in writing to the external device.  It's extremely
> crucial that threads #1 & #2 are executed as close to real-time as
> possible, as they form the "core" of the song, and their elements
> can't be delayed without "messing" the song up considerably.
>
> I've read up quite a bit on different optimization methods in Python,
> but am not sure which direction to head.  I've checked out profile,
> Psyco, Pyrex, as well as just porting everything over to C.  Since I'm
> on a Mac (Power PC), I can't use Psyco.  And doing any of the others
> seemed like a big enough project that I should really ask someone else
> before I embark.

Your problems do not necessarily stem from slow code.  Python only
performs thread context switches every 100 øpcodes, and the switch
might not arrive at the right time.

You can lower this value (sys.setcheckinterval).  Your computationally-
intensive threads may effectively lower their priority by calling
time.sleep(.1) every so often.

Ultimately, maintaining explicity control over the scheduling of
events is probably the way to go.

Pyrex is my preferred optimization method, but it can take some
knowledge of what's going on to get the most out of it.  numpy is
another option.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: huge amounts of pure Python code broken by Python 2.5?

2007-02-12 Thread Klaas
On Feb 10, 5:59 am, Brian Blais <[EMAIL PROTECTED]> wrote:
> Klaas wrote:
> > I have converted our 100 kloc from 2.4 to 2.5.  It was relatively
> > painless, and 2.5 has features we couldn't live without.
>
> Just out of curiosity, what features in 2.5 can you not live without?  I just
> migrated to 2.5, but haven't had much time to check out the cool new features.

Most important being the finalization of generators, which allowed me
to implement remote-rpc-yield elegantly.  It would have been possible
to do this using iterator classes, I admit.

with statements have also been a wonderful addition -- like decorators
for code blocks!

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: huge amounts of pure Python code broken by Python 2.5?

2007-02-09 Thread Klaas
On Feb 8, 6:37 pm, "kernel1983" <[EMAIL PROTECTED]> wrote:
> On Feb 9, 10:29 am, "Klaas" <[EMAIL PROTECTED]> wrote:

> > The changes listed dont' seem particularly huge considering the size,
> > complexity, and boundary-pushingness of Twisted, coupled with the
> > magnitude of the 2.5 release.
>
> Just keep using python2.4

I have converted our 100 kloc from 2.4 to 2.5.  It was relatively
painless, and 2.5 has features we couldn't live without.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: huge amounts of pure Python code broken by Python 2.5?

2007-02-08 Thread Klaas
On Feb 6, 11:07 am, Jean-Paul Calderone <[EMAIL PROTECTED]> wrote:
> On Tue, 06 Feb 2007 08:40:40 -0700, Steven Bethard <[EMAIL PROTECTED]> wrote:
> >Jean-Paul Calderone <[EMAIL PROTECTED]> wrote:
> > > Huge amounts of my pure Python code was broken by Python 2.5.
>
> >Interesting. Could you give a few illustrations of this? (I didn't run
> >into the same problem at all, so I'm curious.)
>
> There are about half a dozen examples linked from here:
>
>  http://twistedmatrix.com/trac/ticket/1867
>
> Check out the closed ticket linked from there or the changesets for more
> detail.

The changes listed dont' seem particularly huge considering the size,
complexity, and boundary-pushingness of Twisted, coupled with the
magnitude of the 2.5 release.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Built-in datatypes speed

2007-02-08 Thread Klaas
On Feb 7, 2:34 am, Maël Benjamin Mettler <[EMAIL PROTECTED]>
wrote:
> Anyway, I reimplemented parts of TigerSearch 
> (http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERSearch/) in Python.
> I am currently writing the paper that goes along with this
> reimplementation. Part of the paper deals with the
> differences/similarities in the original Java implementation and my
> reimplementation. In order to superficially evaluate differences in
> speed, I used this paper 
> (http://www.ubka.uni-karlsruhe.de/cgi-bin/psview?document=ira/2000/5&f...
> ) as a reference. Now, this is not about speed differences between Java
> and Python, mind you, but about the speed built-in datatypes
> (dictionaries, lists etc.) run at. As far as I understood it from the
> articles and books I read, any method call from these objects run nearly
> at C-speed (I use this due to lack of a better term), since these parts
> are implemented in C. Now the question is:
>
> a) Is this true?
> b) Is there a correct term for C-speed and what is it?

I think the statement is highly misleading.  It is true that most of
the underlying operations on native data types are implemented in c.
If the operations themselves are expensive, they could run close to
the speed of a suitably generic c implementation of, say, a
hashtable.  But with richer data types, you run good chances of
landing back in pythonland, e.g. via __hash__, __equals__, etc.

Also, method dispatch to c is relatively slow.  A loop such as:

lst = []
for i in xrange(int(10e6)):
lst.append(i)

will spend most of its time in method dispatch and iterating, and very
little in the "guts" of append().

Those guts, mind, will be quick.

-Mike


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The reliability of python threads

2007-01-24 Thread Klaas
On Jan 24, 5:18 pm, Paul Rubin <http://[EMAIL PROTECTED]> wrote:
> "Klaas" <[EMAIL PROTECTED]> writes:
> > CPython is more that "a particular implementation" of python,

> It's precisely a particular implementation of Python.  Other
> implementations include Jython, PyPy, and IronPython.

I did not deny that it is an implementation of Python.  I deny that it
is but an implementation of Python.

Jython: several versions behind, used primariy for interfacing with
java
PyPy: years away from being a practical platform for replacing CPython
IronPython: best example you've given, but still probably three or four
orders of magnitude less significant that CPython

> >  and the GIL is more than an "artifact".  It is a central tenet of
> > threaded python programming.

> If it's a central tenet of threaded python programming, why is it not
> mentioned at all in the language or library manual?

The same reason why IE CSS quirks are not delineated in the HTML 4.01
spec.  This doesn't mean that they aren't central to css web
programming (they are).

How could the GIL, which limits the number of threads in which python
code can be run in a single process to one, NOT be a central part of
threaded python programming?

> The threading
> module documentation describes the right way to handle thread
> synchronization in Python, and that module implements traditional
> locking approaches without reference to the GIL.

No-one has argued that the GIL should be used instead of
threading-based locking.  How could they? The two concepts are not
interchangeable and while they affect each other, are two different
things entirely.  In the post you responded to and quoted I said:

> > I don't advocate relying on the GIL to manage shared data when
> > threading, 

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The reliability of python threads

2007-01-24 Thread Klaas
On Jan 24, 4:11 pm, Paul Rubin <http://[EMAIL PROTECTED]> wrote:
> "Klaas" <[EMAIL PROTECTED]> writes:
> > POSIX issues aside, Python's threading model should be less susceptible
> > to memory-barrier problems that are possible in other languages (this
> > is due to the GIL).

> But the GIL is not part of Python's threading model; it's just a
> particular implementation artifact.  Programs that rely on it are
> asking for trouble.

CPython is more that "a particular implementation" of python, and the
GIL is more than an "artifact".  It is a central tenet of threaded
python programming.

I don't advocate relying on the GIL to manage shared data when
threading, but 1) it is useful for the reasons I mention 2) the OP's
question was almost certainly about an application written for  and run
on CPython.

> > Double-checked locking, frinstance, is safe in python even though it
> > isn't in java.

> What's that?

google.com

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The reliability of python threads

2007-01-24 Thread Klaas
On Jan 24, 10:43 am, "Carl J. Van Arsdall" <[EMAIL PROTECTED]>
wrote:

> Yea, typically I would think that.  The problem I am seeing is
> incredibly intermittent.  Like a simple pyro server that gives me a
> problem maybe every three or four months.  Just something funky will
> happen to the state of the whole thing, some bad data, i'm having an
> issue tracking it down and some more experienced programmers mentioned
> that its most likely a race condition.  THe thing is, I'm really not
> doing anything too crazy, so i'm having difficult tracking it down.  I
> had heard in the past that there may be issues with threads, so I
> thought to investigate this side of things.

POSIX issues aside, Python's threading model should be less susceptible
to memory-barrier problems that are possible in other languages (this
is due to the GIL).  Double-checked locking, frinstance, is safe in
python even though it isn't in java.

Are you ever relying solely on the GIL to access shared data?

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The reliability of python threads

2007-01-24 Thread Klaas


On Jan 24, 10:43 am, "Carl J. Van Arsdall" <[EMAIL PROTECTED]>
wrote:
> Chris Mellon wrote:
> > On 24 Jan 2007 18:21:38 GMT, Nick Maclaren <[EMAIL PROTECTED]> wrote:
>
> >> [snip]
>
> > I'm aware of the issues with the POSIX threading model. I still stand
> > by my statement - bringing up the problems with the provability of
> > correctness in the POSIX model amounts to FUD in a discussion of
> > actual problems with actual code.
>
> > Logic and programming errors in user code are far more likely to be
> > the cause of random errors in a threaded program than theoretical
> > (I've never come across a case in practice) issues with the POSIX
> > standard.Yea, typically I would think that.  The problem I am seeing is
> incredibly intermittent.  Like a simple pyro server that gives me a
> problem maybe every three or four months.  Just something funky will
> happen to the state of the whole thing, some bad data, i'm having an
> issue tracking it down and some more experienced programmers mentioned
> that its most likely a race condition.  THe thing is, I'm really not
> doing anything too crazy, so i'm having difficult tracking it down.  I
> had heard in the past that there may be issues with threads, so I
> thought to investigate this side of things.
>
> It still proves difficult, but reassurance of the threading model helps
> me focus my efforts.
>
> > Emphasizing this means that people will tend to ignore bugs as being
> > "the fault of POSIX" rather than either auditing their code more
> > carefully, or avoiding threads entirely (the second being what I
> > suspect your goal is).
>
> > As a last case, I should point out that while the POSIX memory model
> > can't be proven safe, concrete implementations do not necessarily
> > suffer from this problem.Would you consider the Linux implementation of 
> > threads to be concrete?
>
> -carl
>
> --
>
> Carl J. Van Arsdall
> [EMAIL PROTECTED]
> Build and Release
> MontaVista Software

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The reliability of python threads

2007-01-24 Thread Klaas


On Jan 24, 10:43 am, "Carl J. Van Arsdall" <[EMAIL PROTECTED]>
wrote:
> Chris Mellon wrote:
> > On 24 Jan 2007 18:21:38 GMT, Nick Maclaren <[EMAIL PROTECTED]> wrote:
>
> >> [snip]
>
> > I'm aware of the issues with the POSIX threading model. I still stand
> > by my statement - bringing up the problems with the provability of
> > correctness in the POSIX model amounts to FUD in a discussion of
> > actual problems with actual code.
>
> > Logic and programming errors in user code are far more likely to be
> > the cause of random errors in a threaded program than theoretical
> > (I've never come across a case in practice) issues with the POSIX
> > standard.Yea, typically I would think that.  The problem I am seeing is
> incredibly intermittent.  Like a simple pyro server that gives me a
> problem maybe every three or four months.  Just something funky will
> happen to the state of the whole thing, some bad data, i'm having an
> issue tracking it down and some more experienced programmers mentioned
> that its most likely a race condition.  THe thing is, I'm really not
> doing anything too crazy, so i'm having difficult tracking it down.  I
> had heard in the past that there may be issues with threads, so I
> thought to investigate this side of things.
>
> It still proves difficult, but reassurance of the threading model helps
> me focus my efforts.
>
> > Emphasizing this means that people will tend to ignore bugs as being
> > "the fault of POSIX" rather than either auditing their code more
> > carefully, or avoiding threads entirely (the second being what I
> > suspect your goal is).
>
> > As a last case, I should point out that while the POSIX memory model
> > can't be proven safe, concrete implementations do not necessarily
> > suffer from this problem.Would you consider the Linux implementation of 
> > threads to be concrete?
>
> -carl
>
> --
>
> Carl J. Van Arsdall
> [EMAIL PROTECTED]
> Build and Release
> MontaVista Software

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem installing cElementTree on Python 2.5

2007-01-10 Thread Klaas
Piet van Oostrum wrote:
> I have just installed Python 2.5 on Mac OS X 10.4.8 on an iBook (PPC) from
> the dmg. Now I tried to install cElementTree -1.0.5-20 from source (no egg
> available in cheeseshop) and got the following compilation error:

python2.5 ships with cElementTree:

import xml.etree.cElementTree

cheers,
-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Traceback of hanged process

2007-01-08 Thread Klaas
Hynek Hanke wrote:
> Hello,
>
> please, how do I create a pythonic traceback from a python process that
> hangs and is not running in an interpreter that I executed manually
> or it is but doesn't react on CTRL-C etc? I'm trying to debug a server
> implemented in Python, so I need some analog of 'gdb attach' for C.
>
> Unfortunatelly, googling and reading documentation revealed nothing, so
> please excuse if this question is dumb.

In python2.5, you can run a background thread that listens on a port or
unix socket, and prints a formatted version of sys._current_frames() to
stderr.  

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Set type?

2007-01-04 Thread Klaas

Fredrik Lundh wrote:

> > if type(var) is types.SetType:
> >blah
> >
> > but that is not available in types module.  I am using 2.4
>
>  # set or subclass of set
>  if isinstance(var, set):
>  ...

or
if isinstance(var, (set, frozenset)):
   ...

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Wow, Python much faster than MatLab

2006-12-31 Thread Klaas

sturlamolden wrote:

> as well as looping over the data only once. This is one of the main
> reasons why Fortran is better than C++ for scientific computing. I.e.
> instead of
>
> for (i=0; i   array1[i] = (array1[i] + array2[i]) * (array3[i] + array4[i]);
>
> one actually gets something like three intermediates and four loops:
>
> tmp1 = malloc(n*sizeof(whatever));
> for (i=0; itmp1[i] = array1[i] + array2[i];
> tmp2 = malloc(n*sizeof(whatever));
> for (i=0; itmp2[i] = array3[i] + array4[i];
> tmp3 = malloc(n*sizeof(whatever));
> for (i=0; itmp3[i] = tmp1[i] + tmp2[i];
> free(tmp1);
> free(tmp2);
> for (i=0; i   array1[i]  = tmp3[i];
> free(tmp3);

C/C++ do not allocate extra arrays.  What you posted _might_ bear a
small resemblance to what numpy might produce (if using vectorized
code, not explicit loop code).  This is entirely unrelated to the
reasons why fortran can be faster than c.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pyparsing announcement?

2006-12-22 Thread Klaas
Paul McGuire wrote:
> I have tried a couple of times now to post an announcement of the latest
> version of pyparsing, but it does not seem to be making it past the news
> server, neither through my local ISP's server nor through GoogleGroups.
> Could it be because I am also posting to comp.lang.python.announce, and this
> moderated group is holding up posts to all groups?  (Doesn't really make
> sense to me, but it's all I can come up with.)

Moderated usenet is implemented by emailing the post to the moderator,
who then posts the message to usenet with special headers.  A
cross-posted message is a single message, so yes, it will get held up
until the moderator approves it for _all_ groups.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: using methods base64 module in conjunction with Crypto.Hash.SHA256

2006-12-20 Thread Klaas

[EMAIL PROTECTED] wrote:
> I am attempting to implement a process, and I'm pretty sure that a
> major roadblock is that I do not understand the nomenclature.  The
> specs indicate that the goal is to calculate a message digest using an
> SHA-256 algorithm.  There are 2 examples included with the specs.  The
> label on the 2 examples are: 'HMAC samples'.  In both examples, the
> message on which the digest is to be calculated is (the 33 chars within
> the quotes):
>
> 'This is a test of VISION services'
>
> In the first example, the value labeled 'Shared key' is the 44
> characters within the quotes:
> '6lfg2JWdrIR4qkejML0e3YtN4XevHvqowDCDu6XQEFc='

I doubt it.  That is a base64 encoded value, not the value itself.

<>
> My interpretation of the first example is this: when you use an SHA-256
> algorithm to calculate a message digest on the message 'This is a test
> of VISION services' where the key is
> '6lfg2JWdrIR4qkejML0e3YtN4XevHvqowDCDu6XQEFc=',

This isn't the key, but the base64-encoded key.

> the result should be:
> 'KF7GkfXkgXFNOgeRud58Oqx2equmKACAwzqQHZnZx9A=' .

This isn't the result, but the base64-encoded result.

> 2) If the interpretation of the first example is on target, do you see
> anything above in the use of the SHA256, HMAC and base64
> classes/methods that indicates that I did not correctly implement the
> process?

You should base64 decode the key before passing it to the HMAC
constructor.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Cpoying a PyList to a C string array

2006-12-19 Thread Klaas

Sheldon wrote:

> Thanks Mike,
>
> I am rewriting the code but I don't understand the part about the c
> struct variable called work. The function I posted is a part of a
> larger script and I just posted that part that was problamatic. I was
> under the impression that if I declared the structure as global with
> the variable in tow:
>
> struct my_struct {
> int var;
> } work;
>
> then this is visible everywhere in the function as long as everything
> is in one file. Did I miss something?

It's not important how you declare the struct.  It matters what is in
the struct (in particular, the data types of the members, and what
initialization you've done to them).

The important part was the rest of my message.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Cpoying a PyList to a C string array

2006-12-19 Thread Klaas
Sheldon wrote:
> The code below is a rookie attempt to copy a python list of strings to
> a string array in C. It works to some extent but results in memory
> problems when trying to free the C string array. Does anyone know how
> to do this properly?

You have numerous problems in this code.  The most important problem is
that you are referring to global variables which appear to be c structs
but you don't provide the definition (e.g., "work").  However, I can
guess some of the issues:

>   for (i = 0; i < work.sumscenes; i++) {
> msgop = PyList_GetItem(work.msgobj, i);
> work.msg_scenes[i] = PyString_AsString(msgop);
> ppsop = PyList_GetItem(work.ppsobj, i);
> work.pps_scenes[i] = PyString_AsString(ppsop);
>   }

PyString_AsString returns a pointer to the internal buffer of the
python string.  If you want to be able to free() it (or indeed have it
exist for beyond the lifetime of the associated python string), you
need to malloc() memory and strcpy() the data.  If the strings contain
binary data, you should be using PyString_AsStringAndSize.  see
http://docs.python.org/api/stringObjects.html.

I notice that you are doing no error checking or ref counting, but my
(inexperienced python c programming) opinion is that it should work
(neither api could potentially call python code, so I don't think
threading is an issue).

>   for (i = 0; i < NumberOfTiles; i++) {
> tileop  = PyList_GetItem(work.tileobj, i);
> work.tiles[i] = PyString_AsString(tileop);
> sceneop = PyList_GetItem(work.nscenesobj, i);
> work.nscenes[i] = PyInt_AsLong(sceneop);
>   }
>   return 1;

Similarly.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sha, PyCrypto, SHA-256

2006-12-18 Thread Klaas

Dennis Benzinger wrote:
>
> Python 2.5 comes with SHA-256 in the hashlib module.
> So you could install Python 2.5 instead of the PyCrypto module.

You can download the python2.5 hashlib module for use with python2.4

-MIke

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: merits of Lisp vs Python

2006-12-08 Thread Klaas

Aahz wrote:

> As for your claims about speed, they are also nonsense; I doubt one
> would find an order of magnitude increase of speed for production
> programs created by a competent Lisp programmer compared to programs
> created by a competent Python programmer.

Lisp can be compiled into an executable that has c-like speeds.  It can
be much faster than python.

-MIke

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: I think Python is a OO and lite version of matlab

2006-12-08 Thread Klaas


On Dec 7, 11:48 pm, "Allen" <[EMAIL PROTECTED]> wrote:
> Does anyone agree with me?
> If you have used Matlab, welcome to discuss it.

Numpy definitely was inspired in its extended array syntax by matlab.
Besides that, I don't think two languages could be more different.
Philosophically, matlab is closer to perl.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: len() and PEP 3000

2006-12-06 Thread Klaas
Beliavsky wrote:
> Thomas Guettler wrote:
> > Hi,
> >
> > The function len() is not mentioned in the Python 3000 PEPs.
> >
> > I suggest that at least lists, tupples, sets, dictionaries and strings
> > get a len() method. I think the len function can stay, removing it
> > would break to much code. But adding the method, would bu usefull.
> >
> > Yes, I know, that I can call .__len__() but that is ugly.
>
> I agree with you -- a.__len__() is ugly compared to len(a) . I am
> surprised that such common idioms as len(a) may be going away. It is a
> virtue of Python that it supports OOP without forcing OOP syntax upon
> the user. How can one be confident that Python code one writes now has
> a future?

len() is not going away.

-MIke

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: What are python closures realy like?

2006-12-06 Thread Klaas

Michele Simionato wrote:

> I believe decorators are in large part responsible for that. A callable
> object does not work
> as a method unless you define a custom __get__, so in decorator
> programming it is
> often easier to use a closure. OTOH closures a not optimal if you want
> persistency
> (you cannot pickle a closure) so in that case I use a callable object
> instead.

Note that it isn't necessary to write the descriptor yourself.  The
'new' module takes care of it:

In [1]: class A(object):
   ...: pass
In [2]: a = A()
In [3]: class Method(object):
   ...: def __call__(mself, oself):
   ...: print mself, oself
In [4]: import new
In [5]: a.method = new.instancemethod(Method(), a, A)
In [6]: a.method()
<__main__.Method object at 0xb7ab7f6c> <__main__.A object at
0xb7ab79ec>

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Possible to assure no "cyclic"/"uncollectible" memory leaks?

2006-12-02 Thread Klaas
Joe Peterson wrote:
> I've been doing a lot of searching on the topic of one of Python's more
> disturbing issues (at least to me): the fact that if a __del__ finalizer
> is defined and a cyclic (circular) reference is made, the garbage
> collector cannot clean it up.

It is a somewhat fundamental limitation of GCs, if you want to support:

1. __del__ that can resurrect objects and is deterministically called
when objects are destroyed
2. the "view" of alive objects by __del__ methods is consistent
3. no crashing

If there is a cycle of objects containing __del__ methods, there is
clearly no way of knowing a safe order of invoking them.

> First of all, it seems that it's best to avoid using __del__.  So far, I
> have never used it in my Python programming.  So I am safe there.  Or am
> I?  Also, to my knowledge, I have never created a cyclic reference, but
> we do not typically create bugs intentionally either (and there are
> certainly times when it is an OK thing to do).

It is good practice to avoid __del__ unless there is a compelling
reason to do so.  weakref resource management is much safer.  Note that
it is pretty much impossible to avoid creating reference cycles--they
have a tendency to sneak into unsuspecting places (for instance, bound
methods can be a subtle source of cycles).

> Still, it's not comforting to know that it is possible to create a
> situation that would create a memory leak using a language that is
> supposed to relieve us of that worry.  I understand the problem, but it
> would be nice to know that as a programmer, I could be assured that
> Python would always deal with memory management and that memory leaks
> were not something I had to think about.

It is unrealistic to ever be completely relieved of such worry, since
it is always possible to accidently hold on to a strong reference to
data that should actually be "garbage".  But your question is perhaps
precluding these kinds of memory leak.  In that case, it is a matter of
providing to the programmer sufficiently-fine-grained abstractions such
that the compiler can reason about their safety.  For instance, an
included weakref-based resource cleanup scheme has been discussed and
would cover many of the current uses of __del__.  It would also be nice
to remove some of the hidden "gotchas" that are inherent in CPython,
like the integer and float object freelist (not necessarily removing
those features, but providing some mechanism for reclaiming them when
they get out of hand).

These things can reduce the possibility of a problem, but (IMO) can
never completely obviate it.

> So here's a question: if I write Python software and never use __del__,
> can I guarantee that there is no way to create a memory leak?  What
> about system libraries - do any of them use __del__, and if so, are they
> written in such a way that it is not possible to create a cyclic reference?

It is always possible to create a cyclic reference by monkeypatching a
class.  Here are the stdlib modules which use __del__:
$ find -name \*.py | xargs grep __del__ | grep -v test
./Mac/Demo/sound/morselib.py:def __del__(self):
./Lib/telnetlib.py:def __del__(self):
./Lib/plat-mac/EasyDialogs.py:def __del__(self):
./Lib/plat-mac/FrameWork.py:def __del__(self):
./Lib/plat-mac/MiniAEFrame.py:def __del__(self):
./Lib/plat-mac/Audio_mac.py:def __del__(self):
./Lib/plat-mac/videoreader.py:def __del__(self):
./Lib/fileinput.py:def __del__(self):
./Lib/subprocess.py:def __del__(self):
./Lib/gzip.py:def __del__(self):
./Lib/wave.py:def __del__(self):
./Lib/wave.py:def __del__(self):
./Lib/popen2.py:def __del__(self):
./Lib/lib-tk/Tkdnd.py:def __del__(self):
./Lib/lib-tk/tkFont.py:def __del__(self):
./Lib/lib-tk/Tkinter.py:def __del__(self):
./Lib/lib-tk/Tkinter.py:def __del__(self):
./Lib/urllib.py:def __del__(self):
./Lib/tempfile.py:# __del__ is called.
./Lib/tempfile.py:def __del__(self):
./Lib/tarfile.py:def __del__(self):
./Lib/socket.py:def __del__(self):
./Lib/zipfile.py:fp = None   # Set here since
__del__ checks it
./Lib/zipfile.py:def __del__(self):
./Lib/httplib.py:def __del__(self):
./Lib/bsddb/dbshelve.py:def __del__(self):
./Lib/bsddb/dbshelve.py:def __del__(self):
./Lib/bsddb/__init__.py:def __del__(self):
./Lib/bsddb/dbtables.py:def __del__(self):
./Lib/idlelib/MultiCall.py:def __del__(self):
./Lib/idlelib/MultiCall.py:def __del__(self):
./Lib/idlelib/MultiCall.py:def __del__(self):
./Lib/sunau.py:def __del__(self):
./Lib/sunau.py:def __del__(self):
./Lib/poplib.py:#__del__ = quit
./Lib/_threading_local.py:def __del__(self):
./Lib/aifc.py:def __del__(self):
./Lib/dumbdbm.py:# gets called.  One place _commit() gets called is
from __del__(),
./Lib/dumbdbm.py:# be called from __del__().  Therefore we must
never reference a
./Lib/dumbdbm.py:__del__ = close
./Lib/wsgiref

Fun with with

2006-12-01 Thread Klaas
Occasionally I find myself wanting a block that I can break out of at
arbitrary depth--like java's named break statements.  Exceptions can
obviously be used for this, but it doesn't always look nice.

The with statement can be used to whip up something quite usable:

class ExitBlock(object):
""" A context manager that can be broken out of at an arbitrary
depth,
using .exit() """
def __init__(self):
class UniqueException(BaseException):
pass
self.breakExc = UniqueException
def exit(self):
raise self.breakExc()
def __enter__(self):
return self
def __exit__(self, t, v, tb):
return t is self.breakExc

Now the most important thing here is that each exit block creates a
unique exception type.  If you have philosophical issues with creates
unboundedly many type objects, you can use unique instances too.

This allows named break-out-able blocks:

with ExitBlock() as ex1:
with ExitBlock() as ex2:
with ExitBlock() as ex3:
while True:
ex2.exit()
print 'not displayed'
print 'execution proceeds here from ex2.exit()'
while True:
for x in xrange(sys.maxint):
while True:
ex1.exit()
print 'not displayed'
print 'execution proceeds here from ex1.exit()'

The only danger is bare except (or except BaseException) inside the
block.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is python memory shared between theads?

2006-12-01 Thread Klaas

John Henry wrote:
> Wesley Henwood wrote:

> > Is this normal behavior?  Based on the little documentation I have been
> > able to find on this topic, it is normal behavior.  The only way to use
> > same-named variables in scripts is to have them run in a different
> > process, rather than different threads.
>
> Yes and No.
>
> local variables are local to each threads.   Global variables are
> global to the threads.

That is somewhat misleading.  _All_ variables accessible from two
threads are shared.  This includes globals, but also object attributes
and even local variables (you could create a closure to share a local
among threads).

The only reason locals appear "thread-local" is that locals are
"invokation-local" in that they are different bindings every time a
function is executed, and generally a single invokation of a function
is confined to a single thread.

Another way to share local variables is to create a generator, and call
.next() in two different threads... the local variables are
simulatneously modifiable by both threads.

FWIW, there is also threading.local().

-MIke

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: What are python closures realy like?

2006-12-01 Thread Klaas
Karl Kofnarson wrote:
> Hi,
> while writing my last program I came upon the problem
> of accessing a common local variable by a bunch of
> functions.
> I wanted to have a function which would, depending on
> some argument, return other functions all having access to
> the same variable. An OO approach would do but why not
> try out closures...
> So here is a simplified example of the idea:
> def fun_basket(f):
> common_var = [0]
> def f1():
> print common_var[0]
> common_var[0]=1
> def f2():
> print common_var[0]
> common_var[0]=2
> if f == 1:
> return f1
> if f == 2:
> return f2
> If you call f1 and f2 from the inside of fun_basket, they
> behave as expected, so common_var[0] is modified by
> whatever function operates on it.
> However, calling f1=fun_basket(1); f2 = fun_basket(2) and
> then f1(); f2() returns 0 and 0. It is not the way one would
> expect closures to work, knowing e.g. Lisp make-counter.
> Any ideas what's going on behind the scene?

Python can be read quite literally.  "common_var" is a local variable
to fun_basket, hence it independent among invokations of fun_basket.
"def" is a statement that creates a function when it is executed.  If
you execute the same def statement twice, two different functions are
created.  Running fun_basket twice creates four closures, and the first
two have no relation to the second two.  The two sets close over
different cell variables.

If you want to share data between function invokation, you need an
object which persists between calls. You can use a global variable, or
a default argument.  But since the value is shared everytime the
function is called, I don't see the value in using a closure.  I don't
know lisp very well, but in my mind the whole point of closures is that
you can reference a different unique cell each time.

-MIke

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Remarkable results with psyco and sieve of Eratosthenes

2006-11-30 Thread Klaas
Klaus Alexander Seistrup wrote:
> Pekka Karjalainen wrote:
>
> > You can omit the call to math.sqrt if you test this instead.
> >
> > y*y > x
> >
> > in place of if y > maxfact: .
>
> Or use
>
>   sqrt = lambda x: x ** .5

Test it:

$ python -m timeit -s "from math import sqrt" "sqrt(5.6)"
100 loops, best of 3: 0.445 usec per loop
$ python -m timeit -s "sqrt = lambda x: x**.5" "sqrt(5.6)"
100 loops, best of 3: 0.782 usec per loop

Note that this overhead is almost entirely in function calls; calling
an empty lambda is more expensive than a c-level sqrt:

$ python -m timeit -s "sqrt = lambda x: x" "sqrt(5.6)"
100 loops, best of 3: 0.601 usec per loop

Just math ops:
$ python -m timeit -s "x = 5.6" "x*x"
1000 loops, best of 3: 0.215 usec per loop
$ python -m timeit -s "x = 5.6" "x**.5"
100 loops, best of 3: 0.438 usec per loop

Of course, who knows that psyco does with this under the hood.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to increase the speed of this program?

2006-11-28 Thread Klaas

Klaas wrote:
> Klaas wrote:
>
> > In fact, you can make it about 4x faster by balancing:
> >
> > [EMAIL PROTECTED] ~]$ python -m timeit -s "from array import array"
> > "array('c','\0'*200)*500"
> > 1 loops, best of 3: 32.4 usec per loop
>
> This is an unclean minimally-tested patch which achieves reasonable
> performance (about 10x faster than unpatched python):



Never mind, that patch is bogus.  A updated patch is here:
http://sourceforge.net/tracker/index.php?func=detail&aid=1605020&group_id=5470&atid=305470

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to increase the speed of this program?

2006-11-28 Thread Klaas

Klaas wrote:

> In fact, you can make it about 4x faster by balancing:
>
> [EMAIL PROTECTED] ~]$ python -m timeit -s "from array import array"
> "array('c','\0'*200)*500"
> 1 loops, best of 3: 32.4 usec per loop

This is an unclean minimally-tested patch which achieves reasonable
performance (about 10x faster than unpatched python):

$ ./python -m timeit -s "from array import array" "array('c',
'\0')*10"
1 loops, best of 3: 71.6 usec per loop

You have my permission to use this code if you want to submit a patch
to sourceforge (it needs, proper benchmarking, testing, and tidying).

-Mike

Index: Modules/arraymodule.c
===
--- Modules/arraymodule.c   (revision 52849)
+++ Modules/arraymodule.c   (working copy)
@@ -680,10 +680,29 @@
return NULL;
p = np->ob_item;
nbytes = a->ob_size * a->ob_descr->itemsize;
-   for (i = 0; i < n; i++) {
-   memcpy(p, a->ob_item, nbytes);
-   p += nbytes;
-   }
+
+if (n) {
+  Py_ssize_t chunk_size = nbytes;
+  Py_ssize_t copied = 0;
+  char *src = np->ob_item;
+
+  /* copy first element */
+  memcpy(p, a->ob_item, nbytes);
+  copied += nbytes;
+
+  /* copy exponentially-increasing chunks */
+  while(chunk_size < (size - copied)) {
+memcpy(p + copied, src, chunk_size);
+copied += chunk_size;
+if(chunk_size < size/10)
+  chunk_size *= 2;
+  }
+  /* copy remainder */
+  while (copied < size) {
+memcpy(p + copied, src, nbytes);
+copied += nbytes;
+  }  
+}
return (PyObject *) np;
 }

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to increase the speed of this program?

2006-11-28 Thread Klaas
John Machin wrote:

> Thanks, that's indeed faster than array(t, [v]*n) but what I had in
> mind was something like an additional constructor:
>
> array.filledarray(typecode, repeat_value, repeat_count)
>
> which I speculate should be even faster. Looks like I'd better get a
> copy of arraymodule.c and start fiddling.
>
> Anyone who could use this? Suggestions on name? Argument order?
>
> Functionality: same as array.array(typecode, [repeat_value]) *
> repeat_count. So it would cope with array.filledarray('c', "foo", 10)

Why not just optimize array.__mul__?  The difference is clearly in the
repeated memcpy() in arraymodule.c:683.  Pseudo-unrolling the loop in
python demonstrates a speed up:

[EMAIL PROTECTED] ~]$ python -m timeit -s "from array import array"
"array('c',['\0'])*10"
100 loops, best of 3: 3.14 msec per loop
[EMAIL PROTECTED] ~]$ python -m timeit -s "from array import array"
"array('c',['\0','\0','\0','\0'])*25000"
1000 loops, best of 3: 732 usec per loop
[EMAIL PROTECTED] ~]$ python -m timeit -s "from array import array"
"array('c','\0'*20)*5000"1 loops, best of 3: 148 usec per loop

Which is quite close to your fromstring solution:

[EMAIL PROTECTED] ~]$ python -m timeit -s "from array import array"
"array('c').fromstring('\0'*10)"
1 loops, best of 3: 137 usec per loop

In fact, you can make it about 4x faster by balancing:

[EMAIL PROTECTED] ~]$ python -m timeit -s "from array import array"
"array('c','\0'*200)*500"
1 loops, best of 3: 32.4 usec per loop

For the record:

[EMAIL PROTECTED] ~]$ python -m timeit -s "from array import array"
"array('c','\0'*10)"
1 loops, best of 3: 140 usec per loop

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The Python Papers Edition One

2006-11-23 Thread Klaas
Tennessee writes:
>* If you say LaTex, I'll eat your brain. Or my hat. Unless I'm
> seriously underrating it, but I don't think so.

Why?  It is a suitable solution to this problem.  You can produce
unformatted content, then produce pdf and html pages from it.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: utf - string translation

2006-11-22 Thread Klaas
David H Wild wrote:
> In article <[EMAIL PROTECTED]>,
>John Machin <[EMAIL PROTECTED]> wrote:
> > So why do you want to strip off accents? The history of communication
> > has several examples of significant difference in meaning caused by
> > minute differences in punctuation or accents including one of which you
> > may have heard: a will that could be read (in part) as either "a chacun
> > d'eux million francs" or "a chacun deux million francs" with the
> > remainder to a 3rd party.
>
> The difference there, though, is a punctuation character, not an accent.

It's not too hard to imagine an accentual difference, eg:

Le soldat protège avec le fusil --> the soldier protects with the gun
Le soldat protégé avec le fusil --> the soldier who is protected by
the gun (perhaps a cannon)

Contrived example, I realize, but there are scads of such instances.
(Caveat: my french is also very rusty).

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: dict.reserve and other tricks

2006-11-17 Thread Klaas
[EMAIL PROTECTED] wrote:
> Klaas:
>
> > Well, you can reduce the memory usage to virtually nothing by using a
> > generator expression rather than list comprehension.
>
> Are you sure? I don't think so. Can you show a little example?

Sorry, that was boneheaded and wrong.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: cPickle problems

2006-11-16 Thread Klaas

Jeff  Poole wrote:
> Good idea.  Well, I did that, and I found out that the object causing
> problems is a ParseResults object (a class from PyParsing) and that the
> __getstate__ member is in fact an empty string ('').  I'm not sure
> where this leaves me...  The PyParsing code clearly never creates such

Sounds like ParseResults is not intended to be pickable.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dict.reserve and other tricks

2006-11-16 Thread Klaas
[EMAIL PROTECTED] wrote:
> I have started doing practice creating C extensions for CPython, so
> here are two ideas I have had, possibly useless.
>
> If you keep adding elements to a CPython dict/set, it periodically
> rebuilds itself. So maybe dict.reserve(n) and a set.reserve(n) methods
> may help, reserving enough (empty) memory for about n *distinct* keys
> the programmer wants to add to the dict/set in a short future. I have
> seen that the the C API of the dicts doesn't allow this, and I don't
> know if this can be implemented modifying the dicts a bit. Do you think
> this may be useful?

It has been proposed before and rejected.  How often is dict creation a
bottleneck in python apps?  I'd guess not often.  Do you know of any
examples

Optimize space use also isn't terribly compelling, as a "tight" dict
can be created from a "loose" dict d using dict(d).

<>
> Most of the times such functions are good enough, but sometimes the
> dicts are big, so to reduce memory used I remove keys in place:
>
> def filterdict(pred, indict):
>   todel = [k for k,v in indict.iteritems() if not pred(k,v)]
>   for key in todel:
> del indict[key]

<>
> But doing the same thing while iterating on the dict may be faster and
> use even less memory.

Well, you can reduce the memory usage to virtually nothing by using a
generator expression rather than list comprehension.

> This iteration&deletion capability is probably not safe enough to be
> used inside Python programs, but a compiled C module with a function
> that works like that filterdict (and deletes while iterating) may be
> created, and its use is safe from Python programs.

<>

> >The dictionary p should not be mutated during iteration. It is safe (since 
> >Python 2.1) to modify the values of the keys as you iterate over the 
> >dictionary, but only so long as the set of keys does not change.<
>
> Do you think it may be useful to create to create such C filterdict
> function that deletes while iterating? (To create such function it
> probably has to bypass the CPython dict API).

Such a beast would be fiendish to write, I think.  Remember, arbitrary
python code can be executed by __hash__ and deleting (DECREF) python
objects.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-13 Thread Klaas
velotron wrote:
> On Nov 9, 8:38 pm, "Klaas" <[EMAIL PROTECTED]> wrote:
>
> > I was referring specifically to abominations like range(100)
>
> However, there are plenty of valid reasons to allocate huge lists of
> integers.
I'm sure there are some; I doubt there are plenty.  Care to name a few?

> This issue has been worked on:
> http://evanjones.ca/python-memory.html
> http://evanjones.ca/python-memory-part3.html
>
> My understanding is that the patch allows most objects to be released
> back to the OS, but can't help the problem for integers.  I could be

Integers use their own allocator and as such aren't affected by Evan's
patch.

> mistaken.  But on a clean Python 2.5:
>
> x=range(1000)
> x=None
>
> The problem exists for floats too, so for a less contrived example:
>
> x=[random.weibullvariate(7.0,2.0) for i in xrange(1000)]
> x=None
>
> Both leave the Python process bloated in my environment.   Is this
> problem a good candidate for the FAQ?

I think floats use obmalloc so I'm slightly surprised you don't see
differences.  I know that evan's patch imposes conditions on freeing
obmalloc arenas, so you could be seeing effects of that.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-09 Thread Klaas

placid wrote:

> Actually i am executing that code snippet and creating BeautifulSoup
> objects in the range()  (now xrange() ) code block.

Right; I was referring specifically to abominations like
range(100), not looping over an incrementing integer.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-07 Thread Klaas
placid wrote:
> Hi All,
>
> Just wondering when i run the following code;
>
> for i in range(100):
>  print i
>
> the memory usage of Python spikes and when the range(..) block finishes
> execution the memory usage does not drop down. Is there a way of
> freeing this memory that range(..) allocated?

Python maintains a freelist for integers which is never freed (I don't
believe this has changed in 2.5).  Normally this isn't an issue since
the number of distinct integers in simultaneous use is small (assuming
you aren't executing the above snippet).

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Defaultdict and speed

2006-11-04 Thread Klaas
[EMAIL PROTECTED] wrote:
> Klaas wrote:
> > Benchmarks?
>
> There is one (fixed in a succesive post) in the original thread I was
> referring to:
> http://groups.google.com/group/it.comp.lang.python/browse_thread/thread/aff60c644969f9b/
> If you want I can give more of them (and a bit less silly, with strings
> too, etc).
<>

Sorry, I didn't see any numbers.  I ran it myself and found the
defaultdict version to be approximately twice as slow.  This, as you
suggest, is the worst case, as you are using integers as hash keys
(essentially no hashing cost) and are accessing each key exactly once.

>
> > (and slowing down other uses of the class)
>
> All it has to do is to cheek if the default_factory is an int, it's
> just an "if" done only once, so I don't think it slows down the other
> cases significantly.

Once it makes that check, surely it must check a flag or some such
every time it is about to invoke the key constructor function?

> > especially when the faster alternative is so easy to code.
>
> The faster alternative is easy to create, but the best faster
> alternative can't be coded, because if you code it in Python you need
> two hash accesses, while the defaultdict can require only one of them:
>
> if n in d:
> d[n] += 1
> else:
> d[n] = 1

How do you think that defaultdict is implemented?  It must perform the
dictionary access to determine that the value is missing.  It must then
go through the method dispatch machinery to look for the __missing__
method, and execute it.  If you _really_ want to make this fast, you
should write a custom distionary subclass which accepts an object (not
function) as default value, and assigns it directly.

> >If that performance difference matters,
>
> With Python it's usually difficult to tell if some performance
> difference matters. Probably in some programs it may matter, but in
> most other programs it doesn't matter. This is probably true for all
> the performance tweaks I may invent in the future too.

In general, I agree, but in this case it is quite clear.  The only
possible speed up is for defaultdict(int).  The re-write using regular
dicts is trivial, hence, for given piece of code is it quite clear
whether the performance gain is important.  This is not an
interpreter-wide change, after all.

Consider also that the performance gains would be relatively
unsubstantial when more complicated keys and a more realistic data
distribution is used.  Consider further that the __missing__ machinery
would still be called.  Would the resulting construct be faster than
the use of a vanilla dict?  I doubt it.

But you can prove me wrong by implementing it and benchmarking it.

> > you would likely find more fruitful
> > gains in coding it in c, using PyDict_SET
>
> I've just started creating a C lib for related purposes, I'd like to
> show it to you all on c.l.p, but first I have to find a place to put it
> on :-) (It's not easy to find a suitable place, it's a python + c +
> pyd, and it's mostly an exercise).

Would suggesting a webpage be too trite?

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Defaultdict and speed

2006-11-03 Thread Klaas
[EMAIL PROTECTED] wrote:
> This post sums some things I have written in another Python newsgroup.
> More than 40% of the times I use defaultdict like this, to count
> things:
>
> >>> from collections import defaultdict as DD
> >>> s = "abracadabra"
> >>> d = DD(int)
> >>> for c in s: d[c] += 1
> ...
> >>> d
> defaultdict(, {'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1})
>
> But I have seen that if keys are quite sparse, and int() becomes called
> too much often, then code like this is faster:
>
> >>> d = {}
> >>> for c in s:
> ...   if c in d: d[c] += 1
> ...   else: d[c] = 1
> ...
> >>> d
> {'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1}
>
> So to improve the speed for such special but common situation, the
> defaultdict can manage the case with default_factory=int in a different
> and faster way.

Benchmarks?  I doubt it is worth complicating defaultdict's code (and
slowing down other uses of the class) for this improvement...
especially when the faster alternative is so easy to code.  If that
performance difference matters, you would likely find more fruitful
gains in coding it in c, using PyDict_SET.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sorted and reversed on huge dict ?

2006-11-03 Thread Klaas
[EMAIL PROTECTED] wrote:
> thanks for your replies :)
>
> so i just have tried, even if i think it will not go to the end => i
> was wrong : it is around 1.400.000 entries by dict...
>
> but maybe if keys of dicts are not duplicated in memory it can be done
> (as all dicts will have the same keys, with different (count) values)?

Definitely a good strategy.  The easiest way is to use intern(key) when
storing the values.  (This will only work if you are using 8bit
strings.  You'll have to maintain your own object cache if you are
using unicode).

I've reduced the memory requirements of very similar apps this way.
-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Javascript is turning into Python?!

2006-11-03 Thread Klaas

Paul Rubin wrote:
> "Carl Banks" <[EMAIL PROTECTED]> writes:
> > > http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7
> > Maybe in exchange, Python can borrow the let statement.
>
> Maybe the with statement could be extended to allow binding more than
> one variable.
> with x as f(), y as g():
>blah (x, y)

from contextlib import nested

with nested(f(), g()) as (x, y):
   

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: import in threads: crashes & strange exceptions on dual core machines

2006-10-31 Thread Klaas
robert wrote:
> Klaas wrote:
> > It seems clear that the import lock does not include fully-executing
> > the module contents.  To fix this, just import cookielib before the
>
> What is the exact meaning of "not include fully-executing" - regarding the 
> examples "import cookielib" ?
> Do you really mean the import statement can return without having executed 
> the cookielib module code fully?
> (As said, a simple deadlock is not at all my problem)

No, I mean that the import lock seems to not be held while the module
contents are being executed (which would be why you are getting
partially-initialized module in sys.modules).  Perhaps it _is_ held,
but released at various points of the import process.  Someone more
knowledgable of python internals will have to answer the question of
what _should_ be occurring.

> thanks. I will probably have to do the costly pre-import of things in main 
> thread and spread locks as I have also no other real idea so far.

Costly?

> Yet this costs the smoothness of app startup and corrupts my believe in 
> Python capabs of "lazy execution on demand".

If you lock your code properly, you can do the import anytime you wish

> I'd like to get a more fundamental understanding of the real problems than 
> just a general "stay away and lock and lock everything without real 
> understanding".

Of course.  But you have so far provided no information to that
regard--not even a stack trace.  If you suspect a bug in python, have
you submitted a bug report at sourceforge?

> * I have no real explanation why the import of a module like cookielib is not 
> thread-safe. And in no way I can really explain the real OS-level crashes on 
> dual cores/fast CPU's. Python may throw this and that, Python variable states 
> maybe wrong, but how can it crash on OS-level when no extension libs are 
> (hopefully) responsible?

If you are certain (and not just hopeful) that no extension modules are
involved, this points to a bug in python.

> * The Import Lock should be a very hard lock: As soon as any thread imports 
> something, all other threads are guaranteed to be out of any imports. A dead 
> lock is not the problem here.

What do you mean by "should"?  Is this based on your knowledge of
python internals?

> * the things in my code patter are function local code except "opener = 
> urlcookie_openers.get(user)" and "urlcookie_openers[user] = opener" : Simple 
> dictionary accesses which are atomic from all my knowledge and experience. I 
> think, I have thought about enough, what could be not thread safe. The only 
> questionable things have to do with rare change of some globals,

It is very easy for dictionary accesses to be thread-unsafe, as they
can call into python-level __hash__ and __eq__ code.  If this happens,
a context switch is possible.  Are you sure this isn't the case?

> but this has  not at all to do with the severe problems here and could only 
> affect e.g wrong > url2_proxy or double/unecessary re-creation of an opener, 
> which is uncritical in my app.

Your code contains the following pattern, which can cause any number of
application errors, depending on the app:

a = getA()
if a is None:
   
   setA()

If duplicating the creation of an opener isn't a problem, why not just
create one for a user to begin with?

> I'm still puzzled and suspect there is a major problem in Python, maybe in 
> win32ui or - no idea ... ?

Python does a relatively decent job of maintaining thread security for
its most basic operations, but this is no substitute for caring about
thread safety in your own application.  It is only true in the most
basic cases that a single line of code corresponds to a single opcode,
and determining that the code is correct is even more difficult than
when using explicit locking. The advantages just aren't worth it:

$ python -m timeit -s "import thread; t=thread.allocate_lock()"
"t.acquire(); t.release()"
100 loops, best of 3: 1.34 usec per loop

Note that this is actually less expensive than the handle of python
code that dummy_threading does:

$ python -m timeit -s "import dummy_threading; t =
dummy_threading.Lock()" "t.acquire(); t.release()"
10 loops, best of 3: 2.05 usec per loop

Note that this _doesn't_ mean that you should "lock everything without
real understanding", but in my experience there is very little
meaningful python code that the GIL locks adequately.

As for your crashes, those should be investigated.  But without really
any hints, I don't see that happening.  If you can't reproduce it, it
seems unlikely that anyone else will be able to.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: import in threads: crashes & strange exceptions on dual core machines

2006-10-30 Thread Klaas
It seems clear that the import lock does not include fully-executing
the module contents.  To fix this, just import cookielib before the
threads are spawned.  Better yet, use your own locks around the
acquisition of the opener instance (this code seems fraughtfully
thread-unsafe--fix that and you solve other problems besides this one).

regards,
-Mike

robert wrote:
> I get python crashes and (in better cases) strange Python exceptions when (in 
> most cases) importing and using cookielib lazy on demand in a thread.
> It is mainly with cookielib, but remember the problem also with other imports 
> (e.g. urllib2 etc.).
> And again very often in all these cases where I get weired Python exceptions, 
> the problem is around re-functions - usually during re.compile calls during 
> import (see some of the exceptions below). But not only.
>
> Very strange: The errors occur almost only on machines with dual core/multi 
> processors - and very very rarely on very fast single core machines (>3GHz).
>
> I'm using Python2.3.5 on Win with win32ui (build 210) - the cookielib taken 
> from Python 2.5.
>
> I took care that I'm not starting off thread things or main application loop 
> etc. during an import (which would cause a simple & explainable deadlock 
> freeze on the import lock)
>
> With real OS-level crashes I know from user reports (packaged app), that 
> these errors occur very likely early after app start - thus when lazy imports 
> are likely to do real execution.
>
> I researched this bug for some time. I think I can meanwhile exclude 
> (ref-count, mem.leak) problems in win32ui (the only complex extension lib I 
> use) as cause for this. All statistics point towards import problems.
>
> Any ideas?
> Are there problems known with the import lock (Python 2.3.5) ?
>
> (I cannot easily change from Python 2.3 and it takes weeks to get significant 
> feedback after random improvements)
>
> -robert
>
> PS:
>
> The basic pattern of usage is:
>
> ==
> def f():
> ...
> opener = urlcookie_openers.get(user)
> if not opener:
> import cookielib#<1
> cj=cookielib.CookieJar()#<2
> build_opener = urllib2.build_opener
> httpCookieProcessor = urllib2.HTTPCookieProcessor(cj)
> if url2_proxy:
> opener = build_opener(url2_proxy,httpCookieProcessor)
> else:
> opener = build_opener(httpCookieProcessor)
> opener.addheaders   #$pycheck_no
> opener.addheaders= app_addheaders
> urlcookie_openers[user] = opener
> ufile = opener.open(urllib2.Request(url,data,dict(headers)))
> ...
>
>
> thread.start_new(f,())
> =
>
> Symptoms:
> __
>
> sometimes ufile is None and other weired invalid states.
>
> typical Python exceptions when in better cases there is no OS-level crash:
>
> -
>
>  # Attributes randomly missing like:
>  #<2
>
> "AttributeError: \'module\' object has no attribute \'CookieJar\'\\n"]
>
>
> -
>
> # weired invalid states during computation like:
> #<1
>
> ...  File "cookielib.pyo", line 184, in ?\\n\', \'  File
> "sre.pyo", line 179, in compile\\n\', \'  File "sre.pyo", line 228, in 
> _compile\\n\', \'  File
> "sre_compile.pyo", line 467, in compile\\n\', \'  File "sre_parse.pyo", line 
> 624, in parse\\n\', \'
> File "sre_parse.pyo", line 317, in _parse_sub\\n\', \'  File "sre_parse.pyo", 
> line 588, in
> _parse\\n\', \'  File "sre_parse.pyo", line 92, in closegroup\\n\', 
> \'ValueError: list.remove(x): x
> not in list\\n\']
> ...
> 'windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")
>
>
> -
>
> #<1
>
>
> File "cookielib.pyo", line 116, in ?\\n\', \'  File "sre.pyo", line 179, in 
> compile\\n\', \'  File "sre.pyo", line 228, in _compile\\n\', \'  File 
> "sre_compile.pyo", line 467, in compile\\n\', \'  File "sre_parse.pyo", line 
> 624, in parse\\n\', \'  File "sre_parse.pyo", line 317, in _parse_sub\\n\', 
> \'  File "sre_parse.pyo", line 494, in _parse\\n\', \'  File "sre_parse.pyo", 
> line 140, in __setitem__\\n\', \'IndexError: list assignment index out of 
> range\\n\']
>
> ('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2"
>
> -
>
> # weired errors in other threads:
>
> # after dlg.DoModal() in main thread
>
> File "wintools.pyo", line 115, in PreTranslateMessage\\n\', \'TypeError: an 
> integer is required\\n\']
>
> ('windows', "(5, 1, 2600, 2, 'Service Pack 2')/NP=2")
>
> -
>
> # after win32ui.PumpWaitingMessages(wc.WM_PAINT, wc.WM_MOUSELAST) in main 
> thread
> 
> \'TypeError: argument list must be a tuple\\n\'
> 
> 
> ...

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ZODB for inverted index?

2006-10-25 Thread Klaas
[EMAIL PROTECTED] wrote:
> Hello,

Hi.  I'm not familiar with ZODB, but you might consider berkeleydb,
which behaves like a disk-backed + memcache dictionary.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: new-style classes, __hash__. is the documentation wrong?

2006-10-25 Thread Klaas
gabor wrote:

> 
> If a class defines mutable objects and implements a __cmp__() or
> __eq__() method, it should not implement __hash__(), since the
> dictionary implementation requires that a key's hash value is immutable
> (if the object's hash value changes, it will be in the wrong hash bucket).
> 

> now, with new style classes, the class will have __hash__, whatever i
> do. (well, i assume i could play with __getattribute__...).

There is a proposal to fix this, though I don't think we'll see it for
a while.

class CantHash(object):
__hash__ = None

> so is that part of the documentation currently wrong?

yes

> because from what i see (the bug will not be fixed),
> it's still much safer to define a __hash__ even for mutable objects
> (when the object also defines __eq__).

Better is to define __hash__ in a way that prevents errors:

class Mutable(object):
def __hash__(self):
  raise TypeError('unhashable instance')

It will behave similarly to an old-style class that defines __eq__ and
not __hash__

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python segmentation fault?

2006-10-23 Thread Klaas

Michael B. Trausch wrote:
> Is there a way to debug scripts that cause segmentation faults?  I can
> do a backtrace in gdb on Python, but that doesn't really help me all
> that much since, well, it has nothing to do with my script... :-P

Yes.  If you think it is a python interpreter bug, create a
self-contained script which reproduces the issue, and file a python bug
report.

I'd be interested to see the stack trace--I recently uncovered a
segfault bug in python2.5 and I might be able to tell you if it is the
same one.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problem with hot shot stats

2006-10-10 Thread Klaas
Monu wrote:
> HI All,
> I am getting problem in using hotshot profiler.
> When I hotshot with lineevents=0, it works fine,
> but when I use lineevents=1, I get error in stats

<>
> Can anybody help to figure out the problem please?

hotshot has never reached production-ready stability, imo.  Use the new
cProfile module in python2.5

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: for: else: - any practical uses for the else clause?

2006-09-29 Thread Mike Klaas
On 9/29/06, Johan Steyn <[EMAIL PROTECTED]> wrote:
> On 29 Sep 2006 11:26:10 -0700, Klaas <[EMAIL PROTECTED]> wrote:
>
> > else: does not trigger when there is no data on which to iterate, but
> > when the loop terminated normally (ie., wasn't break-ed out).  It is
> > meaningless without break.
>
> The else clause *is* executed when there is no data on which to iterate.
>  Your example even demonstrates that clearly:

Yes--there is a missing "just" in that sentence.

-Mike
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: for: else: - any practical uses for the else clause?

2006-09-29 Thread Klaas
Klaas wrote:

> else: does not trigger when there is no data on which to iterate, but
> when the loop terminated normally (ie., wasn't break-ed out).  It is
> meaningless without break.

Sorry, this was worded confusingly.  "else: triggers when the loop
terminates normally, not simply in the case that there is no iterated
data".

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: for: else: - any practical uses for the else clause?

2006-09-29 Thread Klaas
metaperl wrote:
> Actually right after posting this I came up with a great usage. I use
> meld3 for my Python based dynamic HTML generation. Whenever I plan to
> loop over a tree section I use a for loop, but if there is no data to
> iterate over, then I simply remove that section from the tree or
> populate it with a "no data" message.

else: does not trigger when there is no data on which to iterate, but
when the loop terminated normally (ie., wasn't break-ed out).  It is
meaningless without break.

Python 2.4.3 (#1, Mar 29 2006, 15:37:23)
[GCC 4.0.2 20051125 (Red Hat 4.0.2-8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> for x in []:
... print 'nothing'
... else:
... print 'done'
... 
done

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Strange behaviour of 'is'

2006-09-21 Thread Klaas
Ben C wrote:
> On 2006-09-21, Fijoy George <[EMAIL PROTECTED]> wrote:

> > But my understanding does not explain the result of the second comparison.
> > According to the experiment, y[0] and y[1] are the same object!
>
> I'm as baffled as you, even more so its implication:

> >>> a = 2.
> >>> b = 2.
>
> >>> a is b
> False
>
> >>> a, b = 2., 2.
> >>> a is b
> True

It is suprising that it is easier to recognize identical constants
within the same expression?

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sets and Membership Tests

2006-07-11 Thread Klaas

JKPeck wrote:
> I would like to be able use sets where the set members are objects of a
> class I wrote.
> I want the members to be distinguished by some of the object content,
> but I have not figured out how a set determines whether two (potential)
> elements are identical.  I tried implementing __eq__ and __ne__ and
> __hash__ to make objects with identical content behave as identical for
> set membership, but so far no luck.

__eq__ and __hash__ are necessary and sufficient.  Code?

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Large Dictionaries

2006-06-12 Thread Klaas
Thomas Ganss wrote:
> Klaas schrieb:
>
> > 4. Insert your keys in sorted order.
> This advice is questionable -
> it depends on the at least on the db vendor and probably
> sometimes on the sort method, if inserting pre-sorted
> values is better.

The article I wrote that you quoted named a specific vendor (berkeley
db) for which my advice is unquestionable and well-known.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Large Dictionaries

2006-05-24 Thread Klaas
Chris:
> class StorageBerkeleyDB(StorageTest):
>def runtest(self, number_hash):
>db = bsddb.hashopen(None, flag='c', cachesize=8192)
>for (num, wildcard_digits) in number_hash.keys():
>key = '%d:%d' % (num, wildcard_digits)
>db[key] = None
>db.close()

BDBs can accomplish what you're looking to do, but they need to be
tuned carefully.  I won't get into too many details here, but you have
a few fatal flaws in that code.

1. 8Kb of cache is _pathetic_.  Give it a few hundred megs.  This is by
far your nbiggest problem.
2. Use BTREE unless you have a good reason to use DBHASH
3. Use proper bdb env creation instead of the hash_open apis.
4. Insert your keys in sorted order.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Large Dictionaries

2006-05-24 Thread Klaas
Chris:
> Berkeley DB is great for accessing data by key for things already
> stored on disk (i.e. read access), but write performance for each
> key-value pair is slow due to it being careful about flushing
> writes to disk by default. 

This is absolutely false.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Large Dictionaries

2006-05-16 Thread Klaas
>22.2s  20m25s[3]

20m to insert 1m keys?  You are doing something wrong.

With bdb's it is crucial to insert keys in bytestring-sorted order.
Also, be sure to give it a decent amount of cache.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: list*list

2006-05-01 Thread Klaas
Diez wrote:
> First of all: it's considered bad style to use range if all you want is a
> enumeration of indices, as it will actually create a list of the size you
> specified. Use xrange in such cases.

> But maybe nicer is zip:
> c = [av * bv for av, bv in zip(a, b)]

By your logic, shouldn't it also be "bad style" to create an
unnecessary list with zip instead of using izip?

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Best way to have a for-loop index?

2006-04-05 Thread Klaas
roy spewed:
> The real question is *why* do you want the index?
>
> If you're trying to iterate through list indicies, you're probably trying
> to write C, C++, Fortran, Java, etc in Python.

Could we stop the stupid continual beratement of people validly asking
about enumerate()?  Yes, we want to discourage:

for i in xrange(len(seq)):
   seq[i]

but in this case, and many other cases, that is clearly not the
question being posed.

enumerate is one of the most useful built-ins and a love the way it
reads in code.  Stop the index-hate.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pickle or Mysql

2006-04-01 Thread Klaas
> Can I use Pickle to store about 500,000 key value pairs.. or should I
> use mySql. Which one is best for performance, as the key value pair
> increases.

Pickle: absolutely out of the question.
Mysql: might work, albeit slowly.

Use berkeley DB (bsddb3), or zodb.  I have no experience with the
latter, but bdb's scale far beyond that magnitude if necessary.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Another proposed proposal: operator.bool_and / bool_or

2006-03-30 Thread Klaas
>def any(seq): return reduce(bool_or, seq, False)
>   def all(seq): return reduce(bool_and, seq, True)

Any other use cases?  These will be built-in in 2.5

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: __slots__

2006-03-27 Thread Klaas
David wrote:
> 3. What is a simple example of a Pythonic use of __slots__ that does NOT
> involved the creation of **many** instances.

mu.  Your question presupposes the existence of such an example.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: bsddb3 database file, are there any unexpected file size limits occuring in practice?

2006-02-28 Thread Klaas
> In my current project I expect the total size of the indexes to exceed
> by far the size of the data indexed, but because Berkeley does not
> support multiple indexed columns (i.e. only one key value column as
> index) if I access the database files one after another (not
> simultaneously) it should work without problems with RAM, right?

You can maintain multiple secondary indices on a primary database.  BDB
isn't a "relational" database, though, so speaking of columns confuses
the issue.  But you can have one database with primary key -> value,
then multiple secondary key -> primary key databases (with bdb
transparently providing the secondary key -> value mapping if you
desire).

> Do the data volume required to store the key values have impact on the
> size of the index pages or does the size of the index pages depend only
> on the number of records and kind of the index (btree, hash)?

For btree, it is the size of the keys that matters.  I presume the same
is true for the hashtable, but I'm not certain.

> What is the upper limit of number of records in practice?

Depends on sizes of the keys and values, page size, cache size, and
physical limitations of your machine.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: bsddb3 database file, are there any unexpected file size limits occuring in practice?

2006-02-27 Thread Klaas
Claudio writes:
> I am on a Windows using the NTFS file system, so I don't expect problems
> with too large file size.

how large can files grow on NTFS?  I know little about it.

> (I suppose it in having only 256 MB RAM available that time) as it is
> known that MySQL databases larger than 2 GByte exist and are in daily
> use :-( .

Do you have more ram now?  I've used berkeley dbs up to around 5 gigs
in size and they performed fine.  However, it is quite important that
the working set of the database (it's internal index pages) can fit
into available ram.  If they are swapping in and out, there will be
problems.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: bsddb3 database file, what are the __db.001, __db.002, __db.003 files for?

2006-02-22 Thread Klaas

Claudio Grondi wrote:

> Beside the intended database file
>databaseFile.bdb
> I see in same directory also the
>__db.001
>__db.002
>__db.003
> files where
>__db.003 is ten times as larger as the databaseFile.bdb
> and
>__db.001 has the same size as the databaseFile.bdb .

I can't tell you exactly what each is, but they are the files that the
shared environment (DBEnv) uses to coordinate multi-process access to
the database.  In particular, the big one is likely the mmap'd cache
(which defaults to 5Mb, I believe).

You can safely delete them, but probably shouldn't while your program
is executing.

> Is there any _good_ documentation of the bsddb3 module around beside
> this provided with this module itself, where it is not necessary e.g. to
> guess, that C integer value of zero (0) is represented in Python by the
> value None returned in case of success by db.open() ?

This is the only documentation available, AFAIK:
http://pybsddb.sourceforge.net/bsddb3.html

For most of the important stuff it is necessary to dig into the bdb
docs themselves.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Mutable bytes type

2006-01-16 Thread Klaas
A PEP for a mutable bytes type was composed but withdrawn two years
ago:
http://www.python.org/peps/pep-0296.html

Another was composed (status unclear) in 2004:
http://www.python.org/peps/pep-0332.html

It was mentioned on python-dev in 2004
(http://python.fyxm.net/dev/summary/2004-08-16_2004-08-31.html) and
2005 (http://www.python.org/dev/summary/2005-10-01_2005-10-15.html)
with the latter discussion mentioning that is a likely thing for 3k.

Does anyone know if a well-coded and tested implementation of such a
beast exists now?  Strings make me sad.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regex anomaly

2006-01-02 Thread mike . klaas
Thanks guys, that is probably the most ridiculous mistake I've made in
years 

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Regex anomaly

2006-01-02 Thread mike . klaas

Hello,

Has anyone has issue with compiled re's vis-a-vis the re.I (ignore
case) flag?  I can't make sense of this compiled re producing a
different match when given the flag, odd both in it's difference from
the uncompiled regex (as I thought the uncompiled api was a wrapper
around a compile-and-execute block) and it's difference from the
compiled version with no flag specified.  The match given is utter
nonsense given the input re.

In [48]: import re
In [49]: reStr = r"([a-z]+)://"
In [51]: against = "http://www.hello.com";
In [53]: re.match(reStr, against).groups()
Out[53]: ('http',)
In [54]: re.match(reStr, against, re.I).groups()
Out[54]: ('http',)
In [55]: reCompiled = re.compile(reStr)
In [56]: reCompiled.match(against).groups()
Out[56]: ('http',)
In [57]: reCompiled.match(against, re.I).groups()
Out[57]: ('tp',)

cheers,
-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How simputer COULD HAVE succeeded ?

2005-12-20 Thread mike . klaas

> PS. before investing time in Python, I wanted to find out if it
> can interface low-level, by eg. calling the OS or C...etc.
> eg. could it call linux's "dd if=.." ?

In python you have full access to the shell, and excellent
interoperability with c.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   >