Re: Distinguishing between functions and methods in a decorator.

2008-02-08 Thread Berteun Damman
On Thu, 07 Feb 2008 18:22:03 +0100, Diez B. Roggisch
<[EMAIL PROTECTED]> wrote:
> Can you provide an example of what you are actually after? The 
> descriptor-protocol might come to use there.

Thanks for your responses. I have read the Descriptor protocol how-to,
which clarifies method access on objects, and indeed provides a
solution.

My idea was to have some @pre and @post decorators, which check some
pre-conditions. When applied to a method, the first parameter will be an
instance-object, and I wondered whether I could detect whether the
precondition cared about it or not.

So,
@pre(lambda x: x > 0)
def method(self, x):

Here the precondition function does not use 'self', yet it will of
course be provided in a call, so I need to strip it. In case of a @pre
applied to a function, this does not happen. I'm not sure whether this
magic is such a nice solution though.

However, the descriptor protocol indeed is what I need. If I provide the
__get__ method, this will be invoked instead of __call__, which will
happen on functions (or methods, if __get__ is not provided). This way
the two are clearly distinguishable, and I need not worry about
heuristics (such as, is the first parameter called 'self' or something).

Besides, it taught me a bit more about the inner design of Python. :)

Thanks,

Berteun
-- 
http://mail.python.org/mailman/listinfo/python-list


Distinguishing between functions and methods in a decorator.

2008-02-07 Thread Berteun Damman
Hello,

I was wondering a bit about the differences between methods and
functions. I have the following:

def wrap(arg):
print type(arg)
return arg

class C:
def f():
pass

@wrap
def g():
pass

def h():
pass

print type(C.f)
print type(h)

Which gives the following output:




The first line is caused by the 'wrap' function of course. I had
expected the first line to be 'instancemethod' too. So, I would guess,
these methods of C are first created as functions, and only then become
methods after they are 'attached' to some classobj. (You can do that
yourself of course, by saying, for example, C.h = h, then the type of
C.h is 'instancemethod' too.)

Why does the wrapping occur before the function is 'made' into an
instancemethod?

The reason for asking is that I would like to differentiate between
wrapping a function and an instancemethod, because in the latter case,
the first parameter will be the implicit 'self', which I would like to
ignore.  However, when the wrapping occurs, the method still looks like
a function.

Berteun
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Dictionary Keys question

2008-01-30 Thread Berteun Damman
On Wed, 30 Jan 2008 14:47:36 -0800 (PST), FireNWater <[EMAIL PROTECTED]> wrote:
> I'm curious why the different outputs of this code.  If I make the
> dictionary with letters as the keys, they are not listed in the
> dictionary in alphabetical order, but if I use the integers then the
> keys are in numerical order.
>
> I know that the order of the keys is not important in a dictionary,
> but I was just curious about what causes the differences.  Thanks!!

I don't know the exact way Python's hash function works, but I can take
a guess. I'm sorry if I explain something you already know.

A hash is for quickly looking up data. Yet, you don't want to waste too
much memory. So there is a limit number of spaces allocated, in which to
store objects. This number of spaces can be thought of as a list. Then,
if you put something into the dict, Python computes the 'hash' of this
object, which basically forms the index in the list where to store it.
So every object should be mapped onto some index within the list. (If
you retrieve it, the hash is computed again, and the value on that index
is looked up, like list indexing, these are fast operations.)

Say, if you have 100 spaces, and someone puts in integer in the list,
the hashfunction used might be % 100. So the first 100 integers would
always be placed at consecutive places. For strings however, a more
complicated hash-function would be used, which takes into account more
characters, so strings don't end up in order.

For integers, if you put in integers that are spread very widely apart,
they won't end up in order either (see the mod 100 example, 104 will
come before 10).

If you replace the list2 in your example by:
list2 = [1 * x for x in range(1,9)]

You will see that this one doesn't end up in order either. So, there's
no exception for integers when it comes to the order, yet the particular
properties of the hash function will cause sequential integers to end up
in order under some circumstances.

Berteun

PS:
What happens if two values map onto the same space is of course an
obvious question, and the trick is choosing your hashfunction so this
occurs not very often on average. If it happens there are several
strategies. Wikipedia probably has an explanation of how hash-functions
can work in such a case.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unicode literals to latin-1

2008-01-30 Thread Berteun Damman
On Wed, 30 Jan 2008 09:57:55 +0100, <[EMAIL PROTECTED]>
<[EMAIL PROTECTED]> wrote:
> How can I convert a string read from a database containing unicode
> literals, such as "Fr\u00f8ya" to the latin-1 equivalent, "Frøya"?
>
> I have tried variations around
>   "Fr\u00f8ya".decode('latin-1')
> but to no avail.

Assuming you use Unicode-strings, the following should work:
  u"Fr\u00f8ya".encode('latin-1')

That is, for some string s, s.decode('encoding') converts the
non-unicode string s with encoding to a unicode string u. Whereas
for some unicode string u, u.encode('encoding') converts the unicode
string u into a non-unicode string with the specified encoding.

You can use s.encode() on a non-unicode string, but it will first try to
decode it (which might give an DecodeError if there are non-ASCII
characters present) and it will then encode it.

Berteun
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Removal of element from list while traversing causes the next element to be skipped

2008-01-29 Thread Berteun Damman
On Tue, 29 Jan 2008 09:23:16 -0800 (PST), [EMAIL PROTECTED]
<[EMAIL PROTECTED]> wrote:
> If you're going to delete elements from
> a list while iterating over it, then do
> it in reverse order:

Why so hard? Reversing it that way creates a copy, so you might as
well do:
>>> a = [ 98, 99, 100 ]
>>> for i, x in enumerate(a[:]):
 ... if x == 99: del(a[i])
 ... print x

Berteun
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Removal of element from list while traversing causes the next element to be skipped

2008-01-29 Thread Berteun Damman
On Tue, 29 Jan 2008 16:34:17 GMT, William McBrine <[EMAIL PROTECTED]> wrote:
> Look at this -- from Python 2.5.1:
>
 a = [1, 2, 3, 4, 5]
 for x in a:
> ... if x == 3:
> ... a.remove(x)
> ... print x
> ... 
> 1
> 2
> 3
> 5
 a
> [1, 2, 4, 5]

You have to iterate over a copy of 'a', so for x in a[:]. Modifying a
list while iterating over is a recipe for problems. (As it is in many
other programming languages.)

Berteun
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tracking memory usage and object life time.

2007-09-26 Thread Berteun Damman
On Sep 26, 2:31 pm, Bjoern Schliessmann  wrote:
> Did you check the return value of gc.collect? Also, try using
> other "insight" facilities provided by the gc module.
gc.collect states it cannot find any unreachable objects. Meanwhile
the number of objects the garbage collector has to keep track of keeps
increasing.

> You cannot "del" structures, you only "del" names. Objects are
> deleted when they are not bound to any names when and if the
> garbage collector "wants" to delete them.
I understand, but just before I del the name, I ask for the refererres
to the object the name indicates, and there's only one object. Since
it is a local variable, I think this is logical.

This object is a dictionary which contains strings as keys, and heaps
as values. This heap consists of tuples. Every string is referenced
more than once (that's logical), the heaps are only referenced once.
So I would expect those to be destroyed if I destroy the dictionary. I
furthermore assume that if I call gc.collect() I force the garbage
collector to collect? Even if it wouldn't "want" to collect
otherwise?

> Be sure to check for cyclic references, they can be a problem for
> the GC.
I don't see how these could occur. It's basically something like list
(of lists possibly) of ints/strings. No list containing itself. I'll
see whether I can make a stripped down version which exhibits the same
memory growth.

Berteun


-- 
http://mail.python.org/mailman/listinfo/python-list


Tracking memory usage and object life time.

2007-09-26 Thread Berteun Damman
Hello,

I have programmed some python script that loads a graph (the
mathemical one with vertices and edges) into memory, does some
transformations on it, and then tries to find shortest paths in this
graph, typically several tens of thousands. This works fine.

Then I made a test for this, so I could time it, run it several times
and take a look at the best time, et cetera. But it so happens that
the first time the test is run, is always the fastest. If I track
memory usage of Python in top, I see it starts out with around 80 MB
and slowly grows to 500MB. This might cause the slowdown (which is
about a factor 5 for large graphs).

When I run a test, I disable the garbage collection during the test
run (as is adviced), but just before starting a test I instruct the
garbage collector to collect. Running the test without disabling the
garbage collect doesn't show any difference though.

Where possible I explicitly 'del' some of the larger data structures
that have been created after I don't need them anymore. I furthermore
don't really see why there would be references to these larger objects
left. (I can be mistaken of course).

I understand this might be a bit of a vague problem, but does someone
have any idea why the memory usage keeps growing? And whether there is
some tool that assists me in keeping track of the objects currently
alive and the amount of memory they occupy?

The best I now can do is run the whole script several times (from a
shell script) -- but this also forces Python to reparse the graph
input again, and do some other stuff it only has to do once. And it's
also more difficult to examine values and results this way.

Berteun

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: textwrap and combining diacritical marks

2007-06-28 Thread Berteun Damman
On Thu, 28 Jun 2007 09:19:20 + (UTC), Berteun Damman
<[EMAIL PROTECTED]> wrote:
> And that leasts to another question, does Python have a function akin to
> wcwidth() which gives the number of column positions a unicode character
> needs?

After playing around a bit with unicodedata.normalize, but seeing how
this fails when there is no precomposed form, I've decided to take
Marcus Kuhns implementation [1], and made a Python version [2].

This will try to guess the column width of a character. Non printable
characters will report a -1 width (this includes '\n' and '\t' for
example.), except for \0, which has width 0.  Composing characters will
report '0', normal latin characters 1  and full-width forms for example
'2'.

Of course, real output depends on the capabilities of the display
device. xterm is capable of handling combining characters, whereas OS
X's Terminal.app can not do it for Greek or Russian characters for
example.

All in all, I think it is a reasonable start. There is one issue though,
namely involving Plane 1 chars. On 64 bit systems, so it seems, these
are stored as one character, on 32 bit systems as a surrogate pair. I
don't know how this works exactly, but the code should basically ignore
Plane 1 characters on 32 bit systems (i.e. always report display width
'1' even though they're combining or full-width).

Berteun

[1] http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
[2] http://berteun.nl/tmp/wcwidth.py
-- 
http://mail.python.org/mailman/listinfo/python-list


textwrap and combining diacritical marks

2007-06-28 Thread Berteun Damman
Hello,

When using the textwrap module, the wrap will always use len() to
determine the length of the string being wrapped. This might be a
sensible thing to do in many circumstances, but I think there are
circumstances where this does not lead to the desired result.

I assume many applications of this module are found in applications
where text is formatted to be presented to a user, e.g. a console
application. The number of characters in the string, as determined by
len() might not be the number of columns occupied. Some of the
characters might be combining diacritical marks, which go on top of the
previous character, i.e. the string de'ge'ne're' (where the ' indicate
combing accute accents) will only display with a width of 8 characters.

The string might also include some characters that'll switch the console
to bold or underline mode, which have zero display width. If this
happens a lot, the resuling text might seem very badly formatted because
of all these zerowidth character-strings.

It is of course impossible to handle all these scenario's in which some
characters might influence the width of the displayed string, but
wouldn't it be convenient to have a 'chunk_width' method or something
which can be overridden in a derived class, so that a user might give a
custom implementation? The default of this chunk_width might just be
'len()'.

And that leasts to another question, does Python have a function akin to
wcwidth() which gives the number of column positions a unicode character
needs?

Berteun
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PQueue and Python 2.5

2007-01-22 Thread Berteun Damman
Gabriel Genellina wrote:
> Python got in 2.3 a heapq module in its standard library; I think it is what

> Ah! then I bet:
> - There is some C code involved.
> - It carelessly mixes PyMem_Malloc with PyObject_Free or similar as
> described in
> http://docs.python.org/whatsnew/ports.html
>
> So do yourself a favor and forget about such old piece of code...

I would be happy to do so, but it does suit my needs quite well. :) But
everybody thanks for pointing out the probable cause, I never did
anything with C-extentions before, so I wasn't aware of the 2.5
changes. But I'll look into the code.

Berteun

-- 
http://mail.python.org/mailman/listinfo/python-list


PQueue and Python 2.5

2007-01-19 Thread Berteun Damman
Hello,

Recently I was looking for a Priority Queue module, and I've found
Pqueue by Andrew Snare [1]. When I use it with Python 2.4 everything
works okay, at least on the two system I've tested it on (Debian based
AMD 64) and OS PPC.

However, when I use it with Python 2.5 - again on the same machines,
exiting always gives a pointer error. The easiest to demonstrate this
is:

python2.5 -c 'from pqueue import PQueue; PQueue()'

On the Debian system:
$ python2.5 -c 'from pqueue import PQueue; PQueue()'
*** glibc detected *** free(): invalid pointer: 0x2ad7b5720288 ***
Abort

And on my PowerBook:
python2.5(8124) malloc: ***  Deallocation of a pointer not malloced:
0x3b4218; This could be a double free(), or free() called with the
middle of an allocated block;

A memory fault can also be immediately triggered by apply 'del' to a
PQueue-instance. As said, with Python 2.4 it seems to perform without
problems.

I haven't got a clue how to investigate this, but I would be willing to
help if someone has any ideas.

Berteun

[1] http://py.vaults.ca/apyllo.py/514463245.769244789.44776582

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pyparsing and 'keywords'

2004-12-14 Thread Berteun Damman
On Tue, 14 Dec 2004 18:39:19 GMT, Paul McGuire
<[EMAIL PROTECTED]> wrote:
>> If I try however to parse the String "if test; testagain; fi;", it does
>> not work, because the fi is interpreted as an expr, not as the end of
>> the if statement, and of course, adding another fi doesn't solve this
>> either.
> The simplest way I can think of for this grammar off the top of my head is
> to use a parse action to reject keywords.

Thank you for your quick and good solution, that did the trick indeed. 

But I'm wondering, isn't it possible to have some sort of lexing phase
which already identifies keywords as such? Or to provide a table with
keywords, which pyparsing is able to automatically recognize? 

So you would be able to do IF = Keyword("if"), just as a Literal now is
created, but now the parser knows this word shouldn't be interpreted any
other way than as a keyword. Or would that be a bad idea?

Berteun
-- 
http://mail.python.org/mailman/listinfo/python-list


pyparsing and 'keywords'

2004-12-14 Thread Berteun Damman
Hello,

I'm having some problems with pyparsing, I could not find how to tell
it to view certain words as keywords, i.e. not as a possible variable
name (in an elegant way),
for example, I have this little grammar:

terminator = Literal(";")
expr = Word(alphas)
body = Forward();
ifstat = "if" + body + "fi"
stat = expr | ifstat
body << OneOrMore(stat + terminator)
program = body

I.e. some program which contains statements separated by semicolons. A
statement is either an if [] fi statement or simply a word.

If I try however to parse the String "if test; testagain; fi;", it does
not work, because the fi is interpreted as an expr, not as the end of
the if statement, and of course, adding another fi doesn't solve this
either.

How to fix this?

Thank you,

Berteun

-- 
http://mail.python.org/mailman/listinfo/python-list