Re: Code that ought to run fast, but can't due to Python limitations.

2009-07-06 Thread David M . Cooke
Martin v. Löwis  v.loewis.de> writes:

> > This is a good test for Python implementation bottlenecks.  Run
> > that tokenizer on HTML, and see where the time goes.
> 
> I looked at it with cProfile, and the top function that comes up
> for a larger document (52k) is
> ...validator.HTMLConformanceChecker.__iter__.
[...]
> With this simple optimization, I get a 20% speedup on my test
> case. In my document, there are no attributes - the same changes
> should be made to attribute validation routines.
> 
> I don't think this has anything to do with the case statement.

I agree. I ran cProfile over just the tokenizer step; essentially

tokenizer = html5lib.tokenizer.HTMLStream(htmldata)
for tok in tokenizer:
pass

It mostly *isn't* tokenizer.py that's taking the most time, it's
inputstream.py. (There is one exception:
tokenizer.py:HTMLStream.__init__ constructs a dictionary of states
each time -- this is unnecessary, replace all expressions like
self.states["attributeName"] with self.attributeNameState.)

I've done several optimisations -- I'll upload the patch to the
html5lib issue tracker. In particular,

* The .position property of EncodingBytes is used a lot. Every
self.position +=1 calls getPosition() and setPosition(). Another
getPosition() call is done in the self.currentByte property. Most of
these can be optimised away by using methods that move the position
and return the current byte.

* In HTMLInputStream, the current line number and column are updated
every time a new character is read with .char(). The current position
is *only* used in error reporting, so I reworked it to only calculate
the position when .position() is called, by keeping track of the
number of lines in previous read chunks, and computing the number of
lines to the current offset in the current chunk.

These give me about a 20% speedup.

This just illustrates that the first step in optimisation is profiling :D

As other posters have said, slurping the whole document into memory
and using a regexp-based parser (such as pyparsing) would likely give
you the largest speedups. If you want to keep the chunk- based
approach, you can still use regexp's, but you'd have to think about
matching on chunk boundaries. One way would be to guarantee a minimum
number of characters available, say 10 or 50 (unless end-of-file, of
course) -- long enough such that any *constant* string you'd want to
match like 

Re: setting a breakpoint in the module

2006-08-23 Thread David M. Cooke
"Jason Jiang" <[EMAIL PROTECTED]> writes:

> Hi,
>
> I have two modules: a.py and b.py. In a.py, I have a function called
> aFunc(). I'm calling aFunc() from b.py (of course I import module a first).
> The question is how to directly set a breakpoint in aFunc().
>
> The way I'm doing now is to set a breakpoint in b.py at the line to call
> aFunc(), 'c' to it, then 's' to step in, then set the breakpoint inside
> aFunc() by 'b lineNumber'. It's too cumbersome.

You can also add in your source

import pdb; pdb.set_trace()

at the point you want the debugger to stop. Useful if you want to
break after some failing condition, for instance.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Coding style

2006-07-18 Thread David M. Cooke
[EMAIL PROTECTED] (David M. Cooke) writes:
>
> Bruno's already mentioned that iterators and generators aren't
> sequences. Numpy arrays act like the other sequence types:
>
>>>> a = numpy.array([])
>>>> a
> array([], dtype=int64)
>>>> len(a)
> 0
>>>> bool(a)
> False
>
> (0-dimensional numpy arrays are pathological anyways)

*cough* as a Numpy developer I should know better. Numpy arrays that
have more than one element don't work in a boolean context:

>>> a = numpy.array([1,2])
>>> bool(a)
Traceback (most recent call last):
  File "", line 1, in ?
ValueError: The truth value of an array with more than one element is 
ambiguous. Use a.any() or a.all()

The reason for this is that it really was a common source of errors,
because of the rich comparision semantics used. If a and b are numpy
arrays, 'a == b' is an array of booleans.

Numpy arrays of one element act like scalars in boolean contexts:

>>> a = numpy.array([0])
>>> bool(a)
False
>>> a = numpy.array([1])
>>> bool(a)
True

(this is partly because we define a comphensive hierarchy of scalar
types to match those available in C).

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Coding style

2006-07-18 Thread David M. Cooke
"Carl Banks" <[EMAIL PROTECTED]> writes:

> Patrick Maupin wrote:
>> PTY wrote:
>>
>> > It looks like there are two crowds, terse and verbose.  I thought terse
>> > is perl style and verbose is python style.  BTW, lst = [] was not what
>> > I was interested in :-)  I was asking whether it was better style to
>> > use len() or not.
>>
>> It's not canonical Python to use len() in this case.  From PEP 8:
>>
>> - For sequences, (strings, lists, tuples), use the fact that empty
>>   sequences are false.
>>
>>   Yes: if not seq:
>>if seq:
>>
>>   No: if len(seq)
>>   if not len(seq)
>>
>> The whole reason that a sequence supports testing is exactly for this
>> scenario.  This is not an afterthought -- it's a fundamental design
>> decision of the language.
>
> That might have made sense when Python and string, list, tuple were the
> only sequence types around.
>
> Nowadays, Python has all kinds of spiffy types like numpy arrays,
> interators, generators, etc., for which "empty sequence is false" just
> doesn't make sense.  If Python had been designed with these types in
> mind, I'm not sure "empty list is false" would have been part of the
> language, let alone recommend practice.

Bruno's already mentioned that iterators and generators aren't
sequences. Numpy arrays act like the other sequence types:

>>> a = numpy.array([])
>>> a
array([], dtype=int64)
>>> len(a)
0
>>> bool(a)
False

(0-dimensional numpy arrays are pathological anyways)

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ANN: NumPy 0.9.8 released

2006-05-17 Thread David M. Cooke
"Travis E. Oliphant" <[EMAIL PROTECTED]> writes:

> NumPy 0.9.8 has been released.  It can be downloaded from
>
> http://numeric.scipy.org
>
> The release notes are attached.
>
> Best regards,
>
> -Travis Oliphant
> NumPy 0.9.8 is a bug-fix and optimization release with a
> few new features.  The C-API was changed so that extensions compiled
> against NumPy 0.9.6 will need re-compilation to avoid errors.
>
> The C-API should be stabilizing.  The next release will be 1.0 which
> will come out in a series of release-candidates during Summer 2006.
>
> There were many users and developers who contributed to the fixes for
> this release.   They deserve much praise and thanks.  For details see 
> the Trac pages where bugs are reported and fixed.
>
> http://projects.scipy.org/scipy/numpy/
>
>
>   * numpy should install now with easy_install from setuptools

Note that you'll need to use the latest setuptools (0.6b1). The hacks
I added to get easy_install and numpy.distutils to get along are hard
enough without trying to be backward compatible :-(

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "pow" (power) function

2006-03-17 Thread David M. Cooke
"Russ" <[EMAIL PROTECTED]> writes:

> Ben Cartwright wrote:
>> Russ wrote:
>
>> > Does "pow(x,2)" simply square x, or does it first compute logarithms
>> > (as would be necessary if the exponent were not an integer)?
>>
>>
>> The former, using binary exponentiation (quite fast), assuming x is an
>> int or long.
>>
>> If x is a float, Python coerces the 2 to 2.0, and CPython's float_pow()
>> function is called.  This function calls libm's pow(), which in turn
>> uses logarithms.
>
> I just did a little time test (which I should have done *before* my
> original post!), and 2.0**2 seems to be about twice as fast as
> pow(2.0,2). That seems consistent with your claim above.
>
> I'm a bit surprised that pow() would use logarithms even if the
> exponent is an integer. I suppose that just checking for an integer
> exponent could blow away the gain that would be achieved by avoiding
> logarithms. On the other hand, I would think that using logarithms
> could introduce a tiny error (e.g., pow(2.0,2) = 3.96 <- made
> up result) that wouldn't occur with multiplication.

It depends on the libm implementation of pow() whether logarithms are
used for integer exponents. I'm looking at glibc's (the libc used on
Linux) implementation for Intel processors, and it does optimize
integers. That routine is written in assembly language, btw.

>> > Does "x**0.5" use the same algorithm as "sqrt(x)", or does it use some
>> > other (perhaps less efficient) algorithm based on logarithms?
>>
>> The latter, and that algorithm is libm's pow().  Except for a few
>> special cases that Python handles, all floating point exponentation is
>> left to libm.  Checking to see if the exponent is 0.5 is not one of
>> those special cases.
>
> I just did another little time test comparing 2.0**0.5 with sqrt(2.0).
> Surprisingly, 2.0**0.5 seems to take around a third less time.
>
> None of these differences are really significant unless one is doing
> super-heavy-duty number crunching, of course, but I was just curious.
> Thanks for the information.

And if you are, you'd likely be doing it on more than one number, in
which case you'd probably want to use numpy. We've optimized x**n so
that it does handle n=0.5 and integers specially; it makes more sense
to do this for an array of numbers where you can do the special
manipulation of the exponent, and then apply that to all the numbers
in the array at once.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python advocacy in scientific computation

2006-03-06 Thread David M. Cooke
Robert Kern <[EMAIL PROTECTED]> writes:

> sturlamolden wrote:
>
>> 5. Versioning control? For each program there is only one developer and
>> a single or a handful users.
>
> I used to think like that up until two seconds before I entered this gem:
>
>   $ rm `find . -name "*.pyc"`
>
> Okay, I didn't type it exactly like that; I was missing one character. I'll 
> let
> you guess which.

I did that once. I ended up having to update decompyle to run with
Python 2.4 :-) Lost comments and stuff, but the code came out great.

-- 
|>|\/|<
/----------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Scientific Computing with NumPy

2006-02-10 Thread David M. Cooke
"linda.s" <[EMAIL PROTECTED]> writes:

> where to download numpy for Python 2.3 in Mac?
> Thanks!
> Linda

I don't know if anybody's specifically compiled for 2.3; I think most
of the developers on mac are using 2.4 :-)

But (assuming you have the developer tools installed) it's really to
compile: python setup.py build && python setup.py install.

Do you need Tiger (10.4) or Panther (10.3) compatibility?

-- 
|>|\/|<
/----------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: good library for pdf

2006-01-26 Thread David M. Cooke
"Murali" <[EMAIL PROTECTED]> writes:

> Pulling out pages from existing PDF files can be done with Open Source
> stuff. The simplest would be pdftk (PDF Toolkit). The most fancy will
> be using latex and the pdfpages package together with pdflatex.
>
> - Murali

There's also pyPDF, at http://pybrary.net/pyPdf/. I haven't tried it,
but it looks interesting.

-- 
|>|\/|<
/----------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: idea of building python module using pyrex

2005-12-09 Thread David M. Cooke
"[EMAIL PROTECTED]" <[EMAIL PROTECTED]> writes:

> For example stastical module like commulative probability  function for
> t distribution, or other numerical module which incorporate looping to
> get the result.
>
> I found that pyrex is very helpfull when dealing with looping
> things.

Pyrex is indeed quite helpful. If you're interested in statistical
distributions, you'll want to look at the scipy.stats module in scipy
(http://www.scipy.org/), which has lots (including the t distribution).

In SciPy, we use Pyrex for the random-number generator module
scipy.random. It's actually used to wrap some C code, but it does the
job well.

-- 
|>|\/|<
/----------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Underscores in Python numbers

2005-11-19 Thread David M. Cooke
Peter Hansen <[EMAIL PROTECTED]> writes:

> Steven D'Aprano wrote:
>> Dealing with numeric literals with lots of digits is
>> a real (if not earth-shattering) human interface problem: it is hard for
>> people to parse long numeric strings.
>
> I'm totally unconvinced that this _is_ a real problem, if we define 
> "real" as being even enough to jiggle my mouse, let alone shattering the 
> planet.
>
> What examples does anyone have of where it is necessary to define a 
> large number of large numeric literals?  Isn't it the case that other 
> than the odd constants in various programs, defining a large number of 
> such values would be better done by creating a data file and parsing
> it?

One example I can think of is a large number of float constants used
for some math routine. In that case they usually be a full 16 or 17
digits. It'd be handy in that case to split into smaller groups to
make it easier to match with tables where these constants may come
from. Ex:

def sinxx(x):
"computes sin x/x for 0 <= x <= pi/2 to 2e-9"
a2 = -0.1 4
a4 =  0.00833 33315
a6 = -0.00019 84090
a8 =  0.0 27526
a10= -0.0 00239
x2 = x**2
return 1. + x2*(a2 + x2*(a4 + x2*(a6 + x2*(a8 + x2*a10

(or least that's what I like to write). Now, if I were going to higher
precision, I'd have more digits of course.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: authentication for xmlrpc via cgi

2005-09-22 Thread David M. Cooke
[EMAIL PROTECTED] writes:

> I'm using python 2.2 (hopefully we'll be upgrading our system to 2.3
> soon) and I'm trying to prototype some xml-rpc via cgi functionality.
> If I override the Transport class on the xmlrpclib client and add some
> random header like "Junk", then when I have my xmlrpc server log it's
> environment when running, I see the HTTP_JUNK header.  If I do this
> with AUTHORIZATION, the header is not found.
>
> Does this ring a bell for anyone?  Am I misunderstanding how to use
> this header?  I'm guessing that Apache might be eating this header, but
> I don't know why.

By default, Apache does eat that. It's a compile time default; the
Apache developers think it's a security hole. Here's a note about it:

http://httpd.apache.org/dev/apidoc/apidoc_SECURITY_HOLE_PASS_AUTHORIZATION.html

>From what I can see, this is still true in Apache 2.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there a way to determine -- when parsing -- if a word contains a builtin name or other imported system module name?

2005-08-04 Thread David M. Cooke
Casey Hawthorne <[EMAIL PROTECTED]> writes:

> Is there a way to determine -- when parsing -- if a word contains a
> builtin name or other imported system module name?
>
> Like "iskeyword" determines if a word is a keyword!

Look in the keyword module; there is actually an "iskeyword" function
there :)

For modules, sys.modules is a dictionary of the modules that have been
imported.

-- 
|>|\/|<
/----------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Annoying behaviour of the != operator

2005-06-10 Thread David M. Cooke
Robert Kern <[EMAIL PROTECTED]> writes:

> greg wrote:
>> David M. Cooke wrote:
>>
>>>>To solve that, I would suggest a fourth category of "arbitrary
>>>>ordering", but that's probably Py3k material.
>>>
>>>We've got that: use hash().
>>>[1+2j, 3+4j].sort(key=hash)
>> What about objects that are not hashable?
>> The purpose of arbitrary ordering would be to provide
>> an ordering for all objects, whatever they might be.
>
> How about id(), then?
>
> And so the circle is completed...

Or something like

def uniquish_id(o):
try:
return hash(o)
except TypeError:
return id(o)

hash() should be the same across interpreter invocations, whereas id()
won't.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: without shell

2005-06-10 Thread David M. Cooke
Donn Cave <[EMAIL PROTECTED]> writes:

> In article <[EMAIL PROTECTED]>,
>  Grant Edwards <[EMAIL PROTECTED]> wrote:
>
>> On 2005-06-10, Mage <[EMAIL PROTECTED]> wrote:
>> 
>> >>py> file_list = os.popen("ls").read()
>> >>
>> >>Stores the output of ls into file_list.
>> >>
>> > These commands invoke shell indeed.
>> 
>> Under Unix, popen will not invoke a shell if it's passed a
>> sequence rather than a single string.
>
> I suspect you're thinking of the popen2 functions.
> On UNIX, os.popen is posix.popen, is a simple wrapper
> around the C library popen.  It always invokes the
> shell.
>
> The no-shell alternatives are spawnv (instead of
> system) and the popen2 family (given a sequence
> of strings.)

Don't forget the one module to rule them all, subprocess:

file_list = subprocess.Popen(['ls'], stdout=subprocess.PIPE).communicate()[0]

which by default won't use the shell (unless you pass shell=True to it).

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Annoying behaviour of the != operator

2005-06-09 Thread David M. Cooke
Greg Ewing <[EMAIL PROTECTED]> writes:

> Rocco Moretti wrote:
>
>  > This gives the wacky world where
>> "[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.
>
> To solve that, I would suggest a fourth category of "arbitrary
> ordering", but that's probably Py3k material.

We've got that: use hash().
[1+2j, 3+4j].sort(key=hash)

Using the key= arg in sort means you can do other stuff easily of course:

by real part:
import operator
[1+2j, 3+4j].sort(key=operator.attrgetter('real'))

by size:
[1+2j, 3+4j].sort(key=abs)

and since .sort() is stable, for those numbers where the key is the
same, the order will stay the same.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: computer algebra packages

2005-06-08 Thread David M. Cooke
Fernando Perez <[EMAIL PROTECTED]> writes:

> Rahul wrote:
>
>> Hi.
>> The reason is simple enough. I plan to do some academic research
>> related to computer algebra for which i need some package which i can
>> call as a library. Since i am not going to use the package
>> myself..(rather my program will)..it will be helpful to have a python
>> package since i wanted to write the thing in python. if none is
>> available then probably i will need to work on an interface to some
>> package written in some other language or work in that language itself.
>
> I've heard of people writing a Python MathLink interface to Mathematica, which
> essentially turns Mathematica into a Python module.  But I don't have any
> references handy, sorry, and as far as I remember it was done as a private
> contract.  But it's doable.

It should also be doable with Maple, using the OpenMaple API. I've
looked at it, and it should be possible. I haven't had the time to
actually do anything, though :-)

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: editor for shelve files

2005-05-01 Thread David M. Cooke
"Amir  Michail" <[EMAIL PROTECTED]> writes:

> Hi,
>
> Is there a program for editing shelve databases?
>
> Amir

I doubt it. A shelf is basically just a file-based dictionary, where
the keys must be strings, while the values can be arbitrary objects.
An editor could handle the keys, but most likely not the values.

If you have an editor for arbitrary objects, you could probably make
an editor for shelfs easily enough :-)

Do you have a specific use in mind? That would be easier to handle
than the general case.

-- 
|>|\/|<
/------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Wrapping c functions

2005-05-01 Thread David M. Cooke
Andrew Dalke <[EMAIL PROTECTED]> writes:

> Glenn Pierce wrote:
>> if (!PyArg_ParseTuple(args, "isi", &format, filename, &flags))
>> return NULL;
>
> Shouldn't that be &filename ?  See
>   http://docs.python.org/ext/parseTuple.html
> for examples.
>
>
>> dib = FreeImage_Load(format, filename, flags);
>
>> Also I have little Idea what to return from the function. FIBITMAP * is
>> an opaque pointer
>> that I pass to other FreeImage functions, I pretty certain
>> Py_BuildValue("o", dib) is wrong.
>
> If it's truly opaque and you trust your use of the code you can
> cast it to an integer, use the integer in the Python code, and
> at the Python/C interface cast the integer back to a pointer.
> Of course if it no longer exists you'll get a segfault.
>
> If you want more type safety you can use the SWIG approach and
> encode the pointers as a string, with type information and
> pointer included.

Better yet, use a CObject. That way, a destructor can be added so as
to not leak memory. Type info could be included in the desc field.

return PyCObject_FromVoidPtr(dib, NULL)

(the NULL can be replaced with a routine that will free the image.)

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is there a package with convolution and related methods?

2005-04-21 Thread David M. Cooke
Charles Krug <[EMAIL PROTECTED]> writes:

> List:
>
> Is there a Python package with Convolution and related methods?
>
> I'm working on modeling some DSP processes in Python.  I've rolled one
> up, but don't feel much like reinventing the wheel, especially if
> there's already something like "Insanely Efficient FFT for Python"
> already.
>
> Thanks

You most certainly want to look at the numerical python packages
Numeric and numarray (http://numeric.scipy.org/) for array
manipulations, and scipy (http://scipy.org) has wraps for FFTW (Fast
Fourier Transform in the West).

-- 
|>|\/|<
/----------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python/svn issues....

2005-04-12 Thread David M. Cooke
"bruce" <[EMAIL PROTECTED]> writes:

> david...
>
> thanks for the reply...
>
> it's starting to look as though the actual /usr/lib/libdb-4.2.so from the
> rpm isn't exporting any of the symbols...
>
> when i do:
> nm /usr/lib/libdb-4.2.so | grep db_create
>
> i get
>  nm: /usr/lib/libdb-4.2.so: no symbols
>
> which is strange... because i should be getting the db_create symbol...
>
> i'll try to build berkeley db by hand and see what i get...
>
> if you could try the 'nm' command against your berkely.. i'd appreciate you
> letting me know what you get..

Not surprising; plain 'nm' doesn't work for me on shared libraries. I
need to use 'nm -D'. In that case, I get a db_create (or rather, a
versioned form, db_create_4002). Running 'nm -D -g' on the
libsvn_fs_base library shows it uses the same db_create_4002 function.

-- 
|>|\/|<
/--\
|David M. Cooke  http://arbutus.physics.mcmaster.ca/dmc/
|[EMAIL PROTECTED]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python/svn issues....

2005-04-12 Thread David M. Cooke
"bruce" <[EMAIL PROTECTED]> writes:

> hi...
>
> in trying to get viewcvs up/running, i tried to do the following:
>
> [EMAIL PROTECTED] viewcvs-0.9.2]# python
> Python 2.3.3 (#1, May  7 2004, 10:31:40)
> [GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import svn.repos
> Traceback (most recent call last):
>   File "", line 1, in ?
>   File
> "/dar/tmp/subversion-1.1.4-0.1.1.fc2.rf-root/usr/lib/python2.3/site-packages
> /svn/repos.py", line 19, in ?
>   File
> "/dar/tmp/subversion-1.1.4-0.1.1.fc2.rf-root/usr/lib/python2.3/site-packages
> /svn/fs.py", line 28, in ?
>   File
> "/dar/tmp/subversion-1.1.4-0.1.1.fc2.rf-root/usr/lib/python2.3/site-packages
> /libsvn/fs.py", line 4, in ?
> ImportError: /usr/lib/libsvn_fs_base-1.so.0: undefined symbol: db_create

This looks like a problem when Subversion was built: this library was
not linked against the Berkeley DB libraries.

You can check what's linked using ldd, and see the unresolved symbols
with ldd -r. For instance, on my AMD64 Debian system,

$ ldd -r /usr/lib/libsvn_fs_base-1.so.0
libsvn_delta-1.so.0 => /usr/lib/libsvn_delta-1.so.0 (0x002a95696000)
libsvn_subr-1.so.0 => /usr/lib/libsvn_subr-1.so.0 (0x002a9579f000)
libaprutil-0.so.0 => /usr/lib/libaprutil-0.so.0 (0x002a958c8000)
libldap.so.2 => /usr/lib/libldap.so.2 (0x002a959e)
liblber.so.2 => /usr/lib/liblber.so.2 (0x002a95b19000)
libdb-4.2.so => /usr/lib/libdb-4.2.so (0x002a95c27000)
libexpat.so.1 => /usr/lib/libexpat.so.1 (0x002a95e05000)
libapr-0.so.0 => /usr/lib/libapr-0.so.0 (0x002a95f29000)
librt.so.1 => /lib/librt.so.1 (0x002a9604e000)
libm.so.6 => /lib/libm.so.6 (0x002a96155000)
libnsl.so.1 => /lib/libnsl.so.1 (0x002a962dc000)
libpthread.so.0 => /lib/libpthread.so.0 (0x002a963f2000)
libc.so.6 => /lib/libc.so.6 (0x002a96506000)
libdl.so.2 => /lib/libdl.so.2 (0x002a96746000)
libcrypt.so.1 => /lib/libcrypt.so.1 (0x002a96849000)
libresolv.so.2 => /lib/libresolv.so.2 (0x002a9697c000)
libsasl2.so.2 => /usr/lib/libsasl2.so.2 (0x002a96a91000)
libgnutls.so.11 => /usr/lib/libgnutls.so.11 (0x002a96ba8000)
/lib64/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 
(0x00552000)
libtasn1.so.2 => /usr/lib/libtasn1.so.2 (0x002a96d1b000)
libgcrypt.so.11 => /usr/lib/libgcrypt.so.11 (0x002a96e2b000)
libgpg-error.so.0 => /usr/lib/libgpg-error.so.0 (0x002a96f77000)
libz.so.1 => /usr/lib/libz.so.1 (0x002a9707b000)

If it doesn't look like that, then I'd say your Subversion package was
built badly. You may also want to run ldd on your svn binary, to see what
libraries it pulls in.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threads and variable assignment

2005-04-12 Thread David M. Cooke
Gregory Bond <[EMAIL PROTECTED]> writes:

> I've had a solid hunt through the (2.3) documentation but it seems
> silent on this issue.
>
> I have an problem that would naturally run as 2 threads:  One monitors
> a bunch of asyncrhonous external state and decides if things are
> "good" or "bad".  The second thread processes data, and the processing
> depends on the "good" or "bad" state at the time the data is processed.
>
> Sort of like this:
>
> Thread 1:
>
> global isgood
> while 1:
>   wait_for_state_change()
>   if new_state_is_good():
>   isgood = 1
>   else:
>   isgood = 0
>
> Thread 2:
>
> s = socket()
> s.connect(...)
> f = s.makefile()
> while 1:
>   l = f.readline()
>   if isgood:
>   print >> goodfile, l
>   else:
>   print >> badfile, l

You said that the processing depends on the good or bad state at the
time the data is processed: I don't know how finely-grained your state
changes will be in thread 1, but it doesn't seem that thread 2 would
notice at the right time. If the socket blocks reading a line, the
state could change i

> What guarantees (if any!) does Python make about the thread safety of
> this construct?   Is it possible for thread 2 to get an undefined
> variable if it somehow catches the microsecond when isgood is being
> updated by thread 1?

It won't be undefined, but it's possible that (in thread 1)
between the "if new_state_is_good()" and the setting of isgood that
thread 2 will execute, so if new_state_is_good() was false, then it
could still write the line to the goodfile.

It really depends on how often you have state changes, how often you
get (full) lines on your socket, and how much you care that the
correct line be logged to the right file.

If you needed this to be robust, I'd either:

- Try to rewrite wait_for_status_change()/new_state_is_good() to be
  asynchronous, particularly if wait_for_status_change() is blocking
  on some file or socket object. This way you could hook it into
  asynchat/asyncore or Twisted without any threads.

- Or, if you need to use threads, use a Queue.Queue object where
  timestamps w/ state changes are pushed on in thread 1, and popped
  off and analysed before logging in thread 2. (Or something; this
  just popped in my head.)

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: curious problem with large numbers

2005-04-07 Thread David M. Cooke
Chris Fonnesbeck <[EMAIL PROTECTED]> writes:

> I have been developing a python module for Markov chain Monte Carlo
> estimation, in which I frequently compare variable values with a very
> large number, that I arbitrarily define as:
>
> inf = 1e1
>
> However, on Windows (have tried on Mac, Linux) I get the following behaviour:
>
>>>> inf = 1e1
>>>> inf
> 1.0
>
> while I would have expected:
>
> 1.#INF
>
> Smaller numbers, as expected, yield:
>
>>>> inf = 1e100
>>>> inf
> 1e+100
>
> Obviously, I cannot use the former to compare against large (but not
> infinite) numbers, at least not to get the result I expect. Has anyone
> seen this behaviour?

I don't do Windows, so I can't say this will work, but try

>>> inf = 1e308*2

I think your problem is how the number is being parsed; perhaps
Windows is punting on all those zeros? Computing the infinity may or
may not work, but it gets around anything happening in parsing.

Alternatively, if you have numarray installed (which you should
probably consider if you're doing numerical stuff ;-) you could use

>>> import numarray.ieeespecial
>>> numarray.ieeespecial.plus_inf
inf

(there's minus_inf, nan, plus_zero, and minus_zero also)

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: richcmpfunc semantics

2005-04-07 Thread David M. Cooke
harold fellermann <[EMAIL PROTECTED]> writes:

> Thank you Greg,
>
> I figured most of it out in the meantime, myself. I only differ
> from you in one point.
>
>>> What has to be done, if the function is invoked for an operator
>>> I don't want to define?
>>
>> Return Py_NotImplemented. (Note that's return, *not* raise.)
>
> I used
>
> PyErr_BadArgument();
> return NULL;
>
> instead. What is the difference between the two and which one
> is to prefer.

If you do it your way you're a bad neighbour: If your object is the
first one (left-hand side) of the operator, it will prevent the other
object from handling the case if it can. This is the same advice as
for all of the other operators (__add__, etc.)

Consider the pure-python version:

class A:
def __init__(self, what_to_do='return'):
self.what_to_do = what_to_do
def __eq__(self, other):
print 'A.__eq__'
if self.what_to_do == 'return':
return NotImplemented
else:
raise Exception

class B:
def __eq__(self, other):
print 'B.__eq__'
return True

>>> a = A('return')
>>> b = B()
>>> a == b
A.__eq__
B.__eq__
True
>>> b == a
B.__eq__
True
>>> a == a
A.__eq__
A.__eq__
A.__eq__
A.__eq__
True

So the B class handles the case where A doesn't know what to do. Also
note the last case, where Python falls back on id() comparisions to
determine equality.

Now, compare with this:

>>> a = A('raise')
>>> b = B()
>>> a == b
A.__eq__
Traceback (most recent call last):
  File "", line 1, in ?
  File "x.py", line 9, in __eq__
raise Exception
Exception
>>> b == a
B.__eq__
True
>>> a == a
A.__eq__
Traceback (most recent call last):
  File "", line 1, in ?
  File "x.py", line 9, in __eq__
raise Exception
Exception

So now comparing A and B objects can fail. If you *know* A and B
objects can't be compared for equality, it'd be ok to raise a
TypeError, but that should be after a type test.

> Also, do you need to increment the reference count
> of Py_NotImeplemented before returning it?

Yes; it's a singleton like Py_None.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Gnuplot.py and, _by far_, the weirdest thing I've ever seen on my computer

2005-04-04 Thread David M. Cooke
"syd" <[EMAIL PROTECTED]> writes:

> I don't even know where to begin.  This is just bizarre.  I just picked
> up the Gnuplot.py module (a light interface to gnuplot commands) and
> was messing around with it today.
>
> I've got a tiny script, but it only works from the command line about
> half the time!  In the python interpreter, 100%.   Ipython, 100%.  I'm
> not kidding.
>
> #!/bin/env python
> import Gnuplot
> g = Gnuplot.Gnuplot(debug=1)
> g.title('A simple example')
> g('set data style linespoints')
> g('set terminal png small color')
> g('set output "myGraph.png"')
> g.plot([[0,1.1], [1,5.8], [2,3.3], [3,100]])
>
> Here's just one example -- it does not work, then it works.  It seems
> totally random.  It will work a few times, then it won't for a few
> times...
>
> bash-2.05b$ ./myGnu.py
> gnuplot> set title "A simple example"
> gnuplot> set data style linespoints
> gnuplot> set terminal png small color
> gnuplot> set output "myGraph.png"
> gnuplot> plot '/tmp/tmp5LXAow' notitle
>
> gnuplot> plot '/tmp/tmp5LXAow' notitle
>   ^
>  can't read data file "/tmp/tmp5LXAow"
>  line 0: util.c: No such file or directory
>
> bash-2.05b$ ./myGnu.py
> gnuplot> set title "A simple example"
> gnuplot> set data style linespoints
> gnuplot> set terminal png small color
> gnuplot> set output "myGraph.png"
> gnuplot> plot '/tmp/tmpHMTkpL' notitle
>
> (and it makes the graph image just fine)
>
> I mean what the hell is going on?  My permissions on /tmp are wide open
> (drwxrwxrwt).  It does the same thing when I run as root.  And it
> _always_ works when I use the interpreter or interactive python.
>
> Any clues would be greatly appreciated.  I'm baffled.

What's your OS? Python version? Gnuplot.py version (I assume 1.7)?
Put a 'import sys; print sys.version' in there to make sure /bin/env
is using the same python as you expect it to.

It looks like any temporary file it's writing to is deleted too early.

Have a look at gp_unix.py in the Gnuplot source. There's some
customization options that might be helpful. In particular, I'd try

import Gnuplot
Gnuplot.GnuplotOpts.prefer_fifo_data = 0

... then the data will be save to a temporary file instead of piped
through a fifo.

Alternatively, try
Gnuplot.GnuplotOpts.prefer_inline_data = 1

... then no file will be used.

[I don't use Gnuplot myself; this is just what I came up with after a
few minutes of looking at it]

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: numeric module

2005-04-01 Thread David M. Cooke
"coffeebug" <[EMAIL PROTECTED]> writes:

> I cannot import "numarray" and I cannot import "numeric" using python
> 2.3.3

numarray and Numeric are separate modules available at 
http://numpy.sourceforge.net/

If you're doing anything numerical in Python, you'll want them :-)

-- 
|>|\/|<
/----------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: numeric module

2005-04-01 Thread David M. Cooke
[EMAIL PROTECTED] writes:

> Hello,
> What's the problem with this code? I get the following error message:
>
>  File "test.py", line 26, in test
> print tbl[wi][bi]
> IndexError: index must be either an int or a sequence
>
> ---code snippet
>
> from Numeric import *
> tbl = zeros((32, 16))
>
> def test():
>
> val = testme()
> wi = val >> 4
> bi = val & 0xFL
[above changed to use val instead of crc, as you mentioned in another post]
> print wi
> print bi
> print tbl[wi][bi]

tbl[wi][bi] would be indexing the bi'th element of whatever tbl[wi]
returns. For Numeric arrays, you need

tbl[wi,bi]

Now, you'll have another problem as Terry Reedy mentioned: the indices
(in Numeric) need to be Python ints, not longs. You could rewrite your
test() function as

def test():
val = testme()
wi = int(val >> 4)
bi = int(val & 0xF)
print wi
print bi
print tbl[wi,bi]

and that'll work.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: itertools to iter transition

2005-03-30 Thread David M. Cooke
Steven Bethard <[EMAIL PROTECTED]> writes:

> Terry Reedy wrote:
>>>But if classmethods are intended to provide alternate constructors
>> But I do not remember that being given as a reason for
>> classmethod().  But I am not sure what was.
>
> Well I haven't searched thoroughly, but I know one place that it's
> referenced is in descrintro[1]:
>
> "Factoid: __new__ is a static method, not a class method. I initially
> thought it would have to be a class method, and that's why I added the
> classmethod primitive. Unfortunately, with class methods, upcalls
> don't work right in this case, so I had to make it a static method
> with an explicit class as its first argument. Ironically, there are
> now no known uses for class methods in the Python distribution (other
> than in the test suite).

Not true anymore, of course (it was in 2.2.3). In 2.3.5, UserDict,
tarfile and some the Mac-specific module use classmethod, and the
datetime extension module use the C version (the METH_CLASS flag).

And staticmethod (and METH_STATIC) aren't used at all in 2.3 or 2.4 :-)
[if you ignore __new__]

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: breaking up is hard to do

2005-03-25 Thread David M. Cooke
"bbands" <[EMAIL PROTECTED]> writes:

> For example I have a class named Indicators. If I cut it out and put it
> in a file call Ind.py then "from Ind import Indicators" the class can
> no longer see my globals. This is true even when the import occurs
> after the config file has been read and parsed.

Don't use globals? Or put all the globals into a separate module,
which you import into Ind and into whatever uses Ind.

Putting the globals into a separate namespace (module, class, class
instance, whatever) also makes it easier to know what is a global :-)

-- 
|>|\/|<
/------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using python to extend a python app

2005-03-24 Thread David M. Cooke
dataangel <[EMAIL PROTECTED]> writes:

> I'm writing a python app that works as a replacement for the menu that
> comes with most minimalist wms when you right click the root window.
> It's prettier and written completely in python.
>
> I'd like to provide hooks or some system so that people can write
> their own extensions to the app, for example adding fluxbox options,
> and then fluxbox users can choose to use that extension. But I'm not
> sure how to implement it.
>
> Right now the best idea I have is to have all desired extensions in a
> folder, import each .py file in that folder as a module using
> __import__, and then call some predetermined method, say "start", and
> pass it the menu as it exists so far so they can add to it,
> start(menu). This seems kind of hackish.

That looks pretty reasonable, and easy. There have been some recent
threads (in the past month or so) on plugins, so you might want to
search the archives. Most of it's revolved around not using exec :-)

I just had a look at pyblosxom (one program that I know that uses
plugins), and it uses exactly this approach, with some extra frills:
looking in subdirectories, for instance.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Passing arguments to python from URL

2005-03-22 Thread David M. Cooke
Casey Bralla <[EMAIL PROTECTED]> writes:

> I've got a python cgi-bin application which produces an apache web page.  I
> want to pass arguments to it on the URL line, but the parameters are not
> getting passed along to python properly.
>
> I've been using sys.argv to pick up command line arguments, and it works
> fine when I call the python program from the command line.  Unfortunately,
> when I pass data to the program from the URL, many of the parameters are
> being clobbered and **NOT** passed to python.
>
> For example:  "http://www.nobody.com/cgi-bin/program.py?sort=ascending"; only
> passes the parameter "/usr/lib/cgi-bin/program.py".

This is expected.

> However, "http://www.nobody.com/cgi-bin/program.py?sort%20ascending"; passes
> a 2-place tuple of ("/usr/lib/cgi-bin/program.py", "sort
> ascending").

I don't know why this actually works, it's not (AFAIK) defined behaviour.

> Somehow, adding the "=" in the argument list prevents **ANY** parameters
> from being passed to python.  I could re-write the python program to work
> around this, but I sure would like to understand it first.

You're going to have to rewrite. CGI scripts get their arguments
passed to them through the environment, not on the command line.
QUERY_STRING, for instance, will hold the query string (the stuff
after the ?).

Use Python's cgi module to make things easier on yourself; the
documentation has a good overview:
http://www.python.org/doc/2.4/lib/module-cgi.html

In this case, your script would look something like this:

import cgi
form = cgi.FieldStorage()
if form.getvalue('sort') == 'ascending':
... sort in ascending order ...

etc.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to handle repetitive regexp match checks

2005-03-17 Thread David M. Cooke
Matt Wette <[EMAIL PROTECTED]> writes:

> Over the last few years I have converted from Perl and Scheme to
> Python.  There one task that I do often that is really slick in Perl
> but escapes me in Python.  I read in a text line from a file and check
> it against several regular expressions and do something once I find a match.
> For example, in perl ...
>
>  if ($line =~ /struct {/) {
>do something
>  } elsif ($line =~ /typedef struct {/) {
>do something else
>  } elsif ($line =~ /something else/) {
>  } ...
>
> I am having difficulty doing this cleanly in python.  Can anyone help?
>
>  rx1 = re.compile(r'struct {')
>  rx2 = re.compile(r'typedef struct {')
>  rx3 = re.compile(r'something else')
>
>  m = rx1.match(line)
>  if m:
>do something
>  else:
>m = rx2.match(line)
>if m:
>  do something
>else:
>  m = rx3.match(line)
>  if m:
> do something
>   else:
> error

I usually define a class like this:

class Matcher:
def __init__(self, text):
self.m = None
self.text = text
def match(self, pat):
self.m = pat.match(self.text)
return self.m
def __getitem__(self, name):
return self.m.group(name)

Then, use it like

for line in fo:
m = Matcher(line)
if m.match(rx1):
do something
elif m.match(rx2):
do something
else:
error

-- 
|>|\/|<
David M. Cooke
cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SAX parsing problem

2005-03-15 Thread David M. Cooke
anon <[EMAIL PROTECTED]> writes:

> So I've encountered a strange behavior that I'm hoping someone can fill
> me in on.  i've written a simple handler that works with one small
> exception, when the parser encounters a line with '&' in it, it
> only returns the portion that follows the occurence.  
>
> For example, parsing a file with the line :
> mykeysome%20&%20value
>
> results in getting "%20value" back from the characters method, rather
> than "some%20&%20value".
>
> After looking into this a bit, I found that SAX supports entities and
> that it is probably believing the & to be an entity and processing
> it in some way that i'm unware of.  I'm using the default
> EntityResolver.

Are you sure you're not actually getting three chunks: "some%20", "&",
and "%20value"? The xml.sax.handler.ContentHandler.characters method
(which I presume you're using for SAX, as you don't mention!) is not
guaranteed to get all contiguous character data in one call. Also check
if .skippedEntity() methods are firing.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: binutils "strings" like functionality?

2005-03-03 Thread David M. Cooke
"cjl" <[EMAIL PROTECTED]> writes:

> Fredrik Lundh wrote:
>
>> something like this could work:
>>
>> import re
>>
>> text = open(file, "rb").read()
>>
>> for m in re.finditer("([\x20-\x7f]{4,})[\n\0]", text):
>> print m.start(), repr(m.group(1))
>
> Hey...that worked. I actually modified:
>
> for m in re.finditer("([\x20-\x7f]{4,})[\n\0]", text):
>
> to
>
> for m in re.finditer("([\x20-\x7f]{4,})", text):
>
> and now the output is nearly identical to 'strings'. One problem
> exists, in that if the binary file contains a string
> "monkey/chicken/dog/cat" it is printed as "mokey//chicken//dog//cat",
> and I don't know enough to figure out where the extra "/" is coming
> from.

Are you sure it's monkey/chicken/dog/cat, and not
monkey\chicken\dog\cat? The later one will print monkey\\chicken...
because of the repr() call.

Also, you probably want it as [\x20-\x7e] (the DEL character \x7f
isn't printable). You're also missing tabs (\t).

The GNU binutils string utility looks for \t or [\x20-\x7e].

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to write python plug-ins for your own python program?

2005-03-03 Thread David M. Cooke
Simon Wittber <[EMAIL PROTECTED]> writes:

>> You mean like 'import'? :)
>
> That's how I would do it. It's the simplest thing, that works.
>
> exec("import %s as plugin" % pluginName)
> plugin.someMethod()
>
> where pluginName is the name of the python file, minus the ".py" extension.

You'd better hope someone doesn't name their plugin
'os; os.system("rm -rf /"); import sys'

Use __import__ instead.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [ANN] Python 2.4 Quick Reference available

2005-02-19 Thread David M. Cooke
"Pete Havens" <[EMAIL PROTECTED]> writes:

> The is awesome! Thanks. I did notice one thing while reading it. In the
> "File Object" section, it states:
>
> "Created with built-in functions open() [preferred] or its alias
> file()."
>
> ...this seems to be the opposite of the Python documentation:
>
> "The file() constructor is new in Python 2.2. The previous spelling,
> open(), is retained for compatibility, and is an alias for file()."

Except if you look at the current development docs
(http://www.python.org/dev/doc/devel/lib/built-in-funcs.html) it says

"""
The file() constructor is new in Python 2.2 and is an alias for
open(). Both spellings are equivalent. The intent is for open() to
continue to be preferred for use as a factory function which returns a
new file object. The spelling, file is more suited to type testing
(for example, writing "isinstance(f, file)").
"""

... which more accurately reflects what I believe the consensus is
about the usage of open vs. file.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: LinearAlgebra incredibly slow for eigenvalue problems

2005-01-28 Thread David M. Cooke
"drife" <[EMAIL PROTECTED]> writes:

> Hi David,
>
> I performed the above check, and sure enough, Numeric
> is --not-- linked to the ATLAS libraries.
>
> I followed each of your steps outlined above, and Numeric
> still is not linking to the ATLAS libraries.
>
> My setup.py file is attached below.

> # delete all but the first one in this list if using your own LAPACK/BLAS
> sourcelist = [os.path.join('Src', 'lapack_litemodule.c')]
> # set these to use your own BLAS;
>
> library_dirs_list = ['/d2/lib/atlas']
> libraries_list = ['lapack', 'ptcblas', 'ptf77blas', 'atlas', 'g2c']
>
> # set to true (1), if you also want BLAS optimized 
> matrixmultiply/dot/innerproduct
> use_dotblas = 1
> include_dirs = ['/d2/include']

This all look right (assuming you've got the right stuff in /d2).

When it compiles, does it look like it's actually doing the linking?
After doing python setup.py build, you can run ldd on the libraries in
the build directory (something like build/lib.linux-i386-2.3/lapack_lite.so).
If that's linked, then it's not being installed right.

You don't have a previous Numeric installation that's being picked up
instead of the one you're trying to install, do you?

At the interpreter prompt, check that
>>> import Numeric
>>> Numeric.__file__

gives you something you're expecting, and not something else.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: LinearAlgebra incredibly slow for eigenvalue problems

2005-01-28 Thread David M. Cooke
"drife" <[EMAIL PROTECTED]> writes:

> Hello,
>
> I need to calculate the eigenvectors and eigenvalues for a 3600 X 3600
> covariance matrix.
>
> The LinearAlgebra package in Python is incredibly slow to perform the
> above calculations (about 1.5 hours). This in spite of the fact that
> I have installed Numeric with the full ATLAS and LAPACK libraries.
>
> Also note that my computer has dual Pentium IV (3.1 GHz) processors
> with 2Gb ram.
>
> Every Web discussion I have seen about such issues indicates that
> one can expect huge speed ups if one compiles and installs Numeric
> linked against the ATLAS and LAPACK libraries.

Are you *sure* that Numeric is linked against these?

> Even more perplexing is that the same calculation takes a mere 7 min
> in Matlab V6.5. Matlab uses both ATLAS and LAPACK.
>
> Moreover, the above calculation takes the same amount of time for
> Numeric to complete with --and-- without ATLAS and PACK. I am certain
> that I have done the install correctly.

This is good evidence that Numeric *isn't* linked to them.

If you're on a Linux system, you can check with ldd:
[EMAIL PROTECTED] ldd /usr/lib/python2.3/site-packages/Numeric/lapack_lite.so 
liblapack.so.3 => /usr/lib/atlas/liblapack.so.3 (0x002a95677000)
libblas.so.3 => /usr/lib/atlas/libblas.so.3 (0x002a95e55000)
libg2c.so.0 => /usr/lib/libg2c.so.0 (0x002a96721000)
libpthread.so.0 => /lib/libpthread.so.0 (0x002a96842000)
libc.so.6 => /lib/libc.so.6 (0x002a96957000)
libm.so.6 => /lib/libm.so.6 (0x002a96b96000)
/lib64/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 
(0x00552000)

You can see that lapack and blas (the Atlas versions) are linked to
the lapack_lite.so.

To install Numeric using Lapack:
- remove the build/ directory in your Numeric sources, so you don't
  any old binaries
- edit setup.py and follow the comments on using Lapack (you need to
  comment out a few lines, and set some directories)
  Also set use_dotblas to 1.
- do the 'python setup.py build', 'python setup.py install' dance.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: MMTK Install Problem

2005-01-26 Thread David M. Cooke
Justin Lemkul <[EMAIL PROTECTED]> writes:

> Hello All,
>
> I am hoping that someone out there will be able to help me.  During the 
> "build" phase of MMTK installation, I receive the following series of errors:
>
> $ python setup.py build
> running build
> running build_py
> running build_ext
> building 'lapack_mmtk' extension
> gcc -fno-strict-aliasing -Wno-long-double -no-cpp-precomp -mno-fused-madd
> -fno-common
> -dynamic -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -DLIBM_HAS_ERFC
> -DEXTENDED_TYPES -IInclude
> -I/System/Library/Frameworks/Python.framework/Versions/
> 2.3/include/python2.3 -c Src/lapack_mmtk.c -o
> build/temp.darwin-7.7.0-Power_Macintosh
> -2.3/Src/lapack_mmtk.o
> Src/lapack_mmtk.c:2:33: Numeric/arrayobject.h: No such file or directory

Always look at the first error :-) GCC is awful for, when it can't
find an include file, saying it can't, then spewing millions of error
messages afterwards that are a direct result of not having stuff declared.

In this case, it's obvious that you don't have Numeric installed
correctly; the header files should be picked from one of the
directories specified by the -I flags in the gcc invocation above.

> I am attempting the install on a Mac OS X v10.3 with Python v2.3, NumPy 
> v23.1, 
> and SciPy v2.4.3

(You mean ScientificPython, not SciPy, right? Scipy is at 0.3.2)

How did you install Numeric? The newest version is 23.7. It should be
real easy to upgrade to that, as that version picks up Apple's vecLib
framework for the linear algebra routines. Just do the usual 'python
setup.py build', 'sudo python setup.py install'. That should put the
header files where the MMTK installation expects them.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pickling extension class

2005-01-18 Thread David M. Cooke
harold fellermann <[EMAIL PROTECTED]> writes:

> Hi all,
>
> I have a problem pickling an extension class. As written in the
> Extending/Embedding Manual, I
> provided a function __reduce__ that returns the appropreate tuple.
> This seams to work fine,
> but I still cannot pickle because of the following error:
>
>  >>> from model import hyper
>  >>> g = hyper.PeriodicGrid(4,4,1)
>  >>> g.__reduce__()
> (,(4.,4.,1.))
>  >>> import pickle
>  >>> pickle.dump(g,file("test","w"))
> Traceback (most recent call last):
>File "pickle_test.py", line 5, in ?
>  pickle.dump(g,file("test","w"))
>File "/sw/lib/python2.4/pickle.py", line 1382, in dump
>  Pickler(file, protocol, bin).dump(obj)
>File "/sw/lib/python2.4/pickle.py", line 231, in dump
>  self.save(obj)
>File "/sw/lib/python2.4/pickle.py", line 338, in save
>  self.save_reduce(obj=obj, *rv)
>File "/sw/lib/python2.4/pickle.py", line 414, in save_reduce
>  save(func)
>File "/sw/lib/python2.4/pickle.py", line 293, in save
>  f(self, obj) # Call unbound method with explicit self
>File "/sw/lib/python2.4/pickle.py", line 760, in save_global
>  raise PicklingError(
> pickle.PicklingError: Can't pickle : it's
> not found as hyper.PeriodicGrid
>  >>> dir(hyper)
> ['Dir', 'Neighbors', 'PeriodicGrid', 'PeriodicPos', '__doc__',
> '__file__', '__name__', 'refcount']
>  >>> hyper.PeriodicGrid
> 
 ^

I think that's your error. The extension type is declared to be
hyper.PeriodicGrid, where it actually is model.hyper.PeriodicGrid
(because hyper is in the model package).

Pickle stores g.__class__.__module__ (which is "hyper") and
g.__class__.__name__ (="PeriodicGrid") to find the class object for
reimporting, and on unpickling, tries to do __import__("hyper"), which
fails.

The tp_name slot of your extension type should be "model.hyper.PeriodicGrid".

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pyrex-0.9.3: definition mismatch with distutils of Python24

2005-01-13 Thread David M. Cooke
[EMAIL PROTECTED] (Martin Bless) writes:

> Now that I've got my extension building machine using the VC++ Toolkit
> 2003 up and running I'm keen on using Pyrex (Pyrex-0.9.3,
> Python-2.4.0).
>
> But the definition of the swig_sources() method seems to have changed.
>
> When I try to build the examples from Pyrex I get a TypeError:
>
>
> c:\Pyrex-0.9.3\Demos> python Setup.py build_ext --inplace
> running build_ext
> building 'primes' extension
> [...]
>   File "C:\Python24\lib\distutils\command\build_ext.py", line 442, in
> build_extension
> sources = self.swig_sources(sources, ext)
> TypeError: swig_sources() takes exactly 2 arguments (3 given)
>
>
> I can see that Pyrex.Distutils.build_ext.py subclasses
> distutils.command.build_ext.build_ext, and the number of arguments of
> the swig_sources method seems to have changed.
>
> Pyrex uses:
>
>   def swig_sources (self, sources):
>
> whereas the distutils use:
>
>   def swig_sources (self, sources, extension):
>
> If I just add the "extension" arg to the Pyrex definitions everything
> seems to work. But I have to admit that I don't really know what I'm
> doing here and I feel sorry I can't contribute more than just
> reporting the error.

Yep, that's it. Greg must know now, it's been reported a few times.
You'll want to change it to

def swig_sources(self, sources, extension=None):

so that if you use an older python it won't complain about missing
arguments.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT: MoinMoin and Mediawiki?

2005-01-11 Thread David M. Cooke
Paul Rubin <http://[EMAIL PROTECTED]> writes:

> Alexander Schremmer <[EMAIL PROTECTED]> writes:

>> > lists of incoming links to wiki pages,
>> 
>> It does.
>
> Huh?  I don't see those.  How does it store them, that's resilient
> across crashes?  Or does it just get wedged if there's a crash?

Most Wiki implementations (MoinMoin included) have this, by using a
search. Usually, following the original Wiki (http://c2.com/cgi/wiki)
model, you get at it by clicking on the title of the page.

Searching instead of indexing makes it very resilient :-)

-- 
|>|\/|<
/----------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: readline, rlcompleter

2005-01-11 Thread David M. Cooke
[EMAIL PROTECTED] writes:

> This a case where the documentation is lacking. The standard library
> documentation
> (http://www.python.org/dev/doc/devel/lib/module-rlcompleter.html) gives
> this example
> try:
> import readline
> except ImportError:
> print "Module readline not available."
> else:
> import rlcompleter
> readline.parse_and_bind("tab: complete")
>
> but I don't find a list of recognized key bindings. For instance, can I
> would
> like to bind shift-tab to rlcompleter, is that possible? Can I use
> function
> keys? I did various attempt, but I did not succed :-(
> Is there any readline-guru here with some good pointers?
> Michele Simionato

Basically, you could use any key sequence that is sent to the terminal. So
shift-tab is out (that's not sent as a key to any terminal program).

Function keys would have to be specified as the key sequence sent by a
function key ("\e[11~" for F1, for instance).

Have a look at the readline info page, or the man page. The syntax of
readline.parse_and_bind is the same as that of an inputrc file.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python3: on removing map, reduce, filter

2005-01-10 Thread David M. Cooke
Steven Bethard <[EMAIL PROTECTED]> writes:
> Some timings to verify this:
>
> $ python -m timeit -s "def square(x): return x*x" "map(square, range(1000))"
> 1000 loops, best of 3: 693 usec per loop
>
> $ python -m timeit -s "[x*x for x in range(1000)]"
> 1000 loops, best of 3: 0.0505 usec per loop

Maybe you should compare apples with apples, instead of oranges :-)
You're only running the list comprehension in the setup code...

$ python2.4 -m timeit -s "def square(x): return x*x" "map(square, range(1000))"
1000 loops, best of 3: 464 usec per loop
$ python2.4 -m timeit "[x*x for x in range(1000)]"
1000 loops, best of 3: 216 usec per loop

So factor of 2, instead of 13700 ...

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: why not datetime.strptime() ?

2005-01-10 Thread David M. Cooke
Joshua Spoerri <[EMAIL PROTECTED]> writes:

> Skip Montanaro  pobox.com> writes:
>> josh> Shouldn't datetime have strptime?
>> If someone wants to get their feet wet with extension module
>> programming
>> this might be a good place to start.  Mostly, I think nobody who has
>> needed/wanted it so far has the round tuits available to spend on the
>> task.
>
> OK, it was pretty straightforward. Thanks for the direction.
>
> To whom should I send the patch (attached)?

Submit it to the patch tracker on sourceforge.

But first, some constructive criticism:

> --- Modules/datetimemodule.c.orig 2003-10-20 10:34:46.0 -0400
> +++ Modules/datetimemodule.c  2005-01-10 20:58:38.884823296 -0500
> @@ -3774,6 +3774,32 @@
>   return result;
>  }
>  
> +/* Return new datetime from time.strptime(). */
> +static PyObject *
> +datetime_strptime(PyObject *cls, PyObject *args)
> +{
> + PyObject *result = NULL, *obj, *module;
> + const char *string, *format;
> +
> + if (!PyArg_ParseTuple(args, "ss:strptime", &string, &format))
> + return NULL;
> + if ((module = PyImport_ImportModule("time")) == NULL)
> + return NULL;
> + obj = PyObject_CallMethod(module, "strptime", "ss", string, format);
> + Py_DECREF(module);

You don't check for errors: an exception being thrown by
PyObject_CallMethod will return obj == NULL.

If there's a module in sys.path called time that overrides the stdlib
time, things will fail, and you should be able to catch that.

> + result = PyObject_CallFunction(cls, "iii",
> + PyInt_AsLong(PySequence_GetItem(obj, 0)),
> + PyInt_AsLong(PySequence_GetItem(obj, 1)),
> + PyInt_AsLong(PySequence_GetItem(obj, 2)),
> + PyInt_AsLong(PySequence_GetItem(obj, 3)),
> + PyInt_AsLong(PySequence_GetItem(obj, 4)),
> + PyInt_AsLong(PySequence_GetItem(obj, 5)),
> + PyInt_AsLong(PySequence_GetItem(obj, 6)));

Are you positive those PySequence_GetItem calls will succeed? That
they will return Python integers?

> + Py_DECREF(obj);
> + return result;
> +}
> +
>  /* Return new datetime from date/datetime and time arguments. */
>  static PyObject *
>  datetime_combine(PyObject *cls, PyObject *args, PyObject *kw)
> @@ -4385,6 +4411,11 @@
>PyDoc_STR("timestamp -> UTC datetime from a POSIX timestamp "
>  "(like time.time()).")},
>  
> + {"strptime", (PyCFunction)datetime_strptime,
> +  METH_VARARGS | METH_CLASS,
> +  PyDoc_STR("strptime -> new datetime parsed from a string"
> +"(like time.strptime()).")},
> +
>   {"combine", (PyCFunction)datetime_combine,
>METH_VARARGS | METH_KEYWORDS | METH_CLASS,
>PyDoc_STR("date, time -> datetime with same date and time fields")},

It probably would help to add some documentation to add to the
datetime module documentation.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: date/time

2005-01-05 Thread David M. Cooke
Thomas Guettler <[EMAIL PROTECTED]> writes:

> Am Wed, 05 Jan 2005 15:08:37 +0100 schrieb Nader Emami:
>
>> L.S.,
>> 
>> Could somebody help me how I can get the next format of date
>> from the time module?
>
> I don't understand your question. Do you want to have the next day?
>
> 20041231 --> 20050101 ?
>
> You can do it like this:
>  - parse the string with time.strptime
>  - timetuple[2]+=1
>  - mktime(timetuple) # --> secs
>  - strftime(localtime(secs))

Or using the datetime module:

import time, datetime

tt = time.strptime('20041231', '%Y%m%d')
t = datetime.date.fromtimestamp(time.mktime(tt))
# now in a easier-to-handle form than the time tuple
t += datetime.timedelta(days=1)
print t.strftime('%Y%m%d')


-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: getattr() woes

2004-12-28 Thread David M. Cooke
[EMAIL PROTECTED] (Aahz) writes:

> In article <[EMAIL PROTECTED]>,
> Thomas Rast  <[EMAIL PROTECTED]> wrote:
>>
>>class dispatcher:
>># ...
>>def __getattr__(self, attr):
>>return getattr(self.socket, attr)
>>
>>>>> import asyncore
>>>>> class Peer(asyncore.dispatcher):
>>... def _get_foo(self):
>>... # caused by a bug, several stack levels deeper
>>... raise AttributeError('hidden!')
>>... foo = property(_get_foo)
>>...
>
> You're not supposed to use properties with classic classes.

Even if dispatcher was a new-style class, you still get the same
behaviour (or misbehaviour) -- Peer().foo still raises AttributeError
with the wrong message.

A simple workaround is to put a try ... except AttributeError block in
his _get_foo(), which would re-raise with a different error that
wouldn't be caught by getattr. You could even write a property
replacement for that:

>>> class HiddenAttributeError(Exception):
... pass
>>> def robustprop(fget):
... def wrapped_fget(self):
... try:
... return fget(self)
... except AttributeError, e:
... raise HiddenAttributeError(*e.args)
... return property(fget=wrapped_fget)

Ideally, I think the better way is if getattr, when raising
AttributeError, somehow reused the old traceback (which would point
out the original problem). I don't know how to do that, though.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Namespaces and the timeit module

2004-12-14 Thread David M. Cooke
Roy Smith <[EMAIL PROTECTED]> writes:

> I'm playing with the timeit module, and can't figure out how to time a 
> function call.  I tried:
>
> def foo ():
> x = 4
> return x
>
> t = timeit.Timer ("foo()")
> print t.timeit()
>
> and quickly figured out that the environment the timed code runs under 
> is not what I expected:
>
> Traceback (most recent call last):
>   File "./d.py", line 10, in ?
> print t.timeit()
>   File "/usr/local/lib/python2.3/timeit.py", line 158, in timeit
> return self.inner(it, self.timer)
>   File "", line 6, in inner
> NameError: global name 'foo' is not defined
>
> In fact, trying to time "print dir()" gets you:
>
> ['_i', '_it', '_t0', '_timer']
>
> It seems kind of surprising that I can't time functions.  Am I just not 
> seeing something obvious?

Like the documentation for Timer? :-)

class Timer([stmt='pass' [, setup='pass'  [, timer=]]])

You can't use statements defined elsewhere, you have to define them in
the setup arguments (as a string). Like this:


define_foo = '''
def foo():
x = 4
return x
'''

t = timeit.Timer("foo()" setup=define_foo)
print t.timeit()


One common idiom I've seen is to put your definition of foo() in a
module (say x.py), then, from the command line:

$ python -m timeit -s 'from x import foo' 'foo()'

(the -m is for python 2.4 to run the timeit module; use the full path
to timeit.py instead for earlier pythons)

Alternatively, the examples for the timeit module has another way to
time functions defined in a module.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Distutils vs. Extension header files

2004-12-10 Thread David M. Cooke
Mike Meyer <[EMAIL PROTECTED]> writes:

> I've got a package that includes an extension that has a number of
> header files in the directory with the extension. They are specified
> as "depends = [...]" in the Extension class. However, Distutils
> doesn't seem to do anything with them.
>
> If I do an sdist, the include files aren't added to the tarball.
>
> If I do a bdist_rpm, the source files get copied into the build
> directory and the build starts, but the header files aren't copied
> with the source file, so the build fails with a missing header file.
>
> I find it hard to believe that this is a bug in distutils, so I'd
> appreciate it if someone could tell me what I'm doing wrong.

vincent has the solution (you need to specify them in MANIFEST.in),
but I'll add my 2 cents.

depends = [...] is used in building (it's like dependencies in make).
If one of those files change, distutils will rebuild the extension.
But that's all distutils does with it. It's braindead including stuff
in the source distribution, including depends, data files, and other
stuff you'd think it would do. When in doubt, add it to MANIFEST.in.

-- 
|>|\/|<
/--\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca
-- 
http://mail.python.org/mailman/listinfo/python-list