Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs

2011-09-27 Thread Darren Dale
What is the status of this proposal?

On Wed, Jun 22, 2011 at 6:56 PM, Mark Wiebe  wrote:
> On Wed, Jun 22, 2011 at 4:57 PM, Darren Dale  wrote:
>>
>> On Wed, Jun 22, 2011 at 1:31 PM, Mark Wiebe  wrote:
>> > On Wed, Jun 22, 2011 at 7:34 AM, Lluís  wrote:
>> >>
>> >> Darren Dale writes:
>> >>
>> >> > On Tue, Jun 21, 2011 at 1:57 PM, Mark Wiebe 
>> >> > wrote:
>> >> >> On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris
>> >> >>  wrote:
>> >> >>> How does the ufunc get called so it doesn't get caught in an
>> >> >>> endless
>> >> >>> loop?
>> >>
>> >> > [...]
>> >>
>> >> >> The function being called needs to ensure this, either by extracting
>> >> >> a
>> >> >> raw
>> >> >> ndarray from instances of its class, or adding a 'subok = False'
>> >> >> parameter
>> >> >> to the kwargs.
>> >>
>> >> > I didn't understand, could you please expand on that or show an
>> >> > example?
>> >>
>> >> As I understood the initial description and examples, the ufunc
>> >> overload
>> >> will keep being used as long as its arguments are of classes that
>> >> declare ufunc overrides (i.e., classes with the "_numpy_ufunc_"
>> >> attribute).
>> >>
>> >> Thus Mark's comment saying that you have to either transform the
>> >> arguments into raw ndarrays (either by creating new ones or passing a
>> >> view) or use the "subok = False" kwarg parameter to break a possible
>> >> overloading loop.
>> >
>> > The sequence of events is something like this:
>> > 1. You call np.sin(x)
>> > 2. The np.sin ufunc looks at x, sees the _numpy_ufunc_ attribute, and
>> > calls
>> > x._numpy_ufunc_(np.sin, x)
>> > 3. _numpy_ufunc_ uses np.sin.name (which is "sin") to get the correct
>> > my_sin
>> > function to call
>> > 4A. If my_sin called np.sin(x), we would go back to 1. and get an
>> > infinite
>> > loop
>> > 4B. If x is a subclass of ndarray, my_sin can call np.sin(x,
>> > subok=False),
>> > as this disables the subclass overloading mechanism.
>> > 4C. If x is not a subclass of ndarray, x needs to produce an ndarray,
>> > for
>> > instance it might have an x.arr property. Then it can call np.sin(x.arr)
>>
>> Ok, that seems straightforward and, for what its worth, it looks like
>> it would meet my needs. However, I wonder if the _numpy_func_
>> mechanism is the best approach. This is a bit sketchy, but why not do
>> something like:
>>
>> class Proxy:
>>
>>    def __init__(self, ufunc, *args):
>>        self._ufunc = ufunc
>>        self._args = args
>>
>>    def __call__(self, func):
>>        self._ufunc._registry[tuple(type(arg) for arg in self._args)] =
>> func
>>        return func
>>
>>
>> class UfuncObject:
>>
>>    ...
>>
>>    def __call__(self, *args, **kwargs):
>>        func = self._registry.get(tuple(type(arg) for arg in args), None)
>>        if func is None:
>>            raise TypeError
>>        return func(*args, **kwargs)
>>
>>    def register(self, *args):
>>        return Proxy(self, *args)
>>
>>
>> @np.sin.register(Quantity)
>> def sin(pq):
>>    if pq.units != degrees:
>>        pq = pq.rescale(degrees)
>>    temp = np.sin(pq.view(np.ndarray))
>>    return Quantity(temp, copy=False)
>>
>> This way, classes don't have to implement special methods to support
>> ufunc registration, special attributes to claim primacy in ufunc
>> registration lookup, special versions of the functions for each numpy
>> ufunc, *and* the logic to determine whether the combination of
>> arguments is supported. By that I mean, if I call np.sum with a
>> quantity and a masked array, and Quantity wins the __array_priority__
>> competition, then I also need to check that my special sum function(s)
>> know how to operate on that combination of inputs. With the decorator
>> approach, I just need to implement the special versions of the ufuncs,
>> and the decorators handle the logic of knowing what combinations of
>> arguments are supported.
>>
>> It might be worth considering using ABCs for registration and have
>> UfuncObj

Re: [Numpy-discussion] X11 system info

2011-07-21 Thread Darren Dale
On Wed, Jul 20, 2011 at 4:58 AM, Pauli Virtanen  wrote:
> Tue, 19 Jul 2011 21:55:28 +0200, Ralf Gommers wrote:
>> On Sun, Jul 17, 2011 at 11:48 PM, Darren Dale 
>> wrote:
>>> In numpy.distutils.system info:
>>>
>>>    default_x11_lib_dirs = libpaths(['/usr/X11R6/lib','/usr/X11/lib',
>>>                                     '/usr/lib'], platform_bits)
>>>    default_x11_include_dirs = ['/usr/X11R6/include','/usr/X11/include',
>>>                                '/usr/include']
>>>
>>> These defaults won't work on the forthcoming Ubuntu 11.10, which
>>> installs X into /usr/lib/X11 and /usr/include/X11.
>
> Did you check that some compilation fails because of this?
> If not, how did you find the information that the location is changed?

I discovered the problem when I tried to build the entire Enthought
Tool Suite from source on a Kubuntu-11.10 pre-release system. Even
after changing the paths to point at the right location, there are
other problems, as seen from this traceback for building Enable:

/usr/lib/pymodules/python2.7/numpy/distutils/system_info.py:525:
UserWarning: Specified path /usr/local/include/python2.7 is invalid.
  warnings.warn('Specified path %s is invalid.' % d)
/usr/lib/pymodules/python2.7/numpy/distutils/system_info.py:525:
UserWarning: Specified path /usr/include/suitesparse/python2.7 is
invalid.
  warnings.warn('Specified path %s is invalid.' % d)
/usr/lib/pymodules/python2.7/numpy/distutils/system_info.py:525:
UserWarning: Specified path  is invalid.
  warnings.warn('Specified path %s is invalid.' % d)
/usr/lib/pymodules/python2.7/numpy/distutils/system_info.py:525:
UserWarning: Specified path /usr/lib/X1164 is invalid.
  warnings.warn('Specified path %s is invalid.' % d)
Traceback (most recent call last):
  File "setup.py", line 56, in 
config = configuration().todict()
  File "setup.py", line 48, in configuration
config.add_subpackage('kiva')
  File "/usr/lib/pymodules/python2.7/numpy/distutils/misc_util.py",
line 972, in add_subpackage
caller_level = 2)
  File "/usr/lib/pymodules/python2.7/numpy/distutils/misc_util.py",
line 941, in get_subpackage
caller_level = caller_level + 1)
  File "/usr/lib/pymodules/python2.7/numpy/distutils/misc_util.py",
line 878, in _get_configuration_from_setup_py
config = setup_module.configuration(*args)
  File "kiva/setup.py", line 27, in configuration
config.add_subpackage('agg')
  File "/usr/lib/pymodules/python2.7/numpy/distutils/misc_util.py",
line 972, in add_subpackage
caller_level = 2)
  File "/usr/lib/pymodules/python2.7/numpy/distutils/misc_util.py",
line 941, in get_subpackage
caller_level = caller_level + 1)
  File "/usr/lib/pymodules/python2.7/numpy/distutils/misc_util.py",
line 878, in _get_configuration_from_setup_py
config = setup_module.configuration(*args)
  File "kiva/agg/setup.py", line 235, in configuration
x11_info = get_info('x11', notfound_action=2)
  File "/usr/lib/pymodules/python2.7/numpy/distutils/system_info.py",
line 308, in get_info
return cl().get_info(notfound_action)
  File "/usr/lib/pymodules/python2.7/numpy/distutils/system_info.py",
line 459, in get_info
raise self.notfounderror(self.notfounderror.__doc__)
numpy.distutils.system_info.X11NotFoundError: X11 libraries not found.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] X11 system info

2011-07-17 Thread Darren Dale
In numpy.distutils.system info:

default_x11_lib_dirs = libpaths(['/usr/X11R6/lib','/usr/X11/lib',
 '/usr/lib'], platform_bits)
default_x11_include_dirs = ['/usr/X11R6/include','/usr/X11/include',
'/usr/include']

These defaults won't work on the forthcoming Ubuntu 11.10, which
installs X into /usr/lib/X11 and /usr/include/X11.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs

2011-06-22 Thread Darren Dale
On Wed, Jun 22, 2011 at 1:31 PM, Mark Wiebe  wrote:
> On Wed, Jun 22, 2011 at 7:34 AM, Lluís  wrote:
>>
>> Darren Dale writes:
>>
>> > On Tue, Jun 21, 2011 at 1:57 PM, Mark Wiebe  wrote:
>> >> On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris
>> >>  wrote:
>> >>> How does the ufunc get called so it doesn't get caught in an endless
>> >>> loop?
>>
>> > [...]
>>
>> >> The function being called needs to ensure this, either by extracting a
>> >> raw
>> >> ndarray from instances of its class, or adding a 'subok = False'
>> >> parameter
>> >> to the kwargs.
>>
>> > I didn't understand, could you please expand on that or show an example?
>>
>> As I understood the initial description and examples, the ufunc overload
>> will keep being used as long as its arguments are of classes that
>> declare ufunc overrides (i.e., classes with the "_numpy_ufunc_"
>> attribute).
>>
>> Thus Mark's comment saying that you have to either transform the
>> arguments into raw ndarrays (either by creating new ones or passing a
>> view) or use the "subok = False" kwarg parameter to break a possible
>> overloading loop.
>
> The sequence of events is something like this:
> 1. You call np.sin(x)
> 2. The np.sin ufunc looks at x, sees the _numpy_ufunc_ attribute, and calls
> x._numpy_ufunc_(np.sin, x)
> 3. _numpy_ufunc_ uses np.sin.name (which is "sin") to get the correct my_sin
> function to call
> 4A. If my_sin called np.sin(x), we would go back to 1. and get an infinite
> loop
> 4B. If x is a subclass of ndarray, my_sin can call np.sin(x, subok=False),
> as this disables the subclass overloading mechanism.
> 4C. If x is not a subclass of ndarray, x needs to produce an ndarray, for
> instance it might have an x.arr property. Then it can call np.sin(x.arr)

Ok, that seems straightforward and, for what its worth, it looks like
it would meet my needs. However, I wonder if the _numpy_func_
mechanism is the best approach. This is a bit sketchy, but why not do
something like:

class Proxy:

def __init__(self, ufunc, *args):
self._ufunc = ufunc
self._args = args

def __call__(self, func):
self._ufunc._registry[tuple(type(arg) for arg in self._args)] = func
return func


class UfuncObject:

...

def __call__(self, *args, **kwargs):
func = self._registry.get(tuple(type(arg) for arg in args), None)
if func is None:
raise TypeError
return func(*args, **kwargs)

def register(self, *args):
return Proxy(self, *args)


@np.sin.register(Quantity)
def sin(pq):
if pq.units != degrees:
pq = pq.rescale(degrees)
temp = np.sin(pq.view(np.ndarray))
return Quantity(temp, copy=False)

This way, classes don't have to implement special methods to support
ufunc registration, special attributes to claim primacy in ufunc
registration lookup, special versions of the functions for each numpy
ufunc, *and* the logic to determine whether the combination of
arguments is supported. By that I mean, if I call np.sum with a
quantity and a masked array, and Quantity wins the __array_priority__
competition, then I also need to check that my special sum function(s)
know how to operate on that combination of inputs. With the decorator
approach, I just need to implement the special versions of the ufuncs,
and the decorators handle the logic of knowing what combinations of
arguments are supported.

It might be worth considering using ABCs for registration and have
UfuncObject use isinstance to determine the appropriate special
function to call.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs

2011-06-22 Thread Darren Dale
On Tue, Jun 21, 2011 at 1:57 PM, Mark Wiebe  wrote:
> On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris
>  wrote:
>> How does the ufunc get called so it doesn't get caught in an endless loop?

[...]

> The function being called needs to ensure this, either by extracting a raw
> ndarray from instances of its class, or adding a 'subok = False' parameter
> to the kwargs.

I didn't understand, could you please expand on that or show an example?

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] replacing the mechanism for dispatching ufuncs

2011-06-21 Thread Darren Dale
On Tue, Jun 21, 2011 at 2:28 PM, Charles R Harris
 wrote:
>
>
> On Tue, Jun 21, 2011 at 11:57 AM, Mark Wiebe  wrote:
>>
>> On Tue, Jun 21, 2011 at 12:36 PM, Charles R Harris
>>  wrote:
>>>
>>>
>>> On Mon, Jun 20, 2011 at 12:32 PM, Mark Wiebe  wrote:

 NumPy has a mechanism built in to allow subclasses to adjust or override
 aspects of the ufunc behavior. While this goal is important, this mechanism
 only allows for very limited customization, making for instance the masked
 arrays unable to work with the native ufuncs in a full and proper way. I
 would like to deprecate the current mechanism, in particular
 __array_prepare__ and __array_wrap__, and introduce a new method I will
 describe below. If you've ever used these mechanisms, please review this
 design to see if it meets your needs.

>>>
>>> The current approach is at a dead end, so something better needs to be
>>> done.
>>>

 Any class type which would like to override its behavior in ufuncs would
 define a method called _numpy_ufunc_, and optionally an attribute
 __array_priority__ as can already be done. The class which wins the 
 priority
 battle gets its _numpy_ufunc_ function called as follows:

 return arr._numpy_ufunc_(current_ufunc, *args, **kwargs)

 To support this overloading, the ufunc would get a new support method,
 result_type, and there would be a new global function, 
 broadcast_empty_like.
 The function ufunc.empty_like behaves like the global np.result_type,
 but produces the output type or a tuple of output types specific to the
 ufunc, which may follow a different convention than regular arithmetic type
 promotion. This allows for a class to create an output array of the correct
 type to pass to the ufunc if it needs to be different than the default.
 The function broadcast_empty_like is just like empty_like, but takes a
 list or tuple of arrays which are to be broadcast together for producing 
 the
 output, instead of just one.
>>>
>>> How does the ufunc get called so it doesn't get caught in an endless
>>> loop? I like the proposed method if it can also be used for classes that
>>> don't subclass ndarray. Masked array, for instance, should probably not
>>> subclass ndarray.
>>
>> The function being called needs to ensure this, either by extracting a raw
>> ndarray from instances of its class, or adding a 'subok = False' parameter
>> to the kwargs. Supporting objects that aren't ndarray subclasses is one of
>> the purposes for this approach, and neither of my two example cases
>> subclassed ndarray.
>
> Sounds good. Many of the current uses of __array_wrap__ that I am aware of
> are in the wrappers in the linalg module and don't go through the ufunc
> machinery. How would that be handled?

I contributed the __array_prepare__ method a while back so classes
could raise errors before the array data is modified in place.
Specifically, I was concerned about units support in my quantities
package (http://pypi.python.org/pypi/quantities). But I agree that
this approach is needs to be reconsidered. It would be nice for
subclasses to have an opportunity to intercept and process the values
passed to a ufunc on their way in. For example, it would be nice if
when I did np.cos(1.5 degrees), my subclass could intercept the value
and pass a new one on to the ufunc machinery that is expressed in
radians. I thought PJ Eby's generic functions PEP would be a really
good way to handle ufuncs, but the PEP has stagnated.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python3 setup.py install fails with git maint/1.6.x

2011-04-04 Thread Darren Dale
On Mon, Apr 4, 2011 at 3:31 PM, Ralf Gommers
 wrote:
> On Mon, Apr 4, 2011 at 9:15 PM, Darren Dale  wrote:
>> I just checkout out the 1.6 branch, attempted to install with python3:
>
> I hope you mean the 1.6.0b1 tarball, not the current branch head? This
> problem is (or should have been) fixed.

It was the branch HEAD...

> Just tried again with python3.2 and 1.6.0b2, installs fine. The line
> it fails on is only reached when a numpy/version.py exists, which is
> the case for source releases or if you did not clean your local git
> repo before building.

... but I think this was the case. I just deleted numpy/version.py,
and build/, and now everything is ok.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] python3 setup.py install fails with git maint/1.6.x

2011-04-04 Thread Darren Dale
I just checkout out the 1.6 branch, attempted to install with python3:

RefactoringTool: Line 695: You should use a for loop here
Running from numpy source directory.Traceback (most recent call last):
  File "setup.py", line 196, in 
setup_package()
  File "setup.py", line 170, in setup_package
write_version_py()
  File "setup.py", line 117, in write_version_py
from numpy.version import git_revision as GIT_REVISION
ImportError: cannot import name git_revision

Next, I built and installed with python2, with no problems. Then I
attempted to install with python3 again, at which point git_revision
was importable, presumably because it was provided during the python2
build.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nearing a milestone

2011-04-01 Thread Darren Dale
On Fri, Apr 1, 2011 at 4:08 PM, Benjamin Root  wrote:
> Whoosh!
>
> Ben Root
>
> P.S. -- In case it is needed to be said, that is 1e6 downloads from
> sourceforge only.  NumPy is now on github...

The releases are still distributed through sourceforge.

Maybe the SciPy2011 Program Committee could prepare a mounted gold
record to mark the occasion. I could probably dig up and donate an old
copy of the Footloose soundtrack.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] nearing a milestone

2011-04-01 Thread Darren Dale
Numpy is nearing a milestone:
http://sourceforge.net/projects/numpy/files/NumPy/stats/timeline?dates=2007-09-25+to+2011-04-01
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] When was the ddof kwarg added to std()?

2011-03-16 Thread Darren Dale
On Wed, Mar 16, 2011 at 9:10 AM, Scott Sinclair
 wrote:
> On 16 March 2011 14:52, Darren Dale  wrote:
>> Does anyone know when the ddof kwarg was added to std()? Has it always
>> been there?
>
> Does 'git log --grep=ddof' help?

Yes: March 7, 2008

Thanks
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] When was the ddof kwarg added to std()?

2011-03-16 Thread Darren Dale
Does anyone know when the ddof kwarg was added to std()? Has it always
been there?

Thanks,
Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] core library structure

2011-02-04 Thread Darren Dale
On Fri, Feb 4, 2011 at 2:23 PM, Lluís  wrote:
> Darren Dale writes:
>
>> With generic functions, you wouldn't have to remember to use the ufunc
>> provided by masked array for one type, or the default numpy for
>> another type.
>
> Sorry, but I don't see how generic functions should be a better approach
> compared to redefining methods on masked_array [1]. In both cases you
> have to define them one-by-one.
>
> [1] assuming 'np.foo' and 'ma.foo' (which would now be obsolete) simply
>    call 'instance.foo', which in the ndarray level is the 'foo' ufunc
>    object.

That's a bad assumption. np.ndarray.__add__ actually calls the np.add
ufunc, not the other way around. For example, np.arcsin is a ufunc
that operates on ndarray, yet there is no ndarray.arcsin method.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] core library structure

2011-02-03 Thread Darren Dale
On Thu, Feb 3, 2011 at 2:07 PM, Mark Wiebe  wrote:
> Moving this to a new thread.
> On Thu, Feb 3, 2011 at 10:50 AM, Charles R
> Harris  wrote:
>>
>> On Thu, Feb 3, 2011 at 11:07 AM, Mark Wiebe  wrote:
[...]
>>> Yeah, I understand it's the result of organic growth and merging from
>>> many different sources. The core library should probably become layered in a
>>> manner roughly as follows, with each layer depending only on the previous
>>> APIs.  This is what Travis was getting at, I believe, with the generator
>>> array idea affecting mainly the Storage and Iteration APIs, generalizing
>>> them so that new storage and iteration methods can be plugged in.
>>> Data Type API: data type numbers, casting, byte-swapping, etc.
>>> Array Storage API: array creation/manipulation/deletion.
>>> Array Iteration API: array iterators, copying routines.
>>> Array Calculation API: typedefs for various types of calculation
>>> functions, common calculation idioms, ufunc creation API, etc.
>>> Then, the ufunc module would use the Array Calculation API to implement
>>> all the ufuncs and other routines like inner, dot, trace, diag, tensordot,
>>> einsum, etc.
>>
>> I like the lower two levels if, as I assume, they are basically aimed at
>> allocating, deallocating blocks of memory (or equivalent) and doing basic
>> manipulations such as dealing with endianess and casting. Where do you see
>> array methods making an appearance?
>
> That's correct. Currently, for example, the cast functions take array
> objects as parameters, something that would no longer be the case.  The
> array methods vs functions only shows up in the Python exposure, I believe.
>  The above structure only affects the C library, and how its exposed to
> Python could remain as it is now.
>>
>> The original Numeric only had three (IIRC) rather basic methods and
>> everything else was function based, an approach which is probably easier to
>> maintain. The extensive use of methods came from numarray and might be
>> something that could be added at a higher level so that the current ndarrays
>> would be objects combining ow level arrays and ufuncs.

Concerning ufuncs: I wonder if we could consider generic functions as
a replacement for the current __array_prepare__/__array_wrap__
mechanism. For example, if I have an ndarray, a masked array, and
quantity, and I want to multiply the three together, it would be great
to be able to do so with two calls to a single mul ufunc.

Also, a generic function approach might provide a better mechanism to
allow changes to the arrays on their way into the ufunc:

import numpy as np
import quantities as pq
a = [1,2,3]*pq.deg # yields a subclass of ndarray
np.sin(a)

This is not currently possible with __array_prepare__/__array_wrap__,
because __array_prepare__ is called too late by the ufunc to return
process the input array and rescale it to radians. I suggested on this
list that it might be possible to so with the addition of a *third*
method, call it __input_prepare__... at which point Chuck rightly
complained that things were getting way out of hand.

Imagine if we could do something like

@np.sin.register(pq.Quantity)
def my_sin(q):
return np.sin.default(q.rescale('radians'))

np.sin(a)

With generic functions, you wouldn't have to remember to use the ufunc
provided by masked array for one type, or the default numpy for
another type.

This is something I have been meaning to revisit on the list for a
while (along with the possibility of merging quantities into numpy),
but keep forgetting to do so.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Strange problem with h5py and numpy

2010-12-28 Thread Darren Dale
On Mon, Dec 27, 2010 at 12:58 PM, Johannes Korn  wrote:
> Hi,
>
> I have a strange problem with h5py or with numpy.

I think this question belongs on the h5py mailing list.

> I try to read a bunch of hdf files in a loop. The problem is that I get
> an error at the second file because the file handle is of type  HDF5 file>

The code you posted only involves one file.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] problem with numpy/cython on python3, ok with python2

2010-11-20 Thread Darren Dale
I just installed numpy for both python2 and 3 from an up-to-date
checkout of the 1.5.x branch.

I am attempting to cythonize the following code with cython-0.13:

---
cimport numpy as np
import numpy as np

def test():
   cdef np.ndarray[np.float64_t, ndim=1] ret
   ret_arr = np.zeros((20,), dtype=np.float64)
   ret = ret_arr
---

I have this setup.py file:

---
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

import numpy

setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [
Extension(
"test_open", ["test_open.pyx"], include_dirs=[numpy.get_include()]
)
]
)
---

When I run "python setup.py build_ext --inplace", everything is fine.
When I run "python3 setup.py build_ext --inplace", I get an error:

running build_ext
cythoning test_open.pyx to test_open.c

Error converting Pyrex file to C:

...
# For use in situations where ndarray can't replace PyArrayObject*,
# like PyArrayObject**.
pass

ctypedef class numpy.ndarray [object PyArrayObject]:
cdef __cythonbufferdefaults__ = {"mode": "strided"}
^


/home/darren/.local/lib/python3.1/site-packages/Cython/Includes/numpy.pxd:173:49:
"mode" is not a buffer option

Error converting Pyrex file to C:

...
cimport numpy as np
import numpy as np


def test():
   cdef np.ndarray[np.float64_t, ndim=1] ret
   ^


/home/darren/temp/test/test_open.pyx:6:8: 'ndarray' is not a type identifier
building 'test_open' extension
gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC
-I/home/darren/.local/lib/python3.1/site-packages/numpy/core/include
-I/usr/include/python3.1 -c test_open.c -o
build/temp.linux-x86_64-3.1/test_open.o
test_open.c:1: error: #error Do not use this file, it is the result of
a failed Cython compilation.
error: command 'gcc' failed with exit status 1


Is this a bug, or is there a problem with my example?

Thanks,
Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] seeking advice on a fast string->array conversion

2010-11-19 Thread Darren Dale
I am wrapping up a small package to parse a particular ascii-encoded
file format generated by a program we use heavily here at the lab. (In
the unlikely event that you work at a synchrotron, and use Certified
Scientific's "spec" program, and are actually interested, the code is
currently available at
https://github.com/darrendale/praxes/tree/specformat/praxes/io/spec/
.)

I have been benchmarking the project against another python package
developed by a colleague, which is an extension module written in pure
C. My python/cython project takes about twice as long to parse and
index a file (~0.8 seconds for 100MB), which is acceptable. However,
actually converting ascii strings to numpy arrays, which is done using
numpy.fromstring,  takes a factor of 10 longer than the extension
module. So I am wondering about the performance of np.fromstring:

import time
import numpy as np
s = b'1 ' * 2048 *1200
d = time.time()
x = np.fromstring(s)
print time.time() - d
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] seeking advice on a fast string->array conversion

2010-11-19 Thread Darren Dale
Apologies, I accidentally hit send...

On Tue, Nov 16, 2010 at 9:20 AM, Darren Dale  wrote:
> I am wrapping up a small package to parse a particular ascii-encoded
> file format generated by a program we use heavily here at the lab. (In
> the unlikely event that you work at a synchrotron, and use Certified
> Scientific's "spec" program, and are actually interested, the code is
> currently available at
> https://github.com/darrendale/praxes/tree/specformat/praxes/io/spec/
> .)
>
> I have been benchmarking the project against another python package
> developed by a colleague, which is an extension module written in pure
> C. My python/cython project takes about twice as long to parse and
> index a file (~0.8 seconds for 100MB), which is acceptable. However,
> actually converting ascii strings to numpy arrays, which is done using
> numpy.fromstring,  takes a factor of 10 longer than the extension
> module. So I am wondering about the performance of np.fromstring:

import time
import numpy as np
s = b'1 ' * 2048 *1200
d = time.time()
x = np.fromstring(s, dtype='d', sep=b' ')
print time.time() - d

That takes about 1.3 seconds on my machine. A similar metric for the
extension module is to load 1200 of these 2048-element arrays from the
file:

d=time.time()
x=[s.mca(i+1) for i in xrange(1200)]
print time.time()-d

That takes about 0.127 seconds on my machine. This discrepancy is
unacceptable for my usecase, so I need to develop an alternative to
fromstring. Here is bit of testing with cython:

import time

cdef extern from 'stdlib.h':
double atof(char*)

py_string = '100'
cdef char* c_string = py_string
cdef int i, j
j=2048*1200

d = time.time()
while ihttp://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] seeking advice on a fast string->array conversion

2010-11-19 Thread Darren Dale
On Tue, Nov 16, 2010 at 10:31 AM, Darren Dale  wrote:
> On Tue, Nov 16, 2010 at 9:55 AM, Pauli Virtanen  wrote:
>> Tue, 16 Nov 2010 09:41:04 -0500, Darren Dale wrote:
>> [clip]
>>> That loop takes 0.33 seconds to execute, which is a good start. I need
>>> some help converting this example to return an actual numpy array. Could
>>> anyone please offer a suggestion?
>>
>> Easiest way is probably to use ndarray buffers and resize them when
>> needed. For example:
>>
>> https://github.com/pv/scipy-work/blob/enh/interpnd-smooth/scipy/spatial/qhull.pyx#L980
>
> Thank you Pauli. That makes it *incredibly* simple:
>
> import time
> cimport numpy as np
> import numpy as np
>
> cdef extern from 'stdlib.h':
>    double atof(char*)
>
>
> def test():
>    py_string = '100'
>    cdef char* c_string = py_string
>    cdef int i, j
>    cdef double val
>    i = 0
>    j = 2048*1200
>    cdef np.ndarray[np.float64_t, ndim=1] ret
>
>    ret_arr = np.empty((2048*1200,), dtype=np.float64)
>    ret = ret_arr
>
>    d = time.time()
>    while i        c_string = py_string
>        ret[i] = atof(c_string)
>        i += 1
>    ret_arr.shape = (1200, 2048)
>    print ret_arr, ret_arr.shape, time.time()-d
>
> The loop now takes only 0.11 seconds to execute. Thanks again.
>

One follow-up issue: I can't cythonize this code for python-3. I've
installed numpy with the most recent changes to the 1.5.x maintenance
branch, then re-installed cython-0.13, but when I run "python3
setup.py build_ext --inplace" with this setup script:

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

import numpy

setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [
Extension(
"test_open", ["test_open.pyx"], include_dirs=[numpy.get_include()]
)
]
)


I get the following error. Any suggestions what I need to fix, or
should I report it to the cython list?

$ python3 setup.py build_ext --inplace
running build_ext
cythoning test_open.pyx to test_open.c

Error converting Pyrex file to C:

...
# For use in situations where ndarray can't replace PyArrayObject*,
# like PyArrayObject**.
pass

ctypedef class numpy.ndarray [object PyArrayObject]:
cdef __cythonbufferdefaults__ = {"mode": "strided"}
^


/Users/darren/.local/lib/python3.1/site-packages/Cython/Includes/numpy.pxd:173:49:
"mode" is not a buffer option

Error converting Pyrex file to C:

...
   cdef char* c_string = py_string
   cdef int i, j
   cdef double val
   i = 0
   j = 2048*1200
   cdef np.ndarray[np.float64_t, ndim=1] ret
   ^


/Users/darren/temp/test/test_open.pyx:16:8: 'ndarray' is not a type identifier
building 'test_open' extension
/usr/bin/gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g
-fwrapv -O3 -Wall -Wstrict-prototypes
-I/Users/darren/.local/lib/python3.1/site-packages/numpy/core/include
-I/opt/local/Library/Frameworks/Python.framework/Versions/3.1/include/python3.1
-c test_open.c -o build/temp.macosx-10.6-x86_64-3.1/test_open.o
test_open.c:1:2: error: #error Do not use this file, it is the result
of a failed Cython compilation.
error: command '/usr/bin/gcc-4.2' failed with exit status 1
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] seeking advice on a fast string->array conversion

2010-11-19 Thread Darren Dale
Sorry, I accidentally hit send long before I was finished writing. But
to answer your question, they contain many 2048-element multi-channel
analyzer spectra.

Darren

On Tue, Nov 16, 2010 at 9:26 AM, william ratcliff
 wrote:
> Actually,
> I do use spec when I have synchotron experiments.  But why are your files so
> large?
>
> On Nov 16, 2010 9:20 AM, "Darren Dale"  wrote:
>> I am wrapping up a small package to parse a particular ascii-encoded
>> file format generated by a program we use heavily here at the lab. (In
>> the unlikely event that you work at a synchrotron, and use Certified
>> Scientific's "spec" program, and are actually interested, the code is
>> currently available at
>> https://github.com/darrendale/praxes/tree/specformat/praxes/io/spec/
>> .)
>>
>> I have been benchmarking the project against another python package
>> developed by a colleague, which is an extension module written in pure
>> C. My python/cython project takes about twice as long to parse and
>> index a file (~0.8 seconds for 100MB), which is acceptable. However,
>> actually converting ascii strings to numpy arrays, which is done using
>> numpy.fromstring, takes a factor of 10 longer than the extension
>> module. So I am wondering about the performance of np.fromstring:
>>
>> import time
>> import numpy as np
>> s = b'1 ' * 2048 *1200
>> d = time.time()
>> x = np.fromstring(s)
>> print time.time() - d
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] seeking advice on a fast string->array conversion

2010-11-16 Thread Darren Dale
On Tue, Nov 16, 2010 at 11:46 AM, Christopher Barker
 wrote:
> On 11/16/10 7:31 AM, Darren Dale wrote:
>> On Tue, Nov 16, 2010 at 9:55 AM, Pauli Virtanen  wrote:
>>> Tue, 16 Nov 2010 09:41:04 -0500, Darren Dale wrote:
>>> [clip]
>>>> That loop takes 0.33 seconds to execute, which is a good start. I need
>>>> some help converting this example to return an actual numpy array. Could
>>>> anyone please offer a suggestion?
>
> Darren,
>
> It's interesting that you found fromstring() so slow -- I've put some
> time into trying to get fromfile() and fromstring() to be a bit more
> robust and featurefull, but found it to be some really painful code to
> work on -- but it didn't dawn on my that it would be slow too! I saw all
> the layers of function calls, but I still thought that would be minimal
> compared to the actual string parsing. I guess not. Shows that you never
> know where your bottlenecks are without profiling.
>
> "Slow" is relative, of course, but since the whole point of
> fromfile/string is performance (otherwise, we'd just parse with python),
> it would be nice to get them as fast as possible.
>
> I had been thinking that the way to make a good fromfile was Cython, so
> you've inspired me to think about it some more. Would you be interested
> in extending what you're doing to a more general purpose tool?
>
> Anyway,  a comment or two:
>> cdef extern from 'stdlib.h':
>>      double atof(char*)
>
> One thing I found with the current numpy code is that the use of the
> ato* functions is a source of a lot of bugs (all of them?) the core
> problem is error handling -- you have to do a lot of pointer checking to
> see if a call was successful, and with the fromfile code, that error
> handling is not done in all the layers of calls.

In my case, I am making an assumption about the integrity of the file.

> Anyone know what the advantage of ato* is over scanf()/fscanf()?
>
> Also, why are you doing string parsing rather than parsing the files
> directly, wouldn't that be a bit faster?

Rank inexperience, I guess. I don't understand what you have in mind.
scanf/fscanf don't actually convert strings to numbers, do they?

> I've got some C extension code for simple parsing of text files into
> arrays of floats or doubles (using fscanf). I'd be curious how the
> performance compares to what you've got. Let me know if you're interested.

I'm curious, yes.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] seeking advice on a fast string->array conversion

2010-11-16 Thread Darren Dale
On Tue, Nov 16, 2010 at 9:55 AM, Pauli Virtanen  wrote:
> Tue, 16 Nov 2010 09:41:04 -0500, Darren Dale wrote:
> [clip]
>> That loop takes 0.33 seconds to execute, which is a good start. I need
>> some help converting this example to return an actual numpy array. Could
>> anyone please offer a suggestion?
>
> Easiest way is probably to use ndarray buffers and resize them when
> needed. For example:
>
> https://github.com/pv/scipy-work/blob/enh/interpnd-smooth/scipy/spatial/qhull.pyx#L980

Thank you Pauli. That makes it *incredibly* simple:

import time
cimport numpy as np
import numpy as np

cdef extern from 'stdlib.h':
double atof(char*)


def test():
py_string = '100'
cdef char* c_string = py_string
cdef int i, j
cdef double val
i = 0
j = 2048*1200
cdef np.ndarray[np.float64_t, ndim=1] ret

ret_arr = np.empty((2048*1200,), dtype=np.float64)
ret = ret_arr

d = time.time()
while ihttp://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bzr mirror

2010-11-13 Thread Darren Dale
On Sat, Nov 13, 2010 at 7:27 AM, Darren Dale  wrote:
> On Fri, Nov 12, 2010 at 9:42 PM, Ralf Gommers
>  wrote:
>> Hi,
>>
>> While cleaning up the numpy wiki start page I came across a bzr mirror
>> that still pointed to svn, https://launchpad.net/numpy, originally
>> registered by Jarrod. It would be good to either point that to git or
>> delete it. I couldn't see how to report or do anything about that on
>> Launchpad, but that's maybe just me - I can never find anything there.
>>
>> For now I've removed the link to it on Trac, if the mirror gets
>> updated please put it back.


Comment 8 at https://bugs.launchpad.net/launchpad-registry/+bug/38349 :

"Ask to deactivate the project at
https://answers.launchpad.net/launchpad/+addquestion

If the project has no data that is useful to the community, it will be
deactivated. If the project has code or bugs, the community may still
use the project even if the maintainers are no interested in it.
Launchpad admins will not deactivate projects that the community can
use. Consider transferring maintainership to another user."

Note the continued use of "deactivate" throughout the answer to
repeated inquiries of how to delete a project. From
https://help.launchpad.net/PrivacyPolicy :

"Launchpad retains all data submitted by users permanently.

Except in the circumstances listed below, Launchpad will only delete
data if required to do so by law or if data (including files, PPA
submissions, bug reports, bug comments, bug attachments, and
translations) is inappropriate. Canonical reserves the right to
determine whether data is inappropriate. Spam, malicious code, and
defamation are considered inappropriate. Where data is deleted, it
will be removed from the Launchpad database but it may continue to
exist in backup archives which are maintained by Canonical."
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bzr mirror

2010-11-13 Thread Darren Dale
On Fri, Nov 12, 2010 at 9:42 PM, Ralf Gommers
 wrote:
> Hi,
>
> While cleaning up the numpy wiki start page I came across a bzr mirror
> that still pointed to svn, https://launchpad.net/numpy, originally
> registered by Jarrod. It would be good to either point that to git or
> delete it. I couldn't see how to report or do anything about that on
> Launchpad, but that's maybe just me - I can never find anything there.
>
> For now I've removed the link to it on Trac, if the mirror gets
> updated please put it back.

From
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] whitespace in git repo

2010-10-30 Thread Darren Dale
On Thu, Oct 28, 2010 at 12:11 PM, Charles R Harris
 wrote:
>
>
> On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale  wrote:
>>
>> Hi Chuck,
>>
>> On Wed, Oct 27, 2010 at 1:30 PM, Charles R Harris
>>  wrote:
>> >
>> > I'd like to do something here, but I'm waiting for a consensus and for
>> > someone to test things out, maybe with a test repo, to make sure things
>> > operate correctly. The documentation isn't that clear...
>>
>> I am getting ready to test on windows and mac. In the process of
>> upgrading git on windows to 1.7.3.1, The following dialog appeared:
>>
>> Configuring line ending conversions
>>  How should Git treat line endings in text files?
>>
>> x Checkout Windows-style, commit Unix-style line endings
>>  Git will convert LF to CRLF when checking out text files. When
>> committing text files, CRLF will be converted to LF. For
>> cross-platform projects, this is the recommended setting on Windows
>> ("core.autocrlf" is set to "true")
>>
>> o Checkout as-is, commit Unix-style line endings
>>  Git will not perform any conversion when checking out text files.
>> When committing text files, CRLF will be converted to LF. For
>> cross-platform projects this is the recommended setting on Unix
>> ("core.autocrlf" is set to "input").
>>
>> o Checkout as-is, commit as-is
>>  Git will not perform any conversions when checking out or committing
>> text files. Choosing this option is not recommended for cross-platform
>> projects ("core.autocrlf" is set to "false")
>>
>> This might warrant a very brief mention in the docs, for helping
>> people set up their environment. Its too bad core.autocrlf cannot be
>> set on a per-project basis in a file that gets committed to the
>
> Yes, this would be good information to have in the notes.
>
>>
>> repository. As far as I can tell, it can only be set in ~/.gitconfig
>> or numpy/.git/config. Which is why I suggested adding .gitattributes,
>> which can be committed to the repository, and the line "* text=auto"
>> ensures that EOLs in text files are committed as LF, so we don't have
>> to worry about somebody's config settings having unwanted impact on
>> the repository.
>
> Might be worth trying in a numpy/.gitconfig just to see what happens.
> Documentation isn't always complete.

Now that I understand the situation a little better, I don't think we
would want such a .gitconfig in the repository itself. Most windows
users would probably opt for autcrlf=true, but that is definitely not
the case for mac and linux users.

I've been testing the changes in the pull request this morning on
linux, mac and windows, all using git-1.7.3.1. I made a testing branch
from whitespace-cleanup and added two files created on windows:
temp.txt and tmp.txt. One of them was added to .gitattributes to
preserve the crlf in the repo.

windows: with autocrlf=true, all files in the working directory are
crlf. With autocrlf=false, files marked in .gitattributes for crlf do
have crlf, the other files are lf. Check.

mac: tested with autocrlf=input. files marked in .gitattributes for
crlf have crlf, others are lf. Check.

linux (kubuntu 10.10): tested with autocrlf=input and false. All files
in the working directory have lf, even those marked for crlf. This is
confusing. I copied temp.txt from windows, verified that it still had
crlf endings, and copied it into the working directory. Git warns that
crlf will be converted to lf, but attempting a commit yields "nothing
to do". I had to do "git add temp.txt", at which point git status
tells me the working directory is clean and there is nothing to
commit. I'm not too worried about this, its a situation that is
unlikely to ever occur in practice.

I think I have convinced myself that the pull request is satisfactory.
Devs should bear in mind, though, that there is a small risk when
committing changes to binary files that git will corrupt such a file
by incorrectly identifying and converting crlf to lf. Git should warn
when line conversions are going to take place, so they can be disabled
for a binary file in .gitattributes:

  mybinaryfile.dat -text

That is all,

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] whitespace in git repo

2010-10-28 Thread Darren Dale
On Thu, Oct 28, 2010 at 3:23 PM,   wrote:
> On Thu, Oct 28, 2010 at 2:40 PM, Darren Dale  wrote:
>> On Thu, Oct 28, 2010 at 12:23 PM,   wrote:
>>> On Thu, Oct 28, 2010 at 12:11 PM, Charles R Harris
>>>> On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale  wrote:
>>>>> And now the bad news: I have not been able to verify that Git respects
>>>>> the autocrlf setting or the eol setting in .gitattributes on my
>>>>> windows 7 computer: I made a new clone and the line endings are LF in
>>>>> the working directory, both on master and in my whitespace-cleanup
>>>>> branch (even the nsi.in file!). ("git config -l" confirms that
>>>>> "core.autocrlf" is "true".) To check my sanity, I tried writing files
>>>>> using wordpad and notepad to confirm that they are at least using
>>>>> CRLF, and they are *not*, according to both python's open() and grep
>>>>> "\r\n". If it were after noon where I live, I would be looking for a
>>>
>>> maybe just something obvious: Did you read the files in python as binary 
>>> 'rb' ?
>>
>> No, I did not. You are right, this shows \r\n. Why is it necessary to
>> open them as binary? IIUC (OIDUC), one should use 'rU' to unify line
>> endings.
>
> The python default for open(filename).read()  or open(filename,
> 'r').read() is to standardize line endings to \n.

Although, on a mac:


In [1]: 
open('tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in').readlines()[0]
Out[1]: ';\r\n'
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] whitespace in git repo

2010-10-28 Thread Darren Dale
On Thu, Oct 28, 2010 at 12:23 PM,   wrote:
> On Thu, Oct 28, 2010 at 12:11 PM, Charles R Harris
>> On Thu, Oct 28, 2010 at 9:23 AM, Darren Dale  wrote:
>>> And now the bad news: I have not been able to verify that Git respects
>>> the autocrlf setting or the eol setting in .gitattributes on my
>>> windows 7 computer: I made a new clone and the line endings are LF in
>>> the working directory, both on master and in my whitespace-cleanup
>>> branch (even the nsi.in file!). ("git config -l" confirms that
>>> "core.autocrlf" is "true".) To check my sanity, I tried writing files
>>> using wordpad and notepad to confirm that they are at least using
>>> CRLF, and they are *not*, according to both python's open() and grep
>>> "\r\n". If it were after noon where I live, I would be looking for a
>
> maybe just something obvious: Did you read the files in python as binary 'rb' 
> ?

No, I did not. You are right, this shows \r\n. Why is it necessary to
open them as binary? IIUC (OIDUC), one should use 'rU' to unify line
endings.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] whitespace in git repo

2010-10-28 Thread Darren Dale
Hi Chuck,

On Wed, Oct 27, 2010 at 1:30 PM, Charles R Harris
 wrote:
>
> I'd like to do something here, but I'm waiting for a consensus and for
> someone to test things out, maybe with a test repo, to make sure things
> operate correctly. The documentation isn't that clear...

I am getting ready to test on windows and mac. In the process of
upgrading git on windows to 1.7.3.1, The following dialog appeared:

Configuring line ending conversions
  How should Git treat line endings in text files?

x Checkout Windows-style, commit Unix-style line endings
  Git will convert LF to CRLF when checking out text files. When
committing text files, CRLF will be converted to LF. For
cross-platform projects, this is the recommended setting on Windows
("core.autocrlf" is set to "true")

o Checkout as-is, commit Unix-style line endings
  Git will not perform any conversion when checking out text files.
When committing text files, CRLF will be converted to LF. For
cross-platform projects this is the recommended setting on Unix
("core.autocrlf" is set to "input").

o Checkout as-is, commit as-is
  Git will not perform any conversions when checking out or committing
text files. Choosing this option is not recommended for cross-platform
projects ("core.autocrlf" is set to "false")

This might warrant a very brief mention in the docs, for helping
people set up their environment. Its too bad core.autocrlf cannot be
set on a per-project basis in a file that gets committed to the
repository. As far as I can tell, it can only be set in ~/.gitconfig
or numpy/.git/config. Which is why I suggested adding .gitattributes,
which can be committed to the repository, and the line "* text=auto"
ensures that EOLs in text files are committed as LF, so we don't have
to worry about somebody's config settings having unwanted impact on
the repository.

And now the bad news: I have not been able to verify that Git respects
the autocrlf setting or the eol setting in .gitattributes on my
windows 7 computer: I made a new clone and the line endings are LF in
the working directory, both on master and in my whitespace-cleanup
branch (even the nsi.in file!). ("git config -l" confirms that
"core.autocrlf" is "true".) To check my sanity, I tried writing files
using wordpad and notepad to confirm that they are at least using
CRLF, and they are *not*, according to both python's open() and grep
"\r\n". If it were after noon where I live, I would be looking for a
bottle of whiskey. But its not, so I'll just beat my head against my
desk until I've forgotten about this whole episode.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] whitespace in git repo

2010-10-27 Thread Darren Dale
On Wed, Oct 27, 2010 at 11:31 AM, Friedrich Romstedt
 wrote:
> Hi Darren,
>
> 2010/10/27 Darren Dale :
>>> So the svg changes must come from the 'fix' value for the whitespace action.
>>>
>>> I don't think it is a good idea to let whitespace be fixed by git and
>>> not by your editor :-)  Or do you disagree?
>>
>> "What are considered whitespace errors is controlled by
>> core.whitespace configuration. By default, trailing whitespaces
>> (including lines that solely consist of whitespaces) and a space
>> character that is immediately followed by a tab character inside the
>> initial indent of the line are considered whitespace errors."
>>
>> No mention of EOL conversions there. But yes, I guess we disagree. I
>> prefer to have git automatically strip any trailing whitespace that I
>> might have accidentally introduced.
>
> I agree.  But I just guess that the changes of the svgs in your pull
> request might be not due to eols but due to whitespace fixes.

No, it was not. I explicitly checked the svg files before and after,
using open("foo.svg").readlines[0], and saw that the files were CRLF
before the commit on my branch, and LF after.

> I think
> so because in my numpy (current master branch) I cannot see any CRLF
> there in the repo.  Checked with ``* text=auto``, which also affects
> non-normalised files in the repo.
>
> But it might be that the conversion is done silently, although I don't
> know how to do it like that.  So no "changed" showing up implies "no
> non-LF eol".
>
>>> This whitespace & newline thing is really painful, I suggest you set
>>> in your .gitconfig:
>>>
>>> [core]
>>>    autocrlf = true
>>
>> I don't think so: "Use this setting if you want to have CRLF line
>> endings in your working directory even though the repository does not
>> have normalized line endings." I don't want CRLF in my working
>> directory. Did you read
>> http://help.github.com/dealing-with-lineendings/ ?
>
> Aha, this is a misunderstanding.  Somehow I thought you're working on
> Windows.  Is there then a specific reason not to use CRLF?  I mean,
> you can check it in with LF anyway.
>
> The page you mentioned is very brief and like a recipe, not my taste.
> I want to know what's going on in detail.
>
>>> and in our numpy .gitattributes:
>>>
>>> * text=auto
>>
>> That is already included in the pull request.
>
> Yes, I know.  I meant to leave the line with the eol=crlf alone.  All
> based on the assumtion that you're working with crlf anyway, so might
> be wrong.
>
>>> while the text=auto is more strong and a superset of autocrlf=true.
>>>
>>> I came across this when trying if text=auto marks any files as
>>> changed, and it didn't so everything IS already LF in the repo.
>>>
>>> Can you check this please?
>>
>> Check what?
>
> My conclusions above.  We both know that in this subject all
> conclusions are pretty error-prone ...
>
>>> I was near to leaving a comment like
>>> "asap" on github, but since this is so horribly complicated and
>>> error-prone ...
>>
>> I'm starting to consider canceling the pull request.
>
> At least we should check if it's really what we intend.
>
> I understand now better why at all you wanted to force the .nsi.in
> file to be crlf.  From your previous posts, i.e. that it would be the
> default for Win users anyway, I see now that I should have asked.
>
> To my understanding the strategy should be two things:
> 1)  LF force in the repo.  This is independent from the .nsi.in thing,
> but missing currently in the official branches.  We can do that at the
> same time.
> 2)  Forcing the .nsi.in file to be crlf in the check-out (and only
> there) at all times.  There is one higher level in $GITDIR, but I
> think we can ignore that.
>
> To (1): The default Win user would check-in *newly created* files
> currently in CRLF, at least this is what I did with a not-so-recent
> git some time ago (other repos)  When I switched to Mac, all my
> files were marked "changed".  afaik git does not do normalisation if
> you do not tell it to do so. "While git normally leaves file contents
> alone, it can be configured to normalize line endings to LF in the
> repository and, optionally, to convert them to CRLF when files are
> checked out." 
> (http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html)
>  I still do not understand why my files showed up chan

Re: [Numpy-discussion] whitespace in git repo

2010-10-27 Thread Darren Dale
On Wed, Oct 27, 2010 at 8:36 AM, Friedrich Romstedt
 wrote:
> Hi Darren,
>
> 2010/10/19 Darren Dale :
>> I have the following set in my ~/.gitconfig file:
>>
>>    [apply]
>>        whitespace = fix
>>
>>    [core]
>>        autocrlf = input
>>
>> which is attempting to correct some changes in:
>>
>> branding/icons/numpylogo.svg
>> branding/icons/numpylogoicon.svg
>> tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in
>
> Here an excerpt from git-config:
>
> core.autocrlf
>
>    Setting this variable to "true" is almost the same as setting the
> text attribute to "auto" on all files except that text files are not
> guaranteed to be normalized: files that contain CRLF in the repository
> will not be touched. Use this setting if you want to have CRLF line
> endings in your working directory even though the repository does not
> have normalized line endings. This variable can be set to input, in
> which case no output conversion is performed.
>
> >From git-apply:
>
> ``fix`` outputs warnings for a few such errors, and applies the patch
> after fixing them (strip is a synonym --- the tool used to consider
> only trailing whitespace characters as errors, and the fix involved
> stripping them, but modern gits do more).
>
> So I think your "autocrlf=input" makes the .nsi.in file checked out as
> LF since it's in LF in the repo, and "no output conversion is
> performed" due to core.autocrlf=input in your .gitconfigure.
>
> So the svg changes must come from the 'fix' value for the whitespace action.
>
> I don't think it is a good idea to let whitespace be fixed by git and
> not by your editor :-)  Or do you disagree?

"What are considered whitespace errors is controlled by
core.whitespace configuration. By default, trailing whitespaces
(including lines that solely consist of whitespaces) and a space
character that is immediately followed by a tab character inside the
initial indent of the line are considered whitespace errors."

No mention of EOL conversions there. But yes, I guess we disagree. I
prefer to have git automatically strip any trailing whitespace that I
might have accidentally introduced.

> This whitespace & newline thing is really painful, I suggest you set
> in your .gitconfig:
>
> [core]
>    autocrlf = true

I don't think so: "Use this setting if you want to have CRLF line
endings in your working directory even though the repository does not
have normalized line endings." I don't want CRLF in my working
directory. Did you read
http://help.github.com/dealing-with-lineendings/ ?

> and in our numpy .gitattributes:
>
> * text=auto

That is already included in the pull request.

> while the text=auto is more strong and a superset of autocrlf=true.
>
> I came across this when trying if text=auto marks any files as
> changed, and it didn't so everything IS already LF in the repo.
>
> Can you check this please?

Check what?

> I was near to leaving a comment like
> "asap" on github, but since this is so horribly complicated and
> error-prone ...

I'm starting to consider canceling the pull request.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ANN: NumPy 1.5.1 release candidate 1

2010-10-24 Thread Darren Dale
On Sun, Oct 24, 2010 at 11:29 AM, Charles R Harris
 wrote:
>
>
> On Sun, Oct 24, 2010 at 9:22 AM, Darren Dale  wrote:
>>
>> On Sun, Oct 17, 2010 at 7:35 AM, Ralf Gommers
>>  wrote:
>> > Hi,
>> >
>> > I am pleased to announce the availability of the first release
>> > candidate of NumPy 1.5.1. This is a bug-fix release with no new
>> > features compared to 1.5.0.
>> [...]
>> > Please report any other issues on the Numpy-discussion mailing list.
>>
>> Just installed on kubuntu-10.10, python-2.7 and python-3.1.2. Tests
>> look fine for py2.7, but I see datetime errors with py3k:
[...]
>
> You may have left over tests in the installation directory. Can you try
> deleting it and installing again?

You're right. Tests are passing.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ANN: NumPy 1.5.1 release candidate 1

2010-10-24 Thread Darren Dale
On Sun, Oct 17, 2010 at 7:35 AM, Ralf Gommers
 wrote:
> Hi,
>
> I am pleased to announce the availability of the first release
> candidate of NumPy 1.5.1. This is a bug-fix release with no new
> features compared to 1.5.0.
[...]
> Please report any other issues on the Numpy-discussion mailing list.

Just installed on kubuntu-10.10, python-2.7 and python-3.1.2. Tests
look fine for py2.7, but I see datetime errors with py3k:

.
==
ERROR: test_creation (test_datetime.TestDateTime)
--
Traceback (most recent call last):
  File 
"/home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py",
line 10, in test_creation
dt1 = np.dtype('M8[750%s]'%unit)
TypeError: data type not understood

==
ERROR: test_creation_overflow (test_datetime.TestDateTime)
--
Traceback (most recent call last):
  File 
"/home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py",
line 62, in test_creation_overflow
timesteps = np.array([date], dtype='datetime64[s]')[0].astype(np.int64)
TypeError: data type not understood

==
ERROR: test_divisor_conversion_as (test_datetime.TestDateTime)
--
Traceback (most recent call last):
  File 
"/home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py",
line 58, in test_divisor_conversion_as
self.assertRaises(ValueError, lambda : np.dtype('M8[as/10]'))
  File "/usr/lib/python3.1/unittest.py", line 589, in assertRaises
callableObj(*args, **kwargs)
  File 
"/home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py",
line 58, in 
self.assertRaises(ValueError, lambda : np.dtype('M8[as/10]'))
TypeError: data type not understood

==
ERROR: test_divisor_conversion_bday (test_datetime.TestDateTime)
--
Traceback (most recent call last):
  File 
"/home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py",
line 32, in test_divisor_conversion_bday
assert np.dtype('M8[B/12]') == np.dtype('M8[2h]')
TypeError: data type not understood

==
ERROR: test_divisor_conversion_day (test_datetime.TestDateTime)
--
Traceback (most recent call last):
  File 
"/home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py",
line 37, in test_divisor_conversion_day
assert np.dtype('M8[D/12]') == np.dtype('M8[2h]')
TypeError: data type not understood

==
ERROR: test_divisor_conversion_fs (test_datetime.TestDateTime)
--
Traceback (most recent call last):
  File 
"/home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py",
line 54, in test_divisor_conversion_fs
assert np.dtype('M8[fs/100]') == np.dtype('M8[10as]')
TypeError: data type not understood

==
ERROR: test_divisor_conversion_hour (test_datetime.TestDateTime)
--
Traceback (most recent call last):
  File 
"/home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py",
line 42, in test_divisor_conversion_hour
assert np.dtype('m8[h/30]') == np.dtype('m8[2m]')
TypeError: data type not understood

==
ERROR: test_divisor_conversion_minute (test_datetime.TestDateTime)
--
Traceback (most recent call last):
  File 
"/home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py",
line 46, in test_divisor_conversion_minute
assert np.dtype('m8[m/30]') == np.dtype('m8[2s]')
TypeError: data type not understood

==
ERROR: test_divisor_conversion_month (test_datetime.TestDateTime)
--
Traceback (most recent call last):
  File 
"/home/darren/.local/lib/python3.1/site-packages/numpy/core/tests/test_datetime.py",
line 21, in test_divisor_conversion_month
assert np.dtype('M8[M/2]') == np.dtype('M8[2W]')
TypeError: data type not understood

==
ERROR: test_divisor_conversion_second (test_datet

Re: [Numpy-discussion] whitespace in git repo

2010-10-21 Thread Darren Dale
On Thu, Oct 21, 2010 at 4:48 PM, Friedrich Romstedt
 wrote:
> 2010/10/21 Darren Dale :
>> I filed a new pull request, http://github.com/numpy/numpy/pull/7 .
>> This should enforce LF on all text files, with the current exception
>> of the nsi.in file, which is CRLF. The svgs have been converted to LF.
>> Additional, confusing reading can be found at
>> http://help.github.com/dealing-with-lineendings/ ,
>> http://www.kernel.org/pub/software/scm/git/docs/git-config.html, and
>> http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html .
>
> Hm, I like you pull request more than my own branch, but I think your
> conclusions might be incorrect.
>
> ``* text=auto`` forces git to normalise *all* text files, including
> the .nsi.in file, to LF *in the repo only*.  But it says nothing about
> how to set eol in the working dir.
>
> ``[...].nsi.in eol=crlf`` forces git to check-out the .nsi.in file with CRLF.

I see. Thank you for clarifying. It probably is not necessary then to
have the exception for the nsi.in file, since git will create files
with CRLF eols in the working directory on windows by default. The
eols in the working directory can be controlled by the core.eol
setting, which defaults to "native". But unless David C gives his
blessing, I will leave the pull request as is. Pretty confusing.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] whitespace in git repo

2010-10-21 Thread Darren Dale
On Thu, Oct 21, 2010 at 9:26 AM, David Cournapeau  wrote:
> On Thu, Oct 21, 2010 at 8:47 PM, Friedrich Romstedt
>  wrote:
>> 2010/10/21 David Cournapeau :
>>> On Thu, Oct 21, 2010 at 12:56 AM, Friedrich Romstedt
>>>  wrote:
>>>> 2010/10/20 Darren Dale :
>>>>> On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt
>>>>>  wrote:
>>>>>> Due to Darren's config file the .nsi.in file made it with CRLF into the 
>>>>>> repo.
>>>>>
>>>>> Uh, no.
>>>>
>>>> You mean I'm wrong?
>>>
>>> Yes, the file has always used CRLF, and needs to stay that way.
>>
>> I see, misunderstanding, for me I used "made it" in the sense
>> "succeeded in" :-)  So to be clear, I meant that I understood your
>> config file.
>>
>> Btw, it has \n\r, so it's LFCR and not CRLF as it should be on Windows
>> (ref: de.wikipedia).  I checked both my understanding of CR/LF as well
>> as used $grep -PU '$\n\r' again.
>>
>> See also http://de.wikipedia.org/wiki/Zeilenumbruch (german, the en
>> version doesn't have the table).  So either:
>> 1)  You encoded for whatever reason the file with CR and LF swapped
>
> Nobody encoded the file in a special manner. It just happens to be a
> file used on windows, by a windows program, and as such should stay in
> CR/LF format. I am not sure why you say LF and CR are swapped, I don't
> see it myself, and vim tells me it is in DOS (e.g. CR/LF) format.
>
>> 2)  It doesn't matter what the order is
>
> It does matter. Although text editors are generally smart about line
> endings, other windows softwares are not.

I filed a new pull request, http://github.com/numpy/numpy/pull/7 .
This should enforce LF on all text files, with the current exception
of the nsi.in file, which is CRLF. The svgs have been converted to LF.
Additional, confusing reading can be found at
http://help.github.com/dealing-with-lineendings/ ,
http://www.kernel.org/pub/software/scm/git/docs/git-config.html, and
http://www.kernel.org/pub/software/scm/git/docs/gitattributes.html .

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] whitespace in git repo

2010-10-20 Thread Darren Dale
On Wed, Oct 20, 2010 at 11:56 AM, Friedrich Romstedt
 wrote:
> 2010/10/20 Darren Dale :
>> On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt
>>  wrote:
>>> Due to Darren's config file the .nsi.in file made it with CRLF into the 
>>> repo.
>>
>> Uh, no.
>
> You mean I'm wrong?

Due to my config file... nothing. I simply noticed the
already-existing CRLF line endings in the repository.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] whitespace in git repo

2010-10-20 Thread Darren Dale
On Wed, Oct 20, 2010 at 6:12 AM, Friedrich Romstedt
 wrote:
> Due to Darren's config file the .nsi.in file made it with CRLF into the repo.

Uh, no.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] whitespace in git repo

2010-10-19 Thread Darren Dale
We have been discussing whitespace and line endings at the following
pull request: http://github.com/numpy/numpy/pull/4 . Chuck suggested
we discuss it here on the list.

I have the following set in my ~/.gitconfig file:

[apply]
whitespace = fix

[core]
autocrlf = input

which is attempting to correct some changes in:

branding/icons/numpylogo.svg
branding/icons/numpylogoicon.svg
tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in

David C. suggested that the nsi.in file should not be changed. I
suggested adding a .gitattributes file along with the existing
.gitignore file in the numpy repo. This would enforce windows line
endings for the nsi.in file:

tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in eol=crlf

alternatively this would disable any attempt to convert line endings:

tools/win32build/nsis_scripts/numpy-superinstaller.nsi.in -text

I think the former is preferable. But it seems like a good idea to
include some git config files in the repo to ensure trailing
whitespace is stripped and line endings are appropriate to the numpy
project, regardless of what people may have in their ~/.gitconfig
file. Comments?

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] test errors in the trunk

2010-07-31 Thread Darren Dale
On Sat, Jul 31, 2010 at 7:22 AM, Ralf Gommers
 wrote:
>
>
> On Sat, Jul 31, 2010 at 4:55 AM, Robert Kern  wrote:
>>
>> On Fri, Jul 30, 2010 at 13:22, Darren Dale  wrote:
>> > I just upgraded my svn checkout and did a fresh install. When I try to
>> > run the test suite, I get a ton of errors:
>> >
>> >
>> > np.test()
>> > Running unit tests for numpy
>> > NumPy version 2.0.0.dev8550
>> > NumPy is installed in
>> > /Users/darren/.local/lib/python2.6/site-packages/numpy
>> > Python version 2.6.5 (r265:79063, Jul 19 2010, 09:08:11) [GCC 4.2.1
>> > (Apple Inc. build 5659)]
>> > nose version 0.11.3
>> >
>> > Reloading
>> > numpy.lib
>> > Reloading numpy.lib.info
>> > Reloading numpy.lib.numpy
>> > Reloading numpy
>> > Reloading numpy.numpy
>> > Reloading numpy.show
>> > 
>> > ==
>> >
>> > [...]
>> >
>> >  File
>> > "/Users/darren/.local/lib/python2.6/site-packages/numpy/lib/__init__.py",
>> > line 23, in 
>> >    __all__ += type_check.__all__
>> > NameError: name 'type_check' is not defined
>> >
>> >
>> > I checked numpy/lib/__init__.py, and it does a bunch of imports like
>> > "from type_check import *" but not "import type_check", which are
>> > needed to append to __all__.
>>
>> Not quite. The code does work, as-is, in most situations thanks to a
>> detail of Python's import system. When a submodule is imported in a
>> package, whether through a direct "import package.submodule" or "from
>> submodule import *", Python will take the created module object and
>> assign it into the package.__init__'s namespace with the appropriate
>> name. So while the code doesn't look correct, it usually is correct.
>>
>> The problem is test_getlimits.py:
>>
>> import numpy.lib
>> try:
>>    reload(numpy.lib)
>> except NameError:
>>    # Py3K
>>    import imp
>>    imp.reload(numpy.lib)
>>
>> These are causing reloads of the hierarchy under numpy.lib and are
>> presumably interfering with the normal import process (for some
>> reason). Does anyone know why we reload(numpy.lib) here? The log
>> history is unhelpful. It goes back to when this code was in scipy. I
>> suspect that we can just remove it.
>
> If no one remembers, can we remove this before the 1.5.0 beta (i.e.
> tomorrow) so it gets tested enough before the final release?
>
> Tested on OS X with python 2.6.5 and 3.1, no problems after removing it.

I just committed the change in svn 8568.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] test errors in the trunk

2010-07-30 Thread Darren Dale
I just upgraded my svn checkout and did a fresh install. When I try to
run the test suite, I get a ton of errors:


np.test()
Running unit tests for numpy
NumPy version 2.0.0.dev8550
NumPy is installed in /Users/darren/.local/lib/python2.6/site-packages/numpy
Python version 2.6.5 (r265:79063, Jul 19 2010, 09:08:11) [GCC 4.2.1
(Apple Inc. build 5659)]
nose version 0.11.3
Reloading
numpy.lib
Reloading numpy.lib.info
Reloading numpy.lib.numpy
Reloading numpy
Reloading numpy.numpy
Reloading numpy.show

==

[...]

  File "/Users/darren/.local/lib/python2.6/site-packages/numpy/lib/__init__.py",
line 23, in 
__all__ += type_check.__all__
NameError: name 'type_check' is not defined


I checked numpy/lib/__init__.py, and it does a bunch of imports like
"from type_check import *" but not "import type_check", which are
needed to append to __all__.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about creating numpy arrays

2010-05-20 Thread Darren Dale
On Thu, May 20, 2010 at 12:07 PM, Bruce Southey  wrote:
> np.array is an array creating function that numpy.array takes a
> array_like input and it *will* try to convert that input into an array.
> (This also occurs when you give np.array a masked array as an input.)
> This a 'feature' especially when you don't use the dtype argument and
> applies to any numpy function that takes array_like inputs.

Ok. I can accept that.

> I do not quantities, but you either have to get the user to use the
> appropriate quantities functions or let it remain 'user beware' when
> they do not use the appropriate functions. In the longer term you have
> to get numpy to 'do the right thing' with quantities objects.

I have done a bit of development on numpy to try to extend the
__array_wrap__ mechanism so quantities could tell numpy how to do the
right thing in many situations. That has been largely successful, but
this issue we are discussing is demonstrating some unanticipated
limitations. You may be right that this is a "user-beware" situation,
since in this case there appears to be no way for an ndarray subclass
to step in and influence what numpy will do with a list of those
instances.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about creating numpy arrays

2010-05-20 Thread Darren Dale
[sorry, my last got cut off]

On Thu, May 20, 2010 at 11:37 AM, Darren Dale  wrote:
> On Thu, May 20, 2010 at 10:44 AM, Benjamin Root  wrote:
>>> I gave two counterexamples of why.
>>
>> The examples you gave aren't counterexamples.  See below...
>
> I'm not interested in arguing over semantics. I've discovered an issue
> with how numpy deals with lists of objects that derive from ndarray,
> and am concerned about the implications for classes that extend
> ndarray.
>
>> On Wed, May 19, 2010 at 7:06 PM, Darren Dale  wrote:
>>>
>>> On Wed, May 19, 2010 at 4:19 PM,   wrote:
>>> > On Wed, May 19, 2010 at 4:08 PM, Darren Dale  wrote:
>>> >> I have a question about creation of numpy arrays from a list of
>>> >> objects, which bears on the Quantities project and also on masked
>>> >> arrays:
>>> >>
>>> >>>>> import quantities as pq
>>> >>>>> import numpy as np
>>> >>>>> a, b = 2*pq.m,1*pq.s
>>> >>>>> np.array([a, b])
>>> >> array([ 12.,   1.])
>>> >>
>>> >> Why doesn't that create an object array? Similarly:
>>> >>
>>
>>
>> Consider the use case of a person creating a 1-D numpy array:
>>  > np.array([12.0, 1.0])
>> array([ 12.,  1.])
>>
>> How is python supposed to tell the difference between
>>  > np.array([a, b])
>> and
>>  > np.array([12.0, 1.0])
>> ?
>>
>> It can't, and there are plenty of times when one wants to explicitly
>> initialize a small numpy array with a few discrete variables.
>>
>>
>>>
>>> >>>>> m = np.ma.array([1], mask=[True])
>>> >>>>> m
>>> >> masked_array(data = [--],
>>> >>             mask = [ True],
>>> >>       fill_value = 99)
>>> >>
>>> >>>>> np.array([m])
>>> >> array([[1]])
>>> >>
>>
>> Again, this is expected behavior.  Numpy saw an array of an array,
>> therefore, it produced a 2-D array. Consider the following:
>>
>>  > np.array([[12, 4, 1], [32, 51, 9]])
>>
>> I, as a user, expect numpy to create a 2-D array (2 rows, 3 columns) from
>> that array of arrays.
>>
>>>
>>> >> This has broader implications than just creating arrays, for example:
>>> >>
>>> >>>>> np.sum([m, m])
>>> >> 2
>>> >>>>> np.sum([a, b])
>>> >> 13.0
>>> >>
>>
>>
>> If you wanted sums from each object, there are some better (i.e., more
>> clear) ways to go about it.  If you have a predetermined number of
>> numpy-compatible objects, say a, b, c, then you can explicitly call the sum
>> for each one:
>>  > a_sum = np.sum(a)
>>  > b_sum = np.sum(b)
>>  > c_sum = np.sum(c)
>>
>> Which I think communicates the programmer's intention better than (for a
>> numpy array, x, composed of a, b, c):
>>  > object_sums = np.sum(x)   # <--- As a numpy user, I would expect a
>> scalar out of this, not an array
>>
>> If you have an arbitrary number of objects (which is what I suspect you
>> have), then one could easily produce an array of sums (for a list, x, of
>> numpy-compatible objects) like so:
>>  > object_sums = [np.sum(anObject) for anObject in x]
>>
>> Performance-wise, it should be no more or less efficient than having numpy
>> somehow produce an array of sums from a single call to sum.
>> Readability-wise, it makes more sense because when you are treating objects
>> separately, a *list* of them is more intuitive than a numpy.array, which is
>> more-or-less treated as a single mathematical entity.
>>
>> I hope that addresses your concerns.
>
> I appreciate the response, but you are arguing that it is not a
> problem, and I'm certain that it is. It may not be numpy

It may not be numpy's problem, I can accept that. But it is definitely
a problem for quantities. I'm trying to determine just how big a
problem it is. I had hoped that one day quantities might become a part
of numpy or scipy, but this appears to be a fundamental issue and it
makes me doubt that inclusion would be appropriate.

Thank you for the suggestion about calling the sum method instead of
numpy's function. That is a reasonable workaround.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about creating numpy arrays

2010-05-20 Thread Darren Dale
On Thu, May 20, 2010 at 10:44 AM, Benjamin Root  wrote:
>> I gave two counterexamples of why.
>
> The examples you gave aren't counterexamples.  See below...

I'm not interested in arguing over semantics. I've discovered an issue
with how numpy deals with lists of objects that derive from ndarray,
and am concerned about the implications for classes that extend
ndarray.

> On Wed, May 19, 2010 at 7:06 PM, Darren Dale  wrote:
>>
>> On Wed, May 19, 2010 at 4:19 PM,   wrote:
>> > On Wed, May 19, 2010 at 4:08 PM, Darren Dale  wrote:
>> >> I have a question about creation of numpy arrays from a list of
>> >> objects, which bears on the Quantities project and also on masked
>> >> arrays:
>> >>
>> >>>>> import quantities as pq
>> >>>>> import numpy as np
>> >>>>> a, b = 2*pq.m,1*pq.s
>> >>>>> np.array([a, b])
>> >> array([ 12.,   1.])
>> >>
>> >> Why doesn't that create an object array? Similarly:
>> >>
>
>
> Consider the use case of a person creating a 1-D numpy array:
>  > np.array([12.0, 1.0])
> array([ 12.,  1.])
>
> How is python supposed to tell the difference between
>  > np.array([a, b])
> and
>  > np.array([12.0, 1.0])
> ?
>
> It can't, and there are plenty of times when one wants to explicitly
> initialize a small numpy array with a few discrete variables.
>
>
>>
>> >>>>> m = np.ma.array([1], mask=[True])
>> >>>>> m
>> >> masked_array(data = [--],
>> >>             mask = [ True],
>> >>       fill_value = 99)
>> >>
>> >>>>> np.array([m])
>> >> array([[1]])
>> >>
>
> Again, this is expected behavior.  Numpy saw an array of an array,
> therefore, it produced a 2-D array. Consider the following:
>
>  > np.array([[12, 4, 1], [32, 51, 9]])
>
> I, as a user, expect numpy to create a 2-D array (2 rows, 3 columns) from
> that array of arrays.
>
>>
>> >> This has broader implications than just creating arrays, for example:
>> >>
>> >>>>> np.sum([m, m])
>> >> 2
>> >>>>> np.sum([a, b])
>> >> 13.0
>> >>
>
>
> If you wanted sums from each object, there are some better (i.e., more
> clear) ways to go about it.  If you have a predetermined number of
> numpy-compatible objects, say a, b, c, then you can explicitly call the sum
> for each one:
>  > a_sum = np.sum(a)
>  > b_sum = np.sum(b)
>  > c_sum = np.sum(c)
>
> Which I think communicates the programmer's intention better than (for a
> numpy array, x, composed of a, b, c):
>  > object_sums = np.sum(x)   # <--- As a numpy user, I would expect a
> scalar out of this, not an array
>
> If you have an arbitrary number of objects (which is what I suspect you
> have), then one could easily produce an array of sums (for a list, x, of
> numpy-compatible objects) like so:
>  > object_sums = [np.sum(anObject) for anObject in x]
>
> Performance-wise, it should be no more or less efficient than having numpy
> somehow produce an array of sums from a single call to sum.
> Readability-wise, it makes more sense because when you are treating objects
> separately, a *list* of them is more intuitive than a numpy.array, which is
> more-or-less treated as a single mathematical entity.
>
> I hope that addresses your concerns.

I appreciate the response, but you are arguing that it is not a
problem, and I'm certain that it is. It may not be numpy
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about creating numpy arrays

2010-05-19 Thread Darren Dale
On Wed, May 19, 2010 at 4:19 PM,   wrote:
> On Wed, May 19, 2010 at 4:08 PM, Darren Dale  wrote:
>> I have a question about creation of numpy arrays from a list of
>> objects, which bears on the Quantities project and also on masked
>> arrays:
>>
>>>>> import quantities as pq
>>>>> import numpy as np
>>>>> a, b = 2*pq.m,1*pq.s
>>>>> np.array([a, b])
>> array([ 12.,   1.])
>>
>> Why doesn't that create an object array? Similarly:
>>
>>>>> m = np.ma.array([1], mask=[True])
>>>>> m
>> masked_array(data = [--],
>>             mask = [ True],
>>       fill_value = 99)
>>
>>>>> np.array([m])
>> array([[1]])
>>
>> This has broader implications than just creating arrays, for example:
>>
>>>>> np.sum([m, m])
>> 2
>>>>> np.sum([a, b])
>> 13.0
>>
>> Any thoughts?
>
> These are "array_like" of floats, so why should it create anything
> else than an array of floats.

I gave two counterexamples of why.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] question about creating numpy arrays

2010-05-19 Thread Darren Dale
I have a question about creation of numpy arrays from a list of
objects, which bears on the Quantities project and also on masked
arrays:

>>> import quantities as pq
>>> import numpy as np
>>> a, b = 2*pq.m,1*pq.s
>>> np.array([a, b])
array([ 12.,   1.])

Why doesn't that create an object array? Similarly:

>>> m = np.ma.array([1], mask=[True])
>>> m
masked_array(data = [--],
 mask = [ True],
   fill_value = 99)

>>> np.array([m])
array([[1]])

This has broader implications than just creating arrays, for example:

>>> np.sum([m, m])
2
>>> np.sum([a, b])
13.0

Any thoughts?

Thanks,
Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.fix(): broken for scalar arguments

2010-04-18 Thread Darren Dale
On Sun, Apr 18, 2010 at 9:28 AM, Darren Dale  wrote:
> On Sun, Apr 18, 2010 at 9:08 AM, Darren Dale  wrote:
>> On Sat, Apr 17, 2010 at 4:16 PM, Charles R Harris
>>  wrote:
>>>
>>>
>>> On Sat, Apr 17, 2010 at 2:01 PM, Eric Firing  wrote:
>>>>
>>>> np.fix() no longer works for scalar arguments:
>>>>
>>>>
>>>> In [1]:import numpy as np
>>>>
>>>> In [2]:np.version.version
>>>> Out[2]:'2.0.0.dev8334'
>>>>
>>>> In [3]:np.fix(3.14)
>>>>
>>>> ---
>>>> TypeError                                 Traceback (most recent call
>>>> last)
>>>>
>>>> /home/efiring/ in ()
>>>>
>>>> /usr/local/lib/python2.6/dist-packages/numpy/lib/ufunclike.pyc in fix(x,
>>>> y)
>>>>      46     if y is None:
>>>>      47         y = y1
>>>> ---> 48     y[...] = nx.where(x >= 0, y1, y2)
>>>>      49     return y
>>>>      50
>>>>
>>>> TypeError: 'numpy.float64' object does not support item assignment
>>>>
>>>>
>>>
>>> Looks like r8293. Darren?
>>
>> Thanks, I'm looking into it.
>
> The old np.fix behavior is different from np.floor and np.ceil.
> np.fix(3.14) would return array(3.0), while np.floor(3.14) would
> return 3.0. Shall I fix it to conform with the old but inconsistent
> behavior of fix?

I think this is the underlying issue: np.floor(np.array(3.14)) returns
3.0, not array(3.14). The current implementation of fix had already
taken care to ensure that it was working with an array for the input.
What is numpy's policy here? np.fix returned a len-0 ndarray even for
scalar input, floor and ceil return scalars even for len-0 ndarrays.
This inconsistency makes it difficult to make even small modifications
to the numpy codebase.

r8351 includes a one-line change that addresses Eric's report and is
commensurate with the previous behavior of fix.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.fix(): broken for scalar arguments

2010-04-18 Thread Darren Dale
On Sun, Apr 18, 2010 at 9:08 AM, Darren Dale  wrote:
> On Sat, Apr 17, 2010 at 4:16 PM, Charles R Harris
>  wrote:
>>
>>
>> On Sat, Apr 17, 2010 at 2:01 PM, Eric Firing  wrote:
>>>
>>> np.fix() no longer works for scalar arguments:
>>>
>>>
>>> In [1]:import numpy as np
>>>
>>> In [2]:np.version.version
>>> Out[2]:'2.0.0.dev8334'
>>>
>>> In [3]:np.fix(3.14)
>>>
>>> ---
>>> TypeError                                 Traceback (most recent call
>>> last)
>>>
>>> /home/efiring/ in ()
>>>
>>> /usr/local/lib/python2.6/dist-packages/numpy/lib/ufunclike.pyc in fix(x,
>>> y)
>>>      46     if y is None:
>>>      47         y = y1
>>> ---> 48     y[...] = nx.where(x >= 0, y1, y2)
>>>      49     return y
>>>      50
>>>
>>> TypeError: 'numpy.float64' object does not support item assignment
>>>
>>>
>>
>> Looks like r8293. Darren?
>
> Thanks, I'm looking into it.

The old np.fix behavior is different from np.floor and np.ceil.
np.fix(3.14) would return array(3.0), while np.floor(3.14) would
return 3.0. Shall I fix it to conform with the old but inconsistent
behavior of fix?

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Bug in numpy.fix(): broken for scalar arguments

2010-04-18 Thread Darren Dale
On Sat, Apr 17, 2010 at 4:16 PM, Charles R Harris
 wrote:
>
>
> On Sat, Apr 17, 2010 at 2:01 PM, Eric Firing  wrote:
>>
>> np.fix() no longer works for scalar arguments:
>>
>>
>> In [1]:import numpy as np
>>
>> In [2]:np.version.version
>> Out[2]:'2.0.0.dev8334'
>>
>> In [3]:np.fix(3.14)
>>
>> ---
>> TypeError                                 Traceback (most recent call
>> last)
>>
>> /home/efiring/ in ()
>>
>> /usr/local/lib/python2.6/dist-packages/numpy/lib/ufunclike.pyc in fix(x,
>> y)
>>      46     if y is None:
>>      47         y = y1
>> ---> 48     y[...] = nx.where(x >= 0, y1, y2)
>>      49     return y
>>      50
>>
>> TypeError: 'numpy.float64' object does not support item assignment
>>
>>
>
> Looks like r8293. Darren?

Thanks, I'm looking into it.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ufunc improvements [Was: Warnings in numpy.ma.test()]

2010-03-28 Thread Darren Dale
I'd like to use this thread to discuss possible improvements to
generalize numpys functions. Sorry for double posting, but we will
have a hard time keeping track of discussion about how to improve
functions to deal with subclasses if they are spread across threads
talking about warnings in masked arrays or masked arrays not dealing
well with trapz. There is an additional bit at the end that was not
discussed elsewhere.

On Thu, Mar 18, 2010 at 8:14 AM, Darren Dale  wrote:
> On Wed, Mar 17, 2010 at 10:16 PM, Charles R Harris
>  wrote:
>> Just *one* function to rule them all and on the subtype dump it. No
>> __array_wrap__, __input_prepare__, or __array_prepare__, just something like
>> __handle_ufunc__. So it is similar but perhaps more radical. I'm proposing
>> having the ufunc upper layer do nothing but decide which argument type will
>> do all the rest of the work, casting, calling the low level ufunc base,
>> providing buffers, wrapping, etc. Instead of pasting bits and pieces into
>> the existing framework I would like to lay out a line of attack that ends up
>> separating ufuncs into smaller pieces that provide low level routines that
>> work on strided memory while leaving policy implementation to the subtype.
>> There would need to be some default type (ndarray) when the functions are
>> called on nested lists and scalars and I'm not sure of the best way to
>> handle that.
>>
>> I'm just sort of thinking out loud, don't take it too seriously.
>
> Thanks for the clarification. I think I see how this could work: if
> ufuncs were callable instances of classes, __call__ would find the
> input with highest priority and pass itself and the input to that
> object's __handle_ufunc__. Now it is up to __handle_ufunc__ to
> determine whether and how to modify the input, call some method on the
> ufunc (like execute)
> to perform the buffer operation, then __handle_ufunc__ performs the
> cast, deals with metadata and returns the result.
>
> I skipped a step: initializing the output buffer. Would that be rolled
> into the ufunc execution, or should it be possible for
> __handle_ufunc__ to access the initialized buffer before execution
> occurs(__array_prepare__)? I think it is important to be able to
> perform the cast and calculate metadata before ufunc execution. If an
> error occurs, an exception can be raised before the ufunc operates on
> the arrays, which can modifies the data in place.

We discussed the possibility of simplifying the wrapping scheme with a
method like __handle_gfunc__. (I don't think this necessarily has to
be limited to ufuncs.) I think a second method like __prepare_input__
is also necessary. Imagine something like:

class GenericFunction:
   @property
   def executable(self):
   return self._executable
   def __init__(self, executable):
   self._executable = executable
   def __call__(self, *args, **kwargs):
   # find the input with highest priority, and then:
   args, kwargs = input.__prepare_input__(self, *args, **kwargs)
   return input.__handle_gfunc__(self, *args, **kwargs)

# this is the core function to be passed to the generic class:
def _add(a, b, out=None):
   # the generic, ndarray implementation.
   ...

# here is the publicly exposed interface:
add = GenericFunction(_add)

# now my subclasses
class MyArray(ndarray):
   # My class tweaks the execution of the function in __handle_gfunc__
   def __prepare_input__(self, gfunc, *args, **kwargs):
   return mod_input[gfunc](*args, **kwargs)
   def __handle_gfunc__(self, gfunc, *args, **kwargs):
   res = gfunc.executable(*args, **kwargs)
   # you could have called a different core func there
   return mod_output[gfunc](res, *args, **kwargs)

class MyNextArray(MyArray):
   def __prepare_input__(self, gfunc, *args, **kwargs):
   # let the superclass do its thing:
   args, kwargs = MyArray.__prepare_input__(self, gfunc, *args, **kwargs)
   # now I can tweak it further:
   return mod_input_further[gfunc](*args, **kwargs)
   def __handle_gfunc__(self, gfunc, *args, **kwargs):
   # let's defer to the superclass to handle calling the core function:
   res = MyArray.__handle_gfunc__(self, gfunc, *args, **kwargs)
   # and now we have one more crack at the result before passing it back:
   return mod_output_further[gfunc](res, *args, **kwargs)

If a gfunc is not recognized, the subclass might raise a
NotImplementedError or it might just pass the original args, kwargs on
through. I didn't write that part out because the example was already
running long. But the point is that a single entry point could be used
for any subclass, without having to worry about how to support every
subclass. It may still be necessary to be mindful to use asanyarray in
the core functions, but if a subclass alters 

Re: [Numpy-discussion] numpy.trapz() doesn't respect subclass

2010-03-28 Thread Darren Dale
On Sat, Mar 27, 2010 at 10:23 PM,   wrote:
> subclasses of ndarray, like masked_arrays and quantities, and classes
> that delegate to array calculations, like pandas, can redefine
> anything. So there is not much that can be relied on if any subclass
> is allowed to be used inside a function
>
> e.g. quantities redefines sin, cos,...
> http://packages.python.org/quantities/user/issues.html#umath-functions

Those functions were only intended to be used in the short term, until
the ufuncs that ship with numpy included a mechanism that allowed
quantity arrays to propagate the units. It would be nice to have a
mechanism (like we have discussed briefly just recently on this list)
where there is a single entry point to a given function like add, but
subclasses can tweak the execution.

We discussed the possibility of simplifying the wrapping scheme with a
method like __handle_gfunc__. (I don't think this necessarily has to
be limited to ufuncs.) I think a second method like __prepare_input__
is also necessary. Imagine something like:

class GenericFunction:
@property
def executable(self):
return self._executable
def __init__(self, executable):
self._executable = executable
def __call__(self, *args, **kwargs):
# find the input with highest priority, and then:
args, kwargs = input.__prepare_input__(self, *args, **kwargs)
return input.__handle_gfunc__(self, *args, **kwargs)

# this is the core function to be passed to the generic class:
def _add(a, b, out=None):
# the generic, ndarray implementation.
...

# here is the publicly exposed interface:
add = GenericFunction(_add)

# now my subclasses
class MyArray(ndarray):
# My class tweaks the execution of the function in __handle_gfunc__
def __prepare_input__(self, gfunc, *args, **kwargs):
return mod_input[gfunc](*args, **kwargs)
def __handle_gfunc__(self, gfunc, *args, **kwargs):
res = gfunc.executable(*args, **kwargs)
# you could have called a different core func there
return mod_output[gfunc](res, *args, **kwargs)

class MyNextArray(MyArray):
def __prepare_input__(self, gfunc, *args, **kwargs):
# let the superclass do its thing:
args, kwargs = MyArray.__prepare_input__(self, gfunc, *args, **kwargs)
# now I can tweak it further:
return mod_input_further[gfunc](*args, **kwargs)
def __handle_gfunc__(self, gfunc, *args, **kwargs):
# let's defer to the superclass to handle calling the core function:
res = MyArray.__handle_gfunc__(self, gfunc, *args, **kwargs)
# and now we have one more crack at the result before passing it back:
return mod_output_further[gfunc](res, *args, **kwargs)

If a gfunc is not recognized, the subclass might raise a
NotImplementedError or it might just pass the original args, kwargs on
through. I didn't write that part out because the example was already
running long. But the point is that a single entry point could be used
for any subclass, without having to worry about how to support every
subclass.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] should ndarray implement __round__ for py3k?

2010-03-25 Thread Darren Dale
A simple test in python 3:

>>> import numpy as np
>>> round(np.arange(10))
Traceback (most recent call last):
  File "", line 1, in 
TypeError: type numpy.ndarray doesn't define __round__ method

Here is some additional context: http://bugs.python.org/issue7261

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Warnings in numpy.ma.test()

2010-03-19 Thread Darren Dale
On Wed, Mar 17, 2010 at 10:16 PM, Charles R Harris
 wrote:
> On Wed, Mar 17, 2010 at 7:39 PM, Darren Dale  wrote:
>> On Wed, Mar 17, 2010 at 8:22 PM, Charles R Harris
>> > What bothers me here is the opposing desire to separate ufuncs from
>> > their
>> > ndarray dependency, having them operate on buffer objects instead. As I
>> > see
>> > it ufuncs would be split into layers, with a lower layer operating on
>> > buffer
>> > objects, and an upper layer tying them together with ndarrays where the
>> > "business" logic -- kinds, casting, etc -- resides. It is in that upper
>> > layer that what you are proposing would reside. Mind, I'm not sure that
>> > having matrices and masked arrays subclassing ndarray was the way to go,
>> > but
>> > given that they do one possible solution is to dump the whole mess onto
>> > the
>> > subtype with the highest priority. That subtype would then be
>> > responsible
>> > for casts and all the other stuff needed for the call and wrapping the
>> > result. There could be library routines to help with that. It seems to
>> > me
>> > that that would be the most general way to go. In that sense ndarrays
>> > themselves would just be another subtype with especially low priority.
>>
>> I'm sorry, I didn't understand your point. What you described sounds
>> identical to how things are currently done. What distinction are you
>> making, aside from operating on the buffer object? How would adding a
>> method to modify the input to a ufunc complicate the situation?
>>
>
> Just *one* function to rule them all and on the subtype dump it. No
> __array_wrap__, __input_prepare__, or __array_prepare__, just something like
> __handle_ufunc__. So it is similar but perhaps more radical. I'm proposing
> having the ufunc upper layer do nothing but decide which argument type will
> do all the rest of the work, casting, calling the low level ufunc base,
> providing buffers, wrapping, etc. Instead of pasting bits and pieces into
> the existing framework I would like to lay out a line of attack that ends up
> separating ufuncs into smaller pieces that provide low level routines that
> work on strided memory while leaving policy implementation to the subtype.
> There would need to be some default type (ndarray) when the functions are
> called on nested lists and scalars and I'm not sure of the best way to
> handle that.
>
> I'm just sort of thinking out loud, don't take it too seriously.

This is a seemingly simplified approach. I was taken with it last
night but then I remembered that it will make subclassing difficult. A
simple example can illustrate the problem. We have MaskedArray, which
needs to customize some functions that operate on arrays or buffers,
so we pass the function and the arguments to __handle_ufunc__ and it
takes care of the whole shebang. But now I develop a MaskedQuantity
that takes masked arrays and gives them the ability to handle units,
and so it needs to customize those functions further. Maybe
MaskedQuantity can modify the input passed to its __handle_ufunc__ and
then pass everything on to super().__handle_ufunc__, such that
MaskedQuantity does not have to reimplement MaskedArray's
customizations to that particular function, but that is not enough
flexibility for the general case. If a my subclass needs to call the
low-level ufunc base, it can't rely on the superclass.__handle_ufunc__
because it *also* calls the ufunc base, so my subclass has to
reimplement all of the superclass function customizations.

The current scheme (__input_prepare__, ...) is better able to handle
subclassing, although I agree that it could be improved. If the
subclasses were responsible for calling the ufunc base, alternative
bases could be provided (like the c routines for masked arrays). That
still seems to require the high-level function to provide three or
four entry points: 1) modify the input, 2) initialize the output
(chance to deal with metadata), 3) call the function base, 4) finalize
the output (deal with metadata that requires the ufunc results).
Perhaps 2 and 4 would not both be needed, I'm not sure.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Warnings in numpy.ma.test()

2010-03-18 Thread Darren Dale
On Thu, Mar 18, 2010 at 5:12 PM, Eric Firing  wrote:
> Ryan May wrote:
>> On Thu, Mar 18, 2010 at 2:46 PM, Christopher Barker
>>  wrote:
>>> Gael Varoquaux wrote:
 On Thu, Mar 18, 2010 at 12:12:10PM -0700, Christopher Barker wrote:
> sure -- that's kind of my point -- if EVERY numpy array were
> (potentially) masked, then folks would write code to deal with them
> appropriately.
 That's pretty much saying: "I have a complicated problem and I want every
 one else to have to deal with the full complexity of it, even if they
 have a simple problem".
>>> Well -- I did say it was a fantasy...
>>>
>>> But I disagree -- having invalid data is a very common case. What we
>>> have now is a situation where we have two parallel systems, masked
>>> arrays and regular arrays. Each time someone does something new with
>>> masked arrays, they often find another missing feature, and have to
>>> solve that. Also, the fact that masked arrays are tacked on means that
>>> performance suffers.
>>
>> Case in point, I just found a bug in np.gradient where it forces the
>> output to be an ndarray.
>> (http://projects.scipy.org/numpy/ticket/1435).  Easy fix that doesn't
>> actually require any special casing for masked arrays, just making
>> sure to use the proper function to create a new array of the same
>> subclass as the input.  However, now for any place that I can't patch
>> I have to use a custom function until a fixed numpy is released.
>>
>> Maybe universal support for masked arrays (and masking invalid points)
>> is a pipe dream, but every function in numpy should IMO deal properly
>> with subclasses of ndarray.
>
> 1) This can't be done in general because subclasses can change things to
> the point where there is little one can count on.  The matrix subclass,
> for example, redefines multiplication and iteration, making it difficult
> to write functions that will work for ndarrays or matrices.

I'm more optimistic that it can be done in general, if we provide a
mechanism where the subclass with highest priority can customize the
execution of the function (ufunc or not). In principle, the subclass
could even override the buffer operation, like in the case of
matrices. It still can put a lot of responsibility on the authors of
the subclass, but what is gained is a framework where np.add (for
example) could yield the appropriate result for any subclass, as
opposed to the current situation of needing to know which add function
can be used for a particular type of input.

All speculative, of course. I'll start throwing some examples together
when I get a chance.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ufunc improvements [Was: Warnings in numpy.ma.test()]

2010-03-18 Thread Darren Dale
On Wed, Mar 17, 2010 at 10:16 PM, Charles R Harris
 wrote:
>
>
> On Wed, Mar 17, 2010 at 7:39 PM, Darren Dale  wrote:
>>
>> On Wed, Mar 17, 2010 at 8:22 PM, Charles R Harris
>>  wrote:
>> >
>> >
>> > On Wed, Mar 17, 2010 at 5:26 PM, Darren Dale  wrote:
>> >>
>> >> On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris
>> >>  wrote:
>> >> > On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale 
>> >> > wrote:
>> >> >> On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM 
>> >> >> wrote:
>> >> >> > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote:
>> >> >> >>
>> >> >> >> I started thinking about a third method called __input_prepare__
>> >> >> >> that
>> >> >> >> would be called on the way into the ufunc, which would allow you
>> >> >> >> to
>> >> >> >> intercept the input and pass a somehow modified copy back to the
>> >> >> >> ufunc. The total flow would be:
>> >> >> >>
>> >> >> >> 1) Call myufunc(x, y[, z])
>> >> >> >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which
>> >> >> >> returns
>> >> >> >> x',
>> >> >> >> y' (or simply passes through x,y by default)
>> >> >> >> 3) myufunc creates the output array z (if not specified) and
>> >> >> >> calls
>> >> >> >> ?.__array_prepare__(z, (myufunc, x, y, ...))
>> >> >> >> 4) myufunc finally gets around to performing the calculation
>> >> >> >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and
>> >> >> >> returns
>> >> >> >> the result to the caller
>> >> >> >>
>> >> >> >> Is this general enough for your use case? I haven't tried to
>> >> >> >> think
>> >> >> >> about how to change some global state at one point and change it
>> >> >> >> back
>> >> >> >> at another, that seems like a bad idea and difficult to support.
>> >> >> >
>> >> >> >
>> >> >> > Sounds like a good plan. If we could find a way to merge the first
>> >> >> > two
>> >> >> > (__input_prepare__ and __array_prepare__), that'd be ideal.
>> >> >>
>> >> >> I think it is better to keep them separate, so we don't have one
>> >> >> method that is trying to do too much. It would be easier to explain
>> >> >> in
>> >> >> the documentation.
>> >> >>
>> >> >> I may not have much time to look into this until after Monday. Is
>> >> >> there a deadline we need to consider?
>> >> >>
>> >> >
>> >> > I don't think this should go into 2.0, I think it needs more thought.
>> >>
>> >> Now that you mention it, I agree that it would be too rushed to try to
>> >> get it in for 2.0. Concerning a later release, is there anything in
>> >> particular that you think needs to be clarified or reconsidered?
>> >>
>> >> > And
>> >> > 2.0 already has significant code churn. Is there any reason beyond a
>> >> > big
>> >> > hassle not to set/restore the error state around all the ufunc calls
>> >> > in
>> >> > ma?
>> >> > Beyond that, the PEP that you pointed to looks interesting. Maybe
>> >> > some
>> >> > sort
>> >> > of decorator around ufunc calls could also be made to work.
>> >>
>> >> I think the PEP is interesting, but it is languishing. There were some
>> >> questions and criticisms on the mailing list that I do not think were
>> >> satisfactorily addressed, and as far as I know the author of the PEP
>> >> has not pursued the matter further. There was some interest on the
>> >> python-dev mailing list in the numpy community's use case, but I think
>> >> we need to consider what can be done now to meet the needs of ndarray
>> >> subclasses. I don't see PEP 3124 happening in the near future.
>> >>
>> >> What I am proposing is a simple extension to our existing framework to
>>

Re: [Numpy-discussion] Warnings in numpy.ma.test()

2010-03-17 Thread Darren Dale
On Wed, Mar 17, 2010 at 8:22 PM, Charles R Harris
 wrote:
>
>
> On Wed, Mar 17, 2010 at 5:26 PM, Darren Dale  wrote:
>>
>> On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris
>>  wrote:
>> > On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale  wrote:
>> >> On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM 
>> >> wrote:
>> >> > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote:
>> >> >>
>> >> >> I started thinking about a third method called __input_prepare__
>> >> >> that
>> >> >> would be called on the way into the ufunc, which would allow you to
>> >> >> intercept the input and pass a somehow modified copy back to the
>> >> >> ufunc. The total flow would be:
>> >> >>
>> >> >> 1) Call myufunc(x, y[, z])
>> >> >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns
>> >> >> x',
>> >> >> y' (or simply passes through x,y by default)
>> >> >> 3) myufunc creates the output array z (if not specified) and calls
>> >> >> ?.__array_prepare__(z, (myufunc, x, y, ...))
>> >> >> 4) myufunc finally gets around to performing the calculation
>> >> >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and
>> >> >> returns
>> >> >> the result to the caller
>> >> >>
>> >> >> Is this general enough for your use case? I haven't tried to think
>> >> >> about how to change some global state at one point and change it
>> >> >> back
>> >> >> at another, that seems like a bad idea and difficult to support.
>> >> >
>> >> >
>> >> > Sounds like a good plan. If we could find a way to merge the first
>> >> > two
>> >> > (__input_prepare__ and __array_prepare__), that'd be ideal.
>> >>
>> >> I think it is better to keep them separate, so we don't have one
>> >> method that is trying to do too much. It would be easier to explain in
>> >> the documentation.
>> >>
>> >> I may not have much time to look into this until after Monday. Is
>> >> there a deadline we need to consider?
>> >>
>> >
>> > I don't think this should go into 2.0, I think it needs more thought.
>>
>> Now that you mention it, I agree that it would be too rushed to try to
>> get it in for 2.0. Concerning a later release, is there anything in
>> particular that you think needs to be clarified or reconsidered?
>>
>> > And
>> > 2.0 already has significant code churn. Is there any reason beyond a big
>> > hassle not to set/restore the error state around all the ufunc calls in
>> > ma?
>> > Beyond that, the PEP that you pointed to looks interesting. Maybe some
>> > sort
>> > of decorator around ufunc calls could also be made to work.
>>
>> I think the PEP is interesting, but it is languishing. There were some
>> questions and criticisms on the mailing list that I do not think were
>> satisfactorily addressed, and as far as I know the author of the PEP
>> has not pursued the matter further. There was some interest on the
>> python-dev mailing list in the numpy community's use case, but I think
>> we need to consider what can be done now to meet the needs of ndarray
>> subclasses. I don't see PEP 3124 happening in the near future.
>>
>> What I am proposing is a simple extension to our existing framework to
>> let subclasses hook into ufuncs and customize their behavior based on
>> the context of the operation (using the __array_priority__ of the
>> inputs and/or outputs, and the identity of the ufunc). The steps I
>> listed allow customization at the critical steps: prepare the input,
>> prepare the output, populate the output (currently no proposal for
>> customization here), and finalize the output. The only additional step
>> proposed is to prepare the input.
>>
>
> What bothers me here is the opposing desire to separate ufuncs from their
> ndarray dependency, having them operate on buffer objects instead. As I see
> it ufuncs would be split into layers, with a lower layer operating on buffer
> objects, and an upper layer tying them together with ndarrays where the
> "business" logic -- kinds, casting, etc -- resides. It is in that upper
> layer that what you are proposing would reside. Mind, I'm not sure that
> having matrices and masked arrays subclassing ndarray was the way to go, but
> given that they do one possible solution is to dump the whole mess onto the
> subtype with the highest priority. That subtype would then be responsible
> for casts and all the other stuff needed for the call and wrapping the
> result. There could be library routines to help with that. It seems to me
> that that would be the most general way to go. In that sense ndarrays
> themselves would just be another subtype with especially low priority.

I'm sorry, I didn't understand your point. What you described sounds
identical to how things are currently done. What distinction are you
making, aside from operating on the buffer object? How would adding a
method to modify the input to a ufunc complicate the situation?

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Warnings in numpy.ma.test()

2010-03-17 Thread Darren Dale
On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris
 wrote:
> On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale  wrote:
>> On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM  wrote:
>> > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote:
>> >>
>> >> I started thinking about a third method called __input_prepare__ that
>> >> would be called on the way into the ufunc, which would allow you to
>> >> intercept the input and pass a somehow modified copy back to the
>> >> ufunc. The total flow would be:
>> >>
>> >> 1) Call myufunc(x, y[, z])
>> >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x',
>> >> y' (or simply passes through x,y by default)
>> >> 3) myufunc creates the output array z (if not specified) and calls
>> >> ?.__array_prepare__(z, (myufunc, x, y, ...))
>> >> 4) myufunc finally gets around to performing the calculation
>> >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns
>> >> the result to the caller
>> >>
>> >> Is this general enough for your use case? I haven't tried to think
>> >> about how to change some global state at one point and change it back
>> >> at another, that seems like a bad idea and difficult to support.
>> >
>> >
>> > Sounds like a good plan. If we could find a way to merge the first two
>> > (__input_prepare__ and __array_prepare__), that'd be ideal.
>>
>> I think it is better to keep them separate, so we don't have one
>> method that is trying to do too much. It would be easier to explain in
>> the documentation.
>>
>> I may not have much time to look into this until after Monday. Is
>> there a deadline we need to consider?
>>
>
> I don't think this should go into 2.0, I think it needs more thought.

Now that you mention it, I agree that it would be too rushed to try to
get it in for 2.0. Concerning a later release, is there anything in
particular that you think needs to be clarified or reconsidered?

> And
> 2.0 already has significant code churn. Is there any reason beyond a big
> hassle not to set/restore the error state around all the ufunc calls in ma?
> Beyond that, the PEP that you pointed to looks interesting. Maybe some sort
> of decorator around ufunc calls could also be made to work.

I think the PEP is interesting, but it is languishing. There were some
questions and criticisms on the mailing list that I do not think were
satisfactorily addressed, and as far as I know the author of the PEP
has not pursued the matter further. There was some interest on the
python-dev mailing list in the numpy community's use case, but I think
we need to consider what can be done now to meet the needs of ndarray
subclasses. I don't see PEP 3124 happening in the near future.

What I am proposing is a simple extension to our existing framework to
let subclasses hook into ufuncs and customize their behavior based on
the context of the operation (using the __array_priority__ of the
inputs and/or outputs, and the identity of the ufunc). The steps I
listed allow customization at the critical steps: prepare the input,
prepare the output, populate the output (currently no proposal for
customization here), and finalize the output. The only additional step
proposed is to prepare the input.

In the long run, we could consider if ufuncs should be instances of a
class, perhaps implemented in Cython. This way the ufunc will be able
to pass itself to the special array methods as part of the context
tuple, as is currently done. Maybe an alternative approach would be
for ufuncs to provide methods where subclasses could register routines
for the various steps I specified based on the types of the inputs,
similar to the PEP. This way, the ufunc would determine the context
based on the input (rather than the current way of the ufunc
determining part of the context based on the input by inspecting
__array_priority__ and then the input with highest priority
determining the context based on the identity of the ufunc and the
rest of the input.) This new (half baked) approach could be
backward-compatible with the old one: if the combination of inputs
isn't found in the registry, it would fall back on the existing
input-/array_prepare array_wrap mechanisms (which in principle could
then be deprecated, and at that point __array_priority__ might no
longer be necessary). I don't see anything to indicate that we would
regret implementing a special __input_prepare__ method down the road.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Warnings in numpy.ma.test()

2010-03-17 Thread Darren Dale
On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM  wrote:
> On Mar 17, 2010, at 8:19 AM, Darren Dale wrote:
>>
>> I started thinking about a third method called __input_prepare__ that
>> would be called on the way into the ufunc, which would allow you to
>> intercept the input and pass a somehow modified copy back to the
>> ufunc. The total flow would be:
>>
>> 1) Call myufunc(x, y[, z])
>> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x',
>> y' (or simply passes through x,y by default)
>> 3) myufunc creates the output array z (if not specified) and calls
>> ?.__array_prepare__(z, (myufunc, x, y, ...))
>> 4) myufunc finally gets around to performing the calculation
>> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns
>> the result to the caller
>>
>> Is this general enough for your use case? I haven't tried to think
>> about how to change some global state at one point and change it back
>> at another, that seems like a bad idea and difficult to support.
>
>
> Sounds like a good plan. If we could find a way to merge the first two 
> (__input_prepare__ and __array_prepare__), that'd be ideal.

I think it is better to keep them separate, so we don't have one
method that is trying to do too much. It would be easier to explain in
the documentation.

I may not have much time to look into this until after Monday. Is
there a deadline we need to consider?

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Warnings in numpy.ma.test()

2010-03-17 Thread Darren Dale
On Wed, Mar 17, 2010 at 10:45 AM, Charles R Harris
 wrote:
>
>
> On Wed, Mar 17, 2010 at 6:19 AM, Darren Dale  wrote:
>>
>> On Wed, Mar 17, 2010 at 2:07 AM, Pierre GM  wrote:
>> > All,
>> > As you're probably aware, the current test suite for numpy.ma raises
>> > some nagging warnings such as "invalid value in ...". These warnings are
>> > only issued when a standard numpy ufunc (eg., np.sqrt) is called on a
>> > MaskedArray, instead of its numpy.ma (eg., np.ma.sqrt) equivalent. The
>> > reason is that the masked versions of the ufuncs temporarily set the numpy
>> > error status to 'ignore' before the operation takes place, and reset the
>> > status to its original value.
>>
>> > I thought I could use the new __array_prepare__ method to intercept the
>> > call of a standard ufunc. After actual testing, that can't work.
>> > __array_prepare only help to prepare the *output* of the operation, not to
>> > change the input on the fly, just for this operation. Actually, you can
>> > modify the input in place, but it's usually not what you want.
>>
>> That is correct, __array_prepare__ is called just after the output
>> array is created, but before the ufunc actually gets down to business.
>> I have the same limitation in quantities you are now seeing with
>> masked array, in my case I want the opportunity to rescale different
>> but compatible quantities for the operation (without changing the
>> original arrays in place, of course).
>>
>> > Then, I tried to use  __array_prepare__ to store the current error
>> > status in the input, force it to ignore divide/invalid errors and send the
>> > input to the ufunc. Doesn't work either: np.seterr in __array_prepare__ 
>> > does
>> > change the error status, but as far as I understand, the ufunc is called is
>> > still called with the original error status. That means that if something
>> > goes wrong, your error status can stay stuck. Not a good idea either.
>> > I'm running out of ideas at this point. For the test suite, I'd suggest
>> > to disable the warnings in test_fix_invalid and test_basic_arithmetic.
>> > An additional issue is that if one of the error status is set to
>> > 'raise', the numpy ufunc will raise the exception (as expected), while its
>> > numpy.ma version will not. I'll put also a warning in the docs to that
>> > effect.
>> > Please send me your comments before I commit any changes.
>>
>> I started thinking about a third method called __input_prepare__ that
>> would be called on the way into the ufunc, which would allow you to
>> intercept the input and pass a somehow modified copy back to the
>> ufunc. The total flow would be:
>>
>> 1) Call myufunc(x, y[, z])
>> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x',
>> y' (or simply passes through x,y by default)
>> 3) myufunc creates the output array z (if not specified) and calls
>> ?.__array_prepare__(z, (myufunc, x, y, ...))
>> 4) myufunc finally gets around to performing the calculation
>> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns
>> the result to the caller
>>
>> Is this general enough for your use case? I haven't tried to think
>> about how to change some global state at one point and change it back
>> at another, that seems like a bad idea and difficult to support.
>>
>
> I'm not a masked array user and not familiar with the specific problems
> here, but as an outsider it's beginning to look like one little fix after
> another.

Yeah, I was concerned that criticism would come up.

> Is there some larger framework that would help here?

I think there is: http://www.python.org/dev/peps/pep-3124/

> Changes to the ufuncs themselves?

Perhaps, if ufuncs were instances of a class that implemented
__call__, it would be easier to include context management. Maybe this
approach could be coupled with input_prepare, array_prepare and
array_wrap to provide everything we need.

> There was some code for masked ufuncs on the c level
> posted a while back that I thought was interesting, would it help to have
> masked masked versions of the ufuncs?

I think we need a solution that avoids implementing an entirely new
set of ufuncs for specific subclasses.

> So on and so forth. It just looks like a larger design issue needs to be 
> addressed here.

I'm interested to hear other people's perspectives or suggestions.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Warnings in numpy.ma.test()

2010-03-17 Thread Darren Dale
On Wed, Mar 17, 2010 at 10:11 AM, Ryan May  wrote:
> On Wed, Mar 17, 2010 at 7:19 AM, Darren Dale  wrote:
>> Is this general enough for your use case? I haven't tried to think
>> about how to change some global state at one point and change it back
>> at another, that seems like a bad idea and difficult to support.
>
> Sounds like the textbook use case for the python 2.5/2.6 context
> manager.   Pity we can't use it yet... (and I'm not sure it'd be easy
> to wrap around the calls here.)

I don't think context managers would work. They would be implemented
in one of the subclasses special methods and would thus go out of
scope before the ufunc got around to performing the calculation that
required the change in state.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Warnings in numpy.ma.test()

2010-03-17 Thread Darren Dale
On Wed, Mar 17, 2010 at 2:07 AM, Pierre GM  wrote:
> All,
> As you're probably aware, the current test suite for numpy.ma raises some 
> nagging warnings such as "invalid value in ...". These warnings are only 
> issued when a standard numpy ufunc (eg., np.sqrt) is called on a MaskedArray, 
> instead of its numpy.ma (eg., np.ma.sqrt) equivalent. The reason is that the 
> masked versions of the ufuncs temporarily set the numpy error status to 
> 'ignore' before the operation takes place, and reset the status to its 
> original value.

> I thought I could use the new __array_prepare__ method to intercept the call 
> of a standard ufunc. After actual testing, that can't work. __array_prepare 
> only help to prepare the *output* of the operation, not to change the input 
> on the fly, just for this operation. Actually, you can modify the input in 
> place, but it's usually not what you want.

That is correct, __array_prepare__ is called just after the output
array is created, but before the ufunc actually gets down to business.
I have the same limitation in quantities you are now seeing with
masked array, in my case I want the opportunity to rescale different
but compatible quantities for the operation (without changing the
original arrays in place, of course).

> Then, I tried to use  __array_prepare__ to store the current error status in 
> the input, force it to ignore divide/invalid errors and send the input to the 
> ufunc. Doesn't work either: np.seterr in __array_prepare__ does change the 
> error status, but as far as I understand, the ufunc is called is still called 
> with the original error status. That means that if something goes wrong, your 
> error status can stay stuck. Not a good idea either.
> I'm running out of ideas at this point. For the test suite, I'd suggest to 
> disable the warnings in test_fix_invalid and test_basic_arithmetic.
> An additional issue is that if one of the error status is set to 'raise', the 
> numpy ufunc will raise the exception (as expected), while its numpy.ma 
> version will not. I'll put also a warning in the docs to that effect.
> Please send me your comments before I commit any changes.

I started thinking about a third method called __input_prepare__ that
would be called on the way into the ufunc, which would allow you to
intercept the input and pass a somehow modified copy back to the
ufunc. The total flow would be:

1) Call myufunc(x, y[, z])
2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x',
y' (or simply passes through x,y by default)
3) myufunc creates the output array z (if not specified) and calls
?.__array_prepare__(z, (myufunc, x, y, ...))
4) myufunc finally gets around to performing the calculation
5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns
the result to the caller

Is this general enough for your use case? I haven't tried to think
about how to change some global state at one point and change it back
at another, that seems like a bad idea and difficult to support.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] subclassing ndarray in python3

2010-03-11 Thread Darren Dale
Hi Pauli,

On Thu, Mar 11, 2010 at 3:38 PM, Pauli Virtanen  wrote:
> Thanks for testing. I wish the test suite was more complete (hint!
> hint! :)

I'll be happy to contribute, but lately I get a few 15-30 minute
blocks a week for this kind of work (hence the short attempt to work
on Quantities this morning), and its not likely to let up for about 3
weeks.

> Yes, probably explicitly defining __rmul__ for ndarray could be the
> right solution. Please file a bug report on this.

Done: http://projects.scipy.org/numpy/ticket/1426

Cheers, and *thank you* for all you have already done to support python-3,
Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] subclassing ndarray in python3

2010-03-11 Thread Darren Dale
Now that the trunk has some support for python3, I am working on
making Quantities work with python3 as well. I'm running into some
problems related to subclassing ndarray that can be illustrated with a
simple script, reproduced below. It looks like there is a problem with
the reflected operations, I see problems with __rmul__ and __radd__,
but not with __mul__ and __add__:

import numpy as np


class A(np.ndarray):
def __new__(cls, *args, **kwargs):
return np.ndarray.__new__(cls, *args, **kwargs)

class B(A):
def __mul__(self, other):
return self.view(A).__mul__(other)
def __rmul__(self, other):
return self.view(A).__rmul__(other)
def __add__(self, other):
return self.view(A).__add__(other)
def __radd__(self, other):
return self.view(A).__radd__(other)

a = A((10,))
b = B((10,))

print('A __mul__:')
print(a.__mul__(2))
# ok
print(a.view(np.ndarray).__mul__(2))
# ok
print(a*2)
# ok

print('A __rmul__:')
print(a.__rmul__(2))
# yields NotImplemented
print(a.view(np.ndarray).__rmul__(2))
# yields NotImplemented
print(2*a)
# ok !!??

print('B __mul__:')
print(b.__mul__(2))
# ok
print(b.view(A).__mul__(2))
# ok
print(b.view(np.ndarray).__mul__(2))
# ok
print(b*2)
# ok

print('B __add__:')
print(b.__add__(2))
# ok
print(b.view(A).__add__(2))
# ok
print(b.view(np.ndarray).__add__(2))
# ok
print(b+2)
# ok

print('B __rmul__:')
print(b.__rmul__(2))
# yields NotImplemented
print(b.view(A).__rmul__(2))
# yields NotImplemented
print(b.view(np.ndarray).__rmul__(2))
# yields NotImplemented
print(2*b)
# yields: TypeError: unsupported operand type(s) for *: 'int' and 'B'

print('B __radd__:')
print(b.__radd__(2))
# yields NotImplemented
print(b.view(A).__radd__(2))
# yields NotImplemented
print(b.view(np.ndarray).__radd__(2))
# yields NotImplemented
print(2+b)
# yields: TypeError: unsupported operand type(s) for +: 'int' and 'B'
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-12 Thread Darren Dale
On Fri, Feb 12, 2010 at 12:16 AM, David Cournapeau
 wrote:
> Charles R Harris wrote:
>
>>
>>
>> I don't see any struct definitions there, it looks clean.
>
> Any struct defined outside numpy/core/include is fine to change at will
> as far as ABI is concerned anyway, so no need to check anything :)

Thanks for the clarification. I just double checked the svn diff
(r7308), and I did not touch anything in numpy/core/include.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-11 Thread Darren Dale
On Thu, Feb 11, 2010 at 11:57 PM, Charles R Harris
 wrote:
>
>
> On Thu, Feb 11, 2010 at 9:39 PM, Darren Dale  wrote:
>>
>> On Thu, Feb 11, 2010 at 11:22 PM, Charles R Harris
>>  wrote:
>> >
>> >
>> > On Thu, Feb 11, 2010 at 8:12 PM, David Cournapeau
>> > 
>> > wrote:
>> >>
>> >> Charles R Harris wrote:
>> >> >
>> >> >
>> >> > On Thu, Feb 11, 2010 at 7:00 PM, David Cournapeau
>> >> > > >> > <mailto:da...@silveregg.co.jp>> wrote:
>> >> >
>> >> >     josef.p...@gmail.com <mailto:josef.p...@gmail.com> wrote:
>> >> >
>> >> >      > scipy is relatively easy to compile, I was thinking also of
>> >> > h5py,
>> >> >      > pytables and pymc (b/c of pytables), none of them are
>> >> > importing
>> >> > with
>> >> >      > numpy 1.4.0 because of the cython issue.
>> >> >
>> >> >     As I said, all of them will have to be regenerated with cython
>> >> > 0.12.1.
>> >> >     There is no other solution,
>> >> >
>> >> >
>> >> > Wait, won't the structures be the same size? If they are then the
>> >> > cython
>> >> > check won't fail.
>> >>
>> >> Yes, but the structures are bigger (even after removing the datetime
>> >> stuff, I had the cython warning when I did some tests).
>> >>
>> >
>> > That's curious. It sounds like it isn't ABI compatible yet. Any idea of
>> > what
>> > was added? It would be helpful if the cython message gave a bit more
>> > information...
>>
>> Could it be related to __array_prepare__?
>
> Didn't __array_prepare__  go into 1.3? Did you add anything to a structure?

No, it was included in 1.4:
http://svn.scipy.org/svn/numpy/trunk/doc/release/1.4.0-notes.rst

No, I don't think so. I added __array_prepare__ to array_methods[] in this file:
http://svn.scipy.org/svn/numpy/trunk/numpy/core/src/multiarray/methods.c

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-11 Thread Darren Dale
On Thu, Feb 11, 2010 at 11:22 PM, Charles R Harris
 wrote:
>
>
> On Thu, Feb 11, 2010 at 8:12 PM, David Cournapeau 
> wrote:
>>
>> Charles R Harris wrote:
>> >
>> >
>> > On Thu, Feb 11, 2010 at 7:00 PM, David Cournapeau > > > wrote:
>> >
>> >     josef.p...@gmail.com  wrote:
>> >
>> >      > scipy is relatively easy to compile, I was thinking also of h5py,
>> >      > pytables and pymc (b/c of pytables), none of them are importing
>> > with
>> >      > numpy 1.4.0 because of the cython issue.
>> >
>> >     As I said, all of them will have to be regenerated with cython
>> > 0.12.1.
>> >     There is no other solution,
>> >
>> >
>> > Wait, won't the structures be the same size? If they are then the cython
>> > check won't fail.
>>
>> Yes, but the structures are bigger (even after removing the datetime
>> stuff, I had the cython warning when I did some tests).
>>
>
> That's curious. It sounds like it isn't ABI compatible yet. Any idea of what
> was added? It would be helpful if the cython message gave a bit more
> information...

Could it be related to __array_prepare__?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-11 Thread Darren Dale
2010/2/11 Stéfan van der Walt :
> On 11 February 2010 09:52, Charles R Harris  wrote:
>> Simple, eh. The version should be 2.0.
>
> I'm going with the element of least surprise: no one will be surprised
> when 1.5 is released with ABI changes

I'll buy you a doughnut if that turns out to be correct.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-10 Thread Darren Dale
On Wed, Feb 10, 2010 at 3:31 PM, Travis Oliphant  wrote:
> On Feb 8, 2010, at 4:08 PM, Darren Dale wrote:
>> I definitely should have counted to 100 before sending that. It wasn't
>> helpful and I apologize.
>
> I actually found this quite funny.    I need to apologize if my previous
> email sounded like I was trying to silence other opinions, somehow.   As
> Robert alluded to in a rather well-written email that touched on resolving
> disagreements, it can be hard to communicate that you are listening to
> opposing views despite the fact that your opinion has not changed.

For what its worth, I feel I have had ample opportunity to make my
concerns known, and at this point will leave it to others to do right
by the numpy user community.

> We have a SciPy steering committee that should be reviewed again this year
> at the SciPy conference.   As Robert said, we prefer not to have to use it
> to decide questions.   I think it has been trotted out as a place holder for
> a NumPy steering committee which has never really existed as far as I know.
>   NumPy decisions in the past have been made by me and other people who are
> writing the code.   I think we have tried pretty hard to listen to all
> points of view before doing anything.

Just a comment: I would like to point out that there is (necessarily)
some arbitrary threshold to who is being recognized as "people who are
actively writing the code". Over the last year, I have posted fixes
for multiple bugs and extended the ufunc wrapping mechanisms
(__array_prepare__) which were included in numpy-1.4.0, and have also
been developing the quantities package, which is intimately tied up
with numpy's development. I don't think that makes me a major
contributor like you or Chuck etc., but I am heavily invested in
numpy's development and an active contributor.

Maybe it would be worth considering an approach where the numpy user
community occasionally nominates a few people to serve on some kind of
steering committee along with the developers. Although if there is
interest in or criticism of this idea, I don't think this is the right
thread to discuss it.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-08 Thread Darren Dale
On Mon, Feb 8, 2010 at 10:53 PM, Charles R Harris
 wrote:
>
>
> On Mon, Feb 8, 2010 at 8:40 PM, Darren Dale  wrote:
>>
>> On Mon, Feb 8, 2010 at 10:35 PM, Charles R Harris
>>  wrote:
>> >
>> >
>> > On Mon, Feb 8, 2010 at 8:27 PM, Darren Dale  wrote:
>> >>
>> >> On Mon, Feb 8, 2010 at 10:24 PM, Robert Kern 
>> >> wrote:
>> >> > On Mon, Feb 8, 2010 at 21:23, Darren Dale  wrote:
>> >> >> On Mon, Feb 8, 2010 at 10:10 PM, Robert Kern 
>> >> >> wrote:
>> >> >>> On Mon, Feb 8, 2010 at 20:50, Darren Dale 
>> >> >>> wrote:
>> >> >>>> On Mon, Feb 8, 2010 at 7:52 PM, Robert Kern
>> >> >>>> 
>> >> >>>> wrote:
>> >> >>>>> On Mon, Feb 8, 2010 at 18:43, Darren Dale 
>> >> >>>>> wrote:
>> >> >>>>>> On Mon, Feb 8, 2010 at 7:25 PM, Robert Kern
>> >> >>>>>> 
>> >> >>>>>> wrote:
>> >> >>>>>>> Here's the problem that I don't think many people appreciate:
>> >> >>>>>>> logical
>> >> >>>>>>> arguments suck just as much as personal experience in answering
>> >> >>>>>>> these
>> >> >>>>>>> questions. You can make perfectly structured arguments until
>> >> >>>>>>> you
>> >> >>>>>>> are
>> >> >>>>>>> blue in the face, but without real data to premise them on,
>> >> >>>>>>> they
>> >> >>>>>>> are
>> >> >>>>>>> no better than the gut feelings. They can often be
>> >> >>>>>>> significantly
>> >> >>>>>>> worse
>> >> >>>>>>> if the strength of the logic gets confused with the strength of
>> >> >>>>>>> the
>> >> >>>>>>> premise.
>> >> >>>>>>
>> >> >>>>>> If I recall correctly, the convention of not breaking ABI
>> >> >>>>>> compatibility in minor releases was established in response to
>> >> >>>>>> the
>> >> >>>>>> last ABI compatibility break. Am I wrong?
>> >> >>>>>
>> >> >>>>> I'm not sure how this relates to the material quoted of me, but
>> >> >>>>> no,
>> >> >>>>> you're not wrong.
>> >> >>>>
>> >> >>>> Just trying to provide historical context to support the strength
>> >> >>>> of
>> >> >>>> the premise.
>> >> >>>
>> >> >>> The existence of the policy is not under question (anymore; I
>> >> >>> settled
>> >> >>> that with old email a while ago). The question is whether to change
>> >> >>> the policy.
>> >> >>
>> >> >> So I have gathered. I question whether the concerns that lead to
>> >> >> that
>> >> >> decision in the first place are somehow less important now.
>> >> >
>> >> > And we're back to gut feeling territory again.
>> >>
>> >> That's unfair. I can't win based on gut, you know how skinny I am.
>> >> __
>> >
>> > We haven't reached the extreme of the two physicists at SLAC who stepped
>> > outside to settle a point with fisticuffs. But with any luck we will get
>> > there ;)
>>
>> Really? That also happened here at CHESS a long time ago, only they
>> didn't go outside to fight over who got to use the conference room.
>> __
>
> Heh. I can't vouch for the story personally, I got it from a guy who was a
> grad student back in the day working on a detector at Fermilab along with a
> cast of hundreds.

Yeah, same here. Although, one of the combatants at CHESS, after he
retired, beat an intruder into submission with a fireplace poker. That
story made the local papers.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-08 Thread Darren Dale
On Mon, Feb 8, 2010 at 10:35 PM, Charles R Harris
 wrote:
>
>
> On Mon, Feb 8, 2010 at 8:27 PM, Darren Dale  wrote:
>>
>> On Mon, Feb 8, 2010 at 10:24 PM, Robert Kern 
>> wrote:
>> > On Mon, Feb 8, 2010 at 21:23, Darren Dale  wrote:
>> >> On Mon, Feb 8, 2010 at 10:10 PM, Robert Kern 
>> >> wrote:
>> >>> On Mon, Feb 8, 2010 at 20:50, Darren Dale  wrote:
>> >>>> On Mon, Feb 8, 2010 at 7:52 PM, Robert Kern 
>> >>>> wrote:
>> >>>>> On Mon, Feb 8, 2010 at 18:43, Darren Dale 
>> >>>>> wrote:
>> >>>>>> On Mon, Feb 8, 2010 at 7:25 PM, Robert Kern 
>> >>>>>> wrote:
>> >>>>>>> Here's the problem that I don't think many people appreciate:
>> >>>>>>> logical
>> >>>>>>> arguments suck just as much as personal experience in answering
>> >>>>>>> these
>> >>>>>>> questions. You can make perfectly structured arguments until you
>> >>>>>>> are
>> >>>>>>> blue in the face, but without real data to premise them on, they
>> >>>>>>> are
>> >>>>>>> no better than the gut feelings. They can often be significantly
>> >>>>>>> worse
>> >>>>>>> if the strength of the logic gets confused with the strength of
>> >>>>>>> the
>> >>>>>>> premise.
>> >>>>>>
>> >>>>>> If I recall correctly, the convention of not breaking ABI
>> >>>>>> compatibility in minor releases was established in response to the
>> >>>>>> last ABI compatibility break. Am I wrong?
>> >>>>>
>> >>>>> I'm not sure how this relates to the material quoted of me, but no,
>> >>>>> you're not wrong.
>> >>>>
>> >>>> Just trying to provide historical context to support the strength of
>> >>>> the premise.
>> >>>
>> >>> The existence of the policy is not under question (anymore; I settled
>> >>> that with old email a while ago). The question is whether to change
>> >>> the policy.
>> >>
>> >> So I have gathered. I question whether the concerns that lead to that
>> >> decision in the first place are somehow less important now.
>> >
>> > And we're back to gut feeling territory again.
>>
>> That's unfair. I can't win based on gut, you know how skinny I am.
>> __
>
> We haven't reached the extreme of the two physicists at SLAC who stepped
> outside to settle a point with fisticuffs. But with any luck we will get
> there ;)

Really? That also happened here at CHESS a long time ago, only they
didn't go outside to fight over who got to use the conference room.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-08 Thread Darren Dale
On Mon, Feb 8, 2010 at 10:24 PM, Robert Kern  wrote:
> On Mon, Feb 8, 2010 at 21:23, Darren Dale  wrote:
>> On Mon, Feb 8, 2010 at 10:10 PM, Robert Kern  wrote:
>>> On Mon, Feb 8, 2010 at 20:50, Darren Dale  wrote:
>>>> On Mon, Feb 8, 2010 at 7:52 PM, Robert Kern  wrote:
>>>>> On Mon, Feb 8, 2010 at 18:43, Darren Dale  wrote:
>>>>>> On Mon, Feb 8, 2010 at 7:25 PM, Robert Kern  
>>>>>> wrote:
>>>>>>> Here's the problem that I don't think many people appreciate: logical
>>>>>>> arguments suck just as much as personal experience in answering these
>>>>>>> questions. You can make perfectly structured arguments until you are
>>>>>>> blue in the face, but without real data to premise them on, they are
>>>>>>> no better than the gut feelings. They can often be significantly worse
>>>>>>> if the strength of the logic gets confused with the strength of the
>>>>>>> premise.
>>>>>>
>>>>>> If I recall correctly, the convention of not breaking ABI
>>>>>> compatibility in minor releases was established in response to the
>>>>>> last ABI compatibility break. Am I wrong?
>>>>>
>>>>> I'm not sure how this relates to the material quoted of me, but no,
>>>>> you're not wrong.
>>>>
>>>> Just trying to provide historical context to support the strength of
>>>> the premise.
>>>
>>> The existence of the policy is not under question (anymore; I settled
>>> that with old email a while ago). The question is whether to change
>>> the policy.
>>
>> So I have gathered. I question whether the concerns that lead to that
>> decision in the first place are somehow less important now.
>
> And we're back to gut feeling territory again.

That's unfair. I can't win based on gut, you know how skinny I am.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-08 Thread Darren Dale
On Mon, Feb 8, 2010 at 10:10 PM, Robert Kern  wrote:
> On Mon, Feb 8, 2010 at 20:50, Darren Dale  wrote:
>> On Mon, Feb 8, 2010 at 7:52 PM, Robert Kern  wrote:
>>> On Mon, Feb 8, 2010 at 18:43, Darren Dale  wrote:
>>>> On Mon, Feb 8, 2010 at 7:25 PM, Robert Kern  wrote:
>>>>> Here's the problem that I don't think many people appreciate: logical
>>>>> arguments suck just as much as personal experience in answering these
>>>>> questions. You can make perfectly structured arguments until you are
>>>>> blue in the face, but without real data to premise them on, they are
>>>>> no better than the gut feelings. They can often be significantly worse
>>>>> if the strength of the logic gets confused with the strength of the
>>>>> premise.
>>>>
>>>> If I recall correctly, the convention of not breaking ABI
>>>> compatibility in minor releases was established in response to the
>>>> last ABI compatibility break. Am I wrong?
>>>
>>> I'm not sure how this relates to the material quoted of me, but no,
>>> you're not wrong.
>>
>> Just trying to provide historical context to support the strength of
>> the premise.
>
> The existence of the policy is not under question (anymore; I settled
> that with old email a while ago). The question is whether to change
> the policy.

So I have gathered. I question whether the concerns that lead to that
decision in the first place are somehow less important now.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-08 Thread Darren Dale
On Mon, Feb 8, 2010 at 7:52 PM, Robert Kern  wrote:
> On Mon, Feb 8, 2010 at 18:43, Darren Dale  wrote:
>> On Mon, Feb 8, 2010 at 7:25 PM, Robert Kern  wrote:
>>> Here's the problem that I don't think many people appreciate: logical
>>> arguments suck just as much as personal experience in answering these
>>> questions. You can make perfectly structured arguments until you are
>>> blue in the face, but without real data to premise them on, they are
>>> no better than the gut feelings. They can often be significantly worse
>>> if the strength of the logic gets confused with the strength of the
>>> premise.
>>
>> If I recall correctly, the convention of not breaking ABI
>> compatibility in minor releases was established in response to the
>> last ABI compatibility break. Am I wrong?
>
> I'm not sure how this relates to the material quoted of me, but no,
> you're not wrong.

Just trying to provide historical context to support the strength of
the premise.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-08 Thread Darren Dale
On Mon, Feb 8, 2010 at 7:25 PM, Robert Kern  wrote:
> Here's the problem that I don't think many people appreciate: logical
> arguments suck just as much as personal experience in answering these
> questions. You can make perfectly structured arguments until you are
> blue in the face, but without real data to premise them on, they are
> no better than the gut feelings. They can often be significantly worse
> if the strength of the logic gets confused with the strength of the
> premise.

If I recall correctly, the convention of not breaking ABI
compatibility in minor releases was established in response to the
last ABI compatibility break. Am I wrong?

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-08 Thread Darren Dale
On Mon, Feb 8, 2010 at 5:05 PM, Darren Dale  wrote:
> On Mon, Feb 8, 2010 at 5:05 PM, Jarrod Millman  wrote:
>> On Mon, Feb 8, 2010 at 1:57 PM, Charles R Harris
>>  wrote:
>>> Should the release containing the datetime/hasobject changes be called
>>>
>>> a) 1.5.0
>>> b) 2.0.0
>>
>> My vote goes to b.
>
> You don't matter. Nor do I.

I definitely should have counted to 100 before sending that. It wasn't
helpful and I apologize.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-08 Thread Darren Dale
On Mon, Feb 8, 2010 at 5:05 PM, Jarrod Millman  wrote:
> On Mon, Feb 8, 2010 at 1:57 PM, Charles R Harris
>  wrote:
>> Should the release containing the datetime/hasobject changes be called
>>
>> a) 1.5.0
>> b) 2.0.0
>
> My vote goes to b.

You don't matter. Nor do I.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-07 Thread Darren Dale
I'm breaking my promise, after people wrote me offlist encouraging me
to keep pushing my point of view.

On Sun, Feb 7, 2010 at 8:23 PM, David Cournapeau  wrote:
> Jarrod Millman wrote:
>>  Just
>> to be clear, I would prefer to see the ABI-breaking release be called
>> 2.0.  I don't see why we have to get the release out in three weeks,
>> though.  I think it would be better to use this opportunity to take
>> some time to make sure we get it right.
>
> As a compromise, what about the following:
>        - remove ABI-incompatible changes for 1.4.x
>        - release a 1.5.0 marked as experimental, with everything that Travis
> wants to put in. It would be a preview for python 3k as well, so it
> conveys the idea that it is experimental pretty well.

Why can't this be called 2.0beta, with a __version__ like 1.9.96? I
don't understand the reluctance to follow numpy's own established
conventions.

>        - the 1.6.x branch would be a polished 1.5.x.

This could be called that 2.0.x instead of 1.6.x

> The advantages is that 1.5.0

... or 2.0beta ...

> can be push relatively early, but we would
> still keep 1.4.0 as the "stable" release, against which every other
> binary installer should be built (scipy, mpl).

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-07 Thread Darren Dale
On Sat, Feb 6, 2010 at 10:16 PM, Travis Oliphant  wrote:
> I will just work on trunk and assume that the next release will be ABI
> incompatible.   At this point I would rather call the next version 1.5
> than 2.0, though.  When the date-time work is completed, then we could
> release an ABI-compatible-with-1.5  version 2.0.

There may be repercussions if numpy starts deviating from its own
conventions for what versions may introduce ABI incompatibilities.

I attended a workshop recently where a number of scientists approached
me and expressed interest in switching from IDL to python. Two of
these were senior scientists leading large research groups and
collaborations, both of whom had looked at python several years ago
and decided they did not like "the wild west nature" (direct quote) of
the scientific python community. I assured them that both the projects
and community were maturing. At the time, I did not have to explain
the situation concerning numpy-1.4.0, which, if it causes problems
when they try to set up an environment to assess python, could put
them off python for another 3 years, maybe even for good. It would be
a lot easier to justify the disruption if one could say "numpy-2.0
added support for some important features, so this disruption was
unfortunate but necessary. Such disruptions are specified by major
version changes, which as you can see are rare. In fact, there are no
further major version changes envisioned at this time." That kind of
statement might reassure a lot of people, including package
maintainers etc.

Regards,
Darren

P.S. I promise this will be my last post on the subject.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-06 Thread Darren Dale
On Sat, Feb 6, 2010 at 8:39 AM, David Cournapeau  wrote:
> On Sat, Feb 6, 2010 at 10:36 PM, Darren Dale  wrote:
>>
>> I don't understand why there is any debate about what to call a
>> release that breaks ABI compatibility.
>
> Because it means datetime support will come late (in 2.0), and Travis
> wanted to get it early in.

Why does something called 2.0 have to come late? Why can't whatever
near-term numpy release that breaks ABI compatibility and includes
datetime be called 2.0?

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-06 Thread Darren Dale
On Sat, Feb 6, 2010 at 8:29 AM,   wrote:
> On Sat, Feb 6, 2010 at 8:07 AM, Francesc Alted  wrote:
>> A Saturday 06 February 2010 13:17:22 David Cournapeau escrigué:
>>> On Sat, Feb 6, 2010 at 4:07 PM, Travis Oliphant 
>> wrote:
>>> > I think this plan is the least disruptive and satisfies the concerns
>>> > of all parties in the discussion.  The other plans that have been
>>> > proposed do not address my concerns of keeping the date-time changes
>>>
>>> In that regard, your proposal is very similar to what was suggested at
>>> the beginning - the difference is only whether breaking at 1.4.x or
>>> 1.5.x.
>>
>> I'm thinking why should we so conservative in raising version numbers?  Why
>> not relabeling 1.4.0 to 2.0 and mark 1.4.0 as a broken release?  Then, we can
>> continue by putting everything except ABI breaking features in 1.4.1.  With
>> this, NumPy 2.0 will remain available for people wanting to be more on-the-
>> bleeding-edge.  Something similar to what has happened with Python 3.0, which
>> has not prevented the 2.x series to evolve.
>>
>> How this sounds?
>
> I think breaking with 1.5 sounds good because it starts the second
> part of the 1.x series.
> 2.0 could be for the big overhaul that David has in mind, unless it
> will not be necessary anymore

I don't understand why there is any debate about what to call a
release that breaks ABI compatibility.  Robert Kern already reminded
the list of the "Report from SciPy" dated 2008-08-23:

"""
 * The releases will be numbered major.minor.bugfix
 * There will be no ABI changes in minor releases
 * There will be no API changes in bugfix releases
"""

If numpy-2.0 suddenly shows up at sourceforge, people will either
already be aware of the above convention, or if not they at least will
be more likely to wonder what precipitated the jump and be more likely
to read the release notes.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-05 Thread Darren Dale
On Fri, Feb 5, 2010 at 10:25 PM, Travis Oliphant  wrote:
>
> On Feb 5, 2010, at 2:32 PM, Christopher Barker wrote:
>
>> Hi folks,
>>
>> It sounds like a consensus has been reached to put out a 1.4.1 that is
>> ABI compatible with 1.3.*
>
> This is not true.   Consensus has not been reached.

How many have registered opposition to the above proposal?

> I think 1.3.9 should be released and 1.4.1 should be ABI incompatible.

And then another planned break in numpy ABI compatibility in the
foreseeable future, for the other items that have been discussed in
this thread? I am still inclined to agree with David and Chuck in this
instance.

Regards,
Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing datetime support for 1.4.x series ?

2010-02-04 Thread Darren Dale
On Thu, Feb 4, 2010 at 3:21 AM, Francesc Alted  wrote:
> A Thursday 04 February 2010 08:46:01 Charles R Harris escrigué:
>> > Perhaps one way to articulate my perspective is the following:
>> >
>> > There are currently 2 groups of NumPy users:
>> >
>> >  1)  those who have re-compiled all of their code for 1.4.0
>> >  2)  those who haven't
>>
>> I think David has a better grip on that. There really are a lot of people
>> who depend on binaries, and those binaries in turn depend on numpy. I would
>> even say those folks are a majority, they are those who download the Mac
>>  and Windows versions of numpy.
>
> Yes, I think this is precisely the problem: people that are used to fetch
> binaries and want to use new NumPy, will be forced to upgrade all the other
> binary packages that depends on it.  And these binary packagers (including me)
> are being forced to regenerate their binaries as soon as possible if they
> don't want their users to despair.  I'm not saying that regenerating binaries
> is not possible, but that would require a minimum of anticipation.  I'd be
> more comfortable with ABI-breaking releases to be announced at least with 6
> months of anticipation.
>
> Then, a user is not likely going to change its *already* working environment
> until all the binary packages he depends on (scipy, matplotlib, pytables,
> h5py, numexpr, sympy...) have been *all* updated for dealing with the new ABI
> numpy, and that could be really a long time.  With this (and ironically), an
> attempt to quickly introduce a new feature (in this case datetime, but it
> could have been whatever) in a release for allowing wider testing and
> adoption, will almost certainly result in a release that takes much longer to
> spread widely, and what is worst, generating a large frustration among users.

Also, there was some discussion about wanting to make some other
changes in numpy that would break ABI once, but allow new dtypes in
the future without additional ABI breakage. Since ABI breakage is so
disruptive, could we try to coordinate so a number of them can happen
all at once, with plenty of warning to the community? Then this
change, datetime, and hasobject can all be handled at the same time,
and it could/should be released as numpy-2.0. Then when when numpy for
py-3.0 is ready, which will presumably require ABI breakage, it could
be called numpy-3.0.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy.test(): invalid value encountered in {isinf, divide, power, ...}

2010-01-20 Thread Darren Dale
I haven't been following development on the trunk closely, so I
apologize if this is a known issue. I didn't see anything relevant
when I searched the list.

I just updated my checkout of the trunk, cleaned out the old
installation and build/, and reinstalled. When I run the test suite
(without specifying the verbosity), I get a slew of warnings like:

Warning: invalid value encountered in isinf
Warning: invalid value encountered in isfinite

I checked on both OS X 10.6 and gentoo linux, with similar results.
The test suite reports "ok" at the end with 5 known failures and 4
skipped tests.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [SciPy-dev] Announcing toydist, improving distribution and packaging situation

2009-12-30 Thread Darren Dale
On Wed, Dec 30, 2009 at 11:16 AM, David Cournapeau  wrote:
> On Wed, Dec 30, 2009 at 11:26 PM, Darren Dale  wrote:
>> Hi David,
>>
>> On Mon, Dec 28, 2009 at 9:03 AM, David Cournapeau  wrote:
>>> Executable: grin
>>>    module: grin
>>>    function: grin_main
>>>
>>> Executable: grind
>>>    module: grin
>>>    function: grind_main
>>
>> Have you thought at all about operations that are currently performed
>> by post-installation scripts? For example, it might be desirable for
>> the ipython or MayaVi windows installers to create a folder in the
>> Start menu that contains links the the executable and the
>> documentation. This is probably a secondary issue at this point in
>> toydist's development, but I think it is an important feature in the
>> long run.
>>
>> Also, have you considered support for package extras (package variants
>> in Ports, allowing you to specify features that pull in additional
>> dependencies like traits[qt4])? Enthought makes good use of them in
>> ETS, and I think they would be worth keeping.
>
> Does this example covers what you have in mind ? I am not so familiar
> with this feature of setuptools:
>
> Name: hello
> Version: 1.0
>
> Library:
>    BuildRequires: paver, sphinx, numpy
>    if os(windows)
>        BuildRequires: pywin32
>    Packages:
>        hello
>    Extension: hello._bar
>        sources:
>            src/hellomodule.c
>    if os(linux)
>        Extension: hello._linux_backend
>            sources:
>                src/linbackend.c
>
> Note that instead of os(os_name), you can use flag(flag_name), where
> flag are boolean variables which can be user defined:
>
> http://github.com/cournape/toydist/blob/master/examples/simples/conditional/toysetup.info
>
> http://github.com/cournape/toydist/blob/master/examples/var_example/toysetup.info

I should defer to the description of extras in the setuptools
documentation. It is only a few paragraphs long:

http://peak.telecommunity.com/DevCenter/setuptools#declaring-extras-optional-features-with-their-own-dependencies

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2009-12-30 Thread Darren Dale
On Wed, Dec 30, 2009 at 9:26 AM, Ravi  wrote:
> On Wednesday 30 December 2009 06:15:45 René Dudfield wrote:
>
>> I agree with many things in that post.  Except your conclusion on
>> multiple versions of packages in isolation.  Package isolation is like
>> processes, and package sharing is like threads - and threads are evil!

I don't think this is an appropriate analogy, and hyperbolic
statements like "threads are evil!" are unlikely to persuade a
scientific audience.

> You have stated this several times, but is there any evidence that this is the
> desire of the majority of users? In the scientific community, interactive
> experimentation is critical and users are typically not seasoned systems
> administrators. For such users, almost all packages installed after installing
> python itself are packages they use. In particular, all I want to do is to use
> apt/yum to get the packages (or ask my sysadmin, who rightfully has no
> interest in learning the intricacies of python package installation, to do so)
> and continue with my work. "Packages-in-isolation" is for people whose job is
> to run server farms, not interactive experimenters.

I agree.

>>  Leave my python site-packages directory alone I say... especially
>> don't let setuptools infect it :)

There are already mechanisms in place for this. "python setup.py
install --user" or "easy_install --prefix=/usr/local" for example.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2009-12-30 Thread Darren Dale
Hi David,

On Mon, Dec 28, 2009 at 9:03 AM, David Cournapeau  wrote:
> Executable: grin
>    module: grin
>    function: grin_main
>
> Executable: grind
>    module: grin
>    function: grind_main

Have you thought at all about operations that are currently performed
by post-installation scripts? For example, it might be desirable for
the ipython or MayaVi windows installers to create a folder in the
Start menu that contains links the the executable and the
documentation. This is probably a secondary issue at this point in
toydist's development, but I think it is an important feature in the
long run.

Also, have you considered support for package extras (package variants
in Ports, allowing you to specify features that pull in additional
dependencies like traits[qt4])? Enthought makes good use of them in
ETS, and I think they would be worth keeping.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Cython issues w/ 1.4.0

2009-12-08 Thread Darren Dale
On Tue, Dec 8, 2009 at 12:02 PM, Pauli Virtanen  wrote:
> Sun, 06 Dec 2009 14:53:58 +0100, Gael Varoquaux wrote:
>> I have a lot of code that has stopped working with my latest SVN pull to
>> numpy.
>>
>> * Some compiled code yields an error looking like (from memory):
>>
>>     "incorrect type 'numpy.ndarray'"
>
> This, by the way, also affects the 1.4.x branch. Because of the datetime
> branch merge, a new field was added to ArrayDescr -- and this breaks
> previously compiled Cython modules.

Will the datetime changes effect other previously-compiled python
modules, like PyQwt?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Py3 merge

2009-12-06 Thread Darren Dale
On Sat, Dec 5, 2009 at 10:54 PM, David Cournapeau  wrote:
> On Sun, Dec 6, 2009 at 9:41 AM, Pauli Virtanen  wrote:
>> Hi,
>>
>> I'd like to commit my Py3 Numpy branch to SVN trunk soon:
>>
>>        http://github.com/pv/numpy-work/commits/py3k
>
> Awesome - I think we should merge this ASAP. In particular, I would
> like to start fixing platforms-specific issues.
>
> Concerning nose, will there be any version which works on both py2 and py3 ?

There is a development branch for python-3 here:

svn checkout http://python-nose.googlecode.com/svn/branches/py3k

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] REMINDER: trunk is about to be frozen for 1.4.0

2009-11-18 Thread Darren Dale
On Tue, Nov 17, 2009 at 8:55 PM, David Cournapeau  wrote:
> already done  in r7743 :) Did you report it as a bug on trac, so that
> I close it as well,

Oh, thanks! No, I forgot to report it on trac, I'll try to remember
that in the future.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] REMINDER: trunk is about to be frozen for 1.4.0

2009-11-17 Thread Darren Dale
Please consider applying this patch before freezing, or you can't do
"python setup.py develop" with Distribute (at least not with
Enthought's Enable):

ndex: numpy/distutils/command/build_ext.py
===
--- numpy/distutils/command/build_ext.py(revision 7734)
+++ numpy/distutils/command/build_ext.py(working copy)
@@ -61,6 +61,7 @@
if self.distribution.have_run.get('build_clib'):
log.warn('build_clib already run, it is too late to ' \
'ensure in-place build of build_clib')
+build_clib =
self.distribution.get_command_obj('build_clib')
else:
build_clib =
self.distribution.get_command_obj('build_clib')
build_clib.inplace = 1

On Mon, Nov 16, 2009 at 4:29 AM, David Cournapeau
 wrote:
> Hi,
>
>    A quick remainder: the trunk will be closed for 1.4.0 changes within
> a few hours. After that time, the trunk should only contain things which
> will be in 1.5.0, and the 1.4.0 changes will be in the 1.4.0 branch,
> which should contain only bug fixes.
>
> cheers,
>
> David
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy distutils and distribute

2009-11-14 Thread Darren Dale
On Sat, Nov 14, 2009 at 10:42 AM, Gökhan Sever  wrote:
> On Sat, Nov 14, 2009 at 9:29 AM, Darren Dale  wrote:
>>
>> Please excuse the cross-post. I have installed distribute-0.6.8 and
>> numpy-svn into my ~/.local/lib/python2.6/site-packages (using "python
>> setup.py install --user"). I am now trying to install Enthought's
>> Enable from a fresh svn checkout on ubuntu karmic:
>>
>> $ python setup.py develop --user
>> [...]
>> building library "agg24_src" sources
>> building library "kiva_src" sources
>> building extension "enthought.kiva.agg._agg" sources
>> building extension "enthought.kiva.agg._plat_support" sources
>> building data_files sources
>> build_src: building npy-pkg config files
>> running build_clib
>> customize UnixCCompiler
>> customize UnixCCompiler using build_clib
>> running build_ext
>> build_clib already run, it is too late to ensure in-place build of
>> build_clib
>> Traceback (most recent call last):
>>  File "setup.py", line 327, in 
>>    **config
>>  File
>> "/home/darren/.local/lib/python2.6/site-packages/numpy/distutils/core.py",
>> line 186, in setup
>>    return old_setup(**new_attr)
>>  File "/usr/lib/python2.6/distutils/core.py", line 152, in setup
>>    dist.run_commands()
>>  File "/usr/lib/python2.6/distutils/dist.py", line 975, in run_commands
>>    self.run_command(cmd)
>>  File "/usr/lib/python2.6/distutils/dist.py", line 995, in run_command
>>    cmd_obj.run()
>>  File
>> "/home/darren/.local/lib/python2.6/site-packages/numpy/distutils/command/build_ext.py",
>> line 74, in run
>>    self.library_dirs.append(build_clib.build_clib)
>> UnboundLocalError: local variable 'build_clib' referenced before
>> assignment
>>
>
> Darren,
>
> I had a similar installation error. Could you try the solution that was
> given in this thread?
>
> http://www.mail-archive.com/numpy-discussion@scipy.org/msg19798.html

Thanks!

Here is the diff, could someone with knowledge of numpy's distutils
have a look and consider committing it?

Index: numpy/distutils/command/build_ext.py
===
--- numpy/distutils/command/build_ext.py(revision 7734)
+++ numpy/distutils/command/build_ext.py(working copy)
@@ -61,6 +61,7 @@
 if self.distribution.have_run.get('build_clib'):
 log.warn('build_clib already run, it is too late to ' \
 'ensure in-place build of build_clib')
+build_clib =
self.distribution.get_command_obj('build_clib')
 else:
 build_clib =
self.distribution.get_command_obj('build_clib')
 build_clib.inplace = 1
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy distutils and distribute

2009-11-14 Thread Darren Dale
Please excuse the cross-post. I have installed distribute-0.6.8 and
numpy-svn into my ~/.local/lib/python2.6/site-packages (using "python
setup.py install --user"). I am now trying to install Enthought's
Enable from a fresh svn checkout on ubuntu karmic:

$ python setup.py develop --user
[...]
building library "agg24_src" sources
building library "kiva_src" sources
building extension "enthought.kiva.agg._agg" sources
building extension "enthought.kiva.agg._plat_support" sources
building data_files sources
build_src: building npy-pkg config files
running build_clib
customize UnixCCompiler
customize UnixCCompiler using build_clib
running build_ext
build_clib already run, it is too late to ensure in-place build of build_clib
Traceback (most recent call last):
  File "setup.py", line 327, in 
**config
  File 
"/home/darren/.local/lib/python2.6/site-packages/numpy/distutils/core.py",
line 186, in setup
return old_setup(**new_attr)
  File "/usr/lib/python2.6/distutils/core.py", line 152, in setup
dist.run_commands()
  File "/usr/lib/python2.6/distutils/dist.py", line 975, in run_commands
self.run_command(cmd)
  File "/usr/lib/python2.6/distutils/dist.py", line 995, in run_command
cmd_obj.run()
  File 
"/home/darren/.local/lib/python2.6/site-packages/numpy/distutils/command/build_ext.py",
line 74, in run
self.library_dirs.append(build_clib.build_clib)
UnboundLocalError: local variable 'build_clib' referenced before assignment


I am able to run "python setup.py install --user". Incidentally,
"python setup.py develop --user" worked for TraitsGui, EnthoughtBase,
TraitsBackendQt4.

I have been (sort of) following the discussion on distutils-sig. Thank
you Robert, David, Pauli, for all your effort.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 1.4.0: Setting a firm release date for 1st december.

2009-11-02 Thread Darren Dale
On Mon, Nov 2, 2009 at 3:29 AM, David Cournapeau  wrote:
> Hi,
>
> I think it is about time to release 1.4.0. Instead of proposing a
> release date, I am setting a firm date for 1st December, and 16th
> november to freeze the trunk. If someone wants a different date, you
> have to speak now.
>
> There are a few issues I would like to clear up:
>  - Documentation for datetime, in particular for the public C API
>  - Snow Leopard issues, if any
>
> Otherwise, I think there has been quite a lot of new features. If
> people want to add new functionalities or features, please do it soon,

I wanted to get __input_prepare__ in for the 1.4 release, but I don't
think I can get it in and tested by November 16.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy and C99

2009-10-23 Thread Darren Dale
On Fri, Oct 23, 2009 at 9:29 AM, Pauli Virtanen  wrote:
> Fri, 23 Oct 2009 09:21:17 -0400, Darren Dale wrote:
>> Can we use features of C99 in numpy? For example, can we use "//" style
>> comments, and C99 for statements "for (int i=0, ...) "?
>
> It would be much easier if we could, but so far we have strived for C89
> compliance. So I guess the answer is "no".

Out of curiosity (I am relatively new to C), what is holding numpy
back from embracing C99? Why adhere to a 20-year-old standard?

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy and C99

2009-10-23 Thread Darren Dale
Can we use features of C99 in numpy? For example, can we use "//"
style comments, and C99 for statements "for (int i=0, ...) "?

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Another suggestion for making numpy's functions generic

2009-10-20 Thread Darren Dale
Hi Travis,

On Mon, Oct 19, 2009 at 6:29 PM, Travis Oliphant  wrote:
>
> On Oct 17, 2009, at 7:49 AM, Darren Dale wrote:
[...]
>> When calling numpy functions:
>>
>> 1) __input_prepare__ provides an opportunity to operate on the inputs
>> to yield versions that are compatible with the operation (they should
>> obviously not be modified in place)
>>
>> 2) the output array is established
>>
>> 3) __array_prepare__ is used to determine the class of the output
>> array, as well as any metadata that needs to be established before the
>> operation proceeds
>>
>> 4) the ufunc performs its operations
>>
>> 5) __array_wrap__ provides an opportunity to update the output array
>> based on the results of the computation
>>
>> Comments, criticisms? If PEP 3124^ were already a part of the standard
>> library, that could serve as the basis for generalizing numpy's
>> functions. But I think the PEP will not be approved in its current
>> form, and it is unclear when and if the author will revisit the
>> proposal. The scheme I'm imagining might be sufficient for our
>> purposes.
>
> This seems like it could work.    So, basically ufuncs will take any
> object as input and call it's __input__prepare__ method?   This should
> return a sub-class of an ndarray?

ufuncs would call __input_prepare__ on the input declaring the highest
__array_priority__, just like ufuncs do with __array_wrap__, passing a
tuple of inputs and the ufunc itself (provided for context).
__input_prepare__ would return a tuple of inputs that the ufunc would
use for computation, I'm not sure if these need to be arrays or not, I
think I can give a better answer once I start the implementation (next
few days I think).

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Another suggestion for making numpy's functions generic

2009-10-20 Thread Darren Dale
On Tue, Oct 20, 2009 at 5:24 AM, Sebastian Walter
 wrote:
> I'm not very familiar with the underlying C-API of numpy, so this has
> to be taken with a grain of salt.
>
> The reason why I'm curious about the genericity is that it would be
> awesome to have:
> 1) ufuncs like sin, cos, exp... to work on arrays of any object (this
> works already)
> 2) funcs like dot, eig, etc, to work on arrays of objects( works for
> dot already, but not for eig)
> 3) ufuncs and funcs to work on any objects

I think if you want to work on any object, you need something like the
PEP I mentioned earlier. What I am proposing is to use the existing
mechanism in numpy, check __array_priority__ to determine which
input's __input_prepare__ to call.

> examples that would be nice to work are among others:
> * arrays of polynomials, i.e. arrays of objects
> * polynomials with tensor coefficients, object with underlying array structure
>
> I thought that the most elegant way to implement that would be to have
> all numpy functions try  to call either
> 1)  the class function with the same name as the numpy function
> 2) or if the class function is not implemented, the member function
> with the same name as the numpy function
> 3) if none exists, raise an exception
>
> E.g.
>
> 1)
> if isinstance(x) = Foo
> then numpy.sin(x)
> would call Foo.sin(x) if it doesn't know how to handle Foo

How does it numpy.sin know if it knows how to handle Foo? numpy.sin
will happily process the data of subclasses of ndarray, but if you
give it a quantity with units of degrees it is going to return garbage
and not care.

> 2)
> similarly, for arrays of objects of type Foo:
>  x = np.array([Foo(1), Foo(2)])
>
> Then numpy.sin(x)
> should try to return npy.array([Foo.sin(xi) for xi in x])
> or in case Foo.sin is not implemented as class function,
> return : np.array([xi.sin() for xi in x])

I'm not going to comment on this, except to say that it is outside the
scope of my proposal.

> Therefore, I somehow expected something like that:
> Quantity would derive from numpy.ndarray.
> When calling  Quantity.__new__(cls) creates the member functions
> __add__, __imul__, sin, exp, ...
> where each function has a preprocessing part and a post processing part.
> After the preprocessing call the original ufuncs on the base class
> object, e.g. __add__

It is more complicated than that. Ufuncs don't call array methods, its
the other way around. ndarray.__add__ calls numpy.add. If you have a
custom operation to perform on numpy arrays, you write a ufunc, not a
subclass. What you are proposing is a very significant change to
numpy.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Another suggestion for making numpy's functions generic

2009-10-19 Thread Darren Dale
On Mon, Oct 19, 2009 at 3:10 AM, Sebastian Walter
 wrote:
> On Sat, Oct 17, 2009 at 2:49 PM, Darren Dale  wrote:
>> numpy's functions, especially ufuncs, have had some ability to support
>> subclasses through the ndarray.__array_wrap__ method, which provides
>> masked arrays or quantities (for example) with an opportunity to set
>> the class and metadata of the output array at the end of an operation.
>> An example is
>>
>> q1 = Quantity(1, 'meter')
>> q2 = Quantity(2, 'meters')
>> numpy.add(q1, q2) # yields Quantity(3, 'meters')
>>
>> At SciPy2009 we committed a change to the numpy trunk that provides a
>> chance to determine the class and some metadata of the output *before*
>> the ufunc performs its calculation, but after output array has been
>> established (and its data is still uninitialized). Consider:
>>
>> q1 = Quantity(1, 'meter')
>> q2 = Quantity(2, 'J')
>> numpy.add(q1, q2, q1)
>> # or equivalently:
>> # q1 += q2
>>
>> With only __array_wrap__, the attempt to propagate the units happens
>> after q1's data was updated in place, too late to raise an error, the
>> data is now corrupted. __array_prepare__ solves that problem, an
>> exception can be raised in time.
>>
>> Now I'd like to suggest one more improvement to numpy to make its
>> functions more generic. Consider one more example:
>>
>> q1 = Quantity(1, 'meter')
>> q2 = Quantity(2, 'feet')
>> numpy.add(q1, q2)
>>
>> In this case, I'd like an opportunity to operate on the input arrays
>> on the way in to the ufunc, to rescale the second input to meters. I
>> think it would be a hack to try to stuff this capability into
>> __array_prepare__. One form of this particular example is already
>> supported in quantities, "q1 + q2", by overriding the __add__ method
>> to rescale the second input, but there are ufuncs that do not have an
>> associated special method. So I'd like to look into adding another
>> check for a special method, perhaps called __input_prepare__. My time
>> is really tight for the next month, so I'd rather not start if there
>> are strong objections, but otherwise, I'd like to try to try to get it
>> in in time for numpy-1.4. (Has a timeline been established?)
>>
>> I think it will be not too difficult to document this overall scheme:
>>
>> When calling numpy functions:
>>
>> 1) __input_prepare__ provides an opportunity to operate on the inputs
>> to yield versions that are compatible with the operation (they should
>> obviously not be modified in place)
>>
>> 2) the output array is established
>>
>> 3) __array_prepare__ is used to determine the class of the output
>> array, as well as any metadata that needs to be established before the
>> operation proceeds
>>
>> 4) the ufunc performs its operations
>>
>> 5) __array_wrap__ provides an opportunity to update the output array
>> based on the results of the computation
>>
>> Comments, criticisms? If PEP 3124^ were already a part of the standard
>> library, that could serve as the basis for generalizing numpy's
>> functions. But I think the PEP will not be approved in its current
>> form, and it is unclear when and if the author will revisit the
>> proposal. The scheme I'm imagining might be sufficient for our
>> purposes.
>
> I'm all for generic (u)funcs since they might come handy for me since
> I'm doing lots of operation on arrays of polynomials.
>  I don't quite get the reasoning though.
> Could you correct me where I get it wrong?
> * the class Quantity derives from numpy.ndarray
> * Quantity overrides __add__, __mul__ etc. and you get the correct behaviour 
> for
> q1 = Quantity(1, 'meter')
> q2 = Quantity(2, 'J')
> by raising an exception when performing q1+=q2

No, Quantity does not override __iadd__ to catch this. Quantity
implements __array_prepare__ to perform the dimensional analysis based
on the identity of the ufunc and the inputs, and set the class and
dimensionality of the output array, or raise an error when dimensional
analysis fails. This approach lets quantities support all ufuncs (in
principle), not just built in numerical operations. It should also
make it easier to subclass from MaskedArray, so we could have a
MaskedQuantity without having to establish yet another suite of ufuncs
specific to quantities or masked quantities.

> * The problem is that numpy.add(q1,q1,q2) would corrupt q1 before
> raising an exception

That was solved by the addition of __array_prepare__ to numpy back in
August. What I am proposing now is supporting operations on arrays
that would be compatible if we had a chance to transform them on the
way into the ufunc, like "meter + foot".

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Subclassing record array

2009-10-18 Thread Darren Dale
On Sun, Oct 18, 2009 at 12:22 AM, Charles R Harris
 wrote:
>
>
> On Sat, Oct 17, 2009 at 9:13 AM, Loïc BERTHE  wrote:
>>
>>   Hi,
>>
>> I would like to create my own class of record array to deal with units.
>>
>> Here is the code I used, inspired from
>>
>> http://docs.scipy.org/doc/numpy-1.3.x/user/basics.subclassing.html#slightly-more-realistic-example-attribute-added-to-existing-array
>> :
>>
>>
>> [code]
>> from numpy import *
>>
>> class BlocArray(rec.ndarray):
>>    """ Recarray with units and pretty print """
>>
>>    fmt_dict = {'S' : '%10s', 'f' : '%10.6G', 'i': '%10d'}
>>
>>    def __new__(cls, data, titles=None, units=None):
>>
>>        # guess format for each column
>>        data2 = []
>>        for line in zip(*data) :
>>            try : data2.append(cast[int](line))         # integers
>>            except ValueError :
>>                try : data2.append(cast[float](line))   # reals
>>                except ValueError :
>>                    data2.append(cast[str](line))       # characters
>>
>>        # create the array
>>        dt = dtype(zip(titres, [line.dtype for line in data2]))
>>        obj = rec.array(data2, dtype=dt).view(cls)
>>
>>        # add custom attributes
>>        obj.units = units or []
>>        obj._fmt = " ".join(obj.fmt_dict[d[1][1]] for d in dt.descr) + '\n'
>>        obj._head = "%10s "*len(dt.names) % dt.names +'\n'
>>        obj._head += "%10s "*len(dt.names) % tuple('(%s)' % u for u in
>> units) +'\n'
>>
>>        # Finally, we must return the newly created object:
>>        return obj
>>
>> titles =  ['Name', 'Nb', 'Price']
>> units = ['/', '/', 'Eur']
>> data = [['fish', '1', '12.25'], ['egg', '6', '0.85'], ['TV', 1, '125']]
>> bloc = BlocArray(data, titles=titles, units=units)
>>
>> In [544]: bloc
>> Out[544]:
>>      Name         Nb      Price
>>       (/)        (/)      (Eur)
>>      fish          1      12.25
>>       egg          6       0.85
>>        TV          1        125
>> [/code]
>>
>> It's almost working, but I have some isues :
>>
>>   - I can't access data through indexing
>> In [563]: bloc['Price']
>> /home/loic/Python/numpy/test.py in ((r,))
>>     50
>>     51     def __repr__(self):
>> ---> 52         return self._head + ''.join(self._fmt % tuple(r) for r in
>> self)
>>
>> TypeError: 'numpy.float64' object is not iterable
>>
>> So I think that overloading the __repr__ method is not that easy
>>
>>   - I can't access data through attributes now :
>> In [564]: bloc.Nb
>> AttributeError: 'BlocArray' object has no attribute 'Nb'
>>
>>   - I can't use 'T' as field in theses array as the T method is
>> already here as a shortcut for transpose
>>
>>
>> Have you any hints to make this work ?
>>
>>
>
> On adding units in general, you might want to contact Darren Dale who has
> been working in that direction also and has added some infrastructure in svn
> to make it easier. He also gave a short presentation at scipy2009 on that
> problem, which has been worked on before. No sense in reinventing the wheel
> here.

The units package I have been working on is called quantities. It is
available at the python package index, and the project is hosted at
launchpad as python-quantities. If quantities isn't a good fit, please
let me know why. At least the code can provide some example of how to
subclass ndarray.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Another suggestion for making numpy's functions generic

2009-10-18 Thread Darren Dale
On Sat, Oct 17, 2009 at 6:45 PM, Charles R Harris
 wrote:
>
>
> On Sat, Oct 17, 2009 at 6:49 AM, Darren Dale  wrote:
[...]
>> I think it will be not too difficult to document this overall scheme:
>>
>> When calling numpy functions:
>>
>> 1) __input_prepare__ provides an opportunity to operate on the inputs
>> to yield versions that are compatible with the operation (they should
>> obviously not be modified in place)
>>
>> 2) the output array is established
>>
>> 3) __array_prepare__ is used to determine the class of the output
>> array, as well as any metadata that needs to be established before the
>> operation proceeds
>>
>> 4) the ufunc performs its operations
>>
>> 5) __array_wrap__ provides an opportunity to update the output array
>> based on the results of the computation
>>
>> Comments, criticisms? If PEP 3124^ were already a part of the standard
>> library, that could serve as the basis for generalizing numpy's
>> functions. But I think the PEP will not be approved in its current
>> form, and it is unclear when and if the author will revisit the
>> proposal. The scheme I'm imagining might be sufficient for our
>> purposes.
>>
>
> This sounds interesting to me, as it would push the use of array wrap down
> into a common function and make it easier to use.

Sorry, I don't understand what you mean.

> I wonder what the impact
> would be on the current subclasses of ndarray?

I don't think it will have any impact. The only change would be the
addition of __input_prepare__, which by default would simply return
the unmodified inputs.

> On a side note, I wonder if you could look into adding your reduce loop
> optimizations into the generic loops? It would be interesting to see if that
> speeded up some common operations. In any case, it can't hurt.

I think you are confusing me with someone else.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Another suggestion for making numpy's functions generic

2009-10-17 Thread Darren Dale
numpy's functions, especially ufuncs, have had some ability to support
subclasses through the ndarray.__array_wrap__ method, which provides
masked arrays or quantities (for example) with an opportunity to set
the class and metadata of the output array at the end of an operation.
An example is

q1 = Quantity(1, 'meter')
q2 = Quantity(2, 'meters')
numpy.add(q1, q2) # yields Quantity(3, 'meters')

At SciPy2009 we committed a change to the numpy trunk that provides a
chance to determine the class and some metadata of the output *before*
the ufunc performs its calculation, but after output array has been
established (and its data is still uninitialized). Consider:

q1 = Quantity(1, 'meter')
q2 = Quantity(2, 'J')
numpy.add(q1, q2, q1)
# or equivalently:
# q1 += q2

With only __array_wrap__, the attempt to propagate the units happens
after q1's data was updated in place, too late to raise an error, the
data is now corrupted. __array_prepare__ solves that problem, an
exception can be raised in time.

Now I'd like to suggest one more improvement to numpy to make its
functions more generic. Consider one more example:

q1 = Quantity(1, 'meter')
q2 = Quantity(2, 'feet')
numpy.add(q1, q2)

In this case, I'd like an opportunity to operate on the input arrays
on the way in to the ufunc, to rescale the second input to meters. I
think it would be a hack to try to stuff this capability into
__array_prepare__. One form of this particular example is already
supported in quantities, "q1 + q2", by overriding the __add__ method
to rescale the second input, but there are ufuncs that do not have an
associated special method. So I'd like to look into adding another
check for a special method, perhaps called __input_prepare__. My time
is really tight for the next month, so I'd rather not start if there
are strong objections, but otherwise, I'd like to try to try to get it
in in time for numpy-1.4. (Has a timeline been established?)

I think it will be not too difficult to document this overall scheme:

When calling numpy functions:

1) __input_prepare__ provides an opportunity to operate on the inputs
to yield versions that are compatible with the operation (they should
obviously not be modified in place)

2) the output array is established

3) __array_prepare__ is used to determine the class of the output
array, as well as any metadata that needs to be established before the
operation proceeds

4) the ufunc performs its operations

5) __array_wrap__ provides an opportunity to update the output array
based on the results of the computation

Comments, criticisms? If PEP 3124^ were already a part of the standard
library, that could serve as the basis for generalizing numpy's
functions. But I think the PEP will not be approved in its current
form, and it is unclear when and if the author will revisit the
proposal. The scheme I'm imagining might be sufficient for our
purposes.

Darren

^ http://www.python.org/dev/peps/pep-3124/
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] __array_wrap__

2009-09-30 Thread Darren Dale
On Wed, Sep 30, 2009 at 2:57 AM, Pauli Virtanen  wrote:
> Tue, 29 Sep 2009 14:55:44 -0400, Neal Becker wrote:
>
>> This seems to work now, but I'm wondering if Charles is correct, that
>> inheritance isn't such a great idea here.
>>
>> The advantage of inheritance is I don't have to implement forwarding all
>> the functions, a pretty big advantage. (I wonder if there is some way to
>> do some of these as a generic 'mixin'?)
>
> The usual approach is to use __getattr__, to forward many routines with
> little extra work.

... with a side effect of making the API opaque and breaking tab
completion in ipython.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


  1   2   3   >