On Wed, 2013-03-06 at 12:42 -0600, Kurt Smith wrote: > On Wed, Mar 6, 2013 at 12:12 PM, Kurt Smith <kwmsm...@gmail.com> wrote: > > On Wed, Mar 6, 2013 at 4:29 AM, Francesc Alted <franc...@continuum.io> > > wrote: > >> > >> I would not run too much. The example above takes 9 bytes to host the > >> structure, while a `aligned=True` will take 16 bytes. I'd rather let > >> the default as it is, and in case performance is critical, you can > >> always copy the unaligned field to a new (homogeneous) array. > > > > Yes, I can absolutely see the case you're making here, and I made my > > "vote" with the understanding that `aligned=False` will almost > > certainly stay the default. Adding 'aligned=True' is simple for me to > > do, so no harm done. > > > > My case is based on what's the least surprising behavior: C structs / > > all C compilers, the builtin `struct` module, and ctypes `Structure` > > subclasses all use padding to ensure aligned fields by default. You > > can turn this off to get packed structures, but the default behavior > > in these other places is alignment, which is why I was surprised when > > I first saw that NumPy structured dtypes are packed by default. > > > > Some surprises with aligned / unaligned arrays: > > #----------------------------- > > import numpy as np > > packed_dt = np.dtype((('a', 'u1'), ('b', 'u8')), align=False) > aligned_dt = np.dtype((('a', 'u1'), ('b', 'u8')), align=True) > > packed_arr = np.ones((10**6,), dtype=packed_dt) > aligned_arr = np.ones((10**6,), dtype=aligned_dt) > > print "all(packed_arr['a'] == aligned_arr['a'])", > np.all(packed_arr['a'] == aligned_arr['a']) # True > print "all(packed_arr['b'] == aligned_arr['b'])", > np.all(packed_arr['b'] == aligned_arr['b']) # True > print "all(packed_arr == aligned_arr)", np.all(packed_arr == > aligned_arr) # False (!!) > > #----------------------------- > > I can understand what's likely going on under the covers that makes > these arrays not compare equal, but I'd expect that if all columns of > two structured arrays are everywhere equal, then the arrays themselves > would be everywhere equal. Bug? >
Yes and no... equal for structured types seems not implemented, you get the same (wrong) False also with (packed_arr == packed_arr). But if the types are equivalent but np.equal not implemented, just returning False is a bit dangerous I agree. Not sure what the solution is exactly, I think the == operator could really raise an error instead of eating them all though probably... - Sebastian > And regarding performance, doing simple timings shows a 30%-ish > slowdown for unaligned operations: > > In [36]: %timeit packed_arr['b']**2 > 100 loops, best of 3: 2.48 ms per loop > > In [37]: %timeit aligned_arr['b']**2 > 1000 loops, best of 3: 1.9 ms per loop > > Whereas summing shows just a 10%-ish slowdown: > > In [38]: %timeit packed_arr['b'].sum() > 1000 loops, best of 3: 1.29 ms per loop > > In [39]: %timeit aligned_arr['b'].sum() > 1000 loops, best of 3: 1.14 ms per loop > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion