On Mon, Oct 19, 2015 at 2:14 AM, Nathaniel Smith <n...@pobox.com> wrote:
> On Sun, Oct 18, 2015 at 9:35 PM, <josef.p...@gmail.com> wrote: > >>>> np.column_stack((np.ones(10), np.ones(10))).flags > > C_CONTIGUOUS : True > > F_CONTIGUOUS : False > > > >>>> np.__version__ > > '1.9.2rc1' > > > > > > on my notebook which has numpy 1.6.1 it is f_contiguous > > > > > > I was just trying to optimize a loop over variable adjustment in > regression, > > and found out that we lost fortran contiguity. > > > > I always thought column_stack is for fortran usage (linalg) > > > > What's the alternative? > > column_stack was one of my favorite commands, and I always assumed we > have > > in statsmodels the right memory layout to call the linalg libraries. > > > > ("assumed" means we don't have timing nor unit tests for it.) > > In general practice no numpy functions make any guarantee about memory > layout, unless that's explicitly a documented part of their contract > (e.g. 'ascontiguous', or some functions that take an order= argument > -- I say "some" b/c there are functions like 'reshape' that take an > argument called order= that doesn't actually refer to memory layout). > This isn't so much an official policy as just a fact of life -- if > no-one has any idea that the someone is depending on some memory > layout detail then there's no way to realize that we've broken > something. (But it is a good policy IMO.) > I understand that in general. However, I always thought column_stack is a array creation function which have guaranteed memory layout. And since it's stacking by columns I thought that order is always Fortran. And the fact that it doesn't have an order keyword yet, I thought is just a missing extension. > > If this kind of problem gets caught during a pre-release cycle then we > generally do try to fix it, because we try not to break code, but if > it's been broken for 2 full releases then there's no much we can do -- > we can't go back in time to fix it so it sounds like you're stuck > working around the problem no matter what (unless you want to refuse > to support 1.9.0 through 1.10.1, which I assume you don't... worst > case, you just have to do a global search replace of np.column_stack > with statsmodels.utils.column_stack_f, right?). > > And the regression issue seems like the only real argument for > changing it back -- we'd never guarantee f-contiguity here if starting > from a blank slate, I think? > When the cat is out of the bag, the down stream developer writes compatibility code or helper functions. I will do that at at least the parts I know are intentionally designed for F memory order. --- statsmodels doesn't really check or consistently optimize the memory order, except in some cython functions. But, I thought we should be doing quite well with getting Fortran ordered arrays. I only paid attention where we have more extensive loops internally. Nathniel, Does patsy guarantee memory layout (F-contiguous) when creating design matrices? Josef > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion