Re: [Numpy-discussion] Why do mgrid and meshgrid not return broadcast arrays?

2017-03-08 Thread Warren Weckesser
On Wed, Mar 8, 2017 at 9:48 PM, Juan Nunez-Iglesias 
wrote:

> I was a bit surprised to discover that both meshgrid nor mgrid return
> fully instantiated arrays, when simple broadcasting (ie with stride=0 for
> other axes) is functionally identical and happens much, much faster.
>
>

Take a look at ogrid:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.ogrid.html

Warren


I wrote my own function to do this:
>
>
> *def broadcast_mgrid(arrays):*
> *shape = tuple(map(len, arrays))*
> *ndim = len(shape)*
> *result = []*
> *for i, arr in enumerate(arrays, start=1):*
> *reshaped = np.broadcast_to(arr[[...] + [np.newaxis] * (ndim -
> i)],*
> *   shape)*
> *result.append(reshaped)*
> *return result*
>
>
> For even a modest-sized 512 x 512 grid, this version is close to 100x
> faster:
>
>
> *In [154]: %timeit th.broadcast_mgrid((np.arange(512), np.arange(512)))*
> *1 loops, best of 3: 25.9 µs per loop*
>
> *In [156]: %timeit np.meshgrid(np.arange(512), np.arange(512))*
> *100 loops, best of 3: 2.02 ms per loop*
>
> *In [157]: %timeit np.mgrid[:512, :512]*
> *100 loops, best of 3: 4.84 ms per loop*
>
>
> Is there a conscious design decision as to why this isn’t what
> meshgrid/mgrid do already? Or would a PR be welcome to do this?
>
> Thanks,
>
> Juan.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Intel random number package

2016-10-26 Thread Warren Weckesser
On Wed, Oct 26, 2016 at 3:24 PM, Nathaniel Smith  wrote:

> On Wed, Oct 26, 2016 at 9:10 AM, Julian Taylor
>  wrote:
> > On 10/26/2016 06:00 PM, Julian Taylor wrote:
> >>
> >> On 10/26/2016 10:59 AM, Ralf Gommers wrote:
> >>>
> >>>
> >>>
> >>> On Wed, Oct 26, 2016 at 8:33 PM, Julian Taylor
> >>> mailto:jtaylor.deb...@googlemail.com>>
> >>> wrote:
> >>>
> >>> On 26.10.2016 06:34, Charles R Harris wrote:
> >>> > Hi All,
> >>> >
> >>> > There is a proposed random number package PR now up on github:
> >>> > https://github.com/numpy/numpy/pull/8209
> >>> . It is from
> >>> > oleksandr-pavlyk  >>> > and implements
> >>> > the number random number package using MKL for increased speed.
> >>> I think
> >>> > we are definitely interested in the improved speed, but I'm not
> >>> sure
> >>> > numpy is the best place to put the package. I'd welcome any
> >>> comments on
> >>> > the PR itself, as well as any thoughts on the best way organize
> >>> or use
> >>> > of this work. Maybe scikit-random
> >>>
> >>>
> >>> Note that this thread is a continuation of
> >>> https://mail.scipy.org/pipermail/numpy-discussion/
> 2016-July/075822.html
> >>>
> >>>
> >>>
> >>> I'm not a fan of putting code depending on a proprietary library
> >>> into numpy.
> >>> This should be a standalone package which may provide the same
> >>> interface
> >>> as numpy.
> >>>
> >>>
> >>> I don't really see a problem with that in principle. Numpy can use
> Intel
> >>> MKL (and Accelerate) as well if it's available. It needs some thought
> >>> put into the API though - a ``numpy.random_intel`` module is certainly
> >>> not what we want.
> >>>
> >>
> >> For me there is a difference between being able to optionally use a
> >> proprietary library as an alternative to free software libraries if the
> >> user wishes to do so and offering functionality that only works with
> >> non-free software.
> >> We are providing a form of advertisement for them by allowing it (hey if
> >> you buy this black box that you cannot modify or use freely you get this
> >> neat numpy feature!).
> >>
> >> I prefer for the full functionality of numpy to stay available with a
> >> stack of community owned software, even if it may be less powerful that
> >> way.
> >
> > But then if this is really just the same random numbers numpy already
> > provides just faster, it is probably acceptable in principle. I haven't
> > actually looked at the PR yet.
>
> The RNG stream is totally different, so yeah, it can't just be a
> silent drop-in replacement like BLAS/LAPACK.
>
> The patch also adds ~10,000 lines of code; here's an example of what
> some of it looks like:
>
> https://github.com/oleksandr-pavlyk/numpy/blob/
> b53880432c19356f4e54b520958272516bf391a2/numpy/random_intel/
> mklrand/mkl_distributions.cpp#L1724-L1833
>
> I don't see how we can realistically commit to maintaining this.
>
>

FYI:  numpy already maintains code exactly like that:
https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/distributions.c#L262-L397

Perhaps the point should be that the numpy devs won't want to maintain two
nearly identical versions of that code.

Warren




> I'm also not really seeing how shipping it as part of numpy provides
> extra benefits to maintainers or users? AFAICT right now it's
> basically structured as a standalone library that's been dropped into
> the numpy source tree, and it would be just as easy to ship separately
> (or am I wrong?). And since the public API is that all the
> functionality comes from importing this specific new module
> ('numpy.random_intel'), it'd be a one-line change for users to import
> from a non-numpy namespace, like 'mkl.random' or whatever. If it were
> more integrated with the rest of numpy then the trade-offs would be
> more complicated, but in its present form this seems like an easy
> call.
>
> The other question is whether it could/should change to *become* more
> integrated... that's more tricky. There's been some work towards
> supporting swappable backends inside np.random; but the focus has
> mostly been on allowing new core generators, though, and this code
> seems to want to take over the whole thing (core generator +
> distributions), so even once the swappable backends stuff is working
> I'm not sure it would be relevant here. The one case I can think of
> that does seem promising is that if we get an API for users to say "I
> don't care about stream compatibility, just give me un-reproducible
> variates as fast as you can", then it might make sense for that to
> silently use MKL if available -- this would be pretty analogous to the
> use of MKL in np.linalg. But we don't have that API yet, I'm not sure
> how the MKL fallback could be maintainably implemented given that it
> would require somehow swapping the entire RandomState implementation,
> 

Re: [Numpy-discussion] Integers to integer powers

2016-05-20 Thread Warren Weckesser
On Fri, May 20, 2016 at 4:22 PM, Alan Isaac  wrote:

> On 5/19/2016 11:30 PM, Nathaniel Smith wrote:
>
>> the last bad
>> option IMHO would be that we make int ** (negative int) an error in
>> all cases, and the error message can suggest that instead of writing
>>
>> np.array(2) ** -2
>>
>> they should instead write
>>
>> np.array(2) ** -2.0
>>
>> (And similarly for np.int64(2) ** -2 versus np.int64(2) ** -2.0.)
>>
>
>
>
> Fwiw, Haskell has three exponentiation operators
> because of such ambiguities.  I don't use C, but
> I think the contrasting decision there was to
> always return a double, which has a clear attraction
> since for any fixed-width integral type, most of the
> possible input pairs overflow the type.
>
> My core inclination would be to use (what I understand to be)
> the C convention that integer exponentiation always produces
> a double, but to support dtype-specific exponentiation with
> a function.



C doesn't have an exponentiation operator.  The C math library has pow,
powf and powl, which (like any C functions) are explicitly typed.

Warren


  But this is just a user's perspective.
>
> Cheers,
> Alan Isaac
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] reshaping empty array bug?

2016-02-23 Thread Warren Weckesser
On Tue, Feb 23, 2016 at 11:32 AM, Benjamin Root 
wrote:

> Not exactly sure if this should be a bug or not. This came up in a fairly
> general function of mine to process satellite data. Unexpectedly, one of
> the satellite files had no scans in it, triggering an exception when I
> tried to reshape the data from it.
>
> >>> import numpy as np
> >>> a = np.zeros((0, 5*64))
> >>> a.shape
> (0, 320)
> >>> a.shape = (0, 5, 64)
> >>> a.shape
> (0, 5, 64)
> >>> a.shape = (0, 5*64)
> >>> a.shape = (0, 5, -1)
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: total size of new array must be unchanged
>
> So, if I know all of the dimensions, I can reshape just fine. But if I
> wanted to use the nifty -1 semantic, it completely falls apart. I can see
> arguments going either way for whether this is a bug or not.
>


When you try `a.shape = (0, 5, -1)`, the size of the third dimension is
ambiguous.  From the Zen of Python:  "In the face of ambiguity, refuse the
temptation to guess."

Warren




> Thoughts?
>
> Ben Root
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array of random numbers fails to construct

2015-12-07 Thread Warren Weckesser
On Sun, Dec 6, 2015 at 6:55 PM, Allan Haldane 
wrote:

>
> I've also often wanted to generate large datasets of random uint8 and
> uint16. As a workaround, this is something I have used:
>
> np.ndarray(100, 'u1', np.random.bytes(100))
>
> It has also crossed my mind that np.random.randint and np.random.rand
> could use an extra 'dtype' keyword.



+1.  Not a high priority, but it would be nice.

Warren



> It didn't look easy to implement though.
>
> Allan
>
> On 12/06/2015 04:55 PM, DAVID SAROFF (RIT Student) wrote:
>
>> Matthew,
>>
>> That looks right. I'm concluding that the .astype(np.uint8) is applied
>> after the array is constructed, instead of during the process. This
>> random array is a test case. In the production analysis of radio
>> telescope data this is how the data comes in, and there is no  problem
>> with 10GBy files.
>> linearInputData = np.fromfile(dataFile, dtype = np.uint8, count = -1)
>> spectrumArray = linearInputData.reshape(nSpectra,sizeSpectrum)
>>
>>
>> On Sun, Dec 6, 2015 at 4:07 PM, Matthew Brett > > wrote:
>>
>> Hi,
>>
>> On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student)
>> mailto:dps7...@rit.edu>> wrote:
>> > This works. A big array of eight bit random numbers is constructed:
>> >
>> > import numpy as np
>> >
>> > spectrumArray = np.random.randint(0,255,
>> (2**20,2**12)).astype(np.uint8)
>> >
>> >
>> >
>> > This fails. It eats up all 64GBy of RAM:
>> >
>> > spectrumArray = np.random.randint(0,255,
>> (2**21,2**12)).astype(np.uint8)
>> >
>> >
>> > The difference is a factor of two, 2**21 rather than 2**20, for the
>> extent
>> > of the first axis.
>>
>> I think what's happening is that this:
>>
>> np.random.randint(0,255, (2**21,2**12))
>>
>> creates 2**33 random integers, which (on 64-bit) will be of dtype
>> int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes
>> = 512 GiB.
>>
>> Cheers,
>>
>> Matthew
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org 
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>>
>>
>> --
>> David P. Saroff
>> Rochester Institute of Technology
>> 54 Lomb Memorial Dr, Rochester, NY 14623
>> david.sar...@mail.rit.edu  | (434)
>> 227-6242
>>
>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead

2015-10-29 Thread Warren Weckesser
On Tue, Oct 27, 2015 at 12:31 AM, Nathaniel Smith  wrote:

> Hi all,
>
> Apparently it is not well known that if you have a Python project
> source tree (e.g., a numpy checkout), then the correct way to install
> it is NOT to type
>
>   python setup.py install   # bad and broken!
>
> but rather to type
>
>   pip install .
>
>

FWIW, I don't see any mention of this in the numpy docs, but I do see a lot
of instructions involving `setup.py build` and `setup.py install`.  See,
for example, INSTALL.txt.  Also see
http://docs.scipy.org/doc/numpy/user/install.html#building-from-source
So I guess it is not surprising that it is not well known.

Warren



> (I.e., pip install isn't just for packages on pypi -- you can also
> pass it the path to an arbitrary source directory or the URL of a
> source tarball and it will do its thing. In this case "install ."
> means "install the project in the current directory".)
>
> These don't quite have identical results -- the main difference is
> that the latter makes sure that proper metadata gets installed so that
> later on it will be possible to upgrade or uninstall correctly. If you
> call setup.py directly, and then later you try to upgrade your
> package, then it's entirely possible to end up with a mixture of old
> and new versions of the package installed in your PYTHONPATH. (One
> common effect is in numpy's case is that we get people sending us
> mysterious bug reports about failing tests in files don't even exist
> (!) -- because nose is finding tests in files from one version of
> numpy and running them against a different version of numpy.)
>
> But this isn't the only issue -- using pip also avoids a bunch of
> weird corner cases in distutils/setuptools. E.g., if setup.py uses
> plain distutils, then it turns out this will mangle numpy version
> numbers in ways that cause weird horribleness -- see [1] for a bug
> report of the form "matplotlib doesn't build anymore" which turned out
> to be because of using 'setup.py install' to install numpy. OTOH if
> setup.py uses setuptools then you get different weirdnesses, like you
> can easily end up with multiple versions of the same library installed
> simultaneously.
>
> And finally, an advantage of getting used to using 'pip install .' now
> is that you'll be prepared for the glorious future when we kill
> distutils and get rid of setup.py entirely in favor of something less
> terrible [2].
>
> So a proposal that came out of the discussion in [1] is that we modify
> numpy's setup.py now so that if you try running
>
> python setup.py install
>
> you get
>
> Error: Calling 'setup.py install' directly is NOT SUPPORTED!
> Instead, do:
>
> pip install .
>
> Alternatively, if you want to proceed at your own risk, you
> can try 'setup.py install --force-raw-setup.py'
> For more information see http://...
>
> (Other setup.py commands would continue to work as normal.)
>
> I believe that this would also break both 'easy_install numpy', and
> attempts to install numpy via the setup_requires= argument to
> setuptools.setup (because setup_requires= implicitly calls
> easy_install). install_requires= would *not* be affected, and
> setup_requires= would still be fine in cases where numpy was already
> installed.
>
> This would hopefully cut down on the amount of time everyone spends
> trying to track down these stupid weird bugs, but it will also require
> some adjustment in people's workflows, so... objections? concerns?
>
> -n
>
> [1] https://github.com/numpy/numpy/issues/6551
> [2]
> https://mail.python.org/pipermail/distutils-sig/2015-October/027360.html
>
> --
> Nathaniel J. Smith -- http://vorpus.org
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Any interest in a 'heaviside' ufunc?

2015-02-03 Thread Warren Weckesser
On Tue, Feb 3, 2015 at 11:14 PM, Sturla Molden 
wrote:

> Warren Weckesser  wrote:
>
> > 0if x < 0
> > heaviside(x) =  0.5  if x == 0
> > 1if x > 0
> >
>
> This is not correct. The discrete form of the Heaviside step function has
> the value 1 for x == 0.
>
> heaviside = lambda x : 1 - (x < 0).astype(int)
>
>
>


By "discrete form", do you mean discrete time (i.e. a function defined on
the integers)?  Then I agree, the discrete time unit step function is
defined as

u(k) = 0  k < 0
   1  k >= 0

for integer k.

The domain of the proposed Heaviside function is not discrete; it is
defined for arbitrary floating point (real) arguments.  In this case, the
choice heaviside(0) = 0.5 is a common convention. See for example,

* http://mathworld.wolfram.com/HeavisideStepFunction.html
* http://www.mathworks.com/help/symbolic/heaviside.html
* http://en.wikipedia.org/wiki/Heaviside_step_function, in particular
http://en.wikipedia.org/wiki/Heaviside_step_function#Zero_argument

Other common conventions are the right-continuous version that you prefer
(heavisde(0) = 1), or the left-continuous version (heaviside(0) = 0).

We can accommodate the alternatives with an additional argument that sets
the value at 0:

heaviside(x, zero_value=0.5)


Warren



>
> Sturla
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Any interest in a 'heaviside' ufunc?

2015-02-03 Thread Warren Weckesser
I have an implementation of the Heaviside function as numpy ufunc.  Is
there any interest in adding this to numpy?  The function is simply:

0if x < 0
heaviside(x) =  0.5  if x == 0
1if x > 0


Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] F2PY cannot see module-scope variables

2015-01-26 Thread Warren Weckesser
On 1/26/15, Yuxiang Wang  wrote:
> Dear all,
>
> Sorry about being new to both Fortran 90 and f2py.
>
> I have a module in fortran, written as follows, with a module-scope variable
> dp:
>
> 
> ! testf2py.f90
> module testf2py
> implicit none
> private
> public dp, i1
> integer, parameter :: dp=kind(0.d0)
> contains
> real(dp) function i1(m)
> real(dp), intent(in) :: m(3, 3)
> i1 = m(1, 1) + m(2, 2) + m(3, 3)
> return
> end function i1
> end module testf2py
> 
>
> Then, if I run f2py -c testf2py.f90 -m testf2py
>
> It would report an error, stating that dp was not declared.
>
> If I copy the module-scope to the function-scope, it would work.
>
> 
> ! testf2py.f90
> module testf2py
> implicit none
> private
> public i1
> integer, parameter :: dp=kind(0.d0)
> contains
> real(dp) function i1(m)
> integer, parameter :: dp=kind(0.d0)
> real(dp), intent(in) :: m(3, 3)
> i1 = m(1, 1) + m(2, 2) + m(3, 3)
> return
> end function i1
> end module testf2py
> 
>
> However, this does not look like the best coding practice though, as
> it is pretty "wet".
>
> Any ideas?
>
> Thanks,
>
> Shawn
>


Shawn,

I posted a suggestion as an answer to your question on stackoverflow:
http://stackoverflow.com/questions/28162922/f2py-cannot-see-module-scope-variables

For the mailing-list-only folks, here's what I wrote:

Here's a work-around, in which `dp` is moved to a `types` module, and
the `use types` statement is added to the function `i1`.

! testf2py.f90

module types
implicit none
integer, parameter :: dp=kind(0.d0)
end module types

module testf2py
implicit none
private
public i1
contains
real(dp) function i1(m)
use types
real(dp), intent(in) :: m(3, 3)
i1 = m(1, 1) + m(2, 2) + m(3, 3)
return
end function i1
end module testf2py

In action:

In [6]: import numpy as np

In [7]: m = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])

In [8]: import testf2py

In [9]: testf2py.testf2py.i1(m)
Out[9]: 150.0

The change is similar to the third option that I described in this
answer: 
http://stackoverflow.com/questions/12523524/f2py-specifying-real-precision-in-fortran-when-interfacing-with-python/12524403#12524403


Warren



> --
> Yuxiang "Shawn" Wang
> Gerling Research Lab
> University of Virginia
> yw...@virginia.edu
> +1 (434) 284-0836
> https://sites.google.com/a/virginia.edu/yw5aj/
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New function `count_unique` to generate contingency tables.

2015-01-25 Thread Warren Weckesser
On Sun, Jan 25, 2015 at 1:48 PM, Warren Weckesser <
warren.weckes...@gmail.com> wrote:

>
>
> On Wed, Aug 13, 2014 at 6:17 PM, Eelco Hoogendoorn <
> hoogendoorn.ee...@gmail.com> wrote:
>
>> Its pretty easy to implement this table functionality and more on top of
>> the code I linked above. I still think such a comprehensive overhaul of
>> arraysetops is worth discussing.
>>
>> import numpy as np
>> import grouping
>> x = [1, 1, 1, 1, 2, 2, 2, 2, 2]
>> y = [3, 4, 3, 3, 3, 4, 5, 5, 5]
>> z = np.random.randint(0,2,(9,2))
>> def table(*keys):
>> """
>> desired table implementation, building on the index object
>> cleaner, and more functionality
>> performance should be the same
>> """
>> indices  = [grouping.as_index(k, axis=0) for k in keys]
>> uniques  = [i.unique  for i in indices]
>> inverses = [i.inverse for i in indices]
>> shape= [i.groups  for i in indices]
>> t = np.zeros(shape, np.int)
>> np.add.at(t, inverses, 1)
>> return tuple(uniques), t
>> #here is how to use
>> print table(x,y)
>> #but we can use fancy keys as well; here a composite key and a row-key
>> print table((x,y), z)
>> #this effectively creates a sparse matrix equivalent of your desired table
>> print grouping.count((x,y))
>>
>>
>> On Wed, Aug 13, 2014 at 11:25 PM, Warren Weckesser <
>> warren.weckes...@gmail.com> wrote:
>>
>>>
>>>
>>>
>>> On Wed, Aug 13, 2014 at 5:15 PM, Benjamin Root  wrote:
>>>
>>>> The ever-wonderful pylab mode in matplotlib has a table function for
>>>> plotting a table of text in a plot. If I remember correctly, what would
>>>> happen is that matplotlib's table() function will simply obliterate the
>>>> numpy's table function. This isn't a show-stopper, I just wanted to point
>>>> that out.
>>>>
>>>> Personally, while I wasn't a particular fan of "count_unique" because I
>>>> wouldn't necessarially think of it when needing a contingency table, I do
>>>> like that it is verb-ish. "table()", in this sense, is not a verb. That
>>>> said, I am perfectly fine with it if you are fine with the name collision
>>>> in pylab mode.
>>>>
>>>>
>>>
>>> Thanks for pointing that out.  I only changed it to have something that
>>> sounded more table-ish, like the Pandas, R and Matlab functions.   I won't
>>> update it right now, but if there is interest in putting it into numpy,
>>> I'll rename it to avoid the pylab conflict.  Anything along the lines of
>>> `crosstab`, `xtable`, etc., would be fine with me.
>>>
>>> Warren
>>>
>>>
>>>
>>>> On Wed, Aug 13, 2014 at 4:57 PM, Warren Weckesser <
>>>> warren.weckes...@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 12, 2014 at 12:51 PM, Eelco Hoogendoorn <
>>>>> hoogendoorn.ee...@gmail.com> wrote:
>>>>>
>>>>>> ah yes, that's also an issue I was trying to deal with. the semantics
>>>>>> I prefer in these type of operators, is (as a default), to have every 
>>>>>> array
>>>>>> be treated as a sequence of keys, so if calling unique(arr_2d), youd get
>>>>>> unique rows, unless you pass axis=None, in which case the array is
>>>>>> flattened.
>>>>>>
>>>>>> I also agree that the extension you propose here is useful; but
>>>>>> ideally, with a little more discussion on these subjects we can converge 
>>>>>> on
>>>>>> an even more comprehensive overhaul
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 12, 2014 at 6:33 PM, Joe Kington 
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Aug 12, 2014 at 11:17 AM, Eelco Hoogendoorn <
>>>>>>> hoogendoorn.ee...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks. Prompted by that stackoverflow question, and similar
>>>>>>>> problems I had to deal with myself, I started working on a much more
>>>>>>>> general extension to numpy's functionality in this space. Like you 

Re: [Numpy-discussion] New function `count_unique` to generate contingency tables.

2015-01-25 Thread Warren Weckesser
On Wed, Aug 13, 2014 at 6:17 PM, Eelco Hoogendoorn <
hoogendoorn.ee...@gmail.com> wrote:

> Its pretty easy to implement this table functionality and more on top of
> the code I linked above. I still think such a comprehensive overhaul of
> arraysetops is worth discussing.
>
> import numpy as np
> import grouping
> x = [1, 1, 1, 1, 2, 2, 2, 2, 2]
> y = [3, 4, 3, 3, 3, 4, 5, 5, 5]
> z = np.random.randint(0,2,(9,2))
> def table(*keys):
> """
> desired table implementation, building on the index object
> cleaner, and more functionality
> performance should be the same
> """
> indices  = [grouping.as_index(k, axis=0) for k in keys]
> uniques  = [i.unique  for i in indices]
> inverses = [i.inverse for i in indices]
> shape= [i.groups  for i in indices]
> t = np.zeros(shape, np.int)
> np.add.at(t, inverses, 1)
> return tuple(uniques), t
> #here is how to use
> print table(x,y)
> #but we can use fancy keys as well; here a composite key and a row-key
> print table((x,y), z)
> #this effectively creates a sparse matrix equivalent of your desired table
> print grouping.count((x,y))
>
>
> On Wed, Aug 13, 2014 at 11:25 PM, Warren Weckesser <
> warren.weckes...@gmail.com> wrote:
>
>>
>>
>>
>> On Wed, Aug 13, 2014 at 5:15 PM, Benjamin Root  wrote:
>>
>>> The ever-wonderful pylab mode in matplotlib has a table function for
>>> plotting a table of text in a plot. If I remember correctly, what would
>>> happen is that matplotlib's table() function will simply obliterate the
>>> numpy's table function. This isn't a show-stopper, I just wanted to point
>>> that out.
>>>
>>> Personally, while I wasn't a particular fan of "count_unique" because I
>>> wouldn't necessarially think of it when needing a contingency table, I do
>>> like that it is verb-ish. "table()", in this sense, is not a verb. That
>>> said, I am perfectly fine with it if you are fine with the name collision
>>> in pylab mode.
>>>
>>>
>>
>> Thanks for pointing that out.  I only changed it to have something that
>> sounded more table-ish, like the Pandas, R and Matlab functions.   I won't
>> update it right now, but if there is interest in putting it into numpy,
>> I'll rename it to avoid the pylab conflict.  Anything along the lines of
>> `crosstab`, `xtable`, etc., would be fine with me.
>>
>> Warren
>>
>>
>>
>>> On Wed, Aug 13, 2014 at 4:57 PM, Warren Weckesser <
>>> warren.weckes...@gmail.com> wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Tue, Aug 12, 2014 at 12:51 PM, Eelco Hoogendoorn <
>>>> hoogendoorn.ee...@gmail.com> wrote:
>>>>
>>>>> ah yes, that's also an issue I was trying to deal with. the semantics
>>>>> I prefer in these type of operators, is (as a default), to have every 
>>>>> array
>>>>> be treated as a sequence of keys, so if calling unique(arr_2d), youd get
>>>>> unique rows, unless you pass axis=None, in which case the array is
>>>>> flattened.
>>>>>
>>>>> I also agree that the extension you propose here is useful; but
>>>>> ideally, with a little more discussion on these subjects we can converge 
>>>>> on
>>>>> an even more comprehensive overhaul
>>>>>
>>>>>
>>>>> On Tue, Aug 12, 2014 at 6:33 PM, Joe Kington 
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 12, 2014 at 11:17 AM, Eelco Hoogendoorn <
>>>>>> hoogendoorn.ee...@gmail.com> wrote:
>>>>>>
>>>>>>> Thanks. Prompted by that stackoverflow question, and similar
>>>>>>> problems I had to deal with myself, I started working on a much more
>>>>>>> general extension to numpy's functionality in this space. Like you 
>>>>>>> noted,
>>>>>>> things get a little panda-y, but I think there is a lot of panda's
>>>>>>> functionality that could or should be part of the numpy core, a robust 
>>>>>>> set
>>>>>>> of grouping operations in particular.
>>>>>>>
>>>>>>> see pastebin here:
>>>>>>> http://pastebin.com/c5WLWPbp
>>>>>>>
&g

Re: [Numpy-discussion] Characteristic of a Matrix.

2015-01-05 Thread Warren Weckesser
On Mon, Jan 5, 2015 at 1:58 PM, Nathaniel Smith  wrote:

> I'm afraid that I really don't understand what you're trying to say. Is
> there something that you think numpy should be doing differently?
>
>

This is a case similar to the issue discussed in
https://github.com/numpy/numpy/issues/5303.  Instead of getting an error
(because the arguments don't create the expected 2-d matrix), a matrix with
dtype object and shape (1, 3) is created.

Warren



> On Mon, Jan 5, 2015 at 6:40 PM, Colin J. Williams 
> wrote:
>
>> One of the essential characteristics of a matrix is that it be
>> rectangular.
>>
>> This is neither spelt out or checked currently.
>>
>> The Doc description refers to a class:
>>
>>- *class *numpy.matrix[source]
>>
>> 
>>
>> Returns a matrix from an array-like object, or from a string of data. A
>> matrix is aspecialized 2-D array that retains its 2-D
>> nature through operations. It has certain special operators, such as *
>> (matrix multiplication) and ** (matrix power).
>>
>> This illustrates a failure, which is reported later in the calculation:
>>
>> A2= np.matrix([[1, 2, -2], [-3, -1, 4], [4, 2 -6]])
>>
>> Here 2 - 6 is treated as an expression.
>>
>> Wikipedia offers:
>>
>> In mathematics , a *matrix*
>> (plural *matrices*) is a rectangular
>>  *array
>> *[1]
>>  of
>> numbers , symbols
>> , or expressions
>> , arranged in 
>> *rows
>> * and *columns
>> *.[2]
>> [3]
>>  The
>> individual items in a matrix are called its *elements* or *entries*. An
>> example of a matrix with 2 rows and 3 columns is
>> [image: \begin{bmatrix}1 & 9 & -13 \\20 & 5 & -6 \end{bmatrix}.]In the
>> Numpy context, the symbols or expressions need to be evaluable.
>>
>> Colin W.
>>
>>
>>
>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add `nrows` to `genfromtxt`

2014-11-02 Thread Warren Weckesser
On 11/2/14, Alexander Belopolsky  wrote:
> On Sun, Nov 2, 2014 at 2:32 PM, Warren Weckesser
> > wrote:
>
>>
>>> Still, the case of dtype=None, name=None is problematic.   Suppose I
>>> want
>>> genfromtxt()  to detect the column names from the 1-st row and data
>>> types
>>> from the 3-rd.  How would you do that?
>>>
>>>
>>
>> This may sound like a cop out, but at some point, I stop trying to make
>> genfromtxt() handle every possible case, and instead I would write a
>> custom
>> header reader to handle this.
>>
>
> In the abstract, I would agree with you.  It is often the case that 2-3
> lines of clear Python code is better than a terse function call with half a
> dozen non-obvious options.  Specifically, I would be against the proposed
> slice_rows because it is either equivalent to  genfromtxt(islice(..), ..)
> or hard to specify.


I don't have much more to add to the API discussion at the moment, but
I want to make sure one aspect is clear. (Sorry for the noise if the
following is obvious.)

In an earlier email, I gave my interpretation of the semantics of
`slice_rows` (and `max_rows`), which is that `genfromtxt(f, ...,
slice_rows=arg)` produces the same result as `genfromtxt(f,
...)[arg]`. (The difference is that it only consumes items from the
input iterator f as required by `arg`).  This isn't the same as
`genfromtxt(islice(f, ), ...)`, because `genfromtxt` skips
comments and blank lines.  (It also skips invalid lines if the
argument `invalid_raise=False` is used.)  So if the input file was

-
 1  10
# A comment.
 2  20

 3  30
 4  40
 5  50
-

Then `genfromtxt(f, dtype=int, slice_rows=slice(4))` would produce
`array([[1, 10], [2, 20], [3, 30], [4, 40]])`, while
`genfromtxt(islice(f, 4), dtype=int)` would produce `array([1, 10],
[2, 20]])`.

That's my interpretation of how `max_rows` or `slice_rows` should
work.  If that is not what other folks expect, than that should also
be part of the discussion.

Warren



>
> On the other hand, skip_rows is different for two reasons:
>
> 1. It is not a new option.  It is currently a deprecated alias to
> skip_header, so a change is expected - either removal or redefinition.
> 2. The intended use-case - inferring column names and type information from
> a file where data is separated from the column names is hard to code
> explicitly.  (Try it!)
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add `nrows` to `genfromtxt`

2014-11-02 Thread Warren Weckesser
On Sun, Nov 2, 2014 at 2:18 PM, Alexander Belopolsky 
wrote:

>
> On Sun, Nov 2, 2014 at 1:56 PM, Warren Weckesser <
> warren.weckes...@gmail.com> wrote:
>
>> Or you could just call genfromtxt() once with `max_rows=1` to skip a
>> row.  (I'm assuming that the first argument to genfromtxt is the open file
>> object--or some other iterator--and not the filename.)
>
>
> That's hackish.  If I have to resort to something like this, I would just
> call next() on the open file object or iterator.
>


I agree, calling genfromtxt to skip a line is silly.  Calling next() makes
much more sense.



>
> Still, the case of dtype=None, name=None is problematic.   Suppose I want
> genfromtxt()  to detect the column names from the 1-st row and data types
> from the 3-rd.  How would you do that?
>
>

This may sound like a cop out, but at some point, I stop trying to make
genfromtxt() handle every possible case, and instead I would write a custom
header reader to handle this.

Warren



> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add `nrows` to `genfromtxt`

2014-11-02 Thread Warren Weckesser
On Sat, Nov 1, 2014 at 4:41 PM, Alexander Belopolsky 
wrote:

>
> On Sat, Nov 1, 2014 at 3:15 PM, Warren Weckesser <
> warren.weckes...@gmail.com> wrote:
>
>> Is there wider interest in such an argument to `genfromtxt`?  For my
>> use-cases, `max_rows` is sufficient.  I can't recall ever needing the full
>> generality of a slice for pulling apart a text file.  Does anyone have
>> compelling use-cases that are not handled by `max_rows`?
>>
>
> It is occasionally useful to be able to skip rows after the header.  Maybe
> we should de-deprecate skip_rows and give it the meaning different from
> skip_header in case of names = None?  For example,
>
> genfromtxt(fname,  skip_header= 3, skip_rows = 1, max_rows = 100)
>
> would mean skip 3 lines, read column names from the 4-th, skip 5-th,
> process up to 100 more lines.  This may be useful if the file contains some
> meta-data about the column below the header line.  For example, it is
> common to put units of measurement below the column names.
>


Or you could just call genfromtxt() once with `max_rows=1` to skip a row.
(I'm assuming that the first argument to genfromtxt is the open file
object--or some other iterator--and not the filename.)



>
> Another application could be processing a large text file in chunks, which
> again can be covered nicely by  skip_rows/max_rows.
>


You don't really need `skip_rows` for this.  In a previous email (and in
https://github.com/numpy/numpy/pull/5103) I gave an example of using
`max_rows` for handling a file that doesn't have a header.  If the file has
a header, you could process the file in batches using something like the
following example, where the dtype determined in the first batch is used
when reading the subsequent batches:

In [12]: !cat foo.dat
  ab c
1.0  2.0  -9.0
3.0  4.0  -7.6
5.0  6.0  -1.0
7.0  8.0  -3.3
9.0  0.0  -3.4

In [13]: f = open("foo.dat", "r")

In [14]: batch1 = genfromtxt(f, dtype=None, names=True, max_rows=2)

In [15]: batch1
Out[15]:
array([(1.0, 2.0, -9.0), (3.0, 4.0, -7.6)],
  dtype=[('a', ' I cannot think of a situation where I would need more generality such as
> reading every 3rd row or rows with the given numbers.  Such processing is
> normally done after the text data is loaded into an array.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add `nrows` to `genfromtxt`

2014-11-01 Thread Warren Weckesser
On 11/1/14, Alan G Isaac  wrote:
> On 11/1/2014 3:15 PM, Warren Weckesser wrote:
>> I intended the result of `genfromtxt(..., max_rows=n)` to produce the same
>> array as produced by `genfromtxt(...)[:n]`.
>
> I find that counterintuitive.
> I would first honor skip_header.


Sorry for the terse explanation.  I meant for `...` to indicate any
other arguments, including skip_header.

Warren


> Cheers,
> Alan
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add `nrows` to `genfromtxt`

2014-11-01 Thread Warren Weckesser
On 11/1/14, Alan G Isaac  wrote:
> On 11/1/2014 4:41 PM, Alexander Belopolsky wrote:
>> I cannot think of a situation where I would need more generality such as
>> reading every 3rd row or rows with the given numbers.  Such processing is
>> normally done after the text data is loaded into an array.
>
>
> I have done this as cheaper than random selection for a quick and dirty
> look at large data sets.   Setting maxrows can be very different if the
> data has been stored in some structured manner.
>
> I suppose my view is something like this.  We are considering adding a
> keyword.
> If we can get greater functionality at about the same cost, why not?
> In that case, it is not really useful to speculate about use cases.
> If the costs are substantially greater, then that should be stated.
> Cost is a good reason not to do something.
>


`slice_rows` is a generalization of `max_rows`.  It will probably take
a bit more code to implement, and it will require more tests and more
documentation.  So the cost isn't really the same.  But if it solves
real problems for users, the cost may be worth it.

Warren


> fwiw,
> Alan Isaac
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add `nrows` to `genfromtxt`

2014-11-01 Thread Warren Weckesser
On Sat, Nov 1, 2014 at 10:54 AM, Alan G Isaac  wrote:

> On 11/1/2014 10:31 AM, Warren Weckesser wrote:
> > Alan's suggestion to use a slice is interesting, but I'd like to
> > see a more concrete proposal for the API.  For example, how does
> > it interact with `skip_header` and `skip_footer`?  How would one
> > use it to read a file in batches?
>
>
> I'm probably just not understanding the question, but the initial
> answer I will give is, "just like the proposal for `max_rows`".
>
> That is, skip_header and skip_footer are honored, and the remainder
> of the file is sliced. For the equivalent of say `max_rows=500`,
> one would say `slice_rows=slice(500)`.
>
> Perhaps you could provide an example illustrating the issues this
> reply overlooks.
>
> Cheers,
> Alan
>


OK, so `slice_rows=slice(n)` should behave the same as `max_rows=n`.
Here's my take on how `slice_rows` could be handled.

I intended the result of `genfromtxt(..., max_rows=n)` to produce the same
array as produced by `genfromtxt(...)[:n]`.  So a reasonable way to define
the behavior of `slice_rows` is that `gengromtxt(..., slice_rows=arg)`
returns the same array as `genfromtxt(...)[arg]`.   With that
specification, it is natural for `slice_rows` to accept any object that is
valid for indexing, e.g. `slice_rows=[0,2,3]` or `slice_rows=10`. (But that
wouldn't necessarily have to be implemented.)

The two differences between `genfromtxt(..., slice_rows=arg)` and
`genfromtxt(...)[arg]` are (1) the former is more efficient--it can simply
ignore the rows that won't be part of the final result; and (2) the former
doesn't consume the input iterator beyond what is requested by `arg`.  For
example, `slice_rows=(2,10,2)` would consume 10 items from the input (or
fewer, if there aren't 10 items in the input). Note that the actual indices
for that slice are [2, 4, 6, 8]; even though index 9 is not included in the
result, the corresponding item is consumed from the input iterator.
(That's how I would interpret it, anyway.)

Because the input argument to `genfromtxt` can be an arbitrary iterator,
the use of `slice_rows=slice(n)` is not compatible with the use of
`skip_footer=m`.  Handling `skip_footer=m` requires looking ahead in the
iterator to see if the end of the input is within `m` items, but in
general, looking ahead is not possible without consuming the items. (The
`max_rows` argument has the same problem.  In the current PR, a ValueError
is raised if both `skip_footer` and `max_rows` are given.)

Related to this is how to handle `slice_rows=slice(-3)`.   Either this is
not allowed (for the same reason that `slice_rows=slice(n), skip_footer=m`
is disallowed), or it results in the entire iterator being consumed (and it
is explained in the docstring that this is the effect of a negative `stop`
value in a slice).

Is there wider interest in such an argument to `genfromtxt`?  For my
use-cases, `max_rows` is sufficient.  I can't recall ever needing the full
generality of a slice for pulling apart a text file.  Does anyone have
compelling use-cases that are not handled by `max_rows`?

Warren




>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add `nrows` to `genfromtxt`

2014-11-01 Thread Warren Weckesser
On 9/24/14, Alan G Isaac  wrote:
> On 9/24/2014 2:52 PM, Jaime Fernández del Río wrote:
>> There is a PR in github that adds a new keyword to the genfromtxt
>> function, to limit the number of rows that actually get read in:
>> https://github.com/numpy/numpy/pull/5103
>
> Sorry to come late to this party, but it seems to me that
> more versatile than an `nrows` keyword for the number of rows
> would be a "rows" keyword for a slice argument.
>
> fwiw,
> Alan Isaac
>

I've continued the PR for the addition of the `nrows` (now
`max_rows`) argument to `genfromtxt` here:
https://github.com/numpy/numpy/pull/5253

Alan's suggestion to use a slice is interesting, but I'd like to
see a more concrete proposal for the API.  For example, how does
it interact with `skip_header` and `skip_footer`?  How would one
use it to read a file in batches?

The following are a couple use-cases for `max_rows` (originally
added as comments at https://github.com/numpy/numpy/pull/5103):


(1) Read a file in batches:

Suppose the file "a.csv" contains:

 0 10
 1 11
 2 12
 3 13
 4 14
 5 15
 6 16
 7 17
 8 18
 9 19

With `max_rows`, the file can be read in batches of, say, 4:

In [31]: f = open("a.csv", "r")

In [32]: genfromtxt(f, dtype=None, max_rows=4)
Out[32]:
array([[ 0, 10],
   [ 1, 11],
   [ 2, 12],
   [ 3, 13]])

In [33]: genfromtxt(f, dtype=None, max_rows=4)
Out[33]:
array([[ 4, 14]
   [ 5, 15],
   [ 6, 16],
   [ 7, 17]])

In [33]: genfromtxt(f, dtype=None, max_rows=4)
Out[33]:
array([[ 8, 18],
   [ 9, 19]])


(2) Multiple arrays in a single file:

I've seen a file format of the form

3 5
1.0 1.5 2.1 2.5 4.8
3.5 1.0 8.7 6.0 2.0
4.2 0.7 4.4 5.3 2.0
2 3
89.1 66.3 42.1
12.3 19.0 56.6

The file contains multiple arrays. Each array is
preceded by a line containing the number of rows
and columns in that array. The `max_rows` argument
would make it easy to read this file with genfromtxt:

In [7]: f = open("b.dat", "r")

In [8]: nrows, ncols = genfromtxt(f, dtype=None, max_rows=1)

In [9]: A = genfromtxt(f, max_rows=nrows)

In [10]: nrows, ncols = genfromtxt(f, dtype=None, max_rows=1)

In [11]: B = genfromtxt(f, max_rows=nrows)

In [12]: A
Out[12]:
array([[ 1. ,  1.5,  2.1,  2.5,  4.8],
   [ 3.5,  1. ,  8.7,  6. ,  2. ],
   [ 4.2,  0.7,  4.4,  5.3,  2. ]])

In [13]: B
Out[13]:
array([[ 89.1,  66.3,  42.1],
   [ 12.3,  19. ,  56.6]])


Warren


> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-16 Thread Warren Weckesser
On Thu, Oct 16, 2014 at 12:40 PM, Nathaniel Smith  wrote:

> On Thu, Oct 16, 2014 at 4:39 PM, Warren Weckesser
>  wrote:
> >
> > On Sun, Oct 12, 2014 at 9:13 PM, Nathaniel Smith  wrote:
> >>
> >> Regarding names: shuffle/permutation is a terrible naming convention
> >> IMHO and shouldn't be propagated further. We already have a good
> >> naming convention for inplace-vs-sorted: sort vs. sorted, reverse vs.
> >> reversed, etc.
> >>
> >> So, how about:
> >>
> >> scramble + scrambled shuffle individual entries within each
> >> row/column/..., as in Warren's suggestion.
> >>
> >> shuffle + shuffled to do what shuffle, permutation do now (mnemonic:
> >> these break a 2d array into a bunch of 1d "cards", and then shuffle
> >> those cards).
> >>
> >> permuted remains indefinitely, with the docstring: "Deprecated alias
> >> for 'shuffled'."
> >
> > That sounds good to me.  (I might go with 'randomize' instead of
> 'scramble',
> > but that's a second-order decision for the API.)
>
> I hesitate to use names like "randomize" because they're less
> informative than they feel seem -- if asked what this operation does
> to an array, then it would be natural to say "it randomizes the
> array". But if told that the random module has a function called
> randomize, then that's not very informative -- everything in random
> randomizes something somehow.
>
>

I had some similar concerns (hence my original "disarrange"), but
"randomize" seemed more likely to be found when searching or browsing the
docs, and while it might be a bit too generic-sounding, it does feel like a
natural verb for the process.   On the other hand, "permute" and "permuted"
are even more natural and unambiguous.  Any objections to those?  (The
existing function is "permutation".)

Whatever the names, the docstrings for the four functions should be
cross-referenced in their "See Also" sections to help users find the
appropriate function.

By the way, "permutation" has a feature not yet mentioned here: if the
argument is an integer 'n', it generates a permutation of arange(n).  In
this case, it acts like matlab's "randperm" function.  Unless we replicate
that in the new function, we shouldn't deprecate "permutation".

Warren



> -n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-16 Thread Warren Weckesser
On Sun, Oct 12, 2014 at 9:13 PM, Nathaniel Smith  wrote:

> On Sun, Oct 12, 2014 at 5:14 PM, Sebastian  wrote:
> >
> > On 2014-10-12 16:54, Warren Weckesser wrote:
> >>
> >>
> >> On Sun, Oct 12, 2014 at 7:57 AM, Robert Kern  >> <mailto:robert.k...@gmail.com>> wrote:
> >>
> >> On Sat, Oct 11, 2014 at 11:51 PM, Warren Weckesser
> >> mailto:warren.weckes...@gmail.com>>
> >> wrote:
> >>
> >> > A small wart in this API is the meaning of
> >> >
> >> >   shuffle(a, independent=False, axis=None)
> >> >
> >> > It could be argued that the correct behavior is to leave the
> >> > array unchanged. (The current behavior can be interpreted as
> >> > shuffling a 1-d sequence of monolithic blobs; the axis argument
> >> > specifies which axis of the array corresponds to the
> >> > sequence index.  Then `axis=None` means the argument is
> >> > a single monolithic blob, so there is nothing to shuffle.)
> >> > Or an error could be raised.
> >> >
> >> > What do you think?
> >>
> >> It seems to me a perfectly good reason to have two methods instead
> of
> >> one. I can't imagine when I wouldn't be using a literal True or
> False
> >> for this, so it really should be two different methods.
> >>
> >>
> >>
> >> I agree, and my first inclination was to propose a different method
> >> (and I had the bikeshedding conversation with myself about the name:
> >> "disarrange", "scramble", "disorder", "randomize", "ashuffle", some
> >> other variation of the word "shuffle", ...), but I figured the first
> >> thing folks would say is "Why not just add options to shuffle?"  So,
> >> choose your battles and all that.
> >>
> >> What do other folks think of making a separate method
> > I'm not a fan of more methods with similar functionality in Numpy. It's
> > already hard to overlook the existing functions and all their possible
> > applications and variants. The axis=None proposal for shuffling all
> > items is very intuitive.
> >
> > I think we don't want to take the path of matlab: a huge amount of
> > powerful functions, but few people know of their powerful possibilities.
>
> I totally agree with this principle, but I think this is an exception
> to the rule, b/c unfortunately in this case the function that we *do*
> have is weird and inconsistent with how most other functions in numpy
> work. It doesn't vectorize! Cf. 'sort' or how a 'shuffle' gufunc
> (k,)->(k,) would work. Also, it's easy to implement the current
> 'shuffle' in terms of any 1d shuffle function, with no explicit loops,
> Warren's disarrange requires an explicit loop. So, we really
> implemented the wrong one, oops. What this means going forward,
> though, is that our only options are either to implement both
> behaviours with two functions, or else to give up on have the more
> natural behaviour altogether. I think the former is the lesser of two
> evils.
>
> Regarding names: shuffle/permutation is a terrible naming convention
> IMHO and shouldn't be propagated further. We already have a good
> naming convention for inplace-vs-sorted: sort vs. sorted, reverse vs.
> reversed, etc.
>
> So, how about:
>
> scramble + scrambled shuffle individual entries within each
> row/column/..., as in Warren's suggestion.
>
> shuffle + shuffled to do what shuffle, permutation do now (mnemonic:
> these break a 2d array into a bunch of 1d "cards", and then shuffle
> those cards).
>
> permuted remains indefinitely, with the docstring: "Deprecated alias
> for 'shuffled'."
>
>

That sounds good to me.  (I might go with 'randomize' instead of
'scramble', but that's a second-order decision for the API.)

Warren


-n
>
> --
> Nathaniel J. Smith
> Postdoctoral researcher - Informatics - University of Edinburgh
> http://vorpus.org
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Warren Weckesser
On Sun, Oct 12, 2014 at 12:14 PM, Warren Weckesser <
warren.weckes...@gmail.com> wrote:

>
>
> On Sat, Oct 11, 2014 at 6:51 PM, Warren Weckesser <
> warren.weckes...@gmail.com> wrote:
>
>> I created an issue on github for an enhancement
>> to numpy.random.shuffle:
>> https://github.com/numpy/numpy/issues/5173
>> I'd like to get some feedback on the idea.
>>
>> Currently, `shuffle` shuffles the first dimension of an array
>> in-place.  For example, shuffling a 2D array shuffles the rows:
>>
>> In [227]: a
>> Out[227]:
>> array([[ 0,  1,  2],
>>[ 3,  4,  5],
>>[ 6,  7,  8],
>>[ 9, 10, 11]])
>>
>> In [228]: np.random.shuffle(a)
>>
>> In [229]: a
>> Out[229]:
>> array([[ 0,  1,  2],
>>[ 9, 10, 11],
>>[ 3,  4,  5],
>>[ 6,  7,  8]])
>>
>>
>> To add an axis keyword, we could (in effect) apply `shuffle` to
>> `a.swapaxes(axis, 0)`.  For a 2-D array, `axis=1` would shuffles
>> the columns:
>>
>> In [232]: a = np.arange(15).reshape(3,5)
>>
>> In [233]: a
>> Out[233]:
>> array([[ 0,  1,  2,  3,  4],
>>[ 5,  6,  7,  8,  9],
>>[10, 11, 12, 13, 14]])
>>
>> In [234]: axis = 1
>>
>> In [235]: np.random.shuffle(a.swapaxes(axis, 0))
>>
>> In [236]: a
>> Out[236]:
>> array([[ 3,  2,  4,  0,  1],
>>[ 8,  7,  9,  5,  6],
>>[13, 12, 14, 10, 11]])
>>
>> So that's the first part--adding an `axis` keyword.
>>
>> The other part of the enhancement request is to add a shuffle
>> behavior that shuffles the 1-d slices *independently*.  That is,
>> for a 2-d array, shuffling with `axis=0` would apply a different
>> shuffle to each column.  In the github issue, I defined a
>> function called `disarrange` that implements this behavior:
>>
>> In [240]: a
>> Out[240]:
>> array([[ 0,  1,  2],
>>[ 3,  4,  5],
>>[ 6,  7,  8],
>>[ 9, 10, 11],
>>[12, 13, 14]])
>>
>> In [241]: disarrange(a, axis=0)
>>
>> In [242]: a
>> Out[242]:
>> array([[ 6,  1,  2],
>>[ 3, 13, 14],
>>[ 9, 10,  5],
>>[12,  7,  8],
>>[ 0,  4, 11]])
>>
>> Note that each column has been shuffled independently.
>>
>> This behavior is analogous to how `sort` handles the `axis`
>> keyword.  `sort` sorts the 1-d slices along the given axis
>> independently.
>>
>> In the github issue, I suggested the following signature
>> for `shuffle` (but I'm not too fond of the name `independent`):
>>
>>   def shuffle(a, independent=False, axis=0)
>>
>> If `independent` is False, the current behavior of `shuffle`
>> is used.  If `independent` is True, each 1-d slice is shuffled
>> independently (in the same way that `sort` sorts each 1-d
>> slice).
>>
>> Like most functions that take an `axis` argument, `axis=None`
>> means to shuffle the flattened array.  With `independent=True`,
>> it would act like `np.random.shuffle(a.flat)`, e.g.
>>
>> In [247]: a
>> Out[247]:
>> array([[ 0,  1,  2,  3,  4],
>>[ 5,  6,  7,  8,  9],
>>[10, 11, 12, 13, 14]])
>>
>> In [248]: np.random.shuffle(a.flat)
>>
>> In [249]: a
>> Out[249]:
>> array([[ 0, 14,  9,  1, 13],
>>[ 2,  8,  5,  3,  4],
>>[ 6, 10,  7, 12, 11]])
>>
>>
>> A small wart in this API is the meaning of
>>
>>   shuffle(a, independent=False, axis=None)
>>
>> It could be argued that the correct behavior is to leave the
>> array unchanged. (The current behavior can be interpreted as
>> shuffling a 1-d sequence of monolithic blobs; the axis argument
>> specifies which axis of the array corresponds to the
>> sequence index.  Then `axis=None` means the argument is
>> a single monolithic blob, so there is nothing to shuffle.)
>> Or an error could be raised.
>>
>> What do you think?
>>
>> Warren
>>
>>
>
>
> It is clear from the comments so far that, when `axis` is None, the result
> should be a shuffle of all the elements in the array, for both methods of
> shuffling (whether implemented as a new method or with a boolean argument
> to `shuffle`).  Forget I ever suggested doing nothing or raising an error.
> :)
>
> Josef's comment reminded me that `numpy.random.permutation` returns a
> shuffled copy of the array (when its argument is an array).  This function
> should also

Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Warren Weckesser
On Sat, Oct 11, 2014 at 6:51 PM, Warren Weckesser <
warren.weckes...@gmail.com> wrote:

> I created an issue on github for an enhancement
> to numpy.random.shuffle:
> https://github.com/numpy/numpy/issues/5173
> I'd like to get some feedback on the idea.
>
> Currently, `shuffle` shuffles the first dimension of an array
> in-place.  For example, shuffling a 2D array shuffles the rows:
>
> In [227]: a
> Out[227]:
> array([[ 0,  1,  2],
>[ 3,  4,  5],
>[ 6,  7,  8],
>[ 9, 10, 11]])
>
> In [228]: np.random.shuffle(a)
>
> In [229]: a
> Out[229]:
> array([[ 0,  1,  2],
>[ 9, 10, 11],
>[ 3,  4,  5],
>[ 6,  7,  8]])
>
>
> To add an axis keyword, we could (in effect) apply `shuffle` to
> `a.swapaxes(axis, 0)`.  For a 2-D array, `axis=1` would shuffles
> the columns:
>
> In [232]: a = np.arange(15).reshape(3,5)
>
> In [233]: a
> Out[233]:
> array([[ 0,  1,  2,  3,  4],
>[ 5,  6,  7,  8,  9],
>[10, 11, 12, 13, 14]])
>
> In [234]: axis = 1
>
> In [235]: np.random.shuffle(a.swapaxes(axis, 0))
>
> In [236]: a
> Out[236]:
> array([[ 3,  2,  4,  0,  1],
>[ 8,  7,  9,  5,  6],
>[13, 12, 14, 10, 11]])
>
> So that's the first part--adding an `axis` keyword.
>
> The other part of the enhancement request is to add a shuffle
> behavior that shuffles the 1-d slices *independently*.  That is,
> for a 2-d array, shuffling with `axis=0` would apply a different
> shuffle to each column.  In the github issue, I defined a
> function called `disarrange` that implements this behavior:
>
> In [240]: a
> Out[240]:
> array([[ 0,  1,  2],
>[ 3,  4,  5],
>[ 6,  7,  8],
>[ 9, 10, 11],
>[12, 13, 14]])
>
> In [241]: disarrange(a, axis=0)
>
> In [242]: a
> Out[242]:
> array([[ 6,  1,  2],
>[ 3, 13, 14],
>[ 9, 10,  5],
>[12,  7,  8],
>[ 0,  4, 11]])
>
> Note that each column has been shuffled independently.
>
> This behavior is analogous to how `sort` handles the `axis`
> keyword.  `sort` sorts the 1-d slices along the given axis
> independently.
>
> In the github issue, I suggested the following signature
> for `shuffle` (but I'm not too fond of the name `independent`):
>
>   def shuffle(a, independent=False, axis=0)
>
> If `independent` is False, the current behavior of `shuffle`
> is used.  If `independent` is True, each 1-d slice is shuffled
> independently (in the same way that `sort` sorts each 1-d
> slice).
>
> Like most functions that take an `axis` argument, `axis=None`
> means to shuffle the flattened array.  With `independent=True`,
> it would act like `np.random.shuffle(a.flat)`, e.g.
>
> In [247]: a
> Out[247]:
> array([[ 0,  1,  2,  3,  4],
>[ 5,  6,  7,  8,  9],
>[10, 11, 12, 13, 14]])
>
> In [248]: np.random.shuffle(a.flat)
>
> In [249]: a
> Out[249]:
> array([[ 0, 14,  9,  1, 13],
>[ 2,  8,  5,  3,  4],
>[ 6, 10,  7, 12, 11]])
>
>
> A small wart in this API is the meaning of
>
>   shuffle(a, independent=False, axis=None)
>
> It could be argued that the correct behavior is to leave the
> array unchanged. (The current behavior can be interpreted as
> shuffling a 1-d sequence of monolithic blobs; the axis argument
> specifies which axis of the array corresponds to the
> sequence index.  Then `axis=None` means the argument is
> a single monolithic blob, so there is nothing to shuffle.)
> Or an error could be raised.
>
> What do you think?
>
> Warren
>
>


It is clear from the comments so far that, when `axis` is None, the result
should be a shuffle of all the elements in the array, for both methods of
shuffling (whether implemented as a new method or with a boolean argument
to `shuffle`).  Forget I ever suggested doing nothing or raising an error.
:)

Josef's comment reminded me that `numpy.random.permutation` returns a
shuffled copy of the array (when its argument is an array).  This function
should also get an `axis` argument.  `permutation` shuffles the same way
`shuffle` does--it simply makes a copy and then calls `shuffle` on the
copy.  If a new method is added for the new shuffling style, then it would
be consistent to also add a new method that uses the new shuffling style
and returns a copy of the shuffled array.   Then we would then have four
methods:

   In-placeCopy
Current shuffle style  shuffle permutation
New shuffle style  (name TBD)  (name TBD)

(All of them will have an `axis` argument.)

I suspect this will make some folks prefer the approach of adding a boolean
argument to `shuffle` and `permutation`.

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Warren Weckesser
On Sun, Oct 12, 2014 at 11:20 AM,  wrote:

> On Sun, Oct 12, 2014 at 10:54 AM, Warren Weckesser
>  wrote:
> >
> >
> > On Sun, Oct 12, 2014 at 7:57 AM, Robert Kern 
> wrote:
> >>
> >> On Sat, Oct 11, 2014 at 11:51 PM, Warren Weckesser
> >>  wrote:
> >>
> >> > A small wart in this API is the meaning of
> >> >
> >> >   shuffle(a, independent=False, axis=None)
> >> >
> >> > It could be argued that the correct behavior is to leave the
> >> > array unchanged. (The current behavior can be interpreted as
> >> > shuffling a 1-d sequence of monolithic blobs; the axis argument
> >> > specifies which axis of the array corresponds to the
> >> > sequence index.  Then `axis=None` means the argument is
> >> > a single monolithic blob, so there is nothing to shuffle.)
> >> > Or an error could be raised.
> >> >
> >> > What do you think?
> >>
> >> It seems to me a perfectly good reason to have two methods instead of
> >> one. I can't imagine when I wouldn't be using a literal True or False
> >> for this, so it really should be two different methods.
> >>
> >
> >
> > I agree, and my first inclination was to propose a different method (and
> I
> > had the bikeshedding conversation with myself about the name:
> "disarrange",
> > "scramble", "disorder", "randomize", "ashuffle", some other variation of
> the
> > word "shuffle", ...), but I figured the first thing folks would say is
> "Why
> > not just add options to shuffle?"  So, choose your battles and all that.
> >
> > What do other folks think of making a separate method?
>
> I'm not a fan of many similar functions.
>
> What's the difference between permute, shuffle and scramble?
>


The difference between `shuffle` and the new method being proposed is
explained in the first email in this thread.
`np.random.permutation` with an array argument returns a shuffled copy of
the array; it does not modify its argument. (It should also get an `axis`
argument when `shuffle` gets an `axis` argument.)


And how do I find or remember which is which?
>


You could start with `doc(np.random)` (or `np.random?` in ipython).

Warren



>
>
> >
> >
> >>
> >> That said, I would just make the axis=None behavior the same for both
> >> methods. axis=None does *not* mean "treat this like a single
> >> monolithic blob" in any of the axis=-having methods; it means "flatten
> >> the array and do the operation on the single flattened axis". I think
> >> the latter behavior is a reasonable interpretation of axis=None for
> >> both methods.
> >
> >
> >
> > Sounds good to me.
>
> +1 (since all the arguments have been already given
>
>
> Josef
> - Why does sort treat columns independently instead of sorting rows?
> - because there is lexsort
> - Oh, lexsort, I haven thought about it in 5 years. It's not even next
> to sort in the pop up code completion
>
>
> >
> > Warren
> >
> >
> >>
> >>
> >> --
> >> Robert Kern
> >> ___
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Warren Weckesser
On Sun, Oct 12, 2014 at 7:57 AM, Robert Kern  wrote:

> On Sat, Oct 11, 2014 at 11:51 PM, Warren Weckesser
>  wrote:
>
> > A small wart in this API is the meaning of
> >
> >   shuffle(a, independent=False, axis=None)
> >
> > It could be argued that the correct behavior is to leave the
> > array unchanged. (The current behavior can be interpreted as
> > shuffling a 1-d sequence of monolithic blobs; the axis argument
> > specifies which axis of the array corresponds to the
> > sequence index.  Then `axis=None` means the argument is
> > a single monolithic blob, so there is nothing to shuffle.)
> > Or an error could be raised.
> >
> > What do you think?
>
> It seems to me a perfectly good reason to have two methods instead of
> one. I can't imagine when I wouldn't be using a literal True or False
> for this, so it really should be two different methods.
>
>

I agree, and my first inclination was to propose a different method (and I
had the bikeshedding conversation with myself about the name: "disarrange",
"scramble", "disorder", "randomize", "ashuffle", some other variation of
the word "shuffle", ...), but I figured the first thing folks would say is
"Why not just add options to shuffle?"  So, choose your battles and all
that.

What do other folks think of making a separate method?



> That said, I would just make the axis=None behavior the same for both
> methods. axis=None does *not* mean "treat this like a single
> monolithic blob" in any of the axis=-having methods; it means "flatten
> the array and do the operation on the single flattened axis". I think
> the latter behavior is a reasonable interpretation of axis=None for
> both methods.
>


Sounds good to me.

Warren



>
> --
> Robert Kern
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-11 Thread Warren Weckesser
I created an issue on github for an enhancement
to numpy.random.shuffle:
https://github.com/numpy/numpy/issues/5173
I'd like to get some feedback on the idea.

Currently, `shuffle` shuffles the first dimension of an array
in-place.  For example, shuffling a 2D array shuffles the rows:

In [227]: a
Out[227]:
array([[ 0,  1,  2],
   [ 3,  4,  5],
   [ 6,  7,  8],
   [ 9, 10, 11]])

In [228]: np.random.shuffle(a)

In [229]: a
Out[229]:
array([[ 0,  1,  2],
   [ 9, 10, 11],
   [ 3,  4,  5],
   [ 6,  7,  8]])


To add an axis keyword, we could (in effect) apply `shuffle` to
`a.swapaxes(axis, 0)`.  For a 2-D array, `axis=1` would shuffles
the columns:

In [232]: a = np.arange(15).reshape(3,5)

In [233]: a
Out[233]:
array([[ 0,  1,  2,  3,  4],
   [ 5,  6,  7,  8,  9],
   [10, 11, 12, 13, 14]])

In [234]: axis = 1

In [235]: np.random.shuffle(a.swapaxes(axis, 0))

In [236]: a
Out[236]:
array([[ 3,  2,  4,  0,  1],
   [ 8,  7,  9,  5,  6],
   [13, 12, 14, 10, 11]])

So that's the first part--adding an `axis` keyword.

The other part of the enhancement request is to add a shuffle
behavior that shuffles the 1-d slices *independently*.  That is,
for a 2-d array, shuffling with `axis=0` would apply a different
shuffle to each column.  In the github issue, I defined a
function called `disarrange` that implements this behavior:

In [240]: a
Out[240]:
array([[ 0,  1,  2],
   [ 3,  4,  5],
   [ 6,  7,  8],
   [ 9, 10, 11],
   [12, 13, 14]])

In [241]: disarrange(a, axis=0)

In [242]: a
Out[242]:
array([[ 6,  1,  2],
   [ 3, 13, 14],
   [ 9, 10,  5],
   [12,  7,  8],
   [ 0,  4, 11]])

Note that each column has been shuffled independently.

This behavior is analogous to how `sort` handles the `axis`
keyword.  `sort` sorts the 1-d slices along the given axis
independently.

In the github issue, I suggested the following signature
for `shuffle` (but I'm not too fond of the name `independent`):

  def shuffle(a, independent=False, axis=0)

If `independent` is False, the current behavior of `shuffle`
is used.  If `independent` is True, each 1-d slice is shuffled
independently (in the same way that `sort` sorts each 1-d
slice).

Like most functions that take an `axis` argument, `axis=None`
means to shuffle the flattened array.  With `independent=True`,
it would act like `np.random.shuffle(a.flat)`, e.g.

In [247]: a
Out[247]:
array([[ 0,  1,  2,  3,  4],
   [ 5,  6,  7,  8,  9],
   [10, 11, 12, 13, 14]])

In [248]: np.random.shuffle(a.flat)

In [249]: a
Out[249]:
array([[ 0, 14,  9,  1, 13],
   [ 2,  8,  5,  3,  4],
   [ 6, 10,  7, 12, 11]])


A small wart in this API is the meaning of

  shuffle(a, independent=False, axis=None)

It could be argued that the correct behavior is to leave the
array unchanged. (The current behavior can be interpreted as
shuffling a 1-d sequence of monolithic blobs; the axis argument
specifies which axis of the array corresponds to the
sequence index.  Then `axis=None` means the argument is
a single monolithic blob, so there is nothing to shuffle.)
Or an error could be raised.

What do you think?

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Online docs for numpy are for version 1.8

2014-09-25 Thread Warren Weckesser
Pinging the webmeisters: numpy 1.9 is released, but the docs at
http://docs.scipy.org/doc/numpy/ are still for version 1.8.

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New function `count_unique` to generate contingency tables.

2014-08-13 Thread Warren Weckesser
On Wed, Aug 13, 2014 at 5:15 PM, Benjamin Root  wrote:

> The ever-wonderful pylab mode in matplotlib has a table function for
> plotting a table of text in a plot. If I remember correctly, what would
> happen is that matplotlib's table() function will simply obliterate the
> numpy's table function. This isn't a show-stopper, I just wanted to point
> that out.
>
> Personally, while I wasn't a particular fan of "count_unique" because I
> wouldn't necessarially think of it when needing a contingency table, I do
> like that it is verb-ish. "table()", in this sense, is not a verb. That
> said, I am perfectly fine with it if you are fine with the name collision
> in pylab mode.
>
>

Thanks for pointing that out.  I only changed it to have something that
sounded more table-ish, like the Pandas, R and Matlab functions.   I won't
update it right now, but if there is interest in putting it into numpy,
I'll rename it to avoid the pylab conflict.  Anything along the lines of
`crosstab`, `xtable`, etc., would be fine with me.

Warren



> On Wed, Aug 13, 2014 at 4:57 PM, Warren Weckesser <
> warren.weckes...@gmail.com> wrote:
>
>>
>>
>>
>> On Tue, Aug 12, 2014 at 12:51 PM, Eelco Hoogendoorn <
>> hoogendoorn.ee...@gmail.com> wrote:
>>
>>> ah yes, that's also an issue I was trying to deal with. the semantics I
>>> prefer in these type of operators, is (as a default), to have every array
>>> be treated as a sequence of keys, so if calling unique(arr_2d), youd get
>>> unique rows, unless you pass axis=None, in which case the array is
>>> flattened.
>>>
>>> I also agree that the extension you propose here is useful; but ideally,
>>> with a little more discussion on these subjects we can converge on an
>>> even more comprehensive overhaul
>>>
>>>
>>> On Tue, Aug 12, 2014 at 6:33 PM, Joe Kington 
>>> wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Tue, Aug 12, 2014 at 11:17 AM, Eelco Hoogendoorn <
>>>> hoogendoorn.ee...@gmail.com> wrote:
>>>>
>>>>> Thanks. Prompted by that stackoverflow question, and similar problems
>>>>> I had to deal with myself, I started working on a much more general
>>>>> extension to numpy's functionality in this space. Like you noted, things
>>>>> get a little panda-y, but I think there is a lot of panda's functionality
>>>>> that could or should be part of the numpy core, a robust set of grouping
>>>>> operations in particular.
>>>>>
>>>>> see pastebin here:
>>>>> http://pastebin.com/c5WLWPbp
>>>>>
>>>>
>>>> On a side note, this is related to a pull request of mine from awhile
>>>> back: https://github.com/numpy/numpy/pull/3584
>>>>
>>>> There was a lot of disagreement on the mailing list about what to call
>>>> a "unique slices along a given axis" function, so I wound up closing the
>>>> pull request pending more discussion.
>>>>
>>>> At any rate, I think it's a useful thing to have in "base" numpy.
>>>>
>>>> ___
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion@scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> Update: I renamed the function to `table` in the pull request:
>> https://github.com/numpy/numpy/pull/4958
>>
>>
>> Warren
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New function `count_unique` to generate contingency tables.

2014-08-13 Thread Warren Weckesser
On Tue, Aug 12, 2014 at 12:51 PM, Eelco Hoogendoorn <
hoogendoorn.ee...@gmail.com> wrote:

> ah yes, that's also an issue I was trying to deal with. the semantics I
> prefer in these type of operators, is (as a default), to have every array
> be treated as a sequence of keys, so if calling unique(arr_2d), youd get
> unique rows, unless you pass axis=None, in which case the array is
> flattened.
>
> I also agree that the extension you propose here is useful; but ideally,
> with a little more discussion on these subjects we can converge on an
> even more comprehensive overhaul
>
>
> On Tue, Aug 12, 2014 at 6:33 PM, Joe Kington 
> wrote:
>
>>
>>
>>
>> On Tue, Aug 12, 2014 at 11:17 AM, Eelco Hoogendoorn <
>> hoogendoorn.ee...@gmail.com> wrote:
>>
>>> Thanks. Prompted by that stackoverflow question, and similar problems I
>>> had to deal with myself, I started working on a much more general extension
>>> to numpy's functionality in this space. Like you noted, things get a little
>>> panda-y, but I think there is a lot of panda's functionality that could or
>>> should be part of the numpy core, a robust set of grouping operations in
>>> particular.
>>>
>>> see pastebin here:
>>> http://pastebin.com/c5WLWPbp
>>>
>>
>> On a side note, this is related to a pull request of mine from awhile
>> back: https://github.com/numpy/numpy/pull/3584
>>
>> There was a lot of disagreement on the mailing list about what to call a
>> "unique slices along a given axis" function, so I wound up closing the pull
>> request pending more discussion.
>>
>> At any rate, I think it's a useful thing to have in "base" numpy.
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

Update: I renamed the function to `table` in the pull request:
https://github.com/numpy/numpy/pull/4958


Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New function `count_unique` to generate contingency tables.

2014-08-12 Thread Warren Weckesser
On Tue, Aug 12, 2014 at 11:35 AM, Warren Weckesser <
warren.weckes...@gmail.com> wrote:

> I created a pull request (https://github.com/numpy/numpy/pull/4958) that
> defines the function `count_unique`.  `count_unique` generates a
> contingency table from a collection of sequences.  For example,
>
> In [7]: x = [1, 1, 1, 1, 2, 2, 2, 2, 2]
>
> In [8]: y = [3, 4, 3, 3, 3, 4, 5, 5, 5]
>
> In [9]: (xvals, yvals), counts = count_unique(x, y)
>
> In [10]: xvals
> Out[10]: array([1, 2])
>
> In [11]: yvals
> Out[11]: array([3, 4, 5])
>
> In [12]: counts
> Out[12]:
> array([[3, 1, 0],
>[1, 1, 3]])
>
>
> It can be interpreted as a multi-argument generalization of `np.unique(x,
> return_counts=True)`.
>
> It overlaps with Pandas' `crosstab`, but I think this is a pretty
> fundamental counting operation that fits in numpy.
>
> Matlab's `crosstab` (http://www.mathworks.com/help/stats/crosstab.html)
> and R's `table` perform the same calculation (with a few more bells and
> whistles).
>
>
> For comparison, here's Pandas' `crosstab` (same `x` and `y` as above):
>
> In [28]: import pandas as pd
>
> In [29]: xs = pd.Series(x)
>
> In [30]: ys = pd.Series(y)
>
> In [31]: pd.crosstab(xs, ys)
> Out[31]:
> col_0  3  4  5
> row_0
> 1  3  1  0
> 2  1  1  3
>
>
> And here is R's `table`:
>
> > x <- c(1,1,1,1,2,2,2,2,2)
> > y <- c(3,4,3,3,3,4,5,5,5)
> > table(x, y)
>y
> x   3 4 5
>   1 3 1 0
>   2 1 1 3
>
>
> Is there any interest in adding this (or some variation of it) to numpy?
>
>
> Warren
>
>

While searching StackOverflow in the numpy tag for "count unique", I just
discovered that I basically reinvented Eelco Hoogendoorn's code in his
answer to
http://stackoverflow.com/questions/10741346/numpy-frequency-counts-for-unique-values-in-an-array.
Nice one, Eelco!

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] New function `count_unique` to generate contingency tables.

2014-08-12 Thread Warren Weckesser
I created a pull request (https://github.com/numpy/numpy/pull/4958) that
defines the function `count_unique`.  `count_unique` generates a
contingency table from a collection of sequences.  For example,

In [7]: x = [1, 1, 1, 1, 2, 2, 2, 2, 2]

In [8]: y = [3, 4, 3, 3, 3, 4, 5, 5, 5]

In [9]: (xvals, yvals), counts = count_unique(x, y)

In [10]: xvals
Out[10]: array([1, 2])

In [11]: yvals
Out[11]: array([3, 4, 5])

In [12]: counts
Out[12]:
array([[3, 1, 0],
   [1, 1, 3]])


It can be interpreted as a multi-argument generalization of `np.unique(x,
return_counts=True)`.

It overlaps with Pandas' `crosstab`, but I think this is a pretty
fundamental counting operation that fits in numpy.

Matlab's `crosstab` (http://www.mathworks.com/help/stats/crosstab.html) and
R's `table` perform the same calculation (with a few more bells and
whistles).


For comparison, here's Pandas' `crosstab` (same `x` and `y` as above):

In [28]: import pandas as pd

In [29]: xs = pd.Series(x)

In [30]: ys = pd.Series(y)

In [31]: pd.crosstab(xs, ys)
Out[31]:
col_0  3  4  5
row_0
1  3  1  0
2  1  1  3


And here is R's `table`:

> x <- c(1,1,1,1,2,2,2,2,2)
> y <- c(3,4,3,3,3,4,5,5,5)
> table(x, y)
   y
x   3 4 5
  1 3 1 0
  2 1 1 3


Is there any interest in adding this (or some variation of it) to numpy?


Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Easter Egg or what I am missing here?

2014-05-21 Thread Warren Weckesser
On 5/21/14, Siegfried Gonzi  wrote:
> Please would anyone tell me the following is an undocumented bug
> otherwise I will lose faith in everything:
>
> ==
> import numpy as np
>
>
> years = [2004,2005,2006,2007]
>
> dates = [20040501,20050601,20060801,20071001]
>
> for x in years:
>
>  print 'year ',x
>
>  xy =  np.array([x*1.0e-4 for x in dates]).astype(np.int)
>
>  print 'year ',x
> ==
>
> Or is this a recipe to blow up a power plant?
>


This is a "wart" of Python 2.x.  The dummy variable used in a list
comprehension remains defined with its final value in the enclosing
scope.  For example, this is Python 2.7:

>>> x = 100
>>> w = [x*x for x in range(4)]
>>> x
3


This behavior has been changed in Python 3.  Here's the same sequence
in Python 3.4:

>>> x = 100
>>> w = [x*x for x in range(4)]
>>> x
100


Guido van Rossum gives a summary of this issue near the end of this
blog: 
http://python-history.blogspot.com/2010/06/from-list-comprehensions-to-generator.html

Warren



> Thanks,
> Siegfried
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Test error with ATLAS, Windows 64 bit

2014-04-14 Thread Warren Weckesser
On Mon, Apr 14, 2014 at 2:59 PM, Matthew Brett wrote:

> Hi,
>
> With Carl Kleffner, I am trying to build a numpy 1.8.1 wheel for
> Windows 64-bit, and latest stable ATLAS.
>
> It works fine, apart from the following test failure:
>
> ==
> FAIL: test_special (test_umath.TestExpm1)
> --
> Traceback (most recent call last):
>   File "C:\Python27\lib\site-packages\numpy\core\tests\test_umath.py",
> line 329, in test_special
> assert_equal(ncu.expm1(-0.), -0.)
>   File "C:\Python27\lib\site-packages\numpy\testing\utils.py", line
> 311, in assert_equal
> raise AssertionError(msg)
> AssertionError:
> Items are not equal:
>  ACTUAL: 0.0
>  DESIRED: -0.0
>
> Has anyone seen this?  Is it in fact necessary that expm1(-0.) return
> -0 instead of 0?
>
>

What a cowinky dink.  This moring I ran into this issue in a scipy pull
request (https://github.com/scipy/scipy/pull/3547), and I asked about this
comparison failing on the mailing list a few hours ago.  In the pull
request, the modified function returns -0.0 where it used to return 0.0,
and the test for the value 0.0 failed.  My work-around was to use
`assert_array_equal` instead of `assert_equal`.  The array comparison
functions treat the values -0.0 and 0.0 as equal.  `assert_equal` has code
that checks for signed zeros, and fails if they are not the same sign.

Warren



Thanks a lot,
>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] assert_equal(-0.0, 0.0) fails.

2014-04-14 Thread Warren Weckesser
The test function numpy.testing.assert_equal fails when comparing -0.0 and 0.0:

In [16]: np.testing.assert_equal(-0.0, 0.0)
---
AssertionErrorTraceback (most recent call last)
 in ()
> 1 np.testing.assert_equal(-0.0, 0.0)

/Users/warren/anaconda/lib/python2.7/site-packages/numpy/testing/utils.pyc
in assert_equal(actual, desired, err_msg, verbose)
309 elif desired == 0 and actual == 0:
310 if not signbit(desired) == signbit(actual):
--> 311 raise AssertionError(msg)
312 # If TypeError or ValueError raised while using isnan and
co, just handle
313 # as before

AssertionError:
Items are not equal:
 ACTUAL: -0.0
 DESIRED: 0.0

There is code that checks for this specific case, so this is
intentional.  But this is not consistent with how negative zeros in
arrays are compared:

In [22]: np.testing.assert_equal(np.array(-0.0), np.array(0.0))  # PASS

In [23]: a = np.array([-0.0])

In [24]: b = np.array([0.0])

In [25]: np.testing.assert_array_equal(a, b)  # PASS


Is there a reason the values are considered equal in an array, but not
when compared as scalars?

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@?

2014-03-15 Thread Warren Weckesser
On Sat, Mar 15, 2014 at 8:38 PM,  wrote:

> I think I wouldn't use anything like @@ often enough to remember it's
> meaning. I'd rather see english names for anything that is not **very**
> common.
>
> I find A@@-1 pretty ugly compared to inv(A)
> A@@(-0.5)  might be nice   (do we have matrix_sqrt ?)
>


scipy.linalg.sqrtm:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.sqrtm.html

Warren



> Josef
>
>
>
> On Sat, Mar 15, 2014 at 5:11 PM, Stephan Hoyer  wrote:
>
>> Speaking only for myself (and as someone who has regularly used matrix
>> powers), I would not expect matrix power as @@ to follow from matrix
>> multiplication as @. I do agree that matrix power is the only reasonable
>> use for @@ (given @), but it's still not something I would be confident
>> enough to know without looking up.
>>
>> We should keep in mind that each new operator imposes some (small)
>> cognitive burden on everyone who encounters them for the first time, and,
>> in this case, this will include a large fraction of all Python users,
>> whether they do numerical computation or not.
>>
>> Guido has given us a tremendous gift in the form of @. Let's not insist
>> on @@, when it is unclear if the burden of figuring out what @@ means it
>> would be worth using, even for heavily numeric code. I would certainly
>> prefer to encounter norm(A), inv(A), matrix_power(A, n),
>> fractional_matrix_power(A, n) and expm(A) rather than their infix
>> equivalents. It will certainly not be obvious which of these @@ will
>> support for objects from any given library.
>>
>> One useful data point might be to consider whether matrix power is
>> available as an infix operator in other languages commonly used for
>> numerical work. AFAICT from some quick searches:
>> MATLAB: Yes
>> R: No
>> IDL: No
>>
>> All of these languages do, of course, implement infix matrix
>> multiplication, but it is apparently not clear at all whether the matrix
>> power is useful.
>>
>> Best,
>> Stephan
>>
>>
>>
>>
>> On Sat, Mar 15, 2014 at 9:03 AM, Olivier Delalleau  wrote:
>>
>>> 2014-03-15 11:18 GMT-04:00 Charles R Harris :
>>>
>>>


 On Fri, Mar 14, 2014 at 10:32 PM, Nathaniel Smith wrote:

> Hi all,
>
> Here's the second thread for discussion about Guido's concerns about
> PEP 465. The issue here is that PEP 465 as currently written proposes
> two new operators, @ for matrix multiplication and @@ for matrix power
> (analogous to * and **):
>   http://legacy.python.org/dev/peps/pep-0465/
>
> The main thing we care about of course is @; I pushed for including @@
> because I thought it was nicer to have than not, and I thought the
> analogy between * and ** might make the overall package more appealing
> to Guido's aesthetic sense.
>
> It turns out I was wrong :-). Guido is -0 on @@, but willing to be
> swayed if we think it's worth the trouble to make a solid case.
>
> Note that question now is *not*, how will @@ affect the reception of
> @. @ itself is AFAICT a done deal, regardless of what happens with @@.
> For this discussion let's assume @ can be taken for granted, and that
> we can freely choose to either add @@ or not add @@ to the language.
> The question is: which do we think makes Python a better language (for
> us and in general)?
>
> Some thoughts to start us off:
>
> Here are the interesting use cases for @@ that I can think of:
> - 'vector @@ 2' gives the squared Euclidean length (because it's the
> same as vector @ vector). Kind of handy.
> - 'matrix @@ n' of course gives the matrix power, which is of marginal
> use but does come in handy sometimes, e.g., when looking at graph
> connectivity.
> - 'matrix @@ -1' provides a very transparent notation for translating
> textbook formulas (with all their inverses) into code. It's a bit
> unhelpful in practice, because (a) usually you should use solve(), and
> (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But
> sometimes transparent notation may be important. (And in some cases,
> like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be
> compiled into a call to solve() anyway.)
>
> (Did I miss any?)
>
> In practice it seems to me that the last use case is the one that's
> might matter a lot practice, but then again, it might not -- I'm not
> sure. For example, does anyone who teaches programming with numpy have
> a feeling about whether the existence of '@@ -1' would make a big
> difference to you and your students? (Alan? I know you were worried
> about losing the .I attribute on matrices if switching to ndarrays for
> teaching -- given that ndarray will probably not get a .I attribute,
> how much would the existence of @@ -1 affect you?)
>
> On a more technical level, Guido is worried about how @@'s precedence
> should work (and this is som

Re: [Numpy-discussion] New (old) function proposal.

2014-02-19 Thread Warren Weckesser
On Tue, Feb 18, 2014 at 11:27 AM, Sebastian Berg  wrote:

> On Di, 2014-02-18 at 09:05 -0700, Charles R Harris wrote:
> > Hi All,
> >
> >
> > There is an old ticket, #1499, that suggest adding a segment_axis
> > function.
> >
> > def segment_axis(a, length, overlap=0, axis=None, end='cut', endvalue=0):
> > """Generate a new array that chops the given array along the given
> axis
> > into overlapping frames.
> >
> > Parameters
> > --
> > a : array-like
> > The array to segment
> > length : int
> > The length of each frame
> > overlap : int, optional
> > The number of array elements by which the frames should overlap
> > axis : int, optional
> > The axis to operate on; if None, act on the flattened array
> > end : {'cut', 'wrap', 'end'}, optional
> > What to do with the last frame, if the array is not evenly
> > divisible into pieces.
> >
> > - 'cut'   Simply discard the extra values
> > - 'wrap'  Copy values from the beginning of the array
> > - 'pad'   Pad with a constant value
> >
> > endvalue : object
> > The value to use for end='pad'
> >
> >
> > Examples
> > 
> > >>> segment_axis(arange(10), 4, 2)
> > array([[0, 1, 2, 3],
> >[2, 3, 4, 5],
> >[4, 5, 6, 7],
> >[6, 7, 8, 9]])
> >
> >
> > Is there and interest in having this function available?
> >
>
> Just to note, there have been similar proposals with a rolling_window
> function. It could be made ND aware, too (though maybe this one is
> also).
>


For example:  https://github.com/numpy/numpy/pull/31

Warren



> >
> > Chuck
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in comparing object arrays to None (?)

2014-01-27 Thread Warren Weckesser
On Mon, Jan 27, 2014 at 3:43 PM, Charles G. Waldman wrote:

> Hi Numpy folks.
>
> I just noticed that comparing an array of type 'object' to None does
> not behave as I expected.  Is this a feature or a bug?  (I can take a
> stab at fixing it if it's a bug, as I believe it is).
>
> >>> np.version.full_version
> '1.8.0'
>
> >>> a = np.array(['Frank', None, 'Nancy'])
>
> >>> a
> array(['Frank', None, 'Nancy'], dtype=object)
>
> >>> a == 'Frank'
> array([ True, False, False], dtype=bool)
> # Return value is an array
>
> >>> a == None
> False
> # Return value is scalar (BUG?)
>


Looks like a fix is in progress:  https://github.com/numpy/numpy/pull/3514

Warren

___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] git tag for version 1.8?

2013-12-20 Thread Warren Weckesser
On 12/20/13, Charles R Harris  wrote:
> On Thu, Dec 19, 2013 at 10:16 PM, Warren Weckesser <
> warren.weckes...@gmail.com> wrote:
>
>> Is version 1.8.0 tagged in git?  I see tags up to 1.7.1.  I suspect
>> the tagging convention has changed in the git repo.  How do I checkout
>> v1.8.0?
>>
>>
> It's tagged. You can see it on github under branches/tags if you hit the
> tags tab. To get tags from upstream do `git fetch upstream --tags`
>


Great, thanks.

Warren

> Chuck
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] git tag for version 1.8?

2013-12-19 Thread Warren Weckesser
Is version 1.8.0 tagged in git?  I see tags up to 1.7.1.  I suspect
the tagging convention has changed in the git repo.  How do I checkout
v1.8.0?

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] (no subject)

2013-11-22 Thread Warren Weckesser
On Fri, Nov 22, 2013 at 4:23 PM, Matthew Brett wrote:

> Hi,
>
> I'm sorry if I missed something obvious - but is there a vectorized
> way to look for None in an array?
>
> In [3]: a = np.array([1, 1])
>
> In [4]: a == object()
> Out[4]: array([False, False], dtype=bool)
>
> In [6]: a == None
> Out[6]: False
>
> (same for object arrays),
>



Looks like using a "scalar array" that holds the value None will work:

In [8]: a
Out[8]: array([[1, 2], 'foo', None], dtype=object)

In [9]: a == np.array(None)
Out[9]: array([False, False,  True], dtype=bool)

Warren



> Thanks a lot,
>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.savetxt to string?

2013-11-06 Thread Warren Weckesser
Which version of numpy are you using?  I just tried it with 1.7.1, and it
accepted a StringIO instance.  The docstring says the first argument may be
a filename or file handle (
http://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html#numpy.savetxt
).

Warren




On Wed, Nov 6, 2013 at 1:17 PM, Neal Becker  wrote:

> According to doc, savetxt only allows a file name.  I'm surprised it
> doesn't
> allow a file-like object.  How can I format text into a string?  I would
> like
> savetxt to accept StringIO for this.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Valid algorithm for generating a 3D Wiener Process?

2013-09-25 Thread Warren Weckesser
On Wed, Sep 25, 2013 at 1:41 PM, David Goldsmith wrote:

> Thanks, guys.  Yeah, I realized the problem w/ the
> uniform-increment-variable-direction approach this morning: physically, it
> ignores the fact that the particles hitting the particle being tracked are
> going to have a distribution of momentum, not all the same, just varying in
> direction.  But I don't quite understand Warren's observation: "the
> 'angles' that describe the position undergo a random walk [actually, it
> would seem that they don't, since they too fail the varying-as-white-noise
> test], so the particle tends to move in the same direction over short
> intervals"--is this just another way of saying that, since I was varying
> the angles by -1, 0, or 1 unit each time, the simulation is susceptible to
> "unnaturally" long strings of -1, 0, or 1 increments?  Thanks again,
>


Note: I was interpreting your code as the discretization of a stochastic
process, and I was experimenting with values of `incr` that were small,
e.g. `incr = 0.01`.

This code

t = 2*np.pi*incr*(R.randint(3, size=(N,))-1)
t[0] = 0
t = t.cumsum()

makes `t` a (discrete) random walk.  At each time step, t either remains
the same, or changes by +/- 2*np.pi*incr.  If `incr` is small, then `t[1]`
is a small step from `t[0]`.  Similarly, `p[1]` will be close to `p[0]`.
So the particle "remembers" its direction.  A particle undergoing Brownian
motion does not have this memory.


Warren




> DG
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Valid algorithm for generating a 3D Wiener Process?

2013-09-25 Thread Warren Weckesser
On Wed, Sep 25, 2013 at 12:51 PM, Warren Weckesser <
warren.weckes...@gmail.com> wrote:

>
> On Wed, Sep 25, 2013 at 9:36 AM, Neal Becker  wrote:
>
>> David Goldsmith wrote:
>>
>> > Is this a valid algorithm for generating a 3D Wiener process?  (When I
>> > graph the results, they certainly look like potential Brownian motion
>> > tracks.)
>> >
>> > def Wiener3D(incr, N):
>> > r = incr*(R.randint(3, size=(N,))-1)
>> > r[0] = 0
>> > r = r.cumsum()
>> > t = 2*np.pi*incr*(R.randint(3, size=(N,))-1)
>> > t[0] = 0
>> > t = t.cumsum()
>> > p = np.pi*incr*(R.randint(3, size=(N,))-1)
>> > p[0] = 0
>> > p = p.cumsum()
>> > x = r*np.cos(t)*np.sin(p)
>> > y = r*np.sin(t)*np.sin(p)
>> > z = r*np.cos(p)
>> > return np.array((x,y,z)).T
>> >
>> > Thanks!
>> >
>> > DG
>>
>> Not the kind of Wiener process I learned of.  This would be the integral
>> of
>> white noise.  Here you have used:
>>
>> 1. discrete increments
>> 2. spherical coordinates
>>
>>
>
> I agree with Neal: that is not a Wiener process.  In your process, the
> *angles* that describe the position undergo a random walk, so the particle
> tends to move in the same direction over short intervals.
>
> To simulate a Wiener process (i.e. Brownian motion) in 3D, you can simply
> evolve each coordinate independently as a 1D process.
>
> Here's a simple function to generate a sample from a Wiener process.  The
> dimension is determined by the shape of the starting point x0.
>
>
> import numpy as np
>
>
> def wiener(x0, n, dt, delta):
> """Generate an n-dimensional random walk.
>
>

Whoops--that's a misleading docstring.  The `n` in  "an n-dimensional
random walk" is not the same `n` that is the second argument of the
function (which is the number of steps to compute).

Warren



> The array of values generated by this function simulate a Wiener
> process.
>
> Arguments
> -
> x0 : float or array
> The starting point of the random walk.
> n : int
> The number of steps to take.
> dt : float
> The time step.
> delta : float
> delta determines the "speed" of the random walk.  The random
> variable
> of the position at time t, X(t), has a normal distribution whose
> mean
> is the position at time t=0 and whose variance is delta**2*t.
>
> Returns
> ---
> x : numpy array
> The shape of `x` is (n+1,) + x0.shape.
> The first element in the array is x0.
> """
> x0 = np.asfarray(x0)
> shp = (n+1,) + x0.shape
>
> # Generate a sample numbers from a normal distribution.
> r = np.random.normal(size=shp, scale=delta*np.sqrt(dt))
>
> # Replace the first element with 0.0, so that x0 + r.cumsum() results
> # in the first element being x0.
> r[0] = 0.0
>
> # This computes the random walk by forming the cumulative sum of
> # the random sample.
> x = r.cumsum(axis=0)
> x += x0
>
> return x
>
>
>
>
> Warren
>
>
> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Valid algorithm for generating a 3D Wiener Process?

2013-09-25 Thread Warren Weckesser
On Wed, Sep 25, 2013 at 9:36 AM, Neal Becker  wrote:

> David Goldsmith wrote:
>
> > Is this a valid algorithm for generating a 3D Wiener process?  (When I
> > graph the results, they certainly look like potential Brownian motion
> > tracks.)
> >
> > def Wiener3D(incr, N):
> > r = incr*(R.randint(3, size=(N,))-1)
> > r[0] = 0
> > r = r.cumsum()
> > t = 2*np.pi*incr*(R.randint(3, size=(N,))-1)
> > t[0] = 0
> > t = t.cumsum()
> > p = np.pi*incr*(R.randint(3, size=(N,))-1)
> > p[0] = 0
> > p = p.cumsum()
> > x = r*np.cos(t)*np.sin(p)
> > y = r*np.sin(t)*np.sin(p)
> > z = r*np.cos(p)
> > return np.array((x,y,z)).T
> >
> > Thanks!
> >
> > DG
>
> Not the kind of Wiener process I learned of.  This would be the integral of
> white noise.  Here you have used:
>
> 1. discrete increments
> 2. spherical coordinates
>
>

I agree with Neal: that is not a Wiener process.  In your process, the
*angles* that describe the position undergo a random walk, so the particle
tends to move in the same direction over short intervals.

To simulate a Wiener process (i.e. Brownian motion) in 3D, you can simply
evolve each coordinate independently as a 1D process.

Here's a simple function to generate a sample from a Wiener process.  The
dimension is determined by the shape of the starting point x0.


import numpy as np


def wiener(x0, n, dt, delta):
"""Generate an n-dimensional random walk.

The array of values generated by this function simulate a Wiener
process.

Arguments
-
x0 : float or array
The starting point of the random walk.
n : int
The number of steps to take.
dt : float
The time step.
delta : float
delta determines the "speed" of the random walk.  The random
variable
of the position at time t, X(t), has a normal distribution whose
mean
is the position at time t=0 and whose variance is delta**2*t.

Returns
---
x : numpy array
The shape of `x` is (n+1,) + x0.shape.
The first element in the array is x0.
"""
x0 = np.asfarray(x0)
shp = (n+1,) + x0.shape

# Generate a sample numbers from a normal distribution.
r = np.random.normal(size=shp, scale=delta*np.sqrt(dt))

# Replace the first element with 0.0, so that x0 + r.cumsum() results
# in the first element being x0.
r[0] = 0.0

# This computes the random walk by forming the cumulative sum of
# the random sample.
x = r.cumsum(axis=0)
x += x0

return x




Warren


___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Unexpected casting result

2013-09-16 Thread Warren Weckesser
On Mon, Sep 16, 2013 at 1:54 PM, Warren Weckesser <
warren.weckes...@gmail.com> wrote:

> An unexpected casting result was just reported on stackoverflow:
>
> http://stackoverflow.com/questions/18833639/attributeerror-in-python-numpy-when-constructing-function-for-certain-values
>
> The following show the essence of the issue:
>
> In [1]: np.__version__
> Out[1]: '1.9.0.dev-6ce65d8'
>
> In [2]: type(np.array(1.) * (2**64-1))
> Out[2]: numpy.float64
>
> In [3]: type(np.array(1.) * (2**64))
> Out[3]: float
>
> Note that the result of `np.array(1.0) * 2**64` is a Python float, not a
> numpy float64.  Is this intentional?
>
> (As pointed out in the stackoverflow question, the issue
> https://github.com/numpy/numpy/issues/3409 is at least tangentially
> related.)
>
> Warren
>
>


The original poster of the stackoverflow question has reported the issue on
github: https://github.com/numpy/numpy/issues/3756

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Unexpected casting result

2013-09-16 Thread Warren Weckesser
An unexpected casting result was just reported on stackoverflow:
http://stackoverflow.com/questions/18833639/attributeerror-in-python-numpy-when-constructing-function-for-certain-values

The following show the essence of the issue:

In [1]: np.__version__
Out[1]: '1.9.0.dev-6ce65d8'

In [2]: type(np.array(1.) * (2**64-1))
Out[2]: numpy.float64

In [3]: type(np.array(1.) * (2**64))
Out[3]: float

Note that the result of `np.array(1.0) * 2**64` is a Python float, not a
numpy float64.  Is this intentional?

(As pointed out in the stackoverflow question, the issue
https://github.com/numpy/numpy/issues/3409 is at least tangentially
related.)

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] A bug in numpy.random.shuffle?

2013-09-05 Thread Warren Weckesser
On Thu, Sep 5, 2013 at 2:11 PM, Fernando Perez  wrote:

> Hi all,
>
> I just ran into this rather weird behavior:
>
> http://nbviewer.ipython.org/6453869
>
> In summary, as far as I can tell, shuffle is misbehaving when acting
> on arrays that have structured dtypes. I've seen the problem on 1.7.1
> (official on ubuntu 13.04) as well as master as of a few minutes ago.
>
> Is this my misuse? It really looks like a bug to me...
>
>

Definitely a bug:

In [1]: np.__version__
Out[1]: '1.9.0.dev-573b3b0'

In [2]: z = np.array([(0,),(1,),(2,),(3,),(4,)], dtype=[('a',int)])

In [3]: z
Out[3]:
array([(0,), (1,), (2,), (3,), (4,)],
  dtype=[('a', '
> f
>
> --
> Fernando Perez (@fperez_org; http://fperez.org)
> fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
> fernando.perez-at-berkeley: contact me here for any direct mail
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy dot returns [nan nan nan]

2013-08-24 Thread Warren Weckesser
On 8/24/13, Tom Bennett  wrote:
> Hi Warren,
>
> Yes you are absolutely right. I had some values close to log(x), where x is
> almost 0. That caused the problem.
>
> Thanks,
> Tom


Now the question is: why does `np.dot` mask the overflow warning?

In numpy 1.7.1, the default is that overflow should generate a warning:

In [1]: np.seterr()
Out[1]: {'divide': 'warn', 'invalid': 'warn', 'over': 'warn', 'under': 'ignore'}


But `np.dot` does not generate a warning:

In [2]: x = np.array([1e300])

In [3]: y = np.array([1e10])

In [4]: np.dot(x, y)
Out[4]: inf


Multiplying `x` and `y` generates the warning, as expected:

In [5]: x*y
/home/warren/anaconda/bin/ipython:1: RuntimeWarning: overflow
encountered in multiply
  #!/home/warren/anaconda/bin/python
Out[5]: array([ inf])


Warren


>
>
> On Sat, Aug 24, 2013 at 12:39 PM, Warren Weckesser <
> warren.weckes...@gmail.com> wrote:
>
>> On 8/24/13, Warren Weckesser  wrote:
>> > On 8/24/13, Tom Bennett  wrote:
>> >> Hi All,
>> >>
>> >> I have two arrays, A and B.A is 3 x 100,000 and B is 100,000. If I do
>> >> np.dot(A,B), I get [nan, nan, nan].
>> >>
>> >> However, np.any(np.isnan(A))==False and np.any(no.isnan(B))==False.
>> >> And
>> >> also np.seterr(all='print') does not print anything.
>> >>
>> >> I am not wondering what is going on and how to avoid.
>> >>
>> >> In case it is important, A and B are from the normal equation of doing
>> >> regression. I am regressing 100,000 observations on 3 100,000 long
>> >> factors.
>> >>
>> >> Thanks,
>> >> Tom
>> >>
>> >
>> > What are the data types of the arrays, and what are the typical sizes
>> > of the values in these arrays?  I can get all nans from np.dot if the
>> > values are huge floating point values:
>> >
>> > ```
>> > In [79]: x = 1e160*np.random.randn(3, 10)
>> >
>> > In [80]: y = 1e160*np.random.randn(10)
>> >
>> > In [81]: np.dot(x, y)
>> > Out[81]: array([ nan,  nan,  nan])
>> > ```
>>
>> ...and that happens because some intermediate terms overflow to inf or
>> -inf, and adding these gives nan:
>>
>> ```
>> In [89]: x = np.array([1e300])
>>
>> In [90]: y = np.array([1e10])
>>
>> In [91]: np.dot(x,y)
>> Out[91]: inf
>>
>> In [92]: x2 = np.array([1e300, 1e300])
>>
>> In [93]: y2 = np.array([1e10,-1e10])
>>
>> In [94]: np.dot(x2, y2)
>> Out[94]: nan
>> ```
>>
>> Warren
>>
>>
>> >
>> > Warren
>> >
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy dot returns [nan nan nan]

2013-08-24 Thread Warren Weckesser
On 8/24/13, Warren Weckesser  wrote:
> On 8/24/13, Tom Bennett  wrote:
>> Hi All,
>>
>> I have two arrays, A and B.A is 3 x 100,000 and B is 100,000. If I do
>> np.dot(A,B), I get [nan, nan, nan].
>>
>> However, np.any(np.isnan(A))==False and np.any(no.isnan(B))==False. And
>> also np.seterr(all='print') does not print anything.
>>
>> I am not wondering what is going on and how to avoid.
>>
>> In case it is important, A and B are from the normal equation of doing
>> regression. I am regressing 100,000 observations on 3 100,000 long
>> factors.
>>
>> Thanks,
>> Tom
>>
>
> What are the data types of the arrays, and what are the typical sizes
> of the values in these arrays?  I can get all nans from np.dot if the
> values are huge floating point values:
>
> ```
> In [79]: x = 1e160*np.random.randn(3, 10)
>
> In [80]: y = 1e160*np.random.randn(10)
>
> In [81]: np.dot(x, y)
> Out[81]: array([ nan,  nan,  nan])
> ```

...and that happens because some intermediate terms overflow to inf or
-inf, and adding these gives nan:

```
In [89]: x = np.array([1e300])

In [90]: y = np.array([1e10])

In [91]: np.dot(x,y)
Out[91]: inf

In [92]: x2 = np.array([1e300, 1e300])

In [93]: y2 = np.array([1e10,-1e10])

In [94]: np.dot(x2, y2)
Out[94]: nan
```

Warren


>
> Warren
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy dot returns [nan nan nan]

2013-08-24 Thread Warren Weckesser
On 8/24/13, Tom Bennett  wrote:
> Hi All,
>
> I have two arrays, A and B.A is 3 x 100,000 and B is 100,000. If I do
> np.dot(A,B), I get [nan, nan, nan].
>
> However, np.any(np.isnan(A))==False and np.any(no.isnan(B))==False. And
> also np.seterr(all='print') does not print anything.
>
> I am not wondering what is going on and how to avoid.
>
> In case it is important, A and B are from the normal equation of doing
> regression. I am regressing 100,000 observations on 3 100,000 long factors.
>
> Thanks,
> Tom
>

What are the data types of the arrays, and what are the typical sizes
of the values in these arrays?  I can get all nans from np.dot if the
values are huge floating point values:

```
In [79]: x = 1e160*np.random.randn(3, 10)

In [80]: y = 1e160*np.random.randn(10)

In [81]: np.dot(x, y)
Out[81]: array([ nan,  nan,  nan])
```

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Warnings not raised by np.log in 32 bit build on Windows

2013-08-22 Thread Warren Weckesser
I'm investigating a test error in scipy 0.13.0 beta 1 that was
reported by Christoph Gohlke.  The scipy issue is here:
https://github.com/scipy/scipy/issues/2771

I don't have a Windows environment to test it myself, but Christoph
reported that this code:

```
import numpy as np

data = np.array([-0.375, -0.25, 0.0])
s = np.log(data)
```

does not generate two RuntimeWarnings when it is run with numpy 1.7.1
in a 32 bit Windows 8 environment (numpy 1.7.1 compiled with Visual
Studio compilers and Intel's MKL).  In 64 bit Windows, and in 64 bit
linux, it generates two RuntimeWarnings.

The inconsistency seems like a bug, possibly this one:
https://github.com/numpy/numpy/issues/1958.

Can anyone check if this also occurs in the development branch?

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy-Discussion Digest, Vol 83, Issue 33

2013-08-20 Thread Warren Weckesser
On 8/20/13, rodrigo koblitz  wrote:
> Hi,
> How I can do this:
> int(scipy.comb(20314,117))
> ...
> OverflowError: cannot convert float infinity to integer
>


I assume you mean `scipy.misc.comb`.  If you give `comb` the argument
`exact=True`, it will give the exact result as a Python long integer:

>>> comb(20314, 117, exact=True)
185322125435964088726782059829379016108985668708943661610691107922797953622540795237396216566443123945647349065209794249915331405960154995668715672694861752881279482861934217563789733636501993781318815128638676831358831027763891670979664077780116887804168965068781398413964827499948504201570364142600031022445600L

By the way, scipy questions like this should be asked on the
scipy-user mailing list: mail.scipy.org/mailman/listinfo/scipy-user
(although I can't seem to connect to that site at the moment).

Warren


> abs,
> Koblitz
>
>
> 2013/8/20 
>
>> Send NumPy-Discussion mailing list submissions to
>> numpy-discussion@scipy.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> or, via email, send a message with subject or body 'help' to
>> numpy-discussion-requ...@scipy.org
>>
>> You can reach the person managing the list at
>> numpy-discussion-ow...@scipy.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of NumPy-Discussion digest..."
>>
>>
>> Today's Topics:
>>
>>1. OS X binaries for releases (Ralf Gommers)
>>2. Re: OS X binaries for releases (David Cournapeau)
>>3. Re: OS X binaries for releases (KACVINSKY Tom)
>>4. Re: OS X binaries for releases (David Cournapeau)
>>
>>
>> --
>>
>> Message: 1
>> Date: Tue, 20 Aug 2013 22:48:51 +0200
>> From: Ralf Gommers 
>> Subject: [Numpy-discussion] OS X binaries for releases
>> To: Discussion of Numerical Python 
>> Message-ID:
>> <
>> cabl7cqjacxp2grtt8hvmayajrm0xmtn1qt71wkdnbgq7dlu...@mail.gmail.com>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> Hi all,
>>
>> Building binaries for releases is currently quite complex and
>> time-consuming. For OS X we need two different machines, because we still
>> provide binaries for OS X 10.5 and PPC machines. I propose to not do this
>> anymore. It doesn't mean we completely drop support for 10.5 and PPC,
>> just
>> that we don't produce binaries. PPC was phased out in 2006 and OS X 10.6
>> came out in 2009, so there can't be a lot of demand for it (and the
>> download stats at
>> http://sourceforge.net/projects/numpy/files/NumPy/1.7.1/confirm this).
>>
>> Furthermore I propose to not provide 2.6 binaries anymore. Downloads of
>> 2.6
>> OS X binaries were <5% of the 2.7 ones. We did the same with 2.4 for a
>> long
>> time - support it but no binaries.
>>
>> So what we'd have left at the moment is only the 64-bit/32-bit universal
>> binary for 10.6 and up. What we finally need to add is 3.x OS X binaries.
>> We can make an attempt to build these on 10.8 - since we have access to a
>> hosted 10.8 Mac Mini it would allow all devs to easily do a release
>> (leaving aside the Windows issue). If anyone has tried the 10.6 SDK on
>> 10.8
>> and knows if it actually works, that would be helpful.
>>
>> Any concerns, objections?
>>
>> Cheers,
>> Ralf
>>
>> P.S. the same proposal applies of course also to scipy
>> -- next part --
>> An HTML attachment was scrubbed...
>> URL:
>> http://mail.scipy.org/pipermail/numpy-discussion/attachments/20130820/4d74e9d0/attachment-0001.html
>>
>> --
>>
>> Message: 2
>> Date: Tue, 20 Aug 2013 23:17:19 +0100
>> From: David Cournapeau 
>> Subject: Re: [Numpy-discussion] OS X binaries for releases
>> To: Discussion of Numerical Python 
>> Message-ID:
>> > 09qp4tjdczpxkywstbkfexodzumo9mqe9xyy5...@mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> On Tue, Aug 20, 2013 at 9:48 PM, Ralf Gommers > >wrote:
>>
>> > Hi all,
>> >
>> > Building binaries for releases is currently quite complex and
>> > time-consuming. For OS X we need two different machines, because we
>> > still
>> > provide binaries for OS X 10.5 and PPC machines. I propose to not do
>> > this
>> > anymore. It doesn't mean we completely drop support for 10.5 and PPC,
>> just
>> > that we don't produce binaries. PPC was phased out in 2006 and OS X
>> > 10.6
>> > came out in 2009, so there can't be a lot of demand for it (and the
>> > download stats at
>> http://sourceforge.net/projects/numpy/files/NumPy/1.7.1/confirm this).
>> >
>> > Furthermore I propose to not provide 2.6 binaries anymore. Downloads of
>> > 2.6 OS X binaries were <5% of the 2.7 ones. We did the same with 2.4
>> > for
>> a
>> > long time - support it but no binaries.
>> >
>> > So what we'd have left at the moment is only the 64-bit/32-bit
>> > universal
>> > binary for 10.6 and up. What we finally need to add is 3.x OS X
>> 

Re: [Numpy-discussion] fresh performance hits: numpy.linalg.pinv >30% slowdown

2013-07-19 Thread Warren Weckesser
On 7/19/13, Yaroslav Halchenko  wrote:
> I have just added a few more benchmarks, and here they come
> http://www.onerussian.com/tmp/numpy-vbench/vb_vb_linalg.html#numpy-linalg-pinv-a-float32
> it seems to be very recent so my only check based on 10 commits
> didn't pick it up yet so they are not present in the summary table.
>
> could well be related to 80% faster det()? ;)
>
> norm was hit as well a bit earlier,


Well, this is embarrassing: https://github.com/numpy/numpy/pull/3539

Thanks for benchmarks!  I'm now an even bigger fan. :)

Warren


 might well be within these commits:
> https://github.com/numpy/numpy/compare/24a0aa5...29dcc54
> I will rerun now benchmarking for the rest of commits (was running last
> in the day iirc)
>
> Cheers,
>
> On Tue, 16 Jul 2013, Yaroslav Halchenko wrote:
>
>> and to put so far reported findings into some kind of automated form,
>> please welcome
>
>> http://www.onerussian.com/tmp/numpy-vbench/#benchmarks-performance-analysis
>
>> This is based on a simple 1-way anova of last 10 commits and some point
>> in the past where 10 other commits had smallest timing and were
>> significantly
>> different from the last 10 commits.
>
>> "Possible recent" is probably too noisy and not sure if useful -- it
>> should
>> point to a closest in time (to the latest commits) diff where a
>> significant excursion from current performance was detected.  So per se it
>> has
>> nothing to do with the initial detected performance hit, but in some
>> cases
>> seems still to reasonably locate commits hitting on performance.
>
>> Enjoy,
> --
> Yaroslav O. Halchenko, Ph.D.
> http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
> Senior Research Associate, Psychological and Brain Sciences Dept.
> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
> Phone: +1 (603) 646-9834   Fax: +1 (603) 646-1419
> WWW:   http://www.linkedin.com/in/yarik
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] retrieving original array locations from 2d argsort

2013-07-15 Thread Warren Weckesser
On 7/15/13, Moroney, Catherine M (398D)
 wrote:
> I know that there's an easy way to solve this problem, but I'm not
> sufficiently knowledgeable
> about numpy indexing to figure it out.
>
> Here is the problem:
>
> Take a 2-d array a, of any size.
> Sort it in ascending order using, I presume, argsort.
> Step through the sorted array in order, and for each element in the sorted
> array,
> retrieve what the corresponding (line, sample) indices in the original array
> are.
>
> For instance:
>
> a = numpy.arange(0, 16).reshape(4,4)
> a[0,:] = -1*numpy.arange(0,4)
> a[2,:] = -1*numpy.arange(4,8)
>
> asort = numpy.sort(a, axis=None)
> for idx in xrange(0, asort.size):
>   element = asort[idx]
> !! Find the line and sample location in a that corresponds to the
> i-th element in assort
>


One way is to use argsort and  `numpy.unravel_index` to recover the
original 2D indices:


import numpy

a = numpy.arange(0, 16).reshape(4,4)
a[0,:] = -1*numpy.arange(0,4)
a[2,:] = -1*numpy.arange(4,8)

flat_sort_indices = numpy.argsort(a, axis=None)
original_indices = numpy.unravel_index(flat_sort_indices, a.shape)

print "  i   j  a[i,j]"
for i, j in zip(*original_indices):
element = a[i,j]
print "%3d %3d %6d" % (i, j, element)




Warren



> Thank-you for your help,
>
> Catherine
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What should be the result in some statistics corner cases?

2013-07-14 Thread Warren Weckesser
On 7/14/13, Charles R Harris  wrote:
> Some corner cases in the mean, var, std.
>
> *Empty arrays*
>
> I think these cases should either raise an error or just return nan.
> Warnings seem ineffective to me as they are only issued once by default.
>
> In [3]: ones(0).mean()
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:61:
> RuntimeWarning: invalid value encountered in double_scalars
>   ret = ret / float(rcount)
> Out[3]: nan
>
> In [4]: ones(0).var()
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
> RuntimeWarning: invalid value encountered in true_divide
>   out=arrmean, casting='unsafe', subok=False)
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> RuntimeWarning: invalid value encountered in double_scalars
>   ret = ret / float(rcount)
> Out[4]: nan
>
> In [5]: ones(0).std()
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
> RuntimeWarning: invalid value encountered in true_divide
>   out=arrmean, casting='unsafe', subok=False)
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> RuntimeWarning: invalid value encountered in double_scalars
>   ret = ret / float(rcount)
> Out[5]: nan
>
> *ddof >= number of elements*
>
> I think these should just raise errors. The results for ddof >= #elements
> is happenstance, and certainly negative numbers should never be returned.
>
> In [6]: ones(2).var(ddof=2)
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> RuntimeWarning: invalid value encountered in double_scalars
>   ret = ret / float(rcount)
> Out[6]: nan
>
> In [7]: ones(2).var(ddof=3)
> Out[7]: -0.0
> *
> nansum*
>
> Currently returns nan for empty arrays. I suspect it should return nan for
> slices that are all nan, but 0 for empty slices. That would make it
> consistent with sum in the empty case.
>


For nansum, I would expect 0 even in the case of all nans.  The point
of these functions is to simply ignore nans, correct?  So I would aim
for this behaviour:  nanfunc(x) behaves the same as func(x[~isnan(x)])

Warren


> Chuck
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] flip array on axis

2013-07-10 Thread Warren Weckesser
On Wed, Jul 10, 2013 at 12:03 PM, Andreas Hilboll  wrote:

> On 10.07.2013 17:06, Matthew Brett wrote:
> > Hi,
> >
> > On Wed, Jul 10, 2013 at 11:02 AM, Andreas Hilboll 
> wrote:
> >> Hi,
> >>
> >> there are np.flipud and np.fliplr methods to flip 2d arrays on the first
> >> and second dimension, respectively. What can I do to flip an array on an
> >> axis which I don't know before runtime? I'd really like to see a
> >> np.flip(arr, axis) method which lets me specify which axis to flip on.
> >
> > I have something like that that's a few lines long:
> >
> > https://github.com/nipy/nibabel/blob/master/nibabel/orientations.py#L231
> >
> > Cheers,
> >
> > Matthew
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
> Thanks, Matthew! Should this go into numpy itself? If so, I could
> prepare a PR, if you point me to the right place (file) to put it.
>
>

Something like this would be nice to have in numpy, so we don't continue to
reinvent it (e.g.
https://github.com/scipy/scipy/blob/master/scipy/signal/_arraytools.py; see
`axis_slice` and `axis_reverse`).

Warren



> Cheers, Andreas.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] TypeError when multiplying float64 and a big integer in Python 3.

2013-06-16 Thread Warren Weckesser
On Sun, Jun 16, 2013 at 12:56 PM, Warren Weckesser <
warren.weckes...@gmail.com> wrote:

> With Python 3.3.2 (64 bit), and numpy master:
>
> >>> import numpy as np
> >>> np.__version__
> '1.8.0.dev-2a5c2c8'
>
> >>> f = np.float64(1.0)
> >>> i = 2**65
> >>> f*i
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: unsupported operand type(s) for *: 'numpy.float64' and 'int'
>
> Is this the expected behavior?
>
> The error does not occur with integers that fit in 64 bits:
>
> >>> f*10
> 10.0
>
> It also does not occur in numpy 1.7.1.
>
>

I should have checked the issues on github before mailing the list:
https://github.com/numpy/numpy/issues/3442

Warren



> Warren
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] TypeError when multiplying float64 and a big integer in Python 3.

2013-06-16 Thread Warren Weckesser
With Python 3.3.2 (64 bit), and numpy master:

>>> import numpy as np
>>> np.__version__
'1.8.0.dev-2a5c2c8'

>>> f = np.float64(1.0)
>>> i = 2**65
>>> f*i
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unsupported operand type(s) for *: 'numpy.float64' and 'int'

Is this the expected behavior?

The error does not occur with integers that fit in 64 bits:

>>> f*10
10.0

It also does not occur in numpy 1.7.1.

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Seg. fault when running tests

2013-06-15 Thread Warren Weckesser
On Sat, Jun 15, 2013 at 4:03 PM, Julian Taylor <
jtaylor.deb...@googlemail.com> wrote:

> On 15.06.2013 21:57, Warren Weckesser wrote:
> >
> > On Sat, Jun 15, 2013 at 3:15 PM, Julian Taylor
>
> > @warren, can you please bisect the commit causing this?
> >
> >
> >
> > Here's the culprit:
> >
> > aef286debfd11a62f1c337dea55624cee7fd4d9e is the first bad commit
> > commit aef286debfd11a62f1c337dea55624cee7fd4d9e
> > Author: Julian Taylor  > <mailto:jtaylor.deb...@googlemail.com>>
> > Date:   Mon Jun 10 19:38:58 2013 +0200
> >
> > ENH: enable unaligned loads on x86
> >
> > x86 can handle unaligned load and there is no hand vectorized code in
> > this file. It would be a serious compiler bug if it adds
> vectorization
> > without checking for alignment.
> > Enables fast complex128 copies which are unaligned on 32 bit gcc
> unless
> > compiled with -malign-double.
> >
> > :04 04 d0948d1e1d942d41d50ce9e57bdc430db9a16f9e
> > 45a48f383353857b8d0dd24e542c7ab6f137448c M  numpy
> >
> > Link:
> >
> https://github.com/numpy/numpy/commit/aef286debfd11a62f1c337dea55624cee7fd4d9e
> >
>
> Interesting, possibly there is some inconsistencies in the macros in
> this file.
> But I don't understand why I can't reproduce it.
> Does it happen with python 3.2 too?
>


Yes, it happens with 3.3.2, 3.3.1, and 3.2.5.

Warren


> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Seg. fault when running tests

2013-06-15 Thread Warren Weckesser
On Sat, Jun 15, 2013 at 3:57 PM, Warren Weckesser <
warren.weckes...@gmail.com> wrote:

>
> On Sat, Jun 15, 2013 at 3:15 PM, Julian Taylor <
> jtaylor.deb...@googlemail.com> wrote:
>
>> On 15.06.2013 21:12, Charles R Harris wrote:
>> >
>> >
>> > On Sat, Jun 15, 2013 at 9:50 AM, Warren Weckesser
>> > mailto:warren.weckes...@gmail.com>> wrote:
>> >
>> >
>> > On Sat, Jun 15, 2013 at 11:43 AM, Warren Weckesser
>> > mailto:warren.weckes...@gmail.com>>
>> wrote:
>> >
>> > I'm getting a seg. fault in master when I run the tests.  I'm on
>> > Ubuntu 12.04 64 bit, with Python 3.3.2 (64 bits):
>> >
>> > $ python3 -c "import numpy as np; np.test('full')"
>> > Running unit tests for numpy
>> > NumPy version 1.8.0.dev-fa5bc1c
>> > NumPy is installed in
>> > /home/warren/local_py332/lib/python3.3/site-packages/numpy
>> > Python version 3.3.2 (default, Jun 14 2013, 12:12:22) [GCC
>> 4.6.3]
>> > nose version 1.3.0
>> >
>> .S.S...S
>>  .
>>
>> .KSSS.
>>  .
>>
>> ...K.Segmentation
>> > fault
>> >
>> > The seg. fault is occurring in ma/tests/test_mrecords.py:
>> >
>> > $ nosetests test_mrecords.py
>> > .Segmentation fault
>> >
>> > More info:
>> >
>> > $ python3
>> > Python 3.3.2 (default, Jun 14 2013, 12:12:22)
>> > [GCC 4.6.3] on linux
>> > Type "help", "copyright", "credits" or "license" for more
>> > information.
>> > >>> import numpy as np
>> > >>> np.show_config()
>> > atlas_threads_info:
>> > library_dirs = ['/usr/lib/atlas-base/atlas',
>> > '/usr/lib/atlas-base']
>> > include_dirs = ['/usr/include/atlas']
>> > language = f77
>> > libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
>> > define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
>> > atlas_blas_threads_info:
>> > library_dirs = ['/usr/lib/atlas-base']
>> > include_dirs = ['/

Re: [Numpy-discussion] Seg. fault when running tests

2013-06-15 Thread Warren Weckesser
On Sat, Jun 15, 2013 at 3:15 PM, Julian Taylor <
jtaylor.deb...@googlemail.com> wrote:

> On 15.06.2013 21:12, Charles R Harris wrote:
> >
> >
> > On Sat, Jun 15, 2013 at 9:50 AM, Warren Weckesser
> > mailto:warren.weckes...@gmail.com>> wrote:
> >
> >
> > On Sat, Jun 15, 2013 at 11:43 AM, Warren Weckesser
> > mailto:warren.weckes...@gmail.com>>
> wrote:
> >
> > I'm getting a seg. fault in master when I run the tests.  I'm on
> > Ubuntu 12.04 64 bit, with Python 3.3.2 (64 bits):
> >
> > $ python3 -c "import numpy as np; np.test('full')"
> > Running unit tests for numpy
> > NumPy version 1.8.0.dev-fa5bc1c
> > NumPy is installed in
> > /home/warren/local_py332/lib/python3.3/site-packages/numpy
> > Python version 3.3.2 (default, Jun 14 2013, 12:12:22) [GCC 4.6.3]
> > nose version 1.3.0
> >
> .S.S...S
>  .
>
> .KSSS.
>  .
>
> ...K.Segmentation
> > fault
> >
> > The seg. fault is occurring in ma/tests/test_mrecords.py:
> >
> > $ nosetests test_mrecords.py
> > .Segmentation fault
> >
> > More info:
> >
> > $ python3
> > Python 3.3.2 (default, Jun 14 2013, 12:12:22)
> > [GCC 4.6.3] on linux
> > Type "help", "copyright", "credits" or "license" for more
> > information.
> > >>> import numpy as np
> > >>> np.show_config()
> > atlas_threads_info:
> > library_dirs = ['/usr/lib/atlas-base/atlas',
> > '/usr/lib/atlas-base']
> > include_dirs = ['/usr/include/atlas']
> > language = f77
> > libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
> > define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
> > atlas_blas_threads_info:
> > library_dirs = ['/usr/lib/atlas-base']
> > include_dirs = ['/usr/include/atlas']
> > language = c
> > libraries = ['ptf77blas', 'ptcblas', 'atlas']
> > define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
> > mkl_info:
> >   NOT AVAI

Re: [Numpy-discussion] Seg. fault when running tests

2013-06-15 Thread Warren Weckesser
On Sat, Jun 15, 2013 at 3:26 PM, Charles R Harris  wrote:

>
>
> On Sat, Jun 15, 2013 at 1:15 PM, Julian Taylor <
> jtaylor.deb...@googlemail.com> wrote:
>
>> On 15.06.2013 21:12, Charles R Harris wrote:
>> >
>> >
>> > On Sat, Jun 15, 2013 at 9:50 AM, Warren Weckesser
>> > mailto:warren.weckes...@gmail.com>> wrote:
>> >
>> >
>> > On Sat, Jun 15, 2013 at 11:43 AM, Warren Weckesser
>> > mailto:warren.weckes...@gmail.com>>
>> wrote:
>> >
>> > I'm getting a seg. fault in master when I run the tests.  I'm on
>> > Ubuntu 12.04 64 bit, with Python 3.3.2 (64 bits):
>> >
>> > $ python3 -c "import numpy as np; np.test('full')"
>> > Running unit tests for numpy
>> > NumPy version 1.8.0.dev-fa5bc1c
>> > NumPy is installed in
>> > /home/warren/local_py332/lib/python3.3/site-packages/numpy
>> > Python version 3.3.2 (default, Jun 14 2013, 12:12:22) [GCC
>> 4.6.3]
>> > nose version 1.3.0
>> >
>> .S.S...S
>>  .
>>
>> .KSSS.
>>  .
>>
>> ...K.Segmentation
>> > fault
>> >
>> > The seg. fault is occurring in ma/tests/test_mrecords.py:
>> >
>> > $ nosetests test_mrecords.py
>> > .Segmentation fault
>> >
>> > More info:
>> >
>> > $ python3
>> > Python 3.3.2 (default, Jun 14 2013, 12:12:22)
>> > [GCC 4.6.3] on linux
>> > Type "help", "copyright", "credits" or "license" for more
>> > information.
>> > >>> import numpy as np
>> > >>> np.show_config()
>> > atlas_threads_info:
>> > library_dirs = ['/usr/lib/atlas-base/atlas',
>> > '/usr/lib/atlas-base']
>> > include_dirs = ['/usr/include/atlas']
>> > language = f77
>> > libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
>> > define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
>> > atlas_blas_threads_info:
>> > library_dirs = ['/usr/lib/atlas-base']
>> > include_dirs = ['/usr/include/atlas']
>&g

Re: [Numpy-discussion] Seg. fault when running tests

2013-06-15 Thread Warren Weckesser
On Sat, Jun 15, 2013 at 11:43 AM, Warren Weckesser <
warren.weckes...@gmail.com> wrote:

> I'm getting a seg. fault in master when I run the tests.  I'm on Ubuntu
> 12.04 64 bit, with Python 3.3.2 (64 bits):
>
> $ python3 -c "import numpy as np; np.test('full')"
> Running unit tests for numpy
> NumPy version 1.8.0.dev-fa5bc1c
> NumPy is installed in
> /home/warren/local_py332/lib/python3.3/site-packages/numpy
> Python version 3.3.2 (default, Jun 14 2013, 12:12:22) [GCC 4.6.3]
> nose version 1.3.0
> .S.S...S..KSSS.K.Segmentation
> fault
>
> The seg. fault is occurring in ma/tests/test_mrecords.py:
>
> $ nosetests test_mrecords.py
> .Segmentation fault
>
> More info:
>
> $ python3
> Python 3.3.2 (default, Jun 14 2013, 12:12:22)
> [GCC 4.6.3] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import numpy as np
> >>> np.show_config()
> atlas_threads_info:
> library_dirs = ['/usr/lib/atlas-base/atlas', '/usr/lib/atlas-base']
> include_dirs = ['/usr/include/atlas']
> language = f77
> libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
> define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
> atlas_blas_threads_info:
> library_dirs = ['/usr/lib/atlas-base']
> include_dirs = ['/usr/include/atlas']
> language = c
> libraries = ['ptf77blas', 'ptcblas', 'atlas']
> define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
> mkl_info:
>   NOT AVAILABLE
> lapack_opt_info:
> library_dirs = ['/usr/lib/atlas-base/atlas', '/usr/lib/atlas-base']
> include_dirs = ['/usr/include/atlas']
> language = f77
> libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
> define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
> blas_opt_info:
> library_dirs = ['/usr/lib/atlas-base']
> include_dirs = ['/usr/include/atlas']
> language = c
> libraries = ['ptf77blas', 'ptcblas', 'atlas']
> define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
> lapack_mkl_info:
>   NOT AVAILABLE
> blas_mkl_info:
>   NOT AVAILABLE
> >>>
>
> gdb:
>
> $ gdb python3
> GNU gdb (Ubuntu/

[Numpy-discussion] Seg. fault when running tests

2013-06-15 Thread Warren Weckesser
I'm getting a seg. fault in master when I run the tests.  I'm on Ubuntu
12.04 64 bit, with Python 3.3.2 (64 bits):

$ python3 -c "import numpy as np; np.test('full')"
Running unit tests for numpy
NumPy version 1.8.0.dev-fa5bc1c
NumPy is installed in
/home/warren/local_py332/lib/python3.3/site-packages/numpy
Python version 3.3.2 (default, Jun 14 2013, 12:12:22) [GCC 4.6.3]
nose version 1.3.0
.S.S...S..KSSS.K.Segmentation
fault

The seg. fault is occurring in ma/tests/test_mrecords.py:

$ nosetests test_mrecords.py
.Segmentation fault

More info:

$ python3
Python 3.3.2 (default, Jun 14 2013, 12:12:22)
[GCC 4.6.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.show_config()
atlas_threads_info:
library_dirs = ['/usr/lib/atlas-base/atlas', '/usr/lib/atlas-base']
include_dirs = ['/usr/include/atlas']
language = f77
libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
atlas_blas_threads_info:
library_dirs = ['/usr/lib/atlas-base']
include_dirs = ['/usr/include/atlas']
language = c
libraries = ['ptf77blas', 'ptcblas', 'atlas']
define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
mkl_info:
  NOT AVAILABLE
lapack_opt_info:
library_dirs = ['/usr/lib/atlas-base/atlas', '/usr/lib/atlas-base']
include_dirs = ['/usr/include/atlas']
language = f77
libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas']
define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
blas_opt_info:
library_dirs = ['/usr/lib/atlas-base']
include_dirs = ['/usr/include/atlas']
language = c
libraries = ['ptf77blas', 'ptcblas', 'atlas']
define_macros = [('ATLAS_INFO', '"\\"3.8.4\\""')]
lapack_mkl_info:
  NOT AVAILABLE
blas_mkl_info:
  NOT AVAILABLE
>>>

gdb:

$ gdb python3
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /home/warren/local_py332/bin/python3...done.
(gdb) run test_mrecords.py
Starting program: /home/warren/local_py332/bin/python3 test_mrecords.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
.
Program received signal SIGSEGV, Segmentation fault.
0x75f080a4 in _aligned_strided_to_contig_size8_srcstride0
(dst=,
ds

Re: [Numpy-discussion] weird problem with subtracting ndarrays

2013-06-12 Thread Warren Weckesser
On Wed, Jun 12, 2013 at 3:25 PM, Moroney, Catherine M (398D) <
catherine.m.moro...@jpl.nasa.gov> wrote:

> Hello,
>
> I've got two arrays of the same shape that I read in from a file, and I'm
> trying to
> difference them.  Very simple stuff, but I'm getting weird answers.
>
> Here is the code:
>
> >>> counts1 = hfile1.read_grid_field("CFbA",
> "TerrainReferencedRCCMFraction_Num")
> >>> counts2 = hfile2.read_grid_field("CFbA",
> "TerrainReferencedRCCMFraction_Num")
> >>> counts1.max(), counts2.max()
> (13, 13)
> >>> counts1.min(), counts2.min()
> (0, 0)
> >>> numpy.all(counts1 == counts2)
> False
> >>> diff = counts1 - counts2
> >>> diff.max()
> 4294967295  !! WHAT IS HAPPENING HERE ??
> >>> sum = counts1 + counts2
> >>> sum.max()
> 26
>
> As you can see, the range of values in both arrays is 0 to 13, and the sum
> behaves normally, but the difference gives this weird number.
>
> When I create dummy arrays, the subtraction works fine.  So there must be
> some funny value
> lurking in either the counts1 or counts2 array, but the numpy.isnan() test
> returns False.
>
> Any ideas for how I debug this?
>
> Catherine
>
>
Check the dtype of the arrays.  They are probably unsigned integers, and
the subtraction leads to wrap-around in some cases.

For example:

In [1]: x = np.array([0, 1, 2], dtype=np.uint32)

In [2]: y = np.array([1, 1, 1], dtype=np.uint32)

In [3]: x - y
Out[3]: array([4294967295,  0,  1], dtype=uint32)


Warren




> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.filled, again

2013-06-12 Thread Warren Weckesser
On Wed, Jun 12, 2013 at 2:00 PM, Nathaniel Smith  wrote:

> On 12 Jun 2013 18:20, "Ralf Gommers"  wrote:
> >
> >
> >
> >
> > On Wed, Jun 12, 2013 at 6:36 PM, Chris Barker - NOAA Federal <
> chris.bar...@noaa.gov> wrote:
> >>
> >> On Wed, Jun 12, 2013 at 5:10 AM, Nathaniel Smith  wrote:
> >>
> >> > Personally I think that overloading np.empty is horribly ugly, will
> >> > continue confusing newbies and everyone else indefinitely, and I'm
> >> > 100% convinced that we'll regret implementing such a warty interface
> >> > for something that should be so idiomatic.
> >
> >
> > I agree.
>
> Sounds like we're pretty much reaching consensus. Phew.
>
> >>
> >> ...
> >>  deprecate np.ma.filled
> >
> >
> > Please don't. Rather just live with the inconsistency between numpy and
> numpy.ma APIs. If that bothers you, just tell yourself that we'll get an
> NA dtype at some point and that that will make numpy.ma much less
> important:)
>
> Oh, I do tell myself that :-). With my committer/consensus-building hat
> on, np.ma has users, so I want something they can live with, and was
> suggesting some options. For myself I don't really care what np.ma does
> though since I don't use it...
>
> >> in favor
> >> > of masked_array.filled (which does exactly the same thing) and
> >> > eventually switch np.ma.filled to be consistent with the new
> >> > np.filled.
> >>
> >> +1
> >>
> >> > I also don't really see why an np.empty() constructor exists, it seems
> >> > to do the same thing that np.ndarray() does.
> >>
> >> I had always assumed that np.ndarray() was a "low-level" interce that
> >> you really don't want to use in regular code (maybe for subclassing
> >> array...), as the docs say:
> >>
> >> """
> >> Arrays should be constructed using `array`, `zeros` or `empty` (refer
> >> to the See Also section below).  The parameters given here refer to
> >> a low-level method (`ndarray(...)`) for instantiating an array.
> >> """
> >>
> >> Am I wrong? is there any reason )other than history to have np.empty()
> >>
> >> But in any case, I like np.filled(), as being analogous to ones(),
> >> zeros() and empty()...
> >
> >
> > I like np.filled as well. np.fill_with sounds fine too.
>
> Grammatically, fill_with is an imperative, which suggests it needs an
> array to operate on; it's synonymous with just plain 'fill'. Having 'fill'
> and 'fill_with' as different functions with different semantics would be
> pretty confusing!
>
>
That's why I suggested 'filledwith' (add the underscore if you like).  This
also allows a corresponding masked implementation, 'ma.filledwith', without
clobbering the existing 'ma.filled'.

Warren

-n
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.filled, again

2013-06-12 Thread Warren Weckesser
On Wed, Jun 12, 2013 at 10:18 AM, Nathaniel Smith  wrote:

> On Wed, Jun 12, 2013 at 1:28 PM, Matthew Brett 
> wrote:
> > On Wed, Jun 12, 2013 at 1:10 PM, Nathaniel Smith  wrote:
> >> Personally I think that overloading np.empty is horribly ugly, will
> >> continue confusing newbies and everyone else indefinitely, and I'm
> >> 100% convinced that we'll regret implementing such a warty interface
> >> for something that should be so idiomatic. (Unfortunately I got busy
> >> and didn't actually say this in the previous thread though.)
> >
> > Maybe you could unpack this, as I seem to remember this was the option
> > with the most support previously.
>
> Indeed it was, which is why I brought it up :-).
>
> I'm not sure what more there is to unpack, though. It's just...
> offensive to every sense of API design I have, I don't know how to
> explain more than I have. I speculate that it's only attraction is
> that it showed up at the end of a 50 email thread and offered the
> promise of ending things, but I don't know.
>
> Well, here's maybe another way of getting at the ugliness.
>
> Here's the current doc page listing all the options for creating an
> array -- a very useful reference, esp. while learning:
>   http://docs.scipy.org/doc/numpy/reference/routines.array-creation.html
>
> Now imagine a new version of this page, if we add 'filled'. There will
> be a list at the top with functions named:
>   empty
>   filled
>   ones
>   zeros
> It's immediately obvious what all of these things do, and how they
> differ from each other, and in which situation you might want each,
> just from the names, even before you read the one-line synopses. Even
> more so if you know about the existence of np.fill(). The synopses for
> 'ones' and 'zeros' don't even have to change, they already use the
> word 'filled' to describe what they do. It all just works.
>
> Now imagine a different new version of this page, if we overload
> 'empty' to add a fill= option. I don't even know how we document that
> on this page. The list will remain:
>   empty
>   ones
>   zeros
> So there will be no clue there how you get an array filled with NaNs
> or whatever, or even any hint that it's possible. Then there's the
> prose on the right. Right now the synopsis for 'empty' is:
>   Return a new array of given shape and type, without initializing entries.
> I guess we change this to
>   Return a new array of given shape and type, without initializing
> entries, OR return a new array of given shape and type, with values
> initialized to some specific value.
> ? IMO probably the single best criterion for judging whether your API
> is good, is whether you can write clean and pretty docs for it. This
> fails that test horribly...
>
> We probably should advertise the ndarray constructor more, and
> possibly make it more generally useful, but the current situation for
> better or worse is that we've spent many years telling people that
> it's a weird low-level thing that they shouldn't use. (I didn't even
> know how it worked until 10 minutes ago!) Adding this functionality
> there means it'll still be hidden away, so it's not a great solution
> to the 'filled' problem, and it doesn't really move us any closer to
> having a coherent story on when you should use the ndarray constructor
> either.
>
> So IMO the best (least bad) solution on offer is still to just add a
> 'filled' function, and live with the np.ma inconsistency.
>
>

But then what do you call the ma version of the new `filled` and
`filled_like` functions?  Presumably they should be implemented as part of
the pull request.

Warren



> -n
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.filled, again

2013-06-12 Thread Warren Weckesser
On Wed, Jun 12, 2013 at 10:18 AM, Nathaniel Smith  wrote:

> On Wed, Jun 12, 2013 at 1:28 PM, Matthew Brett 
> wrote:
> > On Wed, Jun 12, 2013 at 1:10 PM, Nathaniel Smith  wrote:
> >> Personally I think that overloading np.empty is horribly ugly, will
> >> continue confusing newbies and everyone else indefinitely, and I'm
> >> 100% convinced that we'll regret implementing such a warty interface
> >> for something that should be so idiomatic. (Unfortunately I got busy
> >> and didn't actually say this in the previous thread though.)
> >
> > Maybe you could unpack this, as I seem to remember this was the option
> > with the most support previously.
>
> Indeed it was, which is why I brought it up :-).
>
> I'm not sure what more there is to unpack, though. It's just...
> offensive to every sense of API design I have, I don't know how to
> explain more than I have. I speculate that it's only attraction is
> that it showed up at the end of a 50 email thread and offered the
> promise of ending things, but I don't know.
>
> Well, here's maybe another way of getting at the ugliness.
>
> Here's the current doc page listing all the options for creating an
> array -- a very useful reference, esp. while learning:
>   http://docs.scipy.org/doc/numpy/reference/routines.array-creation.html
>
> Now imagine a new version of this page, if we add 'filled'. There will
> be a list at the top with functions named:
>   empty
>   filled
>   ones
>   zeros
> It's immediately obvious what all of these things do, and how they
> differ from each other, and in which situation you might want each,
> just from the names, even before you read the one-line synopses. Even
> more so if you know about the existence of np.fill(). The synopses for
> 'ones' and 'zeros' don't even have to change, they already use the
> word 'filled' to describe what they do. It all just works.
>
> Now imagine a different new version of this page, if we overload
> 'empty' to add a fill= option. I don't even know how we document that
> on this page. The list will remain:
>   empty
>   ones
>   zeros
> So there will be no clue there how you get an array filled with NaNs
> or whatever, or even any hint that it's possible. Then there's the
> prose on the right. Right now the synopsis for 'empty' is:
>   Return a new array of given shape and type, without initializing entries.
> I guess we change this to
>   Return a new array of given shape and type, without initializing
> entries, OR return a new array of given shape and type, with values
> initialized to some specific value.
> ? IMO probably the single best criterion for judging whether your API
> is good, is whether you can write clean and pretty docs for it. This
> fails that test horribly...
>
> We probably should advertise the ndarray constructor more, and
> possibly make it more generally useful, but the current situation for
> better or worse is that we've spent many years telling people that
> it's a weird low-level thing that they shouldn't use. (I didn't even
> know how it worked until 10 minutes ago!) Adding this functionality
> there means it'll still be hidden away, so it's not a great solution
> to the 'filled' problem, and it doesn't really move us any closer to
> having a coherent story on when you should use the ndarray constructor
> either.
>
> So IMO the best (least bad) solution on offer is still to just add a
> 'filled' function, and live with the np.ma inconsistency.
>
>

Another idea (also imperfect): call the new functions `filledwith` and
`filledwith_like`.  Not as concise as `filled`, but the meaning is still
clear, and it avoids the clash with `ma.filled`.

Warren



> -n
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy tests errors and failures

2013-06-04 Thread Warren Weckesser
On Tue, Jun 4, 2013 at 7:52 AM, Warren Weckesser  wrote:

>
> On Tue, Jun 4, 2013 at 1:20 AM, Tim Burgess  wrote:
>
>> On Sat, 2013-06-01 at 20:09 -0400, Warren Weckesser wrote:
>>
>>
>> > I'm using Ubuntu 12.04, so I suspect I won't be the only one who sees
>> > these.
>> >
>> gcc on 12.04 (precise) should be 4.6.3
>>
>
>> See
>>
>> http://packages.ubuntu.com/search?keywords=gcc&searchon=names&suite=precise§ion=all
>>
>>
>
> Yes, that's what it is.   The python from the Anaconda package includes
> "[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2" in its banner, so I
> guess Anaconda is built with an even older gcc.
>
> If I build python myself (with ubuntu's gcc 4.6.3), I get the same failure
> and two errors that I originally reported.  In this case, I used python
> 3.3.1:
>
> $ python3 -c "import numpy as np; np.test('full')"
>
> Running unit tests for numpy
> NumPy version 1.8.0.dev-e9e490a
> NumPy is installed in
> /home/warren/local_numpy/lib/python3.3/site-packages/numpy
> Python version 3.3.1 (default, Apr 13 2013, 13:42:07) [GCC 4.6.3]
> nose version 1.2.1
>
> 
>
> ==
> ERROR: test_numeric.TestIsclose.test_ip_isclose_allclose([1e-08, 1,
> 120.99], [0, nan, 100.0])
> --
> Traceback (most recent call last):
>   File
> "/home/warren/local_py331/lib/python3.3/site-packages/nose-1.2.1-py3.3.egg/nose/case.py",
> line 198, in runTest
> self.test(*self.arg)
>   File
> "/home/warren/local_numpy/lib/python3.3/site-packages/numpy/core/tests/test_numeric.py",
> line 1253, in tst_isclose_allclose
>
> assert_array_equal(isclose(x, y).all(), allclose(x, y), msg % (x, y))
>   File
> "/home/warren/local_numpy/lib/python3.3/site-packages/numpy/core/numeric.py",
> line 2008, in allclose
>
> return all(less_equal(abs(x-y), atol + rtol * abs(y)))
> RuntimeWarning: invalid value encountered in absolute
>
> ==
> ERROR: test_numeric.TestIsclose.test_ip_isclose_allclose(nan, [nan, nan,
> nan])
> --
> Traceback (most recent call last):
>   File
> "/home/warren/local_py331/lib/python3.3/site-packages/nose-1.2.1-py3.3.egg/nose/case.py",
> line 198, in runTest
> self.test(*self.arg)
>   File
> "/home/warren/local_numpy/lib/python3.3/site-packages/numpy/core/tests/test_numeric.py",
> line 1253, in tst_isclose_allclose
>
> assert_array_equal(isclose(x, y).all(), allclose(x, y), msg % (x, y))
>   File
> "/home/warren/local_numpy/lib/python3.3/site-packages/numpy/core/numeric.py",
> line 2008, in allclose
>
> return all(less_equal(abs(x-y), atol + rtol * abs(y)))
> RuntimeWarning: invalid value encountered in absolute
>
> ==
> FAIL: Test numpy dot with different order C, F
> --
> Traceback (most recent call last):
>   File
> "/home/warren/local_py331/lib/python3.3/site-packages/nose-1.2.1-py3.3.egg/nose/case.py",
> line 198, in runTest
> self.test(*self.arg)
>   File
> "/home/warren/local_numpy/lib/python3.3/site-packages/numpy/core/tests/test_blasdot.py",
> line 114, in test_dot_array_order
> assert_almost_equal(a.dot(a), a.T.dot(a.T).T, decimal=30)
>   File
> "/home/warren/local_numpy/lib/python3.3/site-packages/numpy/testing/utils.py",
> line 458, in assert_almost_equal
>
> return assert_array_almost_equal(actual, desired, decimal, err_msg)
>   File
> "/home/warren/local_numpy/lib/python3.3/site-packages/numpy/testing/utils.py",
> line 819, in assert_array_almost_equal
>
> header=('Arrays are not almost equal to %d decimals' % decimal))
>   File
> "/home/warren/local_numpy/lib/python3.3/site-packages/numpy/testing/utils.py",
> line 652, in assert_array_compare
>
> raise AssertionError(msg)
> AssertionError:
> Arrays are not almost equal to 30 decimals
>
> (mismatch 10.0%)
>  x: array([[ 0.60970883,  1.6909554 , -1.0885194 , -1.82058004,
> -3.95746616,
>  1.52435604, -0.59853062, -3.72278631,  3.82375941,  5.51367039],
>[-3.58154905, -2.0623123 , -0.06567267,  1.47373436,  2.60687462,...
>  y: array([[ 0.60970883,  1.6909554 , -1.0885194 , -1.82058004,
> -3.95746616,
>  1.52

Re: [Numpy-discussion] numpy tests errors and failures

2013-06-04 Thread Warren Weckesser
On Tue, Jun 4, 2013 at 1:20 AM, Tim Burgess  wrote:

> On Sat, 2013-06-01 at 20:09 -0400, Warren Weckesser wrote:
>
>
> > I'm using Ubuntu 12.04, so I suspect I won't be the only one who sees
> > these.
> >
> gcc on 12.04 (precise) should be 4.6.3
>

> See
>
> http://packages.ubuntu.com/search?keywords=gcc&searchon=names&suite=precise§ion=all
>
>

Yes, that's what it is.   The python from the Anaconda package includes
"[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2" in its banner, so I
guess Anaconda is built with an even older gcc.

If I build python myself (with ubuntu's gcc 4.6.3), I get the same failure
and two errors that I originally reported.  In this case, I used python
3.3.1:

$ python3 -c "import numpy as np; np.test('full')"
Running unit tests for numpy
NumPy version 1.8.0.dev-e9e490a
NumPy is installed in
/home/warren/local_numpy/lib/python3.3/site-packages/numpy
Python version 3.3.1 (default, Apr 13 2013, 13:42:07) [GCC 4.6.3]
nose version 1.2.1



==
ERROR: test_numeric.TestIsclose.test_ip_isclose_allclose([1e-08, 1,
120.99], [0, nan, 100.0])
--
Traceback (most recent call last):
  File
"/home/warren/local_py331/lib/python3.3/site-packages/nose-1.2.1-py3.3.egg/nose/case.py",
line 198, in runTest
self.test(*self.arg)
  File
"/home/warren/local_numpy/lib/python3.3/site-packages/numpy/core/tests/test_numeric.py",
line 1253, in tst_isclose_allclose
assert_array_equal(isclose(x, y).all(), allclose(x, y), msg % (x, y))
  File
"/home/warren/local_numpy/lib/python3.3/site-packages/numpy/core/numeric.py",
line 2008, in allclose
return all(less_equal(abs(x-y), atol + rtol * abs(y)))
RuntimeWarning: invalid value encountered in absolute

==
ERROR: test_numeric.TestIsclose.test_ip_isclose_allclose(nan, [nan, nan,
nan])
--
Traceback (most recent call last):
  File
"/home/warren/local_py331/lib/python3.3/site-packages/nose-1.2.1-py3.3.egg/nose/case.py",
line 198, in runTest
self.test(*self.arg)
  File
"/home/warren/local_numpy/lib/python3.3/site-packages/numpy/core/tests/test_numeric.py",
line 1253, in tst_isclose_allclose
assert_array_equal(isclose(x, y).all(), allclose(x, y), msg % (x, y))
  File
"/home/warren/local_numpy/lib/python3.3/site-packages/numpy/core/numeric.py",
line 2008, in allclose
return all(less_equal(abs(x-y), atol + rtol * abs(y)))
RuntimeWarning: invalid value encountered in absolute

==
FAIL: Test numpy dot with different order C, F
--
Traceback (most recent call last):
  File
"/home/warren/local_py331/lib/python3.3/site-packages/nose-1.2.1-py3.3.egg/nose/case.py",
line 198, in runTest
self.test(*self.arg)
  File
"/home/warren/local_numpy/lib/python3.3/site-packages/numpy/core/tests/test_blasdot.py",
line 114, in test_dot_array_order
assert_almost_equal(a.dot(a), a.T.dot(a.T).T, decimal=30)
  File
"/home/warren/local_numpy/lib/python3.3/site-packages/numpy/testing/utils.py",
line 458, in assert_almost_equal
return assert_array_almost_equal(actual, desired, decimal, err_msg)
  File
"/home/warren/local_numpy/lib/python3.3/site-packages/numpy/testing/utils.py",
line 819, in assert_array_almost_equal
header=('Arrays are not almost equal to %d decimals' % decimal))
  File
"/home/warren/local_numpy/lib/python3.3/site-packages/numpy/testing/utils.py",
line 652, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 30 decimals

(mismatch 10.0%)
 x: array([[ 0.60970883,  1.6909554 , -1.0885194 , -1.82058004, -3.95746616,
 1.52435604, -0.59853062, -3.72278631,  3.82375941,  5.51367039],
   [-3.58154905, -2.0623123 , -0.06567267,  1.47373436,  2.60687462,...
 y: array([[ 0.60970883,  1.6909554 , -1.0885194 , -1.82058004, -3.95746616,
 1.52435604, -0.59853062, -3.72278631,  3.82375941,  5.51367039],
   [-3.58154905, -2.0623123 , -0.06567267,  1.47373436,  2.60687462,...

--
Ran 5150 tests in 33.772s

FAILED (KNOWNFAIL=6, SKIP=18, errors=2, failures=1)


Warren



>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy tests errors and failures

2013-06-01 Thread Warren Weckesser
On Sat, Jun 1, 2013 at 8:56 PM, Warren Weckesser  wrote:

>
>
>
> On Sat, Jun 1, 2013 at 7:47 PM, Charles R Harris <
> charlesr.har...@gmail.com> wrote:
>
>>
>>
>> On Sat, Jun 1, 2013 at 4:50 PM, Warren Weckesser <
>> warren.weckes...@gmail.com> wrote:
>>
>>> I'm getting a failure and two errors with the latest master branch:
>>>
>>> $ python -c "import numpy; numpy.test('full')"
>>> Running unit tests for numpy
>>> NumPy version 1.8.0.dev-dff8c94
>>> NumPy is installed in
>>> /home/warren/local_numpy/lib/python2.7/site-packages/numpy
>>> Python version 2.7.4 |Anaconda 1.5.0 (64-bit)| (default, Apr 21 2013,
>>> 17:43:08) [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)]
>>> nose version 1.3.0
>>>
>>> .F...S...S..KE.E.SSS...KK..K.

Re: [Numpy-discussion] numpy tests errors and failures

2013-06-01 Thread Warren Weckesser
On Sat, Jun 1, 2013 at 7:47 PM, Charles R Harris
wrote:

>
>
> On Sat, Jun 1, 2013 at 4:50 PM, Warren Weckesser <
> warren.weckes...@gmail.com> wrote:
>
>> I'm getting a failure and two errors with the latest master branch:
>>
>> $ python -c "import numpy; numpy.test('full')"
>> Running unit tests for numpy
>> NumPy version 1.8.0.dev-dff8c94
>> NumPy is installed in
>> /home/warren/local_numpy/lib/python2.7/site-packages/numpy
>> Python version 2.7.4 |Anaconda 1.5.0 (64-bit)| (default, Apr 21 2013,
>> 17:43:08) [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)]
>> nose version 1.3.0
>>
>> .F...S...S..KE.E.SSS...KK..K

Re: [Numpy-discussion] numpy tests errors and failures

2013-06-01 Thread Warren Weckesser
On Sat, Jun 1, 2013 at 7:47 PM, Charles R Harris
wrote:

>
>
> On Sat, Jun 1, 2013 at 4:50 PM, Warren Weckesser <
> warren.weckes...@gmail.com> wrote:
>
>> I'm getting a failure and two errors with the latest master branch:
>>
>> $ python -c "import numpy; numpy.test('full')"
>> Running unit tests for numpy
>> NumPy version 1.8.0.dev-dff8c94
>> NumPy is installed in
>> /home/warren/local_numpy/lib/python2.7/site-packages/numpy
>> Python version 2.7.4 |Anaconda 1.5.0 (64-bit)| (default, Apr 21 2013,
>> 17:43:08) [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)]
>> nose version 1.3.0
>>
>> .F...S...S..KE.E.SSS...KK..K

[Numpy-discussion] numpy tests errors and failures

2013-06-01 Thread Warren Weckesser
I'm getting a failure and two errors with the latest master branch:

$ python -c "import numpy; numpy.test('full')"
Running unit tests for numpy
NumPy version 1.8.0.dev-dff8c94
NumPy is installed in
/home/warren/local_numpy/lib/python2.7/site-packages/numpy
Python version 2.7.4 |Anaconda 1.5.0 (64-bit)| (default, Apr 21 2013,
17:43:08) [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)]
nose version 1.3.0
.F...S...S..KE.E.SSS...KK..K...SS..SS.

Re: [Numpy-discussion] multivariate_normal issue with 'size' argument

2013-05-24 Thread Warren Weckesser
On 5/24/13, Peter Cock  wrote:
> On Fri, May 24, 2013 at 3:02 PM, Warren Weckesser
>  wrote:
>> On 5/24/13, Peter Cock  wrote:
>>>Warren wrote:
>>>> Two more data points:
>>>> On Ubuntu 12.04, using 64 bit builds of Python 2.7.4 (from Anaconda
>>>> 1.5.0), and numpy built from source: numpy 1.6.1 gives the error, but
>>>> 1.6.2 does not.
>>>>
>>>> Warren
>>>
>>> That's interesting - and matches my only success being with NumPy 1.6.2
>>>
>>> This suggests this was broken to up 1.6.1, but fixed in the 1.6.2 branch
>>> and not the 1.7 branch. Have anyone tried the current master branch?
>>>
>>
>> Sorry, I should have repeated my earlier report about 1.7.1.  My current
>> summary
>> (all using 64 bit python 2.7.4 from Anaconda 1.5):
>>
>> numpy 1.6.1 (built from source) fails.
>> numpy 1.6.2 (built from source) succeeds.
>> numpy 1.7.1 (Anaconda package) succeeds.
>>
>> Warren
>
> Was this the same numpy 1.7.1 you used earlier, or a different setup?
>

With python 2.7.4, numpy 1.7.1 from Anaconda and built from source succeeds.
But with python 3.3.1, numpy 1.7.1 built from source fails.

Warren


> 64 bit Linux machine, Python 2.7.4 compiled from source (recently),
> numpy freshly compiled from source today, making sure to remove
> the old numpy installation fully between installs - note run in this order,
> but I have tested some of these multiple times:
>
> numpy 1.4.1 (built from source) succeeds.
> numpy 1.5.1 (built from source) succeeds.
> numpy 1.6.0 (built from source) fails.
> numpy 1.6.1 (built from source) fails.
> numpy 1.6.2 (built from source) succeeds.
> numpy 1.7.0b2 (built from source) succeeds.
> numpy 1.7.0 (built from source) succeeds.
> numpy 1.7.1 (built from source) succeeds.
> numpy 1.8.0.dev-e11cd9b (built from source) succeeds.
>
> That all looks nice and neat, but according to my notes (and the
> earlier email) the old numpy 1.4.1 on this machine was failing.
> (I've erased that install now, but it was likely built with a
> older Python 2.7.x and/or slightly older gcc).
>
> Peter
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] multivariate_normal issue with 'size' argument

2013-05-24 Thread Warren Weckesser
On 5/24/13, Warren Weckesser  wrote:
> On 5/24/13, Peter Cock  wrote:
>> On Fri, May 24, 2013 at 2:47 PM, Warren Weckesser
>>  wrote:
>>>
>>>Peter wrote:
>>>> ---
>>>> Successes
>>>> ---
>>>>
>>>> 64 bit Linux:
>>>>
>>>> $ python2.6
>>>> Python 2.6.6 (r266:84292, Sep 11 2012, 08:34:23)
>>>> [GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2
>>>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>>> import sys;print("%x" % sys.maxsize, sys.maxsize > 2**32)
>>>> ('7fff', True)
>>>>>>> import numpy as np
>>>>>>> print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
>>>>>>> size=1))
>>>> [[-0.27469218 -2.12911784]]
>>>>>>> print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
>>>>>>> size=np.int64(1)))
>>>> [[ 0.02609307  0.32485211]]
>>>>>>> np.__version__
>>>> '1.6.2'
>>>>>>> quit()
>>>>
>>>
>>> Peter: wow, that's a lot of tests!
>>
>> I try to keep a broad range on hand for testing my own code.
>>
>>> Two more data points:
>>> On Ubuntu 12.04, using 64 bit builds of Python 2.7.4 (from Anaconda
>>> 1.5.0), and numpy built from source: numpy 1.6.1 gives the error, but
>>> 1.6.2 does not.
>>>
>>> Warren
>>
>> That's interesting - and matches my only success being with NumPy 1.6.2
>>
>> This suggests this was broken to up 1.6.1, but fixed in the 1.6.2 branch
>> and not the 1.7 branch. Have anyone tried the current master branch?
>>
>
> Sorry, I should have repeated my earlier report about 1.7.1.  My current
> summary
> (all using 64 bit python 2.7.4 from Anaconda 1.5):
>
> numpy 1.6.1 (built from source) fails.
> numpy 1.6.2 (built from source) succeeds.
> numpy 1.7.1 (Anaconda package) succeeds.
>


Latest summary (all on 64 bit Ubuntu 12.04, and all Numpy packages are
built from source)

64 bit Python 2.7.4 (from Anaconda 1.5.0):
numpy 1.6.1 fails.
numpy 1.6.2 succeeds.
numpy 1.7.0 succeeds.
numpy 1.7.1 succeeds.

64 bit Python 3.3.1 (built from source):
numpy 1.7.1 fails.


Warren


> Warren
>
>> Peter
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] multivariate_normal issue with 'size' argument

2013-05-24 Thread Warren Weckesser
On 5/24/13, Peter Cock  wrote:
> On Fri, May 24, 2013 at 2:47 PM, Warren Weckesser
>  wrote:
>>
>>Peter wrote:
>>> ---
>>> Successes
>>> ---
>>>
>>> 64 bit Linux:
>>>
>>> $ python2.6
>>> Python 2.6.6 (r266:84292, Sep 11 2012, 08:34:23)
>>> [GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2
>>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>> import sys;print("%x" % sys.maxsize, sys.maxsize > 2**32)
>>> ('7fff', True)
>>>>>> import numpy as np
>>>>>> print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
>>>>>> size=1))
>>> [[-0.27469218 -2.12911784]]
>>>>>> print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
>>>>>> size=np.int64(1)))
>>> [[ 0.02609307  0.32485211]]
>>>>>> np.__version__
>>> '1.6.2'
>>>>>> quit()
>>>
>>
>> Peter: wow, that's a lot of tests!
>
> I try to keep a broad range on hand for testing my own code.
>
>> Two more data points:
>> On Ubuntu 12.04, using 64 bit builds of Python 2.7.4 (from Anaconda
>> 1.5.0), and numpy built from source: numpy 1.6.1 gives the error, but
>> 1.6.2 does not.
>>
>> Warren
>
> That's interesting - and matches my only success being with NumPy 1.6.2
>
> This suggests this was broken to up 1.6.1, but fixed in the 1.6.2 branch
> and not the 1.7 branch. Have anyone tried the current master branch?
>

Sorry, I should have repeated my earlier report about 1.7.1.  My current summary
(all using 64 bit python 2.7.4 from Anaconda 1.5):

numpy 1.6.1 (built from source) fails.
numpy 1.6.2 (built from source) succeeds.
numpy 1.7.1 (Anaconda package) succeeds.

Warren

> Peter
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] multivariate_normal issue with 'size' argument

2013-05-24 Thread Warren Weckesser
On 5/24/13, Peter Cock  wrote:
> On Fri, May 24, 2013 at 2:15 PM, Robert Kern  wrote:
>> On Fri, May 24, 2013 at 9:12 AM, Peter Cock 
>> wrote:
>>> On Fri, May 24, 2013 at 1:59 PM, Emanuele Olivetti
>>>  wrote:
 Interesting. Anyone able to reproduce what I observe?

 Emanuele
>>>
>>>
>>> Yes, I can reproduce this IndexError under Mac OS X:
>>>
>>> $ which python2.7
>>> /usr/bin/python2.7
>>> $ python2.7
>>> Python 2.7.2 (default, Oct 11 2012, 20:14:37)
>>> [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on
>>> darwin
>>> Type "help", "copyright", "credits" or "license" for more information.
>>
>> Can everyone please report whether they have a 32-bit build of Python
>> or a 64-bit build? That's probably the most relevant factor.
>
> It seems to affect all of 32 bit Windows XP, 64 bit Mac, 32 bit Linux,
> and 64 bit Linux
> for some versions of NumPy...  Thus far the only non-failure I've seen
> is 64 bit Linux,
> Python 2.6.6 with NumPy 1.6.2 (other Python/NumPy installs on this
> machine do fail).
>
> Its a bit strange - I don't see any obvious pattern.
>
> Peter
>
> ---
>
> Failures:
>
> My Python installs on this Mac all seem to be 64bit (and fail),
>
> $ python3.3
> Python 3.3.1 (default, Apr  8 2013, 17:54:08)
> [GCC 4.2.1 Compatible Apple Clang 4.0 ((tags/Apple/clang-421.0.57))] on
> darwin
> Type "help", "copyright", "credits" or "license" for more information.
 import sys;print("%x" % sys.maxsize, sys.maxsize > 2**32)
> 7fff True
 import numpy as np
 print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
 size=1))
> [[ 1.80932387  0.85894164]]
 print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
 size=np.int64(1)))
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "mtrand.pyx", line 4161, in
> mtrand.RandomState.multivariate_normal
> (numpy/random/mtrand/mtrand.c:19140)
> IndexError: invalid index to scalar variable.
 np.__version__
> '1.7.1'
 quit()
>
> This also affects NumPy 1.5 so this isn't a recent regression:
>
> $ python3.2
> Python 3.2 (r32:88445, Feb 28 2011, 17:04:33)
> [GCC 4.2.1 (Apple Inc. build 5664)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
 import sys;print("%x" % sys.maxsize, sys.maxsize > 2**32)
> 7fff True
 import numpy as np
 print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
 size=1))
> [[ 1.11403341 -1.67856405]]
 print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
 size=np.int64(1)))
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "mtrand.pyx", line 3954, in
> mtrand.RandomState.multivariate_normal
> (numpy/random/mtrand/mtrand.c:17234)
> IndexError: invalid index to scalar variable.
 np.__version__
> '1.5.0'
>
> $ python3.1
> Python 3.1.2 (r312:79147, Nov 15 2010, 16:28:52)
> [GCC 4.2.1 (Apple Inc. build 5664)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
 import sys;print("%x" % sys.maxsize, sys.maxsize > 2**32)
> 7fff True
 import numpy as np
 print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
 size=1))
> [[ 0.3834108  -0.31124203]]
 print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
 size=np.int64(1)))
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "mtrand.pyx", line 3954, in
> mtrand.RandomState.multivariate_normal
> (numpy/random/mtrand/mtrand.c:17234)
> IndexError: invalid index to scalar variable.
 np.__version__
> '1.5.0'
 quit()
>
> And on my 32 bit Windows XP box,
>
> Python 2.7 (r27:82525, Jul  4 2010, 09:01:59) [MSC v.1500 32 bit
> (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
 import sys;print("%x" % sys.maxsize, sys.maxsize > 2**32)
> ('7fff', False)
 import numpy as np
 print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
 size=1))
> [[-0.35072523 -0.58046885]]
 print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
 size=np.int64(1)))
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "mtrand.pyx", line 3954, in
> mtrand.RandomState.multivariate_normal
> (numpy\random\mtrand\mtrand.c:17234)
> IndexError: invalid index to scalar variable.
 np.__version__
> '1.5.0'

>
> Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600
> 32 bit (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
 import numpy as np
 print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
 size=1))
> [[-0.00453374  0.2210342 ]]
 print(np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
 size=np.int64(1)))
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "mtrand.pyx", line 4142, in
> mtrand.RandomState.mu

Re: [Numpy-discussion] multivariate_normal issue with 'size' argument

2013-05-24 Thread Warren Weckesser
On 5/24/13, Emanuele Olivetti  wrote:
> Interesting. Anyone able to reproduce what I observe?


Yes.  I'm also using Ubuntu 12.04.  With numpy 1.6.1, I get the same
error, but it works fine with numpy 1.7.1.

Warren


>
> Emanuele
>
> On 05/24/2013 02:09 PM, Nicolas Rougier wrote:
>>
>>
>> Works for me (numpy 1.7.1, osx 10.8.3):
>>
> import numpy as np
> print np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
> size=1)
>> [[-0.55854737 -1.82631485]]
> print np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
> size=np.int64(1))
>> [[ 0.40274243 -0.33922682]]
>>
>>
>>
>> Nicolas
>>
>> On May 24, 2013, at 2:02 PM, Emanuele Olivetti 
>> wrote:
>>
>>> import numpy as np
>>> print np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
>>> size=1)
>>> print np.random.multivariate_normal(mean=np.zeros(2), cov=np.eye(2),
>>> size=np.int64(1))
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "Lists" and "Join" function needed

2013-05-04 Thread Warren Weckesser
On 5/4/13, Bakhtiyor Zokhidov  wrote:
>
> Hi,
> I have the following code which represents intersected point of each cell in
> the given two points, A(x0,y0) and B(x1,y1).
>
> def intersected_points(x0, x1, y0, y1):
> # slope
> m = (y1 - y0 )/( x1 - x0)
> # Boundary of the selected points
> x_ceil = ceil( min (x0, x1 ))
> x_floor = floor( max(x0, x1))
> y_ceil = ceil( min(y0, y1))
> y_floor = floor( max(y0, y1))
> # calculate all intersected x coordinate
> ax = []
> for x in arange(x_ceil, x_floor + 1, 1):
> ax.append([ x, m * ( x - x0 ) + y0 ])
> # calculate all intersected y coordinate
> for y in arange(y_ceil, y_floor + 1, 1):
> ax.append([ ( y - y0 ) * ( 1./m ) + x0, y ])
> return ax
>
> Sample values: intersected_points(1.5,4.4,0.5,4.1)
> Output: [[2.0, 1.1206896551724137], [3.0, 2.3620689655172411], [4.0,
> 3.6034482758620685], [1.9029, 1.0], [2.7085, 2.0],
> [3.51388893, 3.0], [4.3196, 4.0]]
>
> The output I got is unsorted values, so, for each cell coordinates, where
> line crosses:
> BUT, The result I want to get should be something in increased orders
> like: (x0,y0), (x1,y1), (x2,y2), (x3,y3)
> where x0, y0 - intial, x1,y1 - final point. Other values are intersected
> line coordinates!
>
> Any answers will be appreciated,
>
> --  Bakhtiyor Zokhidov


You also asked this question on stackoverflow
(http://stackoverflow.com/questions/16377826/distance-for-each-intersected-points-of-a-line-in-increased-order-in-2d-coordina).
 I've posted an answer there.

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] nanmean(), nanstd() and other "missing" functions for 1.8

2013-05-01 Thread Warren Weckesser
On Wed, May 1, 2013 at 10:14 AM, Daπid  wrote:

> On 1 May 2013 03:36, Benjamin Root  wrote:
> > Are there any other functions that others feel are "missing" from numpy
> and
> > would like to see for v1.8?  Let's discuss them here.
>
> I would like to have sincos, to compute sin and cos of the same number
> faster. According to some benchmarks, it is barely slower than just
> computing one of them.
>
>

+1

Warren



>
> On 1 May 2013 07:13, Chris Barker - NOAA Federal 
> wrote:
> >> Of course, the documentation for discussed before: np.minmax().  My
> thinking is that it would return a 2xN array
> >
> > How about a tuple: (min, max)?
>
> Consider the case of np.minmax(matrix, axis=1), you will end up with a
> tuple of two arrays. In that scenario, you probably want to do
> computations with both numbers, so having them in an array seems more
> convenient.
>
> If there is enough reason, we could always add a "unpack=True" flag
> and then return a tuple.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] int to binary

2013-04-29 Thread Warren Weckesser
On 4/29/13, josef.p...@gmail.com  wrote:
>  Is there a available function to convert an int to binary
> representation as sequence of 0 and 1?
>
>
>  binary_repr produces strings and is not vectorized
>
 np.binary_repr(5)
> '101'
 np.binary_repr(5, width=4)
> '0101'
 np.binary_repr(np.arange(5), width=4)
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "C:\Python26\lib\site-packages\numpy\core\numeric.py", line
> 1732, in binary_repr
> if num < 0:
> ValueError: The truth value of an array with more than one element is
> ambiguous. Use a.any() or a.all()
>
> 
> That's the best I could come up with in a few minutes:
>
>
 k = 3;  int2bin(np.arange(2**k), k, roll=False)
> array([[ 0.,  0.,  0.],
>[ 1.,  0.,  0.],
>[ 0.,  0.,  1.],
>[ 1.,  0.,  1.],
>[ 0.,  1.,  0.],
>[ 1.,  1.,  0.],
>[ 0.,  1.,  1.],
>[ 1.,  1.,  1.]])
 k = 3;  int2bin(np.arange(2**k), k, roll=True)
> array([[ 0.,  0.,  0.],
>[ 0.,  0.,  1.],
>[ 0.,  1.,  0.],
>[ 0.,  1.,  1.],
>[ 1.,  0.,  0.],
>[ 1.,  0.,  1.],
>[ 1.,  1.,  0.],
>[ 1.,  1.,  1.]])
>
> ---
> def int2bin(x, width, roll=True):
> x = np.atleast_1d(x)
> res = np.zeros(x.shape + (width,) )
> for i in range(width):
> x, r = divmod(x, 2)
> res[..., -i] = r
> if roll:
> res = np.roll(res, width-1, axis=-1)
> return res
>

Here one way, in which each value is and'ed (with broadcasting) with
an array of values with a 1 in each consecutive bit.  The comparison `
!= 0` converts the values from powers of 2 to bools, and then
`astype(int)` converts those to 0s and 1s.  You'll probably want to
adjust how reshaping is done to get the result the way you want it.

In [1]: x = array([0, 1, 2, 3, 15, 16])

In [2]: width = 5

In [3]: ((x.reshape(-1,1) & (2**arange(width))) != 0).astype(int)
Out[3]:
array([[0, 0, 0, 0, 0],
   [1, 0, 0, 0, 0],
   [0, 1, 0, 0, 0],
   [1, 1, 0, 0, 0],
   [1, 1, 1, 1, 0],
   [0, 0, 0, 0, 1]])


Warren


>
> Josef
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] timezones and datetime64

2013-04-03 Thread Warren Weckesser
On 4/3/13, Benjamin Root  wrote:
> On Wed, Apr 3, 2013 at 7:52 PM, Chris Barker - NOAA Federal <
> chris.bar...@noaa.gov> wrote:
>
>>
>> Personally, I never need finer resolution than seconds, nor more than
>> a century, so it's no big deal to me, but just wondering
>>
>>
> A use case for finer resolution than seconds (in our field, no less!) is
> lightning data.  At the last SciPy conference,  a fellow meteorologist
> mentioned how difficult it was to plot out lightning data at resolutions
> finer than microseconds (which is the resolution of the python datetime
> objects).  Matplotlib has not supported the datetime64 object yet (John
> passed before he could write up that patch).
>
> Cheers!
> Ben
>
> By the way, my 12th Rule of Programming is "Never roll your own datetime"


A rule on par with "never get involved in a land war in Asia": both
equally Fraught With Peril. :)


Warren

>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add ability to disable the autogeneration of the function signature in a ufunc docstring.

2013-03-20 Thread Warren Weckesser
On Fri, Mar 15, 2013 at 4:39 PM, Nathaniel Smith  wrote:

> On Fri, Mar 15, 2013 at 6:47 PM, Warren Weckesser
>  wrote:
> > Hi all,
> >
> > In a recent scipy pull request (https://github.com/scipy/scipy/pull/459),
> I
> > ran into the problem of ufuncs automatically generating a signature in
> the
> > docstring using arguments such as 'x' or 'x1, x2'.  scipy.special has a
> lot
> > of ufuncs, and for most of them, there are much more descriptive or
> > conventional argument names than 'x'.  For now, we will include a nicer
> > signature in the added docstring, and grudgingly put up with the one
> > generated by the ufunc.  In the long term, it would be nice to be able to
> > disable the automatic generation of the signature.  I submitted a pull
> > request to numpy to allow that: https://github.com/numpy/numpy/pull/3149
> >
> > Comments on the pull request would be appreciated.
>
> The functionality seems obviously useful, but adding a magic public
> attribute to all ufuncs seems like a somewhat clumsy way to expose it?
> Esp. since ufuncs are always created through the C API, including
> docstring specification, but this can only be set at the Python level?
> Maybe it's the best option but it seems worth taking a few minutes to
> consider alternatives.
>


Agreed;  exposing the flag as part of the public Python ufunc API is
unnecessary, since this is something that would rarely, if ever, be changed
during the life of the ufunc.



> Brainstorming:
>
> - If the first line of the docstring starts with "(" and
> ends with ")", then that's a signature and we skip adding one (I think
> sphinx does something like this?) Kinda magic and implicit, but highly
> backwards compatible.
>
> - Declare that henceforth, the signature generation will be disabled
> by default, and go through and add a special marker like
> "__SIGNATURE__" to all the existing ufunc docstrings, which gets
> replaced (if present) by the automagically generated signature.
>
> - Give ufunc arguments actual names in general, that work for things
> like kwargs, and then use those in the automagically generated
> signature. This is the most work, but it would mean that people don't
> have to remember to update their non-magic signatures whenever numpy
> adds a new feature like out= or where=, and would make the docstrings
> actually accurate, which right now they aren't:
>
>
I'm leaning towards this option.  I don't know if there would still be a
need to disable the automatic generation of the docstring if it was good
enough.


In [7]: np.add.__doc__.split("\n")[0]
> Out[7]: 'add(x1, x2[, out])'
>
> In [8]: np.add(x1=1, x2=2)
> ValueError: invalid number of arguments
>
> - Allow some special syntax to describe the argument names in the
> docstring: "__ARGNAMES__: a b\n" -> "add(a, b[, out])"
>
> - Something else...
>
> -n
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Add ability to disable the autogeneration of the function signature in a ufunc docstring.

2013-03-15 Thread Warren Weckesser
Hi all,

In a recent scipy pull request (https://github.com/scipy/scipy/pull/459), I
ran into the problem of ufuncs automatically generating a signature in the
docstring using arguments such as 'x' or 'x1, x2'.  scipy.special has a lot
of ufuncs, and for most of them, there are much more descriptive or
conventional argument names than 'x'.  For now, we will include a nicer
signature in the added docstring, and grudgingly put up with the one
generated by the ufunc.  In the long term, it would be nice to be able to
disable the automatic generation of the signature.  I submitted a pull
request to numpy to allow that: https://github.com/numpy/numpy/pull/3149

Comments on the pull request would be appreciated.

Thanks,

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy 1.7.0 with Intel MKL 11.0.2.146

2013-03-10 Thread Warren Weckesser
On 3/10/13, Warren Weckesser  wrote:
> On 3/10/13, QT  wrote:
>> Dear all,
>>
>> I'm at my wits end.  I've followed Intel's own
>> instructions<http://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl>on
>> how to compile Numpy with Intel MKL.  Everything compiled and linked
>> fine and I've installed it locally in my user folder...There is one nasty
>> problem.  When one calls the numpy library to do some computation, it
>> does
>> not use all of the available threads.  I have 8 "cores" on my machine and
>> it only uses 4 of them.  The MKL_NUM_THREADS environmental variable can
>> be
>> set to tune the number of threads but setting it to 8 does not change
>> anything.  Indeed, setting it to 3 does limit the threads to 3What is
>> going on?
>
>
> Does your computer have 8 physical cores, or 4 cores that look like 8
> because of hyperthreading?
>


Here's why I ask this: http://software.intel.com/en-us/forums/topic/294954


> Warren
>
>
>>
>> As a comparison, the numpy (version 1.4.1, installed from yum, which uses
>> BLAS+ATLAS) uses all 8 threads.  I do not get this.
>>
>> You can run this test program
>>
>> python -mtimeit -s'import numpy as np; a = np.random.randn(1e3,1e3)'
>> 'np.dot(a, a)'
>>
>> There is one saving grace, the local numpy built with MKL is much faster
>> than the system's numpy.
>>
>> I hope someone can help me.  Searching the internet has been fruitless.
>>
>> Best,
>> Quyen
>>
>> My site.cfg for numpy (1.7.0)
>> [mkl]
>> library_dirs = /opt/intel/mkl/lib/intel64
>> include_dirs = /opt/intel/mkl/include
>> mkl_libs = mkl_rt
>> lapack_libs =
>>
>> I've edited line 37 of numpy/distutils/intelcompiler.py
>> self.cc_exe = 'icc -O3 -fPIC -fp-model strict -fomit-frame-pointer
>> -openmp
>> -parallel -DMKL_ILP64'
>>
>> Also line 54 of numpy/distutils/fcompiler/intel.py
>> return ['-i8 -xhost -openmp -fp-model strict']
>>
>> My .bash_profile also contains the lines:
>> source /opt/intel/bin/compilervars.sh intel64
>> source /opt/intel/mkl/bin/mklvars.sh intel64
>>
>> The above is needed to set the LD_LIBRARY_PATH so that Python can source
>> the intel dynamic library when numpy is called.
>>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy 1.7.0 with Intel MKL 11.0.2.146

2013-03-10 Thread Warren Weckesser
On 3/10/13, QT  wrote:
> Dear all,
>
> I'm at my wits end.  I've followed Intel's own
> instructionson
> how to compile Numpy with Intel MKL.  Everything compiled and linked
> fine and I've installed it locally in my user folder...There is one nasty
> problem.  When one calls the numpy library to do some computation, it does
> not use all of the available threads.  I have 8 "cores" on my machine and
> it only uses 4 of them.  The MKL_NUM_THREADS environmental variable can be
> set to tune the number of threads but setting it to 8 does not change
> anything.  Indeed, setting it to 3 does limit the threads to 3What is
> going on?


Does your computer have 8 physical cores, or 4 cores that look like 8
because of hyperthreading?

Warren


>
> As a comparison, the numpy (version 1.4.1, installed from yum, which uses
> BLAS+ATLAS) uses all 8 threads.  I do not get this.
>
> You can run this test program
>
> python -mtimeit -s'import numpy as np; a = np.random.randn(1e3,1e3)'
> 'np.dot(a, a)'
>
> There is one saving grace, the local numpy built with MKL is much faster
> than the system's numpy.
>
> I hope someone can help me.  Searching the internet has been fruitless.
>
> Best,
> Quyen
>
> My site.cfg for numpy (1.7.0)
> [mkl]
> library_dirs = /opt/intel/mkl/lib/intel64
> include_dirs = /opt/intel/mkl/include
> mkl_libs = mkl_rt
> lapack_libs =
>
> I've edited line 37 of numpy/distutils/intelcompiler.py
> self.cc_exe = 'icc -O3 -fPIC -fp-model strict -fomit-frame-pointer -openmp
> -parallel -DMKL_ILP64'
>
> Also line 54 of numpy/distutils/fcompiler/intel.py
> return ['-i8 -xhost -openmp -fp-model strict']
>
> My .bash_profile also contains the lines:
> source /opt/intel/bin/compilervars.sh intel64
> source /opt/intel/mkl/bin/mklvars.sh intel64
>
> The above is needed to set the LD_LIBRARY_PATH so that Python can source
> the intel dynamic library when numpy is called.
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] step paramter for linspace

2013-03-01 Thread Warren Weckesser
On 3/1/13, Henry Gomersall  wrote:
> On Fri, 2013-03-01 at 13:34 +, Nathaniel Smith wrote:
>> > My usual hack to deal with the numerical bounds issue is to
>> add/subtract
>> > half the step.
>>
>> Right. Which is exactly the sort of annoying, content-free code that a
>> library is supposed to handle for you, so you can save mental energy
>> for more important things :-).
>
> I agree with the sentiment (I sometimes wish a library could read my
> mind ;) but putting this sort of logic into the library seems dangerous
> to me.
>
> The point is that the coder _should_ understand the subtleties of
> floating point numbers. IMO arange _should_ be well specified and
> actually operate on the half open interval; continuing to add a step
> until >= the limit is clear and always unambiguous.
>
> Unfortunately, the docs tell me that this isn't the case:
> "For floating point arguments, the length of the result is
>  ``ceil((stop - start)/step)``.  Because of floating point overflow,
>  this rule may result in the last element of `out` being greater
>  than `stop`."
>
> In my jet-lag addled state, i can't see when this out[-1] > stop case
> will occur, but I can take it as true. It does seem to be problematic
> though.


Here you go:

In [32]: end = 2.2

In [33]: x = arange(0.1, end, 0.3)

In [34]: x[-1]
Out[34]: 2.2006

In [35]: x[-1] > end
Out[35]: True



Warren



>
> As soon as you allow freeform setting of the stop value, problems are
> going to be encountered. Who's to say that the stop - delta is actually
> _meant_ to be below the limit, or is meant to be the limit? Certainly
> not the library!
>
> It just seems to me that this will lead to lots of bad code in which the
> writer has glossed over an ambiguous case.
>
> Henry
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to Keep An Array Two Dimensional

2012-11-25 Thread Warren Weckesser
On Sun, Nov 25, 2012 at 8:24 PM, Tom Bennett wrote:

> Hi,
>
> I am trying to extract n columns from an 2D array and then operate on the
> extracted columns. Below is the code:
>
> A is an MxN 2D array.
>
> u = A[:,:n] #extract the first n columns from A
>
> B = np.dot(u, u.T) #take outer product.
>
> This code works when n>1. However, when n=1, u becomes an 1D array instead
> of an Mx1 2D array and the code breaks down.
>
> I wonder if there is any way to keep u=A[:,:n] an Mxn array no matter what
> value n takes. I do not want to use matrix because array is more convenient
> in other places.
>
>
Tom,

Your example works for me:

In [1]: np.__version__
Out[1]: '1.6.2'

In [2]: A = arange(15).reshape(3,5)

In [3]: A
Out[3]:
array([[ 0,  1,  2,  3,  4],
   [ 5,  6,  7,  8,  9],
   [10, 11, 12, 13, 14]])

In [4]: u = A[:,:1]

In [5]: u
Out[5]:
array([[ 0],
   [ 5],
   [10]])

In [6]: B = np.dot(u, u.T)

In [7]: B
Out[7]:
array([[  0,   0,   0],
   [  0,  25,  50],
   [  0,  50, 100]])



Warren



> Thanks,
> Tom
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] strange behavior of numpy.unique

2012-11-07 Thread Warren Weckesser
On Wed, Nov 7, 2012 at 11:24 AM,  wrote:

> On Tue, Nov 6, 2012 at 9:52 PM, Warren Weckesser
>  wrote:
> >
> >
> > On Tue, Nov 6, 2012 at 8:27 PM, Phillip Feldman
> >  wrote:
> >>
> >> numpy.unique behaves as I would expect for small inputs like the
> >> following:
> >>
> >> In [12]: x= [0, 0, 1, 0, 1, 2, 0, 1, 2, 3]
> >>
> >> In [13]: unique(x, return_index=True)
> >> Out[13]: (array([0, 1, 2, 3]), array([0, 2, 5, 9], dtype=int64))
> >>
> >> But, when I give it something larger, the return index values do not
> >> always correspond to the first occurrences in the input. The
> documentation
> >> is silent on the question of how the return index values are chosen
> when a
> >> given element of x appears more than once. Either the documentation
> should
> >> be
> >> clarified, or better yet, the behavior should be changed.
> >
> >
> >
> > In fact, it was changed (in the master branch on github) several months
> ago,
> > but there has not yet been a release with the changes.  The sort method
> that
> > np.unique passes to np.argsort is now 'mergesort', and the docstring
> states
> > that the indices returned are for the first occurrences of the unique
> > elements.  The new docstring is here:
> >
> http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.unique.html#numpy.unique
> >
> > See
> >
> https://github.com/numpy/numpy/commit/dbf235169ed3386b359caaa9217f5280bf1d6749
> > for the commit, and
> > https://github.com/numpy/numpy/blob/master/numpy/lib/arraysetops.py for
> the
> > latest version of the source.
>
> I think it's in 1.6.2 and it broke return_index for structured dtypes,
> IIRC.
>
>

You are correct, Josef, that change is in 1.6.2.  Thanks.

Warren


Josef
>
>
> >
> > Warren
> >
> >
> >>
> >> ___
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] strange behavior of numpy.unique

2012-11-06 Thread Warren Weckesser
On Tue, Nov 6, 2012 at 8:27 PM, Phillip Feldman  wrote:

> numpy.unique behaves as I would expect for small inputs like the following:
>
> In [12]: x= [0, 0, 1, 0, 1, 2, 0, 1, 2, 3]
>
> In [13]: unique(x, return_index=True)
> Out[13]: (array([0, 1, 2, 3]), array([0, 2, 5, 9], dtype=int64))
>
> But, when I give it something larger, the return index values do not
> always correspond to the first occurrences in the input. The documentation
> is silent on the question of how the return index values are chosen when a
> given element of x appears more than once. Either the documentation should
> be
> clarified, or better yet, the behavior should be changed.
>


In fact, it was changed (in the master branch on github) several months
ago, but there has not yet been a release with the changes.  The sort
method that np.unique passes to np.argsort is now 'mergesort', and the
docstring states that the indices returned are for the first occurrences of
the unique elements.  The new docstring is here:
http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.unique.html#numpy.unique

See
https://github.com/numpy/numpy/commit/dbf235169ed3386b359caaa9217f5280bf1d6749for
the commit, and
https://github.com/numpy/numpy/blob/master/numpy/lib/arraysetops.py for the
latest version of the source.

Warren



> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix norm

2012-10-22 Thread Warren Weckesser
On Mon, Oct 22, 2012 at 10:56 AM, Charles R Harris <
charlesr.har...@gmail.com> wrote:

>
>
> On Mon, Oct 22, 2012 at 9:44 AM, Jason Grout 
> wrote:
>
>> I'm curious why scipy/numpy defaults to calculating the Frobenius norm
>> for matrices [1], when Matlab, Octave, and Mathematica all default to
>> calculating the induced 2-norm [2].  Is it solely because the Frobenius
>> norm is easier to calculate, or is there some other good mathematical
>> reason for doing things differently?
>>
>>
> Looks to me like Matlab, Octave, and Mathematica all default to the
> Frobenius norm .
>


Not octave:

octave-3.4.0:26> a = [1 2; 3 4]
a =

   1   2
   3   4


The default norm (for a 2x2 matrix) is the spectral norm:

octave-3.4.0:27> norm(a)
ans =  5.4650
octave-3.4.0:28> norm(a, 2)
ans =  5.4650
octave-3.4.0:29> svd(a)(1)
ans =  5.4650

not the Frobenius norm:

octave-3.4.0:30> norm(a, 'fro')
ans =  5.4772
octave-3.4.0:31> sqrt(sum(sum(a.**2)))
ans =  5.4772


Warren



> Chuck
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] set_printoptions precision and single floats

2012-10-06 Thread Warren Weckesser
On Sat, Oct 6, 2012 at 12:17 PM, Ralf Gommers wrote:

>
>
> On Fri, Oct 5, 2012 at 5:17 PM, Dan Goodman wrote:
>
>> Hi,
>>
>> numpy.set_printoptions(precision=...) doesn't affect single floats, even
>> if they are numpy floats rather than Python floats. Is this a bug or is
>> there some reason for this behaviour? I ask because I have a class that
>> derives from numpy.float64 and adds some extra information, and I'd like
>> to be able to control the precision. I could fix it to use the precision
>> set by numpy.set_printoptions, but then it would be inconsistent with
>> how numpy itself handles precision. Thoughts?
>>
>
> Do you mean scalars or arrays? For me set_printoptions only affects arrays
> and not scalars. Both float32 and float64 arrays work as advertised:
>
> In [28]: np.set_printoptions(precision=4)
>
> In [29]: np.array([np.float32(1.23456789101101110012345679)])
> Out[29]: array([ 1.2346], dtype=float32)
>
> In [30]: np.array([np.float64(1.23456789101101110012345679)])
> Out[30]: array([ 1.2346])
>
> In [31]: np.set_printoptions(precision=8)
>
> In [32]: np.array([np.float32(1.23456789101101110012345679)])
> Out[32]: array([ 1.23456788], dtype=float32)
>
> In [33]: np.array([np.float64(1.23456789101101110012345679)])
> Out[33]: array([ 1.23456789])
>
>
> But for scalars it doesn't work:
>
> In [34]: np.float32(1.23456789101101110012345679)
> Out[34]: 1.2345679
>
> In [35]: np.float64(1.23456789101101110012345679)
> Out[35]: 1.2345678910110112
>
> In [36]: np.set_printoptions(precision=4)
>
> In [37]: np.float32(1.23456789101101110012345679)
> Out[37]: 1.2345679
>
> In [38]: np.float64(1.23456789101101110012345679)
> Out[38]: 1.2345678910110112
>
>
> Ralf
>


It also does not affect zero-dimensional (i.e. scalar) arrays (e.g.
array(1.2345)):

In [1]: x = array(1./3)

In [2]: x
Out[2]: array(0.)

In [3]: set_printoptions(precision=3)

In [4]: x
Out[4]: array(0.)

In [5]: type(x)
Out[5]: numpy.ndarray


`y` is a 1-d array, so this works as expected:


In [6]: y = array([1./3])

In [7]: y
Out[7]: array([ 0.333])


Warren



>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Obscure code in concatenate code path?

2012-09-13 Thread Warren Weckesser
On Thu, Sep 13, 2012 at 9:01 AM, Travis Oliphant wrote:

>
> On Sep 13, 2012, at 8:40 AM, Nathaniel Smith wrote:
>
> > On Thu, Sep 13, 2012 at 11:12 AM, Matthew Brett 
> wrote:
> >> Hi,
> >>
> >> While writing some tests for np.concatenate, I ran foul of this code:
> >>
> >>if (axis >= NPY_MAXDIMS) {
> >>ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays,
> NPY_CORDER);
> >>}
> >>else {
> >>ret = PyArray_ConcatenateArrays(narrays, arrays, axis);
> >>}
> >>
> >> in multiarraymodule.c
> >
> > How deeply weird
>
>
> This is expected behavior.



Heh, I guess "expected" is subjective:

In [23]: np.__version__
Out[23]: '1.6.1'

In [24]: a = zeros((2,2))

In [25]: b = ones((2,3))

In [26]: concatenate((a, b), axis=0)  # Expected error.
---
ValueErrorTraceback (most recent call last)
/Users/warren/gitwork/class-material/demo/pytables/
in ()
> 1 concatenate((a, b), axis=0)  # Expected error.

ValueError: array dimensions must agree except for d_0

In [27]: concatenate((a, b), axis=1)   # Normal behavior.
Out[27]:
array([[ 0.,  0.,  1.,  1.,  1.],
   [ 0.,  0.,  1.,  1.,  1.]])

In [28]: concatenate((a, b), axis=2)   # Cryptic error message.
---
ValueErrorTraceback (most recent call last)
/Users/warren/gitwork/class-material/demo/pytables/
in ()
> 1 concatenate((a, b), axis=2)   # Cryptic error message.

ValueError: bad axis1 argument to swapaxes

In [29]: concatenate((a, b), axis=32)   # What the... ?
Out[29]: array([ 0.,  0.,  0.,  0.,  1.,  1.,  1.,  1.,  1.,  1.])


I would expect an error, consistent with the behavior when 1 < axis < 32.


Warren




> It's how the concatenate Python function manages to handle axis=None to
> flatten the arrays before concatenation.This has been in NumPy since
> 1.0 and should not be changed without deprecation warnings which I am -0 on.
>
> Now, it is true that the C-API could have been written differently (I
> think this is what Mark was trying to encourage) so that there are two
> C-API functions and they are dispatched separately from the
> array_concatenate method depending on whether or not a None is passed in.
> But, the behavior is documented and has been for a long time.
>
> Reference PyArray_AxisConverter (which turns a "None" Python argument into
> an axis=MAX_DIMS).   This is consistent behavior throughout the C-API.
>
> -Travis
>
>
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] sum and prod

2012-09-08 Thread Warren Weckesser
On Sat, Sep 8, 2012 at 4:56 PM, nicky van foreest wrote:

> Hi,
>
> I ran the following code:
>
> args = np.array([4,8])
> print np.sum( (arg > 0) for arg in args)
> print np.sum([(arg > 0) for arg in args])
> print np.prod( (arg > 0) for arg in args)
> print np.prod([(arg > 0) for arg in args])
>
> with this result:
>
> 2
> 1
>


I get 2 here, not 1 (numpy version 1.6.1).



>  at 0x1c70410>
> 1
>
> Is the difference between prod and sum intentional? I would expect
> that  numpy.prod would also work on a generator, just like numpy.sum.
>


Whatever the correct result may be, I would expect them to have the same
behavior with respect to a generator argument.



> BTW: the last line does what I need: the product over the truth values
> of all elements of args. Is there perhaps a nicer (conciser) way to
> achieve this?  Thanks.
>


How about:

In [15]: np.all(args > 0)
Out[15]: True


 Warren




> Nicky
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.fromfunction() doesn't work as expected?

2012-07-20 Thread Warren Weckesser
On Thu, Jul 19, 2012 at 5:52 AM, Cheng Li wrote:

> Hi All,
>
> ** **
>
> I have spot a strange behavior of numpy.fromfunction(). The sample codes
> are as follows:
>
> >>>  import numpy as np
>
> >>>  def myOnes(i,j):
>
>  return 1.0
>
> >>>  a = np.fromfunction(myOnes,(2000,2000))
>
> >>>  a
>
> 1.0
>
> ** **
>
> Actually what I expected is that the function will return a 2000*2000 2d
> array with unit value. The returned single float value really confused me.
> Is this a known bug? The numpy version I used is 1.6.1.
>
> **
>



Your function will be called *once*, with arguments that are *arrays* of
coordinate values.  It must handle these arrays when it computes the values
of the array to be created.

To see what is happening, print the values of i and j from within your
function, e.g.:


In [57]: def ijsum(i, j):
   : print "i =", i
   : print "j =", j
   : return i + j
   :

In [58]: fromfunction(ijsum, (3, 4))
i = [[ 0.  0.  0.  0.]
 [ 1.  1.  1.  1.]
 [ 2.  2.  2.  2.]]
j = [[ 0.  1.  2.  3.]
 [ 0.  1.  2.  3.]
 [ 0.  1.  2.  3.]]
Out[58]:
array([[ 0.,  1.,  2.,  3.],
   [ 1.,  2.,  3.,  4.],
   [ 2.,  3.,  4.,  5.]])


Your `myOnes` function will work if you modify it something like this:


In [59]: def myOnes(i, j):
   : return np.ones(i.shape)
   :

In [60]: fromfunction(myOnes, (3, 4))
Out[60]:
array([[ 1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.]])



The bug is in the docstring for fromfunction.  In the description of the
`function` argument, it says "`function` must be capable of operating on
arrays, and should return a scalar value."  But the function should *not*
return a scalar value.  It should return an array of values appropriate for
the given arguments.

Warren




> **
>
> Regards,
>
> Cheng
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] dot() function question

2012-06-27 Thread Warren Weckesser
On Wed, Jun 27, 2012 at 4:38 PM,  wrote:

> Hi list.
> I have got completely cunfused with the numpy.dot() function.
> dot(A,B) does:
> - matrix multiplication if A and B are of MxN and NxK sizey
> - dot product if A and B are of size M
> How how can I perform matrix multiplication of two vectors?
> (in matlab I do it like a*a')
>


If 'a' is a 1D numpy array, you can use numpy.outer:


In [6]: a = array([1, -2, 3])

In [7]: outer(a, a)
Out[7]:
array([[ 1, -2,  3],
   [-2,  4, -6],
   [ 3, -6,  9]])


Warren



> Thanks.
> Petro.
>
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] not expected output of fill_diagonal

2012-06-09 Thread Warren Weckesser
On Fri, Jun 8, 2012 at 7:45 PM, Frédéric Bastien  wrote:

> Hi,
>
> While reviewing the Theano op that wrap numpy.fill_diagonal, we found
> an unexpected behavior of it:
>
> # as expected for square matrix
> >>> a=numpy.zeros((5,5))
> >>> numpy.fill_diagonal(a, 10)
> >>> print a
>
> # as expected long rectangular matrix
> >>> a=numpy.zeros((3,5))
> >>> numpy.fill_diagonal(a, 10)
> >>> print a
> [[ 10.   0.   0.   0.   0.]
>  [  0.  10.   0.   0.   0.]
>  [  0.   0.  10.   0.   0.]]
>
> # Not as expected
> >>> a=numpy.zeros((5,3))
> >>> numpy.fill_diagonal(a, 10)
> >>> print a
> [[ 10.   0.   0.]
>  [  0.  10.   0.]
>  [  0.   0.  10.]
>  [  0.   0.   0.]
>  [ 10.   0.   0.]]
>
>
> I can make a PR that will add a parameter wrap that allow to control
> if it return the old behavior or what I would expect in the last case:
> [[ 10.   0.   0.]
>  [  0.  10.   0.]
>  [  0.   0.  10.]
>  [  0.   0.   0.]
>  [  0.   0.   0.]]
>
> My questions is, do someone else expect the current behavior? Should
> we change the default to be what I expect? Do you want that we warn if
> the user didn't specify witch behavior and in the future we change it?
>
>

There is a ticket for this:

http://projects.scipy.org/numpy/ticket/1953

I agree that the behavior is unexpected and should be fixed.

Warren



> Anything else I didn't think?
>
> thanks
>
> Fred
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] SciPy 2012 Abstract and Tutorial Deadlines Extended

2012-04-30 Thread Warren Weckesser
SciPy 2012 Conference Deadlines Extended

Didn't quite finish your abstract or tutorial yet?  Good news: the SciPy
2012 organizers have extended the deadline until Friday, May 4.  Proposals
for tutorials and abstracts for talks and posters are now due by midnight
(Austin time, CDT), May 4.

For the many of you who have already submitted an abstract or tutorial:
thanks!   If you need to make corrections to an abstract or tutorial that
you have already submitted, you may resubmit it by the same deadline.

The SciPy 2012 Organizers
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] SciPy 2012 - The Eleventh Annual Conference on Scientific Computing with Python

2012-04-27 Thread Warren Weckesser
Dear all,

(Sorry if you receive this announcement multiple times.)

Registration for SciPy 2012, the eleventh annual Conference on Scientific
Computing with Python, is open! Go to
https://conference.scipy.org/scipy2012/register/index.php

We would like to remind you that the submissions for talks, posters and
tutorials are open *until April 30th, *which is just around the corner. For
more information see:

http://conference.scipy.org/scipy2012/tutorials.php
http://conference.scipy.org/scipy2012/talks/index.php

For talks or posters, all we need is an abstract.  Tutorials require more
significant preparation.  If you are preparing a tutorial, please send a
brief note to Jonathan Rocher (jroc...@enthought.com) to indicate your
intent.

We look forward to seeing many of you this summer.

Kind regards,

The SciPy 2012 organizers
scipy2...@scipy.org



On Wed, Apr 4, 2012 at 4:30 PM, Warren Weckesser <
warren.weckes...@enthought.com> wrote:

> SciPy 2012, the eleventh annual Conference on Scientific Computing with
> Python, will be held July 16–21, 2012, in Austin, Texas.
>
> At this conference, novel scientific applications and libraries related to
> data acquisition, analysis, dissemination and visualization using Python
> are presented. Attended by leading figures from both academia and industry,
> it is an excellent opportunity to experience the cutting edge of scientific
> software development.
>
> The conference is preceded by two days of tutorials, during which
> community experts provide training on several scientific Python packages.
>  Following the main conference will be two days of coding sprints.
>
> We invite you to give a talk or present a poster at SciPy 2012.
>
> The list of topics that are appropriate for the conference includes (but
> is not limited to):
>
>- new Python libraries for science and engineering;
>- applications of Python in solving scientific or computational
>problems;
>- high performance, parallel and GPU computing with Python;
>- use of Python in science education.
>
>
>
> Specialized Tracks
>
> Two specialized tracks run in parallel to the main conference:
>
>- High Performance Computing with Python
>Whether your algorithm is distributed, threaded, memory intensive or
>latency bound, Python is making headway into the problem.  We are looking
>for performance driven designs and applications in Python.  Candidates
>include the use of Python within a parallel application, new architectures,
>and ways of making traditional applications execute more efficiently.
>
>
>- Visualization
>They say a picture is worth a thousand words--we’re interested in
>both!  Python provides numerous visualization tools that allow scientists
>to show off their work, and we want to know about any new tools and
>techniques out there.  Come show off your latest graphics, whether it’s an
>old library with a slick new feature, a new library out to challenge the
>status quo, or simply a beautiful result.
>
>
>
> Domain-specific Mini-symposia
>
> Mini-symposia on the following topics are also being organized:
>
>- Computational bioinformatics
>- Meteorology and climatology
>- Astronomy and astrophysics
>- Geophysics
>
>
>
> Talks, papers and posters
>
> We invite you to take part by submitting a talk or poster abstract.
>  Instructions are on the conference website:
>
><http://conference.scipy.org/scipy2012/papers.php>
> http://conference.scipy.org/scipy2012/talks.php<http://conference.scipy.org/scipy2012/papers.php>
>  <http://conference.scipy.org/scipy2012/papers.php>
> Selected talks are included as papers in the peer-reviewed conference
> proceedings, to be published online.
>
>
> Tutorials
>
> Tutorials will be given July 16–17.  We invite instructors to submit
> proposals for half-day tutorials on topics relevant to scientific computing
> with Python.  See
>
>   
> http://conference.scipy.org/scipy2012/tutorials.php<http://conference.scipy.org/scipy2011/tutorials.php>
>  <http://conference.scipy.org/scipy2011/tutorials.php>
> for information about submitting a tutorial proposal.  To encourage
> tutorials of the highest quality, the instructor (or team of instructors)
> is given a $1,000 stipend for each half day tutorial.
>
>
> Student/Community Scholarships
>
> We anticipate providing funding for students and for active members of the
> SciPy community who otherwise might not be able to attend the conference.
>  See
>
>   
> http://conference.scipy.org/scipy2012/student.php<http://conference.scipy.org/scipy2011/student.php>
>  <http://conference.scipy.org/scipy2011/student.php>
&

  1   2   3   >