Re: [Numpy-discussion] Coverting ranks to a Gaussian

2008-06-10 Thread Anne Archibald
2008/6/9 Keith Goodman <[EMAIL PROTECTED]>:
> Does anyone have a function that converts ranks into a Gaussian?
>
> I have an array x:
>
>>> import numpy as np
>>> x = np.random.rand(5)
>
> I rank it:
>
>>> x = x.argsort().argsort()
>>> x_ranked = x.argsort().argsort()
>>> x_ranked
>   array([3, 1, 4, 2, 0])
>
> I would like to convert the ranks to a Gaussian without using scipy.
> So instead of the equal distance between ranks in array x, I would
> like the distance been them to follow a Gaussian distribution.
>
> How far out in the tails of the Gaussian should 0 and N-1 (N=5 in the
> example above) be? Ideally, or arbitrarily, the areas under the
> Gaussian to the left of 0 (and the right of N-1) should be 1/N or
> 1/2N. Something like that. Or a fixed value is good too.

I'm actually not clear on what you need.

If what you need is for rank i of N to be the 100*i/N th percentile in
a Gaussian distribution, then you should indeed use scipy's functions
to accomplish that; I'd use scipy.stats.norm.ppf().

Of course, if your points were drawn from a Gaussian distribution,
they wouldn't be exactly 1/N apart, there would be some distribution.
Quite what the distribution of (say) the maximum or the median of N
points drawn from a Gaussian is, I can't say, though people have
looked at it. But if you want "typical" values, just generate N points
from a Gaussian and sort them:

V = np.random.randn(N)
V = np.sort(V)

return V[ranks]

Of course they will be different every time, but the distribution will be right.

Anne
P.S. why the "no scipy" restriction? it's a bit unreasonable. -A
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Switching to nose test framework (was: NumpyTest problem)

2008-06-10 Thread Stéfan van der Walt
2008/6/10 Alan McIntyre <[EMAIL PROTECTED]>:
> Is the stuff Robert pointed out on a wiki page somewhere? It would be
> nice to have a "Welcome noob NumPy developer, here's how to do
> NumPy-specific development things," page. There may be such a page,
> but I just haven't stumbled across it yet.

We could add this info the the `numpy` docstring; that way new users
will have it available without having to search the web.

http://sd-2116.dedibox.fr/pydocweb/doc/numpy/

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Coverting ranks to a Gaussian

2008-06-10 Thread Keith Goodman
On Tue, Jun 10, 2008 at 12:56 AM, Anne Archibald
<[EMAIL PROTECTED]> wrote:
> 2008/6/9 Keith Goodman <[EMAIL PROTECTED]>:
>> Does anyone have a function that converts ranks into a Gaussian?
>>
>> I have an array x:
>>
 import numpy as np
 x = np.random.rand(5)
>>
>> I rank it:
>>
 x = x.argsort().argsort()
 x_ranked = x.argsort().argsort()
 x_ranked
>>   array([3, 1, 4, 2, 0])
>>
>> I would like to convert the ranks to a Gaussian without using scipy.
>> So instead of the equal distance between ranks in array x, I would
>> like the distance been them to follow a Gaussian distribution.
>>
>> How far out in the tails of the Gaussian should 0 and N-1 (N=5 in the
>> example above) be? Ideally, or arbitrarily, the areas under the
>> Gaussian to the left of 0 (and the right of N-1) should be 1/N or
>> 1/2N. Something like that. Or a fixed value is good too.
>
> I'm actually not clear on what you need.
>
> If what you need is for rank i of N to be the 100*i/N th percentile in
> a Gaussian distribution, then you should indeed use scipy's functions
> to accomplish that; I'd use scipy.stats.norm.ppf().
>
> Of course, if your points were drawn from a Gaussian distribution,
> they wouldn't be exactly 1/N apart, there would be some distribution.
> Quite what the distribution of (say) the maximum or the median of N
> points drawn from a Gaussian is, I can't say, though people have
> looked at it. But if you want "typical" values, just generate N points
> from a Gaussian and sort them:
>
> V = np.random.randn(N)
> V = np.sort(V)
>
> return V[ranks]
>
> Of course they will be different every time, but the distribution will be 
> right.

I guess I botched the description of my problem.

I have data that contains outliers and other noise. I am trying
various transformations of the data to preprocess it before plugging
it into my prediction algorithm. One such transformation is to rank
the data and then convert that rank to a Gaussian. The particular
details of the transformation don't matter. I just want something
smooth and normal like.

> Anne
> P.S. why the "no scipy" restriction? it's a bit unreasonable. -A

I'd rather not pull in a scipy dependency for one function if there is
a numpy alternative. I think it is funny that you picked up on my
brief mention of scipy and called it unreasonable.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Inplace shift

2008-06-10 Thread Keith Goodman
On Sat, Jun 7, 2008 at 6:48 PM, Anne Archibald
<[EMAIL PROTECTED]> wrote:
> 2008/6/7 Keith Goodman <[EMAIL PROTECTED]>:
>> On Fri, Jun 6, 2008 at 10:46 PM, Anne Archibald
>> <[EMAIL PROTECTED]> wrote:
>>> 2008/6/6 Keith Goodman <[EMAIL PROTECTED]>:
 I'd like to shift the columns of a 2d array one column to the right.
 Is there a way to do that without making a copy?

 This doesn't work:

>> import numpy as np
>> x = np.random.rand(2,3)
>> x[:,1:] = x[:,:-1]
>> x

 array([[ 0.44789223,  0.44789223,  0.44789223],
   [ 0.80600897,  0.80600897,  0.80600897]])
>>>
>>> As a workaround you can use backwards slices:
>>>worki
>>> In [40]: x = np.random.rand(2,3)
>>>
>>> In [41]: x[:,:0:-1] = x[:,-2::-1]
>>>
>>> In [42]: x
>>> Out[42]:
>>> array([[ 0.20183084,  0.20183084,  0.08156887],
>>>   [ 0.30611585,  0.30611585,  0.79001577]])
>>
>> Neat. It makes sense to go backwards. Thank you.
>>
>>> Less painful for numpy developers but more painful for users is to
>>> warn them about the status quo: operations on overlapping slices can
>>> happen in arbitrary order.
>>
>> Now I'm confused. Could some corner case of memory layout cause numpy
>> to work from right to left, breaking the workaround? Or can I depend
>> on the workaround working with numpy 1.0.4?
>
> I'm afraid so. And it's not such a corner case as that: if the array
> is laid out in "C contiguous" order, you have to go backwards, while
> if the array is laid out in "FORTRAN contiguous" order you have to go
> forwards.

I think I'll end up using a code snippet like shift2:

def shift(x):
x2 = x.copy()
x[:,1:] = x2[:,:-1]
return x

def shift2(x):
if x.flags.c_contiguous:
x[:,:0:-1] = x[:,-2::-1]
elif x.flags.f_contiguous:
x[:,1:] = x[:,:-1]
else:
raise ValueError, 'x must be c_contiguous or f_contiguous'
return x

>> x = np.random.rand(2,3)
>> timeit shift(x)
10 loops, best of 3: 5.02 µs per loop
>> timeit shift2(x)
10 loops, best of 3: 4.75 µs per loop

>> x = np.random.rand(500,500)
>> timeit shift(x)
100 loops, best of 3: 2.51 ms per loop
>> timeit shift2(x)
1000 loops, best of 3: 1.62 ms per loop
>>
>> x = x.T
>>
>> timeit shift(x)
100 loops, best of 3: 4.17 ms per loop
>> timeit shift2(x)
1000 loops, best of 3: 348 µs per loop

f contiguous (x.T) is faster. How do I change an array from c
contiguous to f contiguous without changing the elements?
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Coverting ranks to a Gaussian

2008-06-10 Thread Dag Sverre Seljebotn
Keith Goodman wrote:
> I'd rather not pull in a scipy dependency for one function if there is
> a numpy alternative. I think it is funny that you picked up on my
> brief mention of scipy and called it unreasonable.
>   
(I didn't follow this exact discussion, arguing from general principles 
here about SciPy dependencies.) Try to look at it this way:

NumPy may solve almost all your needs, and you only need, say, 0.1% of 
SciPy.

Assume, then, that the same statement is true about n other people. The 
problem then is that the 0.1% that each person needs from SciPy does not 
typically overlap. So as n grows larger, and assuming that everyone use 
the same logic that you use, the amount of SciPy stuff that must be 
reimplemented on the NumPy discussion mailing list could become quite large.

(Besides, SciPy is open source, so you can always copy & paste the 
function from it if you only need one trivial bit of it. Not that I 
recommend doing that, but it's still better than reimplementing it 
yourself. Unless you're doing it for the learning experience of course.)

Dag Sverre
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] problems building in cygwin under vmware

2008-06-10 Thread Chris Kees
The solution to this problem (roll back binutils to the previous  
cygwin version or fix numpy)  is here:

http://www.scipy.org/scipy/numpy/ticket/811

On Jun 9, 2008, at 2:41 PM, Chris Kees wrote:

> Hi,
>
> I'm getting an assembler error "Error: suffix or operands invalid for
> `fnstsw'" while trying to build numpy on cygwin (running under windows
> XP running on vmware on a mac pro). I've tried the last two releases
> of numpy and the svn version.  Has anybody ever seen this before?
>
> -Chris
>
> tail end of 'python setup.py build' output:
>
> compile options: '-g -Ibuild/src.cygwin-1.5.25-i686-2.5/numpy/core/src
> -Inumpy/core/include -Ibuild/src.cygwin-1.5.25-i686-2.5/numpy/core -
> Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.5 -I/usr/
> include/python2.5 -c'
> gcc: build/src.cygwin-1.5.25-i686-2.5/numpy/core/src/umathmodule.c
> In file included from numpy/core/src/umathmodule.c.src:2183:
> numpy/core/src/ufuncobject.c: In function `_extract_pyvals':
> numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg
> (arg 4)
> numpy/core/src/ufuncobject.c:1164: warning: int format, long int arg
> (arg 5)
> /cygdrive/c/DOCUME~1/ADMINI~1/LOCALS~1/Temp/cclP4Hfs.s: Assembler
> messages:
> /cygdrive/c/DOCUME~1/ADMINI~1/LOCALS~1/Temp/cclP4Hfs.s:72160: Error:
> suffix or operands invalid for `fnstsw'
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Plans for Scipy Tutorials

2008-06-10 Thread Fernando Perez
Hi all,

I've now put up the near-final tutorial plans for SciPy 2008 here:

http://conference.scipy.org/tutorials

If your name is listed there and you disagree/can't make it, please
let me and Travis Oliphant know as soon as possible.

As the various presenters fine-tune their plan, we'll update the
details on each tutorial and provide links to pre-requisites,
installation pages, etc.  But the main topics are probably not going
to change now, barring any unforeseen event.

Regards,

f
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Cookbook/Documentation

2008-06-10 Thread Christopher Burns
Where is the "CookBookCategory"?  I'm afraid I don't understand that
reference below.  Are there plans to auto-generate the content in the
Cookbook Recipes (http://www.scipy.org/Cookbook) or is it still
reasonable for me to edit those pages?

Thanks,
Chris

On Mon, May 19, 2008 at 10:02 AM, Stéfan van der Walt <[EMAIL PROTECTED]> wrote:
> Hi Pierre
>
> 2008/5/19 Pierre GM <[EMAIL PROTECTED]>:
>> * I've just noticed that the page describing RecordArrays
>> (http://www.scipy.org/RecordArrays) is not listed under the Cookbook: should
>> this be changed ? Shouldn't there be at least a link in the documentation
>> page ?
>
> How about we add those pages to a CookBookCategory and auto-generate
> the Cookbook (like I've done with ProposedEnhancements)?
>
>> * I was eventually considering writing down some basic docs for MaskedArrays:
>> should I create a page under the Cookbook ? Elsewhere ?
>
> That's a good place for now.  Use ReST (on the wiki, use {{{#!rst
>  }}}), then we can incorporate your work into the
> user guide later.
>
> Regards
> Stéfan
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
Christopher Burns
Computational Infrastructure for Research Labs
10 Giannini Hall, UC Berkeley
phone: 510.643.4014
http://cirl.berkeley.edu/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Cookbook/Documentation

2008-06-10 Thread Stéfan van der Walt
Hi Chris

2008/6/11 Christopher Burns <[EMAIL PROTECTED]>:
> Where is the "CookBookCategory"?  I'm afraid I don't understand that
> reference below.  Are there plans to auto-generate the content in the
> Cookbook Recipes (http://www.scipy.org/Cookbook) or is it still
> reasonable for me to edit those pages?

The list is already generated at the bottom of Cookbook.  To make a
page show up there, add the following lines to the bottom (excluding
the triple-quotes):

"""

CategoryCookbook
"""

I suggest adding your page to the descriptive list at the top of Cookbook too.

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Cookbook/Documentation

2008-06-10 Thread Christopher Burns
Excellent, thanks Stefan!

On Tue, Jun 10, 2008 at 3:31 PM, Stéfan van der Walt <[EMAIL PROTECTED]> wrote:
> Hi Chris
>
> 2008/6/11 Christopher Burns <[EMAIL PROTECTED]>:
>> Where is the "CookBookCategory"?  I'm afraid I don't understand that
>> reference below.  Are there plans to auto-generate the content in the
>> Cookbook Recipes (http://www.scipy.org/Cookbook) or is it still
>> reasonable for me to edit those pages?
>
> The list is already generated at the bottom of Cookbook.  To make a
> page show up there, add the following lines to the bottom (excluding
> the triple-quotes):
>
> """
> 
> CategoryCookbook
> """
>
> I suggest adding your page to the descriptive list at the top of Cookbook too.
>
> Regards
> Stéfan
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Large symmetrical matrix

2008-06-10 Thread Simon Palmer
Hi I have a problem which involves the creation of a large square matrix
which is zero across its diagonal and symmetrical about the diagonal i.e.
m[i,j] = m[j,i] and m[i,i] = 0.  So, in fact, it is a large triangular
matrix.  I was wondering whether there is any way of easily handling a
matrix of this shape without either incurring a memory penalty or a whole
whack of proprietary code?

To get through this I have implemented a 1D array which has ((n-1)^2)/2
elements inside a wrapper class which manpulates the arguments of array
accessors with some arithmetic to return the approriate value.  To be honest
I'd love to throw this away, but I haven't yet come across a feasible
alternative.

Any ideas?
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Large symmetrical matrix

2008-06-10 Thread Robert Kern
On Tue, Jun 10, 2008 at 18:53, Simon Palmer <[EMAIL PROTECTED]> wrote:
> Hi I have a problem which involves the creation of a large square matrix
> which is zero across its diagonal and symmetrical about the diagonal i.e.
> m[i,j] = m[j,i] and m[i,i] = 0.  So, in fact, it is a large triangular
> matrix.  I was wondering whether there is any way of easily handling a
> matrix of this shape without either incurring a memory penalty or a whole
> whack of proprietary code?
>
> To get through this I have implemented a 1D array which has ((n-1)^2)/2
> elements inside a wrapper class which manpulates the arguments of array
> accessors with some arithmetic to return the approriate value.  To be honest
> I'd love to throw this away, but I haven't yet come across a feasible
> alternative.
>
> Any ideas?

What operations do you want to perform on this array?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumpyTest problem

2008-06-10 Thread David Huard
Charles,

This bug appeared after your change in r5217:

Index: numpytest.py
===
--- numpytest.py(révision 5216)
+++ numpytest.py(révision 5217)
@@ -527,7 +527,7 @@
 all_tests = unittest.TestSuite(suite_list)
 return all_tests

-def test(self, level=1, verbosity=1, all=False, sys_argv=[],
+def test(self, level=1, verbosity=1, all=True, sys_argv=[],
  testcase_pattern='.*'):
 """Run Numpy module test suite with level and verbosity.

running
NumpyTest().test(all=False) works, but
NumpyTest().test(all=True) doesn't, that is, it finds 0 test.

David


2008/6/2 Charles R Harris <[EMAIL PROTECTED]>:

>
>
> On Mon, Jun 2, 2008 at 9:20 AM, David Huard <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> There are 2 problems with NumpyTest
>>
>> 1. It fails if the command is given the file name only (without a
>> directory structure)
>>
>> E.g.:
>>
>> [EMAIL PROTECTED]:~/repos/numpy/numpy/tests$ python test_ctypeslib.py
>> Traceback (most recent call last):
>>   File "test_ctypeslib.py", line 87, in 
>> NumpyTest().run()
>>   File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py",
>> line 655, in run
>> testcase_pattern=options.testcase_pattern)
>>   File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py",
>> line 575, in test
>> level, verbosity)
>>   File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py",
>> line 453, in _test_suite_from_all_tests
>> importall(this_package)
>>   File "/usr/lib64/python2.5/site-packages/numpy/testing/numpytest.py",
>> line 681, in importall
>> for subpackage_name in os.listdir(package_dir):
>> OSError: [Errno 2] No such file or directory: ''
>> [EMAIL PROTECTED]:~/repos/numpy/numpy/tests$
>>
>>
>>
>> 2. It doesn't find tests it used to find:
>>
>> [EMAIL PROTECTED]:~/repos/numpy/numpy$ python tests/test_ctypeslib.py
>>
>
>
> There haven't been many changes to the tests.  Could you fool with
> numpy.test(level=10,all=0) and such to see what happens? All=1 is now the
> default.
>
> I've also seen test run some tests twice. I don't know what was up with
> that.
>
> Chuck
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumpyTest problem

2008-06-10 Thread Charles R Harris
On Tue, Jun 10, 2008 at 8:49 PM, David Huard <[EMAIL PROTECTED]> wrote:

> Charles,
>
> This bug appeared after your change in r5217:
>
> Index: numpytest.py
> ===
> --- numpytest.py(révision 5216)
> +++ numpytest.py(révision 5217)
> @@ -527,7 +527,7 @@
>  all_tests = unittest.TestSuite(suite_list)
>  return all_tests
>
> -def test(self, level=1, verbosity=1, all=False, sys_argv=[],
> +def test(self, level=1, verbosity=1, all=True, sys_argv=[],
>   testcase_pattern='.*'):
>  """Run Numpy module test suite with level and verbosity.
>
> running
> NumpyTest().test(all=False) works, but
> NumpyTest().test(all=True) doesn't, that is, it finds 0 test.
>
> David
>

Yep, there seems to be a bug in test somewhere. Hmm, all is supposed to be
equivalent to level > 10 (but isn't), so I wonder if there is a conflict
with the level=1 default? But since we are moving to nose...

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion