[Numpy-discussion] PyArray_PutTo Question

2013-07-05 Thread Mark Janikas
Hi All,

I am a bit new to the NumPy C-API and I am having a hard time with placing 
results into output arrays... I am using PyArray_TakeFrom to grab an input 
dimension of data, then do a calculation, then I want to pack it back to the 
output... yet the PutTo function does not have an axis argument like the 
TakeFrom does... I am grabbing by column in a two-dimensional array and I would 
like to pack it that way.  I know that I can build the result in reverse and 
pack the columns into rows and then reshape the output... but I am wondering 
why the PutTo does not behave exactly like the take-from does?... The python 
implementation numpy.put also does not have the axis... so I guess I can see 
the one-to-one reason for the omission.  However, is building in reverse and 
reshaping the normal way to pack by column?

Thanks much!

MJ
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

2011-08-31 Thread Mark Janikas
Right indeed... I have spent a lot of time looking at this and it seems a waste 
of time as the results are garbage anyways when the columns are collinear.  I 
am just going to set a threshold, check the condition number, continue is 
satisfied, return error/warning if not now, what is too large? Ill poke 
around.  TY!

MJ 

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Pauli Virtanen
Sent: Wednesday, August 31, 2011 2:00 AM
To: numpy-discussion@scipy.org
Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

On Tue, 30 Aug 2011 15:48:18 -0700, Mark Janikas wrote:
 Last week I posted a question involving the identification of linear
 dependent columns of a matrix... but now I am finding an interesting
 result based on the linalg.inv() function... sometime I am able to
 invert a matrix that has linear dependent columns and other times I get
 the LinAlgError()... this suggests that there is some kind of random
 component to the INV method.  Is this normal?

I suspect that this is a case of floating-point rounding errors.
Floating-point arithmetic is inexact, so even if a certain matrix
is singular in exact arithmetic, for a computer it may still be
invertible (by a given algorithm). This type of things are not
unusual in floating-point computations.

The matrix condition number (`np.linalg.cond`) is a better measure
of whether a matrix is invertible or not.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

2011-08-31 Thread Mark Janikas
When I say garbage, I mean in the context of my hypothesis testing when in the 
presence of perfect multicollinearity.  I advise the user of the combination 
that leads to the problem and move on 

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Bruce Southey
Sent: Wednesday, August 31, 2011 11:11 AM
To: numpy-discussion@scipy.org
Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

On 08/31/2011 12:56 PM, Mark Janikas wrote:
 Right indeed... I have spent a lot of time looking at this and it seems a 
 waste of time as the results are garbage anyways when the columns are 
 collinear.  I am just going to set a threshold, check the condition number, 
 continue is satisfied, return error/warning if not now, what is too 
 large? Ill poke around.  TY!

 MJ
The results are not 'garbage' as if you have collinear columns as these 
have very well-known and understandable meaning. But if you don't expect 
this then you really need to examine how you are modeling or measuring 
your data because that is where the problem lies. For example, if you 
are measuring two variables then it means that those measurements are 
not independent as you are assuming.

Bruce

 -Original Message-
 From: numpy-discussion-boun...@scipy.org 
 [mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Pauli Virtanen
 Sent: Wednesday, August 31, 2011 2:00 AM
 To: numpy-discussion@scipy.org
 Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

 On Tue, 30 Aug 2011 15:48:18 -0700, Mark Janikas wrote:
 Last week I posted a question involving the identification of linear
 dependent columns of a matrix... but now I am finding an interesting
 result based on the linalg.inv() function... sometime I am able to
 invert a matrix that has linear dependent columns and other times I get
 the LinAlgError()... this suggests that there is some kind of random
 component to the INV method.  Is this normal?
 I suspect that this is a case of floating-point rounding errors.
 Floating-point arithmetic is inexact, so even if a certain matrix
 is singular in exact arithmetic, for a computer it may still be
 invertible (by a given algorithm). This type of things are not
 unusual in floating-point computations.

 The matrix condition number (`np.linalg.cond`) is a better measure
 of whether a matrix is invertible or not.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Question on LinAlg Inverse Algorithm

2011-08-30 Thread Mark Janikas
Hello All,

Last week I posted a question involving the identification of linear dependent 
columns of a matrix... but now I am finding an interesting result based on the 
linalg.inv() function... sometime I am able to invert a matrix that has linear 
dependent columns and other times I get the LinAlgError()... this suggests that 
there is some kind of random component to the INV method.  Is this normal?  
Thanks much ahead of time,

MJ

Mark Janikas
Product Developer
ESRI, Geoprocessing
380 New York St.
Redlands, CA 92373
909-793-2853 (2563)
mjani...@esri.commailto:mjani...@esri.com

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

2011-08-30 Thread Mark Janikas
Working on it... Give me a few minutes to get you the data.  TY!

MJ

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Christopher 
Jordan-Squire
Sent: Tuesday, August 30, 2011 3:57 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

Can you give an example matrix? I'm not a numerical linear algebra
expert, but I suspect that if your matrix is singular (or nearly so,
in floating point) then any inverse given will look pretty wonky. Huge
determinant, eigenvalues, operator norm, etc..

-Chris JS

On Tue, Aug 30, 2011 at 5:48 PM, Mark Janikas mjani...@esri.com wrote:
 Hello All,



 Last week I posted a question involving the identification of linear
 dependent columns of a matrix. but now I am finding an interesting result
 based on the linalg.inv() function. sometime I am able to invert a matrix
 that has linear dependent columns and other times I get the LinAlgError().
 this suggests that there is some kind of random component to the INV
 method.  Is this normal?  Thanks much ahead of time,



 MJ



 Mark Janikas

 Product Developer

 ESRI, Geoprocessing

 380 New York St.

 Redlands, CA 92373

 909-793-2853 (2563)

 mjani...@esri.com



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

2011-08-30 Thread Mark Janikas
When I export to ascii I am losing precision and it getting consistency... I 
will try a flat dump.  More to come.  TY

MJ

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Mark Janikas
Sent: Tuesday, August 30, 2011 4:02 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

Working on it... Give me a few minutes to get you the data.  TY!

MJ

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Christopher 
Jordan-Squire
Sent: Tuesday, August 30, 2011 3:57 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

Can you give an example matrix? I'm not a numerical linear algebra
expert, but I suspect that if your matrix is singular (or nearly so,
in floating point) then any inverse given will look pretty wonky. Huge
determinant, eigenvalues, operator norm, etc..

-Chris JS

On Tue, Aug 30, 2011 at 5:48 PM, Mark Janikas mjani...@esri.com wrote:
 Hello All,



 Last week I posted a question involving the identification of linear
 dependent columns of a matrix. but now I am finding an interesting result
 based on the linalg.inv() function. sometime I am able to invert a matrix
 that has linear dependent columns and other times I get the LinAlgError().
 this suggests that there is some kind of random component to the INV
 method.  Is this normal?  Thanks much ahead of time,



 MJ



 Mark Janikas

 Product Developer

 ESRI, Geoprocessing

 380 New York St.

 Redlands, CA 92373

 909-793-2853 (2563)

 mjani...@esri.com



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

2011-08-30 Thread Mark Janikas
OK... so I have been using checksums to compare and it looks like I am getting 
a different value when it fails as opposed to when it passes... I.e. the input 
is NOT the same.  When I save them to npy files and run LA.inv() I get 
consistent results.  Now I have to track down in my code why the inputs are 
different Sucks, because I keep having to dive deeper (more checksums... 
yeh!).  But it is all linear algebra from the same input, so kinda weird that 
there is a diversion. Thanks for all of your help! And Ill post again when I 
find the culprit. (probably me :-))

MJ

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Robert Kern
Sent: Tuesday, August 30, 2011 4:42 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

On Tue, Aug 30, 2011 at 17:48, Mark Janikas mjani...@esri.com wrote:
 Hello All,

 Last week I posted a question involving the identification of linear
 dependent columns of a matrix… but now I am finding an interesting result
 based on the linalg.inv() function… sometime I am able to invert a matrix
 that has linear dependent columns and other times I get the LinAlgError()…
 this suggests that there is some kind of random component to the INV
 method.  Is this normal?  Thanks much ahead of time,

We will also need to know the platform that you are on as well as the
LAPACK library that you linked numpy against. It is the behavior of
that LAPACK library that is controlling here. Standard LAPACK does
sometimes use pseudorandom numbers in certain situations, but AFAICT
it deterministically seeds the PRNG on every call, and I don't think
it does this for any subroutine involved with inversion. But if you
use an optimized LAPACK from some vendor, I don't know what they may
be doing. Some optimized LAPACK/BLAS libraries may be threaded and may
dynamically determine how to break up the problem based on load (I
don't know of any that specifically do this, but it's a possibility).

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Identifying Colinear Columns of a Matrix

2011-08-26 Thread Mark Janikas
Hello All,

I am trying to identify columns of a matrix that are perfectly collinear.  It 
is not that difficult to identify when two columns are identical are have zero 
variance, but I do not know how to ID when the culprit is of a higher order. 
i.e. columns 1 + 2 + 3 = column 4.  NUM.corrcoef(matrix.T) will return NaNs 
when the matrix is singular, and LA.cond(matrix.T) will provide a very large 
condition number But they do not tell me which columns are causing the 
problem.   For example:

zt = numpy. array([[ 1.  ,  1.  ,  1.  ,  1.  ,  1.  ],
   [ 0.25,  0.1 ,  0.2 ,  0.25,  0.5 ],
   [ 0.75,  0.9 ,  0.8 ,  0.75,  0.5 ],
   [ 3.  ,  8.  ,  0.  ,  5.  ,  0.  ]])

How can I identify that columns 0,1,2 are the issue because: column 1 + column 
2 = column 0?

Any input would be greatly appreciated.  Thanks much,

MJ

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

2011-08-26 Thread Mark Janikas
I actually use the VIF when the design matrix can be inverted I do it the 
quick and dirty way as opposed to the step regression:

1. Calc the correlation coefficient of the matrix (w/o the intercept)
2. Return the diagonal of the inversion of the correlation matrix in step 1.

Again, the problem lies in the multiple column relationship... I wouldn't be 
able to run sub regressions at all when the columns are perfectly collinear.

MJ

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Skipper Seabold
Sent: Friday, August 26, 2011 10:28 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

On Fri, Aug 26, 2011 at 1:10 PM, Mark Janikas mjani...@esri.com wrote:
 Hello All,



 I am trying to identify columns of a matrix that are perfectly collinear.
 It is not that difficult to identify when two columns are identical are have
 zero variance, but I do not know how to ID when the culprit is of a higher
 order. i.e. columns 1 + 2 + 3 = column 4.  NUM.corrcoef(matrix.T) will
 return NaNs when the matrix is singular, and LA.cond(matrix.T) will provide
 a very large condition number.. But they do not tell me which columns are
 causing the problem.   For example:



 zt = numpy. array([[ 1.  ,  1.  ,  1.  ,  1.  ,  1.  ],

    [ 0.25,  0.1 ,  0.2 ,  0.25,  0.5 ],

    [ 0.75,  0.9 ,  0.8 ,  0.75,  0.5 ],

    [ 3.  ,  8.  ,  0.  ,  5.  ,  0.  ]])



 How can I identify that columns 0,1,2 are the issue because: column 1 +
 column 2 = column 0?



 Any input would be greatly appreciated.  Thanks much,


The way that I know to do this in a regression context for (near
perfect) multicollinearity is VIF. It's long been on my todo list for
statsmodels.

http://en.wikipedia.org/wiki/Variance_inflation_factor

Maybe there are other ways with decompositions. I'd be happy to hear about them.

Please post back if you write any code to do this.

Skipper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

2011-08-26 Thread Mark Janikas
I wonder if my last statement is essentially the only answer... which I wanted 
to avoid... 

Should I just use combinations of the columns and try and construct the 
corrcoef() (then ID whether NaNs are present), or use the condition number to 
ID the singularity?  I just wanted to avoid the whole k! algorithm.

MJ

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Mark Janikas
Sent: Friday, August 26, 2011 10:35 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

I actually use the VIF when the design matrix can be inverted I do it the 
quick and dirty way as opposed to the step regression:

1. Calc the correlation coefficient of the matrix (w/o the intercept)
2. Return the diagonal of the inversion of the correlation matrix in step 1.

Again, the problem lies in the multiple column relationship... I wouldn't be 
able to run sub regressions at all when the columns are perfectly collinear.

MJ

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Skipper Seabold
Sent: Friday, August 26, 2011 10:28 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

On Fri, Aug 26, 2011 at 1:10 PM, Mark Janikas mjani...@esri.com wrote:
 Hello All,



 I am trying to identify columns of a matrix that are perfectly collinear.
 It is not that difficult to identify when two columns are identical are have
 zero variance, but I do not know how to ID when the culprit is of a higher
 order. i.e. columns 1 + 2 + 3 = column 4.  NUM.corrcoef(matrix.T) will
 return NaNs when the matrix is singular, and LA.cond(matrix.T) will provide
 a very large condition number.. But they do not tell me which columns are
 causing the problem.   For example:



 zt = numpy. array([[ 1.  ,  1.  ,  1.  ,  1.  ,  1.  ],

    [ 0.25,  0.1 ,  0.2 ,  0.25,  0.5 ],

    [ 0.75,  0.9 ,  0.8 ,  0.75,  0.5 ],

    [ 3.  ,  8.  ,  0.  ,  5.  ,  0.  ]])



 How can I identify that columns 0,1,2 are the issue because: column 1 +
 column 2 = column 0?



 Any input would be greatly appreciated.  Thanks much,


The way that I know to do this in a regression context for (near
perfect) multicollinearity is VIF. It's long been on my todo list for
statsmodels.

http://en.wikipedia.org/wiki/Variance_inflation_factor

Maybe there are other ways with decompositions. I'd be happy to hear about them.

Please post back if you write any code to do this.

Skipper
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

2011-08-26 Thread Mark Janikas
Charles!  That looks like it could be a winner!  It looks like you always 
choose the last column of the U matrix and ID the columns that have the same 
values?  It works when I add extra columns as well!  BTW, sorry for my lack of 
knowledge... but what was the point of the dot multiply at the end?  That they 
add up to essentially zero, indicating singularity?  Thanks so much!

MJ

From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Charles R Harris
Sent: Friday, August 26, 2011 11:04 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix


On Fri, Aug 26, 2011 at 11:41 AM, Mark Janikas 
mjani...@esri.commailto:mjani...@esri.com wrote:
I wonder if my last statement is essentially the only answer... which I wanted 
to avoid...

Should I just use combinations of the columns and try and construct the 
corrcoef() (then ID whether NaNs are present), or use the condition number to 
ID the singularity?  I just wanted to avoid the whole k! algorithm.

MJ

-Original Message-
From: 
numpy-discussion-boun...@scipy.orgmailto:numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.orgmailto:numpy-discussion-boun...@scipy.org]
 On Behalf Of Mark Janikas
Sent: Friday, August 26, 2011 10:35 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

I actually use the VIF when the design matrix can be inverted I do it the 
quick and dirty way as opposed to the step regression:

1. Calc the correlation coefficient of the matrix (w/o the intercept)
2. Return the diagonal of the inversion of the correlation matrix in step 1.

Again, the problem lies in the multiple column relationship... I wouldn't be 
able to run sub regressions at all when the columns are perfectly collinear.

MJ

-Original Message-
From: 
numpy-discussion-boun...@scipy.orgmailto:numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.orgmailto:numpy-discussion-boun...@scipy.org]
 On Behalf Of Skipper Seabold
Sent: Friday, August 26, 2011 10:28 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

On Fri, Aug 26, 2011 at 1:10 PM, Mark Janikas 
mjani...@esri.commailto:mjani...@esri.com wrote:
 Hello All,



 I am trying to identify columns of a matrix that are perfectly collinear.
 It is not that difficult to identify when two columns are identical are have
 zero variance, but I do not know how to ID when the culprit is of a higher
 order. i.e. columns 1 + 2 + 3 = column 4.  NUM.corrcoef(matrix.T) will
 return NaNs when the matrix is singular, and LA.cond(matrix.T) will provide
 a very large condition number.. But they do not tell me which columns are
 causing the problem.   For example:



 zt = numpy. array([[ 1.  ,  1.  ,  1.  ,  1.  ,  1.  ],

[ 0.25,  0.1 ,  0.2 ,  0.25,  0.5 ],

[ 0.75,  0.9 ,  0.8 ,  0.75,  0.5 ],

[ 3.  ,  8.  ,  0.  ,  5.  ,  0.  ]])



 How can I identify that columns 0,1,2 are the issue because: column 1 +
 column 2 = column 0?



 Any input would be greatly appreciated.  Thanks much,


The way that I know to do this in a regression context for (near
perfect) multicollinearity is VIF. It's long been on my todo list for
statsmodels.

http://en.wikipedia.org/wiki/Variance_inflation_factor

Maybe there are other ways with decompositions. I'd be happy to hear about them.

Please post back if you write any code to do this.

Why not svd?

In [13]: u,d,v = svd(zt)

In [14]: d
Out[14]:
array([  1.01307066e+01,   1.87795095e+00,   3.03454566e-01,
 3.29253945e-16])

In [15]: u[:,3]
Out[15]: array([ 0.57735027, -0.57735027, -0.57735027,  0.])

In [16]: dot(u[:,3], zt)
Out[16]:
array([ -7.77156117e-16,  -6.66133815e-16,  -7.21644966e-16,
-7.77156117e-16,  -8.88178420e-16])

Chuck

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Most efficient trim of arrays

2010-12-14 Thread Mark Janikas
Hello All,

I was wondering what the best way to trim an array based on some values I do 
not want  I could use NUM.where or NUM.take... but let me give you an 
example:

import numpy as NUM
n = 100 (Length of my dataset)
data = NUM.empty((n,), float)
badRecords = []
for ind, record in enumerate(records):
if record == someValueIDOntWant:
badRecords.append(ind)
else:
data[ind] = record


Now, I want to trim my array using badRecords.  I guess I want to avoid 
copying.  Any thoughts on the best way to do it?  I do not want to use lists 
and then subsequently array the result as it is nice to pre-allocate the space.

Thanks much,

MJ


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Database with Nulls to Numpy Structure

2009-10-02 Thread Mark Janikas
Hello All,

I was hoping you could help me out with a simple little problem I am having:

I am reading data from a database that contains NULL values.  There is more 
than one field being read in with equal length, but if any of them are NULL in 
a row, then I do NOT want to include it in my numpy structure (I.e. no records 
for that row across fields).  As the values from each field are of the same 
type, I can pre-allocate the space for the entire dataset (if all were not 
NULL), but there may be less observations after accounting for the NULLS.  So, 
do I use lists and append then create the arrays... Or do I fill up the 
pre-allocated empty arrays and slice off the ends?  Thoughts?  Thanks much...

MJ

Mark Janikas
Product Engineer
ESRI, Geoprocessing
380 New York St.
Redlands, CA 92373
909-793-2853 (2563)
mjani...@esri.commailto:mjani...@esri.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Database with Nulls to Numpy Structure

2009-10-02 Thread Mark Janikas
Thanks for the input!  I wonder if I can resize my own record array?  I.e. one 
call to truncate... Ill give it a go.  But the resize works great as it doesn't 
make a copy:

In [12]: a = NUM.arange(10)

In [13]: id(a)
Out[13]: 190182896

In [14]: a.resize(5,)

In [15]: a
Out[15]: array([0, 1, 2, 3, 4])

In [16]: id(a)
Out[16]: 190182896

Whereas the slice seems to make a copy/reassign:

In [18]: a = a[0:2]

In [19]: id(a)
Out[19]: 189981184


Pretty Nice.  Pre-allocate the full space and count number of good records... 
then resize.  Doesn't seem that much faster than using the lists then creating 
arrays, but memory should be better.  Thanks again, and anything further would 
be appreciated.

MJ


-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Christopher Barker
Sent: Friday, October 02, 2009 12:34 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Database with Nulls to Numpy Structure

Mark Janikas wrote:
 So, do I use lists and 
 append then create the arrays... Or do I fill up the pre-allocated empty 
 arrays and slice off the ends?  Thoughts?  Thanks much...

Either will work. I think the decision would be based on how many Null 
records you expect -- if it's a small fraction then go ahead and 
pre-allocate the array, if it's a large fraction, then you might want to 
go with a list.

Note: you may be able to use arr.resize() to chop it off at the end.

The list method has the downside of using more memory, and being a bit 
slower, which may be mitigated if there are lots of null records.

See an upcoming email of mine for another option...

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Timing array construction

2009-04-30 Thread Mark Janikas
Thanks Eric!

I have a lot of array constructions in my code that use NUM.array([list of 
values])... I am going to replace it with the empty allocation and insertion.  
It is indeed twice as fast as c_ (when it matters, I.e. N is relatively 
large):

c_, empty
100 0.0007, 0.0230
200 0.0007, 0.0002
400 0.0007, 0.0002
800 0.0020, 0.0002
1600 0.0009, 0.0003
3200 0.0010, 0.0003
6400 0.0013, 0.0005
12800 0.0058, 0.0032

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Eric Firing
Sent: Wednesday, April 29, 2009 11:49 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Timing array construction

Mark Janikas wrote:
 Hello All,
 
  
 
 I was exploring some different ways to concatenate arrays, and using 
 c_ is the fastest by far.  Is there a difference I am missing that can 
 account for the huge disparity?  Obviously the zip function makes the 
 as array and array calls slower, but the same arguments (xCoords, 
 yCoords) are being passed to the methods... so if there is no difference 
 in the outputs (there doesn't appear to be) then what reason would I 
 have to use array or as array in this context?  Thanks so much ahead 
 of time..

If you really want speed, use something like this:

import numpy as np
def useEmpty(xCoords, yCoords):
 out = np.empty((len(xCoords), 2), dtype=xCoords.dtype)
 out[:,0] = xCoords
 out[:,1] = yCoords
 return out

It is quite a bit faster than using c_; more than a factor of two on my 
machine for all your test cases.

All your methods using zip and array are doing a lot of unpacking, 
repacking, checking, iterating... Even the c_ method is slower than it 
needs to be for this case because it is more general and flexible.

Eric
 
  
 
 MJ
 
  
 
 ## Snippet ###
 
 import numpy as NUM
 
  
 
 def useAsArray(xCoords, yCoords):
 
 return NUM.asarray(zip(xCoords, yCoords))
 
  
 
 def useArray(xCoords, yCoords):
 
 return NUM.array(zip(xCoords, yCoords))
 
  
 
 def useC(xCoords, yCoords):
 
 return NUM.c_[xCoords, yCoords]
 
  
 
  
 
 if __name__ == __main__:
 
 from timeit import Timer
 
 import numpy.random as RAND
 
 import collections as COLL
 
  
 
 resAsArray = COLL.defaultdict(float)
 
 resArray = COLL.defaultdict(float)
 
 resMat = COLL.defaultdict(float)
 
 numTests = 0.0
 
 sameTests = 0.0
 
 N = [100, 200, 400, 800, 1600, 3200, 6400, 12800]
 
 for i in N:
 
 print Time Join List into Array for N =  + str(i)
 
 xCoords = RAND.normal(10, 1, i)
 
 yCoords = RAND.normal(10, 1, i)
 
  
 
 statement = 'from __main__ import xCoords, yCoords, useAsArray'
 
 t1 = Timer('useAsArray(xCoords, yCoords)', statement)
 
 resAsArray[i] = t1.timeit(10)
 
  
 
 statement = 'from __main__ import xCoords, yCoords, useArray'
 
 t2 = Timer('useArray(xCoords, yCoords)', statement)
 
 resArray[i] = t2.timeit(10)
 
  
 
 statement = 'from __main__ import xCoords, yCoords, useC'
 
 t3 = Timer('useC(xCoords, yCoords)', statement)
 
 resMat[i] = t3.timeit(10)  
 
  
 
 for n in N:
 
 print %i, %0.4f, %0.4f, %0.4f % (n, resAsArray[n], 
 resArray[n], resMat[n])
 
 ###
 
  
 
 RESULT
 
  
 
 N, useAsArray, useArray, useC
 
 100, 0.0066, 0.0065, 0.0007
 
 200, 0.0137, 0.0140, 0.0008
 
 400, 0.0277, 0.0288, 0.0007
 
 800, 0.0579, 0.0577, 0.0008
 
 1600, 0.1175, 0.1289, 0.0009
 
 3200, 0.2291, 0.2309, 0.0012
 
 6400, 0.4561, 0.4564, 0.0013
 
 12800, 0.9218, 0.9122, 0.0019
 
  
 
  
 
 Mark Janikas
 
 Product Engineer
 
 ESRI, Geoprocessing
 
 380 New York St.
 
 Redlands, CA 92373
 
 909-793-2853 (2563)
 
 mjani...@esri.com mailto:mjani...@esri.com
 
 
 
 
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Timing array construction

2009-04-30 Thread Mark Janikas
Thanks Chris and Bruce for the further input.  I kindof like the c_ method 
because it is still relatively speedy and easy to implement.  But, the empty 
method seems to be closest to what is actually done no matter which direction 
you go in... I.e. preallocate space and insert.  I am in the process of ripping 
all of my zip calls out.  The profile of my first set of techniques is already 
significantly better.  This whole exercise has been very enlightening, as I 
spend so much time working on speeding up my algorithms and simple things like 
this should be tackled first.  Thanks again!

MJ 

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Christopher Barker
Sent: Thursday, April 30, 2009 12:16 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Timing array construction

Mark Janikas wrote:
 I have a lot of array constructions in my code that use
 NUM.array([list of values])... I am going to replace it with the
 empty allocation and insertion.

It may not be worth it, depending on where list_of_values comes from/is. 
A rule of thumb may be: it's going to be slow going from a numpy array 
to a regular old python list or tuple, back to a numpy array. If your 
data is a python list already, than np.array(list) is a fine choice.


 def useAsArray(xCoords, yCoords):

 return NUM.asarray(zip(xCoords, yCoords))

Here are some of the issues with this one:

zip unpacks two generic python sequences and then put the items into 
tuple, then puts them in a list. Essentially this:

new_list = []
for i in range(len(xCoords)):
 new_list.append((xCoords[i], yCoords[i]))


In each iteration of that loop, it's indexing into the numpy arrays, 
making a python object out of them, putting them into a tuple, and 
appending that tuple to the list, which may have to re-allocate memory a 
few times.

Then the np.array() call loops through that list, unpacks each tuple, 
examines the python object, decides what it is, and turn it into a raw 
c-type to put into the array.

whereas:

def useEmpty(xCoords, yCoords):
  out = np.empty((len(xCoords), 2), dtype=xCoords.dtype)
  out[:,0] = xCoords
  out[:,1] = yCoords
  return out

allocates an array the right size.
directly copies the data from xCoords and yCoords to it.

that's it.

You can see why it's so much faster!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Timing array construction

2009-04-29 Thread Mark Janikas
Hello All,

I was exploring some different ways to concatenate arrays, and using c_ is 
the fastest by far.  Is there a difference I am missing that can account for 
the huge disparity?  Obviously the zip function makes the as array and 
array calls slower, but the same arguments (xCoords, yCoords) are being 
passed to the methods... so if there is no difference in the outputs (there 
doesn't appear to be) then what reason would I have to use array or as 
array in this context?  Thanks so much ahead of time..

MJ

## Snippet ###
import numpy as NUM

def useAsArray(xCoords, yCoords):
return NUM.asarray(zip(xCoords, yCoords))

def useArray(xCoords, yCoords):
return NUM.array(zip(xCoords, yCoords))

def useC(xCoords, yCoords):
return NUM.c_[xCoords, yCoords]


if __name__ == __main__:
from timeit import Timer
import numpy.random as RAND
import collections as COLL

resAsArray = COLL.defaultdict(float)
resArray = COLL.defaultdict(float)
resMat = COLL.defaultdict(float)
numTests = 0.0
sameTests = 0.0
N = [100, 200, 400, 800, 1600, 3200, 6400, 12800]
for i in N:
print Time Join List into Array for N =  + str(i)
xCoords = RAND.normal(10, 1, i)
yCoords = RAND.normal(10, 1, i)

statement = 'from __main__ import xCoords, yCoords, useAsArray'
t1 = Timer('useAsArray(xCoords, yCoords)', statement)
resAsArray[i] = t1.timeit(10)

statement = 'from __main__ import xCoords, yCoords, useArray'
t2 = Timer('useArray(xCoords, yCoords)', statement)
resArray[i] = t2.timeit(10)

statement = 'from __main__ import xCoords, yCoords, useC'
t3 = Timer('useC(xCoords, yCoords)', statement)
resMat[i] = t3.timeit(10)

for n in N:
print %i, %0.4f, %0.4f, %0.4f % (n, resAsArray[n], resArray[n], 
resMat[n])
###

RESULT

N, useAsArray, useArray, useC
100, 0.0066, 0.0065, 0.0007
200, 0.0137, 0.0140, 0.0008
400, 0.0277, 0.0288, 0.0007
800, 0.0579, 0.0577, 0.0008
1600, 0.1175, 0.1289, 0.0009
3200, 0.2291, 0.2309, 0.0012
6400, 0.4561, 0.4564, 0.0013
12800, 0.9218, 0.9122, 0.0019


Mark Janikas
Product Engineer
ESRI, Geoprocessing
380 New York St.
Redlands, CA 92373
909-793-2853 (2563)
mjani...@esri.commailto:mjani...@esri.com
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Permutations in Simulations`

2009-02-10 Thread Mark Janikas
Hello All,

I want to create an array that contains a column of permutations for each 
simulation:

import numpy as NUM
import numpy.random as RAND
x = NUM.arange(4.)
res = NUM.zeros((4,100))

for sim in range(100):
res[:,sim] = RAND.permutation(x)


Is there a way to do this without a loop?  Thanks so much ahead of time...

MJ

Mark Janikas
Product Engineer
ESRI, Geoprocessing
380 New York St.
Redlands, CA 92373
909-793-2853 (2563)
mjani...@esri.commailto:mjani...@esri.com
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Permutations in Simulations`

2009-02-10 Thread Mark Janikas
Thanks to all for your replies.  I want this to work on any vector so I was 
thinking this...?

import numpy as np
import timeit
x = np.array([4.,5.,10.,3.,5.,6.,7.,2.,9.,1.])
nx = 10
ny = 100

def weirdshuffle4(x, ny):
nx = len(x)
indices = np.random.random_sample((nx,ny)).argsort(0).argsort(0)
return x[indices]

t=timeit.Timer(weirdshuffle4(x,ny), from __main__ import *) 
print t.timeit(100)

0.0148663153873


-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Keith Goodman
Sent: Tuesday, February 10, 2009 12:59 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Permutations in Simulations`

On Tue, Feb 10, 2009 at 12:41 PM, Keith Goodman kwgood...@gmail.com wrote:
 On Tue, Feb 10, 2009 at 12:28 PM, Keith Goodman kwgood...@gmail.com wrote:
 On Tue, Feb 10, 2009 at 12:18 PM, Keith Goodman kwgood...@gmail.com wrote:
 On Tue, Feb 10, 2009 at 11:29 AM, Mark Janikas mjani...@esri.com wrote:
 I want to create an array that contains a column of permutations for each
 simulation:

 import numpy as NUM

 import numpy.random as RAND

 x = NUM.arange(4.)

 res = NUM.zeros((4,100))


 for sim in range(100):

 res[:,sim] = RAND.permutation(x)


 Is there a way to do this without a loop?  Thanks so much ahead of time.

 Does this work? Might not be faster but it does avoid the loop.

 import numpy as np

 def weirdshuffle(nx, ny):
x = np.ones((nx,ny)).cumsum(0, dtype=np.int) - 1
yidx = np.ones((nx,ny)).cumsum(1, dtype=np.int) - 1
xidx = np.random.rand(nx,ny).argsort(0).argsort(0)
return x[xidx, yidx]

 Hey, it is faster for nx=4, ny=100

 def baseshuffle(nx, ny):
x = np.arange(nx)
res = np.zeros((nx,ny))
for sim in range(ny):
res[:,sim] = np.random.permutation(x)
return res

 timeit baseshuffle(4,100)
 1000 loops, best of 3: 1.11 ms per loop
 timeit weirdshuffle(4,100)
 1 loops, best of 3: 127 µs per loop

 OK, who can cut that time in half? My first try looks clunky.

 This is a little faster:

 def weirdshuffle2(nx, ny):
one = np.ones((nx,ny), dtype=np.int)
x = one.cumsum(0)
x -= 1
yidx = one.cumsum(1)
yidx -= 1
xidx = np.random.random_sample((nx,ny)).argsort(0).argsort(0)
return x[xidx, yidx]

 timeit weirdshuffle(4,100)
 1 loops, best of 3: 129 µs per loop
 timeit weirdshuffle2(4,100)
 1 loops, best of 3: 106 µs per loop

Sorry for all the mail.

def weirdshuffle3(nx, ny):
return np.random.random_sample((nx,ny)).argsort(0).argsort(0)

 timeit weirdshuffle(4,100)
1 loops, best of 3: 128 µs per loop
 timeit weirdshuffle3(4,100)
1 loops, best of 3: 37.5 µs per loop
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Permutations in Simulations`

2009-02-10 Thread Mark Janikas
You are correct!  Thanks to all!

MJ

-Original Message-
From: numpy-discussion-boun...@scipy.org 
[mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Keith Goodman
Sent: Tuesday, February 10, 2009 6:07 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Permutations in Simulations`

Yeah, good point. The second argsort isn't needed. That should speed things up.

The double argsort ranks the values in the array. But we don't need that here.

On Tue, Feb 10, 2009 at 5:31 PM,  josef.p...@gmail.com wrote:
 very nice. What's the purpose of the second  `.argsort(0)` ? Doesn't
 it also work without it, or am I missing something in how this works?

 Josef

 On 2/10/09, Mark Janikas mjani...@esri.com wrote:
 Thanks to all for your replies.  I want this to work on any vector so I was
 thinking this...?

 import numpy as np
 import timeit
 x = np.array([4.,5.,10.,3.,5.,6.,7.,2.,9.,1.])
 nx = 10
 ny = 100

 def weirdshuffle4(x, ny):
 nx = len(x)
 indices = np.random.random_sample((nx,ny)).argsort(0).argsort(0)
 return x[indices]

 t=timeit.Timer(weirdshuffle4(x,ny), from __main__ import *)
 print t.timeit(100)

 0.0148663153873


 -Original Message-
 From: numpy-discussion-boun...@scipy.org
 [mailto:numpy-discussion-boun...@scipy.org] On Behalf Of Keith Goodman
 Sent: Tuesday, February 10, 2009 12:59 PM
 To: Discussion of Numerical Python
 Subject: Re: [Numpy-discussion] Permutations in Simulations`

 On Tue, Feb 10, 2009 at 12:41 PM, Keith Goodman kwgood...@gmail.com wrote:
 On Tue, Feb 10, 2009 at 12:28 PM, Keith Goodman kwgood...@gmail.com
 wrote:
 On Tue, Feb 10, 2009 at 12:18 PM, Keith Goodman kwgood...@gmail.com
 wrote:
 On Tue, Feb 10, 2009 at 11:29 AM, Mark Janikas mjani...@esri.com
 wrote:
 I want to create an array that contains a column of permutations for
 each
 simulation:

 import numpy as NUM

 import numpy.random as RAND

 x = NUM.arange(4.)

 res = NUM.zeros((4,100))


 for sim in range(100):

 res[:,sim] = RAND.permutation(x)


 Is there a way to do this without a loop?  Thanks so much ahead of
 time.

 Does this work? Might not be faster but it does avoid the loop.

 import numpy as np

 def weirdshuffle(nx, ny):
x = np.ones((nx,ny)).cumsum(0, dtype=np.int) - 1
yidx = np.ones((nx,ny)).cumsum(1, dtype=np.int) - 1
xidx = np.random.rand(nx,ny).argsort(0).argsort(0)
return x[xidx, yidx]

 Hey, it is faster for nx=4, ny=100

 def baseshuffle(nx, ny):
x = np.arange(nx)
res = np.zeros((nx,ny))
for sim in range(ny):
res[:,sim] = np.random.permutation(x)
return res

 timeit baseshuffle(4,100)
 1000 loops, best of 3: 1.11 ms per loop
 timeit weirdshuffle(4,100)
 1 loops, best of 3: 127 µs per loop

 OK, who can cut that time in half? My first try looks clunky.

 This is a little faster:

 def weirdshuffle2(nx, ny):
one = np.ones((nx,ny), dtype=np.int)
x = one.cumsum(0)
x -= 1
yidx = one.cumsum(1)
yidx -= 1
xidx = np.random.random_sample((nx,ny)).argsort(0).argsort(0)
return x[xidx, yidx]

 timeit weirdshuffle(4,100)
 1 loops, best of 3: 129 µs per loop
 timeit weirdshuffle2(4,100)
 1 loops, best of 3: 106 µs per loop

 Sorry for all the mail.

 def weirdshuffle3(nx, ny):
 return np.random.random_sample((nx,ny)).argsort(0).argsort(0)

 timeit weirdshuffle(4,100)
 1 loops, best of 3: 128 µs per loop
 timeit weirdshuffle3(4,100)
 1 loops, best of 3: 37.5 µs per loop
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] appending extra items to arrays

2007-10-11 Thread Mark Janikas
If you do not know the size of your array before you finalize it, then
you should use lists whenever you can.  I just cooked up a short
example:

##
import timeit
import numpy as N

values = range(1)

def appendArray(values):
result = N.array([], dtype=int)
for value in values:
result = N.append(result, value)
return result

def appendList(values):
result = []
for value in values:
result.append(value)
return N.array(result)

test = timeit.Timer('appendArray(values)',
'from __main__ import appendArray, values')
t1 = test.timeit(number=10)

test2 = timeit.Timer('appendList(values)',
'from __main__ import appendList, values')
t2 = test2.timeit(number=10)

print Total Time with array:  + str(t1)
print Total Time with list:  + str(t2)

# Result #
Total Time with array: 2.12951189331
Total Time with list: 0.0469707035741



Hope this helps,

MJ

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Adam Mercer
Sent: Thursday, October 11, 2007 7:42 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] appending extra items to arrays

On 11/10/2007, Robert Kern [EMAIL PROTECTED] wrote:

 Appending to a list then converting the list to an array is the most
 straightforward way to do it. If the performance of this isn't a
problem, I
 recommend leaving it alone.

Thanks, I'll leave it as is - I was just wondering if there was a
better way to do it.

Cheers

Adam
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy and freeze.py

2007-05-22 Thread Mark Janikas
I cant be sure if your issue is related to mine, so I was wondering
where/when you got your numpy build?

My issue:
http://projects.scipy.org/pipermail/numpy-discussion/2007-April/027000.h
tml

Travis has been kind enough to work with me on it.  His changes are in
the svn.  So, I don't think this is an issue that has arisen due to the
changes unless you have checked numpy out recently and compiled it
yourself.

MJ

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Hanno Klemm
Sent: Tuesday, May 22, 2007 9:04 AM
To: numpy-discussion@scipy.org
Subject: [Numpy-discussion] numpy and freeze.py


Hi,

I want to use freeze.py on code that heavily relies on numpy. If I
just try 

python2.5 /scratch/src/Python-2.5/Tools/freeze/freeze.py pylay.py

the make works but then I get the error:

Traceback (most recent call last):
  File pylay.py, line 1, in module
import kuvBeta4 as kuv
  File kuvBeta4.py, line 6, in module
import mfunBeta4 as mfun
  File mfunBeta4.py, line 2, in module
import numpy
  File
/glb/eu/siep_bv/proj/yot04/apps/python2.5/lib/python2.5/site-packages/n
umpy/__init__.py,
line 39, in module
import core
  File
/glb/eu/siep_bv/proj/yot04/apps/python2.5/lib/python2.5/site-packages/n
umpy/core/__init__.py,
line 5, in module
import multiarray
ImportError: No module named multiarray


Am I doing something wrong? Or does freeze.py not work with numpy?

Hanno


-- 
Hanno Klemm
[EMAIL PROTECTED]


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Silent install of .exe

2007-04-04 Thread Mark Janikas
Is there a way to silently install the numpy.exe from a Microsoft DOS
prompt?

 

Something like: numpy-1.0.2.win32-py2.4.exe -silent

 

Thanks ahead of time...

 

MJ

 

Mark Janikas

Product Engineer

ESRI, Geoprocessing

380 New York St.

Redlands, CA 92373

909-793-2853 (2563)

[EMAIL PROTECTED]

 

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dynamic module not initialized properly

2007-04-02 Thread Mark Janikas
Thanks for the info Greg.   Yup.  I am sorry that I had to post a thread
without code to back it up unfortunately, there just isn't a way for
me to roll it into an example without the entire package being
installed.  This is all very good info you have provided.  Ill let you
know how things work out.  Thanks again,

 

MJ

 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Steele, Greg
Sent: Monday, April 02, 2007 9:07 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Dynamic module not initialized properly

 

Mark,

 

It is hard to comment since you have not provided much information. Your
link to a previous thread brought up a post that I had sent. The issue
that I encountered had to do with the multiarraymodule.c extension
module. When numpy is imported, it imports this module and the static
variable _multiarray_module_loaded gets set. When Python is finalized
is does not unload the multiarraymodule.c DLL. When Python is
initialized again and numpy is imported again, the static variable is
already set and multiarraymodule does not import correctly. Hence the
error.

 

The way I dealt with this is a 'hack', but it worked for us. This was on
a windows platform. After I finalize Python, I forcibly unload the
multiarraymodule DLL using the FreeLibrary call. The C code looks like

 

if (multiarray_loaded) {

  HINSTANCE hDLL = NULL;

  hDLL = LoadLibraryEx(buf, NULL,LOAD_WITH_ALTERED_SEARCH_PATH);

  FreeLibrary(hDLL);

  FreeLibrary(hDLL);

}

 

The two calls of FreeLibrary are needed since each call to LoadLibraryEx
increments the DLL reference count. The call to LoadLibraryEx here gets
a handle to the DLL. 

 

What needs to be done long term is the removal of the static variable in
multiarraymodule. I don't understand the code well enough to know why it
is needed, but that appears to be the crux of the issue. Another
solution would be for Python to call FreeLibrary on all the DLLs during
Py_Finalize.

 

Greg



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mark Janikas
Sent: Friday, March 30, 2007 4:55 PM
To: Discussion of Numerical Python
Subject: [Numpy-discussion] Dynamic module not initialized properly

 

Hello all,

 

I am having an issue importing numpy on subsequent (I.e. not on first
load) attempts in our software.  The majority of the code is written in
C, C++ and I am a python developer and do not have direct access to a
lot of it.  This is a bit of a difficult question to ask all of you
because I cant provide you a direct example.  All I can do is point to a
numpy thread that discusses the issue:

 

http://groups.google.com/group/Numpy-discussion/browse_thread/thread/321
77a82deab05ae/d8eecaf494ba5ad5?lnk=stq=dynamic+module+not+initialized+p
roperly+numpyrnum=1hl=en#d8eecaf494ba5ad5

 

ERROR:

exceptions.SystemError: dynamic module not initialized properly

 

 What is really odd about my specific issue is that if I don't change
anything in the source code Then the error doesn't pop up.
Furthermore, the error doesn't show on some attempts even after I make a
change  Not sure whether there is anything I can do from the
scripting side (some alternative form of reload?)... or if I have to
forward it along to the C developers.  You have my appreciation ahead of
time.  

 

Mark Janikas

Product Engineer

ESRI, Geoprocessing

380 New York St.

Redlands, CA 92373

909-793-2853 (2563)

[EMAIL PROTECTED]

 

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Source install

2007-02-28 Thread Mark Janikas
Hello all,

 

I have used numpy on both Mac and Windows.  The latter is easily
installed with the exe file.  The former required the gcc program from
XCode... but once installed, python setup.py install worked.  I cant
seem to get numpy to work on my linux machine.  Can someone point me to
a platform-independent doc on how to install from the source tar file?
Thanks ahead of time,

 

MJ

 

Mark Janikas

Product Engineer

ESRI, Geoprocessing

380 New York St.

Redlands, CA 92373

909-793-2853 (2563)

[EMAIL PROTECTED]

 

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Source install

2007-02-28 Thread Mark Janikas
Thanks Robert,

Sorry for the incomplete request for help.  The install of numpy seems
to go fine, but when I import numpy it reports that it is running from
the source directory.  I assume this has to do with the BLAS/ATLAS stuff
I have been reading about.  What I am actually trying to do is get NumPy
wrapped in the install of our software program.  We currently wrap
Python2.4 as our scripting language and I need a way to get numpy in our
compiler.  The gui portions of our software runs on Windows but the
engine works on Unix flavors.  I am afraid I am not too knowledgeable
about what goes on under the hood of the NumPy install.  I assume I need
an appropriate C compiler (where gcc fit in for Mac OSX), but I was
wondering if there was an appropriate Doc I should closely examine that
would point me in the right direction.  I hope this clears my question
up a bit.  Again, thanks in advance

MJ

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Robert Kern
Sent: Wednesday, February 28, 2007 11:26 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Source install

Mark Janikas wrote:
 Hello all,
 
 I have used numpy on both Mac and Windows.  The latter is easily
 installed with the exe file.  The former required the gcc program from
 XCode... but once installed, python setup.py install worked.  I cant
 seem to get numpy to work on my linux machine.  Can someone point me
to
 a platform-independent doc on how to install from the source tar file?
  Thanks ahead of time,

We need more information from you. There is no way one can make a
platform-independent doc that covers all of the cases. We need to know
what you
tried and exactly how it failed (i.e., we need you to copy the exact
error
messages and paste them into an email).

If I had to guess, though, since you succeeded doing an install from
source on
OS X, the problem on Linux is likely that you do not have the
appropriate Python
development package for your system. On RPM-based systems like Fedora
Core, it
is usually named something like python-devel.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma
 that is made terrible by our own mad attempt to interpret it as though
it had
 an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Greek Letters

2007-02-20 Thread Mark Janikas
Hello all,

 

I was wondering how I could print the chi-squared symbol in python.  I
have been looking at the Unicode docs, but I figured I would ask for
assistance here while I delve into it.  Thanks for any help in advance.

 

Mark Janikas

Product Engineer

ESRI, Geoprocessing

380 New York St.

Redlands, CA 92373

909-793-2853 (2563)

[EMAIL PROTECTED]

 

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Greek Letters

2007-02-20 Thread Mark Janikas
Thanks for all the info.  That website with all the codes is great.  

MJ

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Zachary Pincus
Sent: Tuesday, February 20, 2007 4:18 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Greek Letters

I have found that the python 'unicode name' escape sequence, combined  
with the canonical list of unicode names ( http://unicode.org/Public/ 
UNIDATA/NamesList.txt ), is a good way of getting the symbols you  
want and still keeping the python code legible.

 From the above list, we see that the symbol name we want is GREEK  
SMALL LETTER CHI, so:
chi = u'\N{GREEK SMALL LETTER CHI}'
will do the trick. For chi^2, use:
chi2 = u'\N{GREEK SMALL LETTER CHI}\N{SUPERSCRIPT TWO}'

Note that to print these characters, we usually need to encode them  
somehow. My terminal supports UTF-8, so the following works for me:
import codecs
print codecs.encode(chi2, 'utf8')

giving (if your mail reader supports utf8 and mine encodes it  
properly...):
χ²

Zach Pincus

Program in Biomedical Informatics and Department of Biochemistry
Stanford University School of Medicine


On Feb 20, 2007, at 3:56 PM, Mark Janikas wrote:

 Hello all,



 I was wondering how I could print the chi-squared symbol in  
 python.  I have been looking at the Unicode docs, but I figured I  
 would ask for assistance here while I delve into it.  Thanks for  
 any help in advance.



 Mark Janikas

 Product Engineer

 ESRI, Geoprocessing

 380 New York St.

 Redlands, CA 92373

 909-793-2853 (2563)

 [EMAIL PROTECTED]



 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Greek Letters

2007-02-20 Thread Mark Janikas
Oh.  I am using CygWin, and the website I just went to:

http://www.cygwin.com/faq/faq_3.html


stated that:  The short answer is that Cygwin is not Unicode-aware

Not sure if this is going to apply to python in general, but I suspect it will. 
 Ugh, I dislike Windows a lot, but it pays the bills.  The interesting thing to 
note is that the print out to gui interface is 'UTF-8' so it works.  It just 
wont work on my terminal where I do all of my testing.  I might just have to 
put a try statement in and put a chi-square in the except.

MJ

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mark Janikas
Sent: Tuesday, February 20, 2007 5:16 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Greek Letters

Thanks Robert but alas, I get.

 import sys
 sys.stdout.encoding
'cp437'
 print u'\u03a7\u00b2'.encode(sys.stdout.encoding)
Traceback (most recent call last):
  File stdin, line 1, in ?
  File C:\Python24\lib\encodings\cp437.py, line 18, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\u03a7' in position
 0: character maps to undefined



Ill keep at it please let me know if you have any solutions

Thanks again,

MJ

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Robert Kern
Sent: Tuesday, February 20, 2007 4:20 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Greek Letters

Mark Janikas wrote:
 Hello all,
 
 I was wondering how I could print the chi-squared symbol in python.  I
 have been looking at the Unicode docs, but I figured I would ask for
 assistance here while I delve into it.  Thanks for any help in advance.

Print it where? To the terminal (which one?)? In HTML? With some GUI?

Assuming that you have a Unicode-capable terminal, you can find out the encoding
it uses by looking at sys.stdout.encoding. Encode your Unicode string with that
encoding, and print it. E.g., I use iTerm on OS X and set it to use UTF-8 as the
encoding:

In [5]: import sys

In [6]: sys.stdout.encoding
Out[6]: 'UTF-8'

In [7]: print u'\u03a7\u00b2'.encode(sys.stdout.encoding)
Χ²

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fromstring, tostring slow?

2007-02-13 Thread Mark Janikas
Yup.  It was faster to:

 

 Use lists for the append, then transform into an array, then transform
into a binary string

 

Rather than

 

Create empty arrays and use its append method, then transform into a
binary string.

 

The last question on the output when then be to test the speed of using
generic Python arrays, which have append methods as well.  Then, there
would still only be the binary string conversion as apposed to
list--numpy array--binary string

 

Thanks to all for your input

 

MJ

 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Charles R
Harris
Sent: Tuesday, February 13, 2007 12:44 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] fromstring, tostring slow?

 

 


I am going to guess that a list would be faster for appending. Concat
and, I suspect, append make new arrays for each use, rather like string
concatenation in Python. A list, on the other hand, is no doubt
optimized for adding new values. Another option might be using PyTables
with extensible arrays. In any case, a bit of timing should show the way
if the performance is that crucial to your application. 

Chuck

 

 

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fromstring, tostring slow?

2007-02-13 Thread Mark Janikas
I don't think I can do that because I have heterogeneous rows of
data I.e. the columns in each row are different in length.
Furthermore, when reading it back in, I want to read only bytes of the
info at a time so I can save memory.  In this case, I only want to have
one record in mem at once.

Another issue has arisen from taking this routine cross-platform
namely, if I write the file on Windows I cant read it on Solaris.  I
assume the big-little endian is at hand here.  I know using the struct
module that I can pack using either one.  Perhaps I will have to go back
to the drawing board.  I actually love these methods now because I get
back out directly what I put in.  Great kudos to the developers

MJ


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Christopher
Barker
Sent: Tuesday, February 13, 2007 1:39 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] fromstring, tostring slow?

Mark Janikas wrote:
 I am finding that directly packing numpy arrays into binary using the 
 tostring and fromstring methods

For starters, use fromfile and tofile, to save the overhead of creating 
an entire extra string.

fromfile is a function (as it is an alternate constructor for arrays):

numpy.fromfile()

ndarray.tofile() is an array method.

Enclosed is your test, including a test for tofile(), I needed to make 
the arrays much larger, and use time.time() rather than time.clock() to 
get enough time resolution to see anything, though if you really want to

be accurate, you need to use the timeit module.

My results:
Using lists 0.457561016083
Using tostring 0.00922703742981
Using tofile 0.00431108474731

Another note: where is the data coming from -- there may be ways to 
optimize this whole process if we saw that.

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fromstring, tostring slow?

2007-02-13 Thread Mark Janikas
Yes, but does the code have the same license as NumPy?  As I work for a 
software company, where I help with the scripting interface, I must make sure 
everything I use is cited and has the appropriate license.  

MJ  

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Stefan van der 
Walt
Sent: Tuesday, February 13, 2007 3:52 PM
To: numpy-discussion@scipy.org
Subject: Re: [Numpy-discussion] fromstring, tostring slow?

On Tue, Feb 13, 2007 at 03:44:37PM -0800, Mark Janikas wrote:
 I don't think I can do that because I have heterogeneous rows of
 data I.e. the columns in each row are different in length.
 Furthermore, when reading it back in, I want to read only bytes of the
 info at a time so I can save memory.  In this case, I only want to have
 one record in mem at once.
 
 Another issue has arisen from taking this routine cross-platform
 namely, if I write the file on Windows I cant read it on Solaris.  I
 assume the big-little endian is at hand here.

Indeed.  You may want to take a look at npfile, the new IO module in
scipy written by Matthew Brett (you don't have to install the whole
scipy to use it, just grab the file).

Cheers
Stéfan


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fromstring, tostring slow?

2007-02-13 Thread Mark Janikas
This is all very good info.  Especially, the byteswap.  Ill be testing
it momentarily.  As far as a detailed explanation of the problem

In essence, I am applying sparse matrix multiplication.  The matrix of
which I am dealing with in the matter described is nxn.  Generally, this
matrix is 1-20% sparse.  I use it in spatial data analysis, where the
matrix W represents the spatial association between n observations.  The
operations I perform on it are generally related to the spatial lag of a
variable... or Wy, where y is a nxk matrix (usually k=1).  As k is
generally small, the y vector and the result vector are represented by
numpy arrays.  I can have nxkx2 pieces of info in mem (usually).  What I
cant have is n**2.  So, I store each row of W in a file as a record
consisting of 3 parts:

1) row, nn (# of neighbors)
2) nhs (nx1) vector of integers representing the columns in row[i] != 0
3) weights (nx1) vector of floats corresponding to the index in the
previous row

The first two parts of the record are known as a GAL or geographic
algorithm library.  Since a lot of my W matrices have distance metrics
associated with them I added the third.  I think this might be termed by
someone else as an enhanced GAL.  At any rate, this allows me to perform
this operation on large datasets w/o running out of mem.


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Christopher
Barker
Sent: Tuesday, February 13, 2007 4:07 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] fromstring, tostring slow?

Mark Janikas wrote:
 I don't think I can do that because I have heterogeneous rows of
 data I.e. the columns in each row are different in length.

like I said, show us your whole problem...

But you don't have to write.read all the data at once with from/tofile()

anyway. Each of your rows has to be in a separate array anyway, as 
numpy arrays don't support ragged arrays, but each row can be written 
with tofile()

 Furthermore, when reading it back in, I want to read only bytes of the
 info at a time so I can save memory.  In this case, I only want to
have
 one record in mem at once.

you can make multiple calls to fromfile(), thou you'll have to know how 
long each record is.

 Another issue has arisen from taking this routine cross-platform
 namely, if I write the file on Windows I cant read it on Solaris.  I
 assume the big-little endian is at hand here.

yup.

 I know using the struct
 module that I can pack using either one.

so can numpy. see the byteswap method, and you can specify a 
particular endianess with a datatype when you read with fromfile():

a = N.fromfile(DataFile, dtype=N.dtype(d), count=20)

reads 20 little-endian doubles from DataFile, regardless of the native 
endianess of the machine you're on.

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Newbie Question, Probability

2006-12-20 Thread Mark Janikas
Hello all,

 

Is there a way to get probability values for the various families of
distributions in numpy?  I.e. ala R: 

 

 pnorm(1.96, mean = 0 , sd = 1)

[1] 0.9750021

 

# for the normal

 

 pt(1.65, df=100)

[1] 0.9489597

 

# for student t

 

Any suggestions would be greatly appreciated.  

 

 

Mark Janikas

Product Engineer

ESRI, Geoprocessing

380 New York St.

Redlands, CA 92373

909-793-2853 (2563)

[EMAIL PROTECTED]

 

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion