[Numpy-discussion] Py3k: making a py3k compat header available in installed numpy for scipy

2010-03-29 Thread David Cournapeau
Hi,

I have worked on porting scipy to py3k, and it is mostly working. One
thing which would be useful is to install something similar to
npy_3kcompat.h in numpy, so that every scipy extension could share the
compat header. Is the current python 3 compatibility header usable in
the wild, or will it still significantly change (this requiring a
different, more stable one) ?

cheers,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why this Difference in Importing NumPy 1.2 vs 1.4?

2010-03-29 Thread Wayne Watson
Yes, that is very likely  the solution. It's clear that the module is in 
the list. I say likely, since I've never done it before and there always 
seems to be something that gets overlooked in what seems to be something 
so simple. :-)

However, my colleague is on XP. Ah, same idea there.

I find it odd that no one seems to address the removal of modules from 
site-packages from Windows.
Thanks.

On 3/28/2010 9:13 AM, phob...@geosyntec.com wrote:
 If your on windows, you can probably get rid of it through the Add/Remove 
 Programs portion of the Conrol Panel.

 --
 Paul Hobson
 Senior Staff Engineer
 Geosyntec Consultants
 Portland, OR

 On Mar 26, 2010, at 8:09 PM, Wayne 
 Watsonsierra_mtnv...@sbcglobal.netmailto:sierra_mtnv...@sbcglobal.net  
 wrote:

 Thanks. How do I switch? Do I just pull down 1.3 or better 1.2 (I use it.), 
 and install it? How do I (actually my colleague) somehow remove 1.4? Is it as 
 easy as going to IDLE's path browser and removing, under site-packages, 
 numpy? (I'm not sure that's even possible. I don't see a right-click menu.)

 On 3/26/2010 7:22 PM,mailto:phob...@geosyntec.com  
 phob...@geosyntec.commailto:phob...@geosyntec.com  wrote:
 Wayne,

 The current release of Scipy doesn’t work perfectly well with Numpy 1.4.

 On my systems (Mac OS 10.6, WinXP, and Ubuntu), I’m running Numpy 1.4 with 
 the current Scipy on Python 2.6.4. I get the same error you describe below on 
 the first attempt. For some reason unknown to me, it works on the second try.

 Switching to Numpy 1.3 is the best solution to the error.

 -paul

 From:mailto:numpy-discussion-boun...@scipy.org  
 numpy-discussion-boun...@scipy.orgmailto:numpy-discussion-boun...@scipy.org 
  
 [mailto:numpy-discussion-boun...@scipy.orgmailto:numpy-discussion-boun...@scipy.org]
  On Behalf Of Wayne Watson
 Sent: Friday, March 26, 2010 5:44 PM
 To:mailto:numpy-discussion@scipy.org  
 numpy-discussion@scipy.orgmailto:numpy-discussion@scipy.org
 Subject: [Numpy-discussion] Why this Difference in Importing NumPy 1.2 vs 1.4?

 I wrote a program in Python 2.5 under Win7 and it runs fine using Numpy 1.2 , 
 but not on a colleague's machine who has a slightly newer 2.5. We both use 
 IDLE to execute the program. During import he gets this:


  
 Traceback (most recent call last):
File C:\Documents and Settings\HP_Administrator.DavesDesktop\My 
 Documents\Astro\Meteors\NC-FireballReport.py, line 38, inmodule
  from scipy import stats as stats # scoreatpercentile
File C:\Python25\lib\site-packages\scipy\stats\__init__.py, line 7, 
 inmodule
  from stats import *
File C:\Python25\lib\site-packages\scipy\stats\stats.py, line 191, 
 inmodule
  import scipy.special as special
File C:\Python25\lib\site-packages\scipy\special\__init__.py, line 22, 
 inmodule
  from numpy.testing import NumpyTest
 ImportError: cannot import name NumpyTest

  
 Comments?


 --

 Wayne Watson (Watson Adventures, Prop., Nevada City, CA)



   (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time)

Obz Site:  39° 15' 7 N, 121° 2' 32 W, 2700 feet

 Poisoned Shipments. Serious illegal waste dumping may be

 occuring in the Meditrainean. Radioactive material,

 mercury, biohazards. -- Sci Am Mag, Feb., 2010, p14f.



  Web 
 Page:http://www.speckledwithstars.net/www.speckledwithstars.net/http://www.speckledwithstars.net/


 ___
 NumPy-Discussion mailing list
 mailto:NumPy-Discussion@scipy.orgNumPy-Discussion@scipy.orgmailto:NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussionhttp://mail.scipy.org/mailman/listinfo/numpy-discussion



 --
 Wayne Watson (Watson Adventures, Prop., Nevada City, CA)

   (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time)
Obz Site:  39° 15' 7 N, 121° 2' 32 W, 2700 feet
 Poisoned Shipments. Serious illegal waste dumping may be
 occuring in the Meditrainean. Radioactive material,
 mercury, biohazards. -- Sci Am Mag, Feb., 2010, p14f.

  Web 
 Page:http://www.speckledwithstars.net/www.speckledwithstars.net/http://www.speckledwithstars.net/

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.orgmailto:NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


-- 
Wayne Watson (Watson Adventures, Prop., Nevada City, CA)

  (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time)
   Obz Site:  39° 15' 7 N, 121° 2' 32 W, 2700 feet
Poisoned Shipments. Serious illegal waste dumping may be
occuring in the Meditrainean. Radioactive 

Re: [Numpy-discussion] Applying formula to all in an array which hasvalue from previous

2010-03-29 Thread Nadav Horesh
The general guideline:

Suppose the function definition is:

def func(x,y):
# x and y are scalars
bla bla bla ...
return z # a scalar

So,

import numpy as np

vecfun = np.vectorize(func)

vecfun.ufunc.accumulate(array((0,1,2,3,4,5,6,7,8,9))


   Nadav.


-Original Message-
From: numpy-discussion-boun...@scipy.org on behalf of Vishal Rana
Sent: Sun 28-Mar-10 21:19
To: Discussion of Numerical Python
Subject: [Numpy-discussion] Applying formula to all in an array which hasvalue 
from previous
 
Hi,

For a numpy array:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

I do some calculation with 0, 1... and get a value = 2.5, now use this value
to do the repeat the same calculation with next element for example...
2.5, 2 and get a value = 3.1
3.1, 3 and get a value = 4.2
4.2, 4 and get a value = 5.1

 and get a value = 8.5
8.5, 9 and get a value = 9.8

So I should be getting a new array like array([0, 2.5, 3.1, 4.2, 5.1, .
8.5,9.8])

Is it where numpy or scipy can help?

Thanks
Vishal

winmail.dat___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.trapz() doesn't respect subclass

2010-03-29 Thread Bruce Southey
On 03/27/2010 01:31 PM, Ryan May wrote:
 On Sat, Mar 27, 2010 at 11:12 AM,josef.p...@gmail.com  wrote:

 On Sat, Mar 27, 2010 at 1:00 PM, Ryan Mayrma...@gmail.com  wrote:
  
 On Mon, Mar 22, 2010 at 8:14 AM, Ryan Mayrma...@gmail.com  wrote:

 On Sun, Mar 21, 2010 at 11:57 PM,josef.p...@gmail.com  wrote:
  
 On Mon, Mar 22, 2010 at 12:49 AM, Ryan Mayrma...@gmail.com  wrote:

 Hi,

 I found that trapz() doesn't work with subclasses:

 http://projects.scipy.org/numpy/ticket/1438

 A simple patch (attached) to change asarray() to asanyarray() fixes
 the problem fine.
  
 Are you sure this function works with matrices and other subclasses?

 Looking only very briefly at it: the multiplication might be a problem.

 Correct, it probably *is* a problem in some cases with matrices.  In
 this case, I was using quantities (Darren Dale's unit-aware array
 package), and the result was that units were stripped off.

 The patch can't make trapz() work with all subclasses. However, right
 now, you have *no* hope of getting a subclass out of trapz().  With
 this change, subclasses that don't redefine operators can work fine.
 If you're passing a Matrix to trapz() and expecting it to work, IMHO
 you're doing it wrong.  You can still pass one in by using asarray()
 yourself.  Without this patch, I'm left with copying and maintaining a
 copy of the code elsewhere, just so I can loosen the function's input
 processing. That seems wrong, since there's really no need in my case
 to drop down to an ndarray. The input I'm giving it supports all the
 operations it needs, so it should just work with my original input.
  
 With asarray it gives correct results for matrices and all array_like
 and subclasses, it just doesn't preserve the type.
 Your patch would break matrices and possibly other types, masked_arrays?, ...
  
 It would break matrices, yes.  I would argue that masked arrays are
 already broken with trapz:

 In [1]: x = np.arange(10)

 In [2]: y = x * x

 In [3]: np.trapz(y, x)
 Out[3]: 244.5

 In [4]: ym = np.ma.array(y, mask=(x4)(x7))

 In [5]: np.trapz(ym, x)
 Out[5]: 244.5

 In [6]: y[5:7] = 0

 In [7]: ym = np.ma.array(y, mask=(x4)(x7))

 In [8]: np.trapz(ym, x)
 Out[8]: 183.5

 Because of the call to asarray(), the mask is completely discarded and
 you end up with identical results to an unmasked array,
 which is not what I'd expect.  Worse, the actual numeric value of the
 positions that were masked affect the final answer. My patch allows
 this to work as expected too.

Actually you should assume that unless it is explicitly addressed 
(either by code or via a test), any subclass of ndarray (matrix, masked, 
structured, record and even sparse) may not provide a 'valid' answer.  
There are probably many numpy functions that only really work with the 
standard ndarray. Most of the time people do not meet these with the 
subclasses or have workarounds so there has been little requirement to 
address this especially due to the added overhead needed for checking.

Also, any patch that does not explicitly define the assumed behavior 
with points that are masked  has to be rejected. It is not even clear 
what the expected behavior is for masked arrays should be:
Is it even valid for trapz to be integrating across the full range if 
there are missing points? That implies some assumption about the missing 
points.
If is valid, then should you just ignore the masked values or try to 
predict the missing values first? Perhaps you may want to have the 
option to do both.


 One solution would be using arraywrap as in numpy.linalg.
  
 By arraywrap, I'm assuming you mean:

 def _makearray(a):
  new = asarray(a)
  wrap = getattr(a, __array_prepare__, new.__array_wrap__)
  return new, wrap

 I'm not sure if that's identical to just letting the subclass handle
 what's needed.  To my eyes, that doesn't look as though it'd be
 equivalent, both for handling masked arrays and Quantities. For
 quantities at least, the result of trapz will have different units
 than either of the inputs.


 for related discussion:
 http://mail.scipy.org/pipermail/scipy-dev/2009-June/012061.html
  
 Actually, that discussion kind of makes my point.  Matrices are a pain
 to make work in a general sense because they *break* ndarray
 conventions--to me it doesn't make sense to help along classes that
 break convention at the expense of making well-behaved classes a pain
 to use.  You should need an *explicit* cast of a matrix to an ndarray
 instead of the function quietly doing it for you. (Explicit is better
 than implicit) It just seems absurd that if I make my own ndarray
 subclass that *just* adds some behavior to the array, but doesn't
 break *any* operations, I need to do one of the following:

 1) Have my own copy of trapz that works with my class
 2) Wrap every call to numpy's own trapz() to put the metadata back.

 Does it not seem backwards that 

Re: [Numpy-discussion] Py3k: making a py3k compat header available in installed numpy for scipy

2010-03-29 Thread Charles R Harris
On Mon, Mar 29, 2010 at 4:13 AM, David Cournapeau courn...@gmail.comwrote:

 Hi,

 I have worked on porting scipy to py3k, and it is mostly working. One
 thing which would be useful is to install something similar to
 npy_3kcompat.h in numpy, so that every scipy extension could share the
 compat header. Is the current python 3 compatibility header usable in
 the wild, or will it still significantly change (this requiring a
 different, more stable one) ?


Pauli will have to weigh in here, but I think it is pretty stable and has
been for a while. The only thing I'm thinking of changing are the PyCapsule
compatibility functions; instead of having them downgrade PyCapsule to
PyCObject like behaviour, go the other way. Going the other way requires
changes to the surrounding code to handle errors, which is why Numpy doesn't
use those functions at the moment.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Py3k: making a py3k compat header available in installed numpy for scipy

2010-03-29 Thread Pauli Virtanen
ma, 2010-03-29 kello 19:13 +0900, David Cournapeau kirjoitti:
 I have worked on porting scipy to py3k, and it is mostly working. One
 thing which would be useful is to install something similar to
 npy_3kcompat.h in numpy, so that every scipy extension could share the
 compat header. Is the current python 3 compatibility header usable in
 the wild, or will it still significantly change (this requiring a
 different, more stable one) ?

I believe it's reasonably stable, as it contains mostly simple stuff.
Something perhaps can be added later, but I don't think anything will
need to be removed.

At least, I don't see what I would like to change there. The only thing
I wouldn't perhaps like to have in the long run are the PyString and
possibly PyInt redefinition macros. But if the header is going to be
used elsewhere, we can as well freeze it now and promise not to remove
or change anything existing.

Pauli


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.trapz() doesn't respect subclass

2010-03-29 Thread Ryan May
On Mon, Mar 29, 2010 at 8:00 AM, Bruce Southey bsout...@gmail.com wrote:
 On 03/27/2010 01:31 PM, Ryan May wrote:
 Because of the call to asarray(), the mask is completely discarded and
 you end up with identical results to an unmasked array,
 which is not what I'd expect.  Worse, the actual numeric value of the
 positions that were masked affect the final answer. My patch allows
 this to work as expected too.

 Actually you should assume that unless it is explicitly addressed
 (either by code or via a test), any subclass of ndarray (matrix, masked,
 structured, record and even sparse) may not provide a 'valid' answer.
 There are probably many numpy functions that only really work with the
 standard ndarray. Most of the time people do not meet these with the
 subclasses or have workarounds so there has been little requirement to
 address this especially due to the added overhead needed for checking.

It's not that I'm surprised that masked arrays don't work. It's more
that the calls to np.asarray within trapz() have been held up as being
necessary for things like matrices and (at the time) masked arrays to
work properly; as if calling asarray() is supposed to make all
subclasses work, though at a base level by dropping to an ndarray. To
me, the current behavior with masked arrays is worse than if passing
in a matrix raised an exception.  One is a silently wrong answer, the
other is a big error that the programmer can see, test, and fix.

 Also, any patch that does not explicitly define the assumed behavior
 with points that are masked  has to be rejected. It is not even clear
 what the expected behavior is for masked arrays should be:
 Is it even valid for trapz to be integrating across the full range if
 there are missing points? That implies some assumption about the missing
 points.
 If is valid, then should you just ignore the masked values or try to
 predict the missing values first? Perhaps you may want to have the
 option to do both.

You're right, it doesn't actually work with MaskedArrays as it stand
right now, because it calls add.reduce() directly instead of using the
array.sum() method. Once fixed, by allowing MaskedArray to handle the
operation, you end up not integrating over the masked region. Any
operation involving masked points results in contributions by masked
points are ignored.  I guess it's as if you assumed the function was 0
over the masked region.  If you wanted to ignore the masked points,
but integrate over the region (making a really big trapezoid over that
region), you could just pass in the .compressed() versions of the
arrays.

 than implicit) It just seems absurd that if I make my own ndarray
 subclass that *just* adds some behavior to the array, but doesn't
 break *any* operations, I need to do one of the following:

 1) Have my own copy of trapz that works with my class
 2) Wrap every call to numpy's own trapz() to put the metadata back.

 Does it not seem backwards that the class that breaks conventions
 just works while those that don't break conventions, will work
 perfectly with the function as written, need help to be treated
 properly?

 You need your own version of trapz or whatever function because it has
 the behavior that you expect. But a patch should not break numpy so you
 need to at least to have a section that looks for masked array subtypes
 and performs the desired behavior(s).

I'm not trying to be difficult but it seems like there are conflicting
ideas here: we shouldn't break numpy, which in this case means making
matrices no longer work with trapz().  On the other hand, subclasses
can do a lot of things, so there's no real expectation that they
should ever work with numpy functions in general.  Am I missing
something here? I'm just trying to understand what I perceive to be
some inconsistencies in numpy's behavior and, more importantly,
convention with regard subclasses.

Ryan

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.trapz() doesn't respect subclass

2010-03-29 Thread Ryan May
Hi,

I decided that having actual code that does what I want and keeps
backwards compatibility (and adds tests) might be better than arguing
semantics.  I've updated my patch to:

* Uses the array.sum() method instead of add.reduce to make subclasses
fully work (this was still breaking masked arrays.
* Catches an exception on doing the actual multiply and sum of the
arrays and tries again after casting to ndarrays.  This allows any
subclasses that relied on being cast to still work.
* Adds tests that ensure matrices work (test passes before and after
changes to trapz()) and adds a test for masked arrays that checks that
masked points are treated as expected. In this case, expected is
defined to be the same as if you implemented the trapezoidal method by
hand using MaskedArray's basic arithmetic operations.

Attached here and at: http://projects.scipy.org/numpy/ticket/1438

I think this addresses the concerns that were raised about the changes
for subclasses in this case. Let me know if I've missed something (or
if there's no way in hell any such patch will ever be committed).

Thanks,

Ryan

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.trapz() doesn't respect subclass

2010-03-29 Thread Bruce Southey
On 03/29/2010 10:17 AM, Ryan May wrote:
 On Mon, Mar 29, 2010 at 8:00 AM, Bruce Southeybsout...@gmail.com  wrote:

 On 03/27/2010 01:31 PM, Ryan May wrote:
  
 Because of the call to asarray(), the mask is completely discarded and
 you end up with identical results to an unmasked array,
 which is not what I'd expect.  Worse, the actual numeric value of the
 positions that were masked affect the final answer. My patch allows
 this to work as expected too.


 Actually you should assume that unless it is explicitly addressed
 (either by code or via a test), any subclass of ndarray (matrix, masked,
 structured, record and even sparse) may not provide a 'valid' answer.
 There are probably many numpy functions that only really work with the
 standard ndarray. Most of the time people do not meet these with the
 subclasses or have workarounds so there has been little requirement to
 address this especially due to the added overhead needed for checking.
  
 It's not that I'm surprised that masked arrays don't work. It's more
 that the calls to np.asarray within trapz() have been held up as being
 necessary for things like matrices and (at the time) masked arrays to
 work properly; as if calling asarray() is supposed to make all
 subclasses work, though at a base level by dropping to an ndarray. To
 me, the current behavior with masked arrays is worse than if passing
 in a matrix raised an exception.  One is a silently wrong answer, the
 other is a big error that the programmer can see, test, and fix.


 Also, any patch that does not explicitly define the assumed behavior
 with points that are masked  has to be rejected. It is not even clear
 what the expected behavior is for masked arrays should be:
 Is it even valid for trapz to be integrating across the full range if
 there are missing points? That implies some assumption about the missing
 points.
 If is valid, then should you just ignore the masked values or try to
 predict the missing values first? Perhaps you may want to have the
 option to do both.
  
 You're right, it doesn't actually work with MaskedArrays as it stand
 right now, because it calls add.reduce() directly instead of using the
 array.sum() method. Once fixed, by allowing MaskedArray to handle the
 operation, you end up not integrating over the masked region. Any
 operation involving masked points results in contributions by masked
 points are ignored.  I guess it's as if you assumed the function was 0
 over the masked region.  If you wanted to ignore the masked points,
 but integrate over the region (making a really big trapezoid over that
 region), you could just pass in the .compressed() versions of the
 arrays.


 than implicit) It just seems absurd that if I make my own ndarray
 subclass that *just* adds some behavior to the array, but doesn't
 break *any* operations, I need to do one of the following:

 1) Have my own copy of trapz that works with my class
 2) Wrap every call to numpy's own trapz() to put the metadata back.

 Does it not seem backwards that the class that breaks conventions
 just works while those that don't break conventions, will work
 perfectly with the function as written, need help to be treated
 properly?


 You need your own version of trapz or whatever function because it has
 the behavior that you expect. But a patch should not break numpy so you
 need to at least to have a section that looks for masked array subtypes
 and performs the desired behavior(s).
  
 I'm not trying to be difficult but it seems like there are conflicting
 ideas here: we shouldn't break numpy, which in this case means making
 matrices no longer work with trapz().  On the other hand, subclasses
 can do a lot of things, so there's no real expectation that they
 should ever work with numpy functions in general.  Am I missing
 something here? I'm just trying to understand what I perceive to be
 some inconsistencies in numpy's behavior and, more importantly,
 convention with regard subclasses.

 Ryan


You should not confuse class functions with normal Python functions. 
Functions that are inherited from the ndarray superclass should be the 
same in the subclass unless these class functions have been modified. 
However many functions like trapz are not part of the ndarray superclass 
that have been written to handle the standard array (i.e. the unmodified 
ndarray superclass) but these may or may not work for all ndarray 
subclasses.  Other functions have been written to handle the specific 
ndarray subclass such as masked array or Matrix and may (but not 
guaranteed to) work for the standard array. Thus, I think your 
'inconsistencies' relate to the simple fact that not all numpy functions 
are aware of ndarray subclasses.

What is missing are the bug reports and solutions for these functions 
that occur when the expected behavior differs between the ndarray 
superclass and ndarray subclasses.  In the case of trapz, the bug report 
needs to be 

[Numpy-discussion] Fourier transform

2010-03-29 Thread Pascal
Hi,

Does anyone have an idea how fft functions are implemented? Is it pure
python? based on BLAS/LAPACK? or is it using fftw?

I successfully used numpy.fft in 3D. I would like to know if I can
calculate a specific a plane using the numpy.fft.

I have in 3D: 
r(x, y, z)=\sum_h^N-1 \sum_k^M-1 \sum_l^O-1 f_{hkl}
 \exp(-2\pi \i (hx/N+ky/M+lz/O))

So for the plane, z is no longer independant.
I need to solve the system:
ax+by+cz+d=0
r(x, y, z)=\sum_h^N-1 \sum_k^M-1 \sum_l^O-1 f_{hkl}
 \exp(-2\pi \i (hx/N+ky/M+lz/O))

Do you think it's possible to use numpy.fft for this?

Regards,
Pascal
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fourier transform

2010-03-29 Thread Robert Kern
On Mon, Mar 29, 2010 at 16:00, Pascal pascal...@parois.net wrote:
 Hi,

 Does anyone have an idea how fft functions are implemented? Is it pure
 python? based on BLAS/LAPACK? or is it using fftw?

Using FFTPACK converted from FORTRAN to C.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Interpolation question

2010-03-29 Thread Andrea Gavana
Hi All,

On 29 March 2010 00:59, Andrea Gavana wrote:
 On 29 March 2010 00:34, Robert Kern wrote:
 Scaling each axis by its standard deviation is a typical first start.
 Shifting and scaling the values such that they each go from 0 to 1 is
 another useful thing to try.

 Ah, magnifico! Thank you Robert and Friedrich, it seems to be working
 now... I get reasonable values for various combinations of parameters
 by scaling the input data using the standard deviation of each of
 them. It seems also that the other interpolation schemes are much less
 erratic now, and in fact (using input values equal to the original
 data) I get these range of errors for the various schemes:

 inverse multiquadric -15.6098482614 15.7194674906
 linear                        -1.76157336073e-010 1.24949181055e-010
 cubic                        -0.000709860285963 0.018385394661
 gaussian                  -293.930336611 282.058111404
 quintic                      -0.176381494531 5.37780806549
 multiquadric             -30.9515933446 58.3786105046
 thin-plate                  -7.06755391536e-006 8.71407169821e-005

 In percentage. Some of them are still off the mark, but you should
 have seen them before ;-) .

 I'll do some more analysis tomorrow, and if it works I am going to try
 the bigger profile-over-time interpolation. Thank you so much guys for
 your suggestions.

If anyone is interested in a follow up, I have tried a time-based
interpolation of my oil profile (and gas and gas injection profiles)
using those 40 interpolators (and even more, up to 400, one every
month of fluid flow simulation time step).

I wasn't expecting too much out of it, but when the interpolated
profiles came out (for different combinations of input parameters) I
felt like being on the wrong side of the Lala River in the valley of
Areyoukidding. The results are striking. I get an impressive agreement
between this interpolated proxy model and the real simulations,
whether I use existing combinations of parameters or new ones (i.e., I
create the interpolation and then run the real fluid flow simulation,
comparing the outcomes).

As an aside, I got my colleagues reservoir engineers playfully
complaining that it's time for them to pack their stuff and go home as
this interpolator is doing all the job for us; obviously, this misses
the point that it took 4 years to build such a comprehensive bunch of
simulations which now allows us to somewhat predict a possible
production profile in advance.

I wrapped everything up in a wxPython GUI with some Matplotlib graphs,
and everyone seems happy. The only small complain I have is that I
wasn't able to come up with a vector implementation of RBFs, so it can
be pretty slow to build and interpolate 400 RBFs for each property (3
of them).

Thanks to everyone for your valuable suggestions!

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/

== Never *EVER* use RemovalGroup for your house removal. You'll
regret it forever.
http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html ==
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Interpolation question

2010-03-29 Thread Andrea Gavana
Hi Kevin,

On 29 March 2010 01:38, Kevin Dunn wrote:
 Message: 5
 Date: Sun, 28 Mar 2010 00:24:01 +
 From: Andrea Gavana andrea.gav...@gmail.com
 Subject: [Numpy-discussion] Interpolation question
 To: Discussion of Numerical Python numpy-discussion@scipy.org
 Message-ID:
        d5ff27201003271724o6c82ec75v225d819c84140...@mail.gmail.com
 Content-Type: text/plain; charset=ISO-8859-1

 Hi All,

    I have an interpolation problem and I am having some difficulties
 in tackling it. I hope I can explain myself clearly enough.

 Basically, I have a whole bunch of 3D fluid flow simulations (close to
 1000), and they are a result of different combinations of parameters.
 I was planning to use the Radial Basis Functions in scipy, but for the
 moment let's assume, to simplify things, that I am dealing only with
 one parameter (x). In 1000 simulations, this parameter x has 1000
 values, obviously. The problem is, the outcome of every single
 simulation is a vector of oil production over time (let's say 40
 values per simulation, one per year), and I would like to be able to
 interpolate my x parameter (1000 values) against all the simulations
 (1000x40) and get an approximating function that, given another x
 parameter (of size 1x1) will give me back an interpolated production
 profile (of size 1x40).

 [I posted the following earlier but forgot to change the subject - it
 appears as a new thread called NumPy-Discussion Digest, Vol 42, Issue
 85 - please ignore that thread]

 Andrea, may I suggest a different approach to RBF's.

 Realize that your vector of 40 values for each row in y are not
 independent of each other (they will be correlated).  First build a
 principal component analysis (PCA) model on this 1000 x 40 matrix and
 reduce it down to a 1000 x A matrix, called your scores matrix, where
 A is the number of independent components. A is selected so that it
 adequately summarizes Y without over-fitting and you will find A 
 40, maybe A = 2 or 3. There are tools, such as cross-validation, that
 will help select a reasonable value of A.

 Then you can relate your single column of X to these independent
 columns in A using a tool such as least squares: one least squares
 model per column in the scores matrix.  This works because each column
 in the score vector is independent (contains totally orthogonal
 information) to the others.  But I would be surprised if this works
 well enough, unless A = 1.

 But it sounds like your don't just have a single column in your
 X-variables (you hinted that the single column was just for
 simplification).  In that case, I would build a projection to latent
 structures model (PLS) model that builds a single latent-variable
 model that simultaneously models the X-matrix, the Y-matrix as well as
 providing the maximal covariance between these two matrices.

 If you need some references and an outline of code, then I can readily
 provide these.

 This is a standard problem with data from spectroscopic instruments
 and with batch processes.  They produce hundreds, sometimes 1000's of
 samples per row. PCA and PLS are very effective at summarizing these
 down to a much smaller number of independent columns, very often just
 a handful, and relating them (i.e. building a predictive model) to
 other data matrices.

 I also just saw the suggestions of others to center the data by
 subtracting the mean from each column in Y and scaling (by dividing
 through by the standard deviation).  This is a standard data
 preprocessing step, called autoscaling and makes sense for any data
 analysis, as you already discovered.

I have got some success by using time-based RBFs interpolations, but I
am always open to other possible implementations (as the one I am
using can easily fail for strange combinations of input parameters).
Unfortunately, my understanding of your explanation is very very
limited: I am not an expert at all, so it's a bit hard for me to
translate the mathematical technical stuff in something I can
understand. If you have an example code (even a very trivial one) for
me to study so that I can understand what the code is actually doing,
I would be more than grateful for your help :-)

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/

== Never *EVER* use RemovalGroup for your house removal. You'll
regret it forever.
http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html ==
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Interpolation question

2010-03-29 Thread Andrea Gavana
Hi Brennan  All,

On 29 March 2010 00:46, Brennan Williams wrote:
 Andrea Gavana wrote:
 As for your question, the parameter are not spread completely
 randomly, as this is a collection of simulations done over the years,
 trying manually different scenarios, without having in mind a proper
 experimental design or any other technique. Nor the parameter  values
 vary only on one axis in each simulation (few of them are like that).


 I assume that there is a default norm that calculates the distance
 between points irrespective of the order of the input coordinates?

 So if that isn't working, leading to the spurious results, the next step
 is to normalise all the inputs so they are in the same range, e.g
 max-min=1.0

Scaling the input data using their standard deviation worked very well
for my case.

 On a related note, what approach would be best if one of the input
 parameters wasn't continuous? e.g. I have three quite different
 geological distributions called say A,B and C.
 SO some of my simulations use distribution A, some use B and some use C.
 I could assign them the numbers 1,2,3 but a value of 1.5 is meaningless.

Not sure about this: I do have integer numbers too (the number of
wells can not be a fractional one, obviously), but I don't care about
it as it is an input parameter (i.e., the user choose how many
o2/o3/injector wells he/she wants, and I get an interpolated
production profiles). Are you saying that the geological realization
is one of your output variables?

 Andrea, if you have 1TB of data for 1,000 simulation runs, then, if I
 assume you only mean the smspec/unsmry files, that means each of your
 summary files is 1GB in size?

It depends on the simulation, and also for how many years the forecast
is run. Standard runs go up to 2038, but we have a bunch of them
running up to 2120 (!) . As we do have really many wells in this
field, the ECLIPSE summary file dimensions skyrocket pretty quickly.

 Are those o2w,o3w and inw figures the number of new wells only or
 existing+new? It's fun dealing with this amount of data isn't it?

They're only new wells, with a range of 0 = o2w = 150 and 0 = o3 =
84 and 0 = inw = 37, and believe it or not, our set of simulations
contains a lot of the possible combinations for these 2 variables (and
the other 4 variables too)...

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/

== Never *EVER* use RemovalGroup for your house removal. You'll
regret it forever.
http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html ==
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Interpolation question

2010-03-29 Thread Andrea Gavana
Hi Chris and All,

On 29 March 2010 22:35, Christopher Barker wrote:
 Andrea Gavana wrote:
 Scaling each axis by its standard deviation is a typical first start.
 Shifting and scaling the values such that they each go from 0 to 1 is
 another useful thing to try.
 Ah, magnifico! Thank you Robert and Friedrich, it seems to be working
 now...

 One other thought -- core to much engineering is dimensional analysis --
 you know how we like those non-dimensional number!

 I think this situation is less critical, as you are interpolating, not
 optimizing or something, but many interpolation methods are built on the
 idea of some data points being closer than others to your point of interest.

 Who is to say if a point that is 2 hours away is closer or father than
 one 2 meters away? This is essentially what you are doing.

 Scaling everything to the same range is a start, but then you've still
 given them an implicit weighting.

 An alternative to to figure out a way to non-dimensionalize your
 parameters -- that *may* give you a more physically based scaling.

 And you might invent the Gavana Number in the process ;-)

Might be :-D . At the moment I am pretty content with what I have got,
it seems to be working fairly well although I didn't examine all the
possible cases and it is very likely that my little tool will break
disastrously for some combinations of parameters. However, I am not
sure I am allowed to post an image comparing the real simulation
with the prediction of the interpolated proxy model, but if you could
see it, you would surely agree that it is a very reasonable approach.
It seems to good to be true :-D .

Again, this is mainly due to the fact that we have a very extensive
set of simulations which cover a wide range of combinations of
parameters, so the interpolation itself is only doing its job
correctly. I don't think the technique can be applied blindly to
whatever oil/gas/condensate /whatever reservoir, as non-linearities in
fluid flow simulations appear where you least expect them, but it
seems to be working ok (up to now) for our field (which is, by the
way, one of the most complex and less understood condensate reservoir
out there).

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/

== Never *EVER* use RemovalGroup for your house removal. You'll
regret it forever.
http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html ==
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Interpolation question

2010-03-29 Thread Christopher Barker
Andrea Gavana wrote:
 Scaling each axis by its standard deviation is a typical first start.
 Shifting and scaling the values such that they each go from 0 to 1 is
 another useful thing to try.
 Ah, magnifico! Thank you Robert and Friedrich, it seems to be working
 now...

One other thought -- core to much engineering is dimensional analysis -- 
you know how we like those non-dimensional number!

I think this situation is less critical, as you are interpolating, not 
optimizing or something, but many interpolation methods are built on the 
idea of some data points being closer than others to your point of interest.

Who is to say if a point that is 2 hours away is closer or father than 
one 2 meters away? This is essentially what you are doing.

Scaling everything to the same range is a start, but then you've still 
given them an implicit weighting.

An alternative to to figure out a way to non-dimensionalize your 
parameters -- that *may* give you a more physically based scaling.

And you might invent the Gavana Number in the process ;-)

-Chris







-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals

2010-03-29 Thread David Warde-Farley
Hi,

In my setup.py, I have
from numpy.distutils.misc_util import Configuration

fflags= '-fdefault-real-8 -ffixed-form'
config = Configuration(
 'foo',
 parent_package=None,
 top_path=None,
 f2py_options='--f77flags=\'%s\' --f90flags=\'%s\'' % (fflags,  
fflags)
)

However I am still getting stuff returned in 'real' variables as  
dtype=float32. Am I doing something wrong?

Thanks,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fourier transform

2010-03-29 Thread Charles R Harris
On Mon, Mar 29, 2010 at 3:00 PM, Pascal pascal...@parois.net wrote:

 Hi,

 Does anyone have an idea how fft functions are implemented? Is it pure
 python? based on BLAS/LAPACK? or is it using fftw?

 I successfully used numpy.fft in 3D. I would like to know if I can
 calculate a specific a plane using the numpy.fft.

 I have in 3D:
 r(x, y, z)=\sum_h^N-1 \sum_k^M-1 \sum_l^O-1 f_{hkl}
  \exp(-2\pi \i (hx/N+ky/M+lz/O))

 So for the plane, z is no longer independant.
 I need to solve the system:
 ax+by+cz+d=0
 r(x, y, z)=\sum_h^N-1 \sum_k^M-1 \sum_l^O-1 f_{hkl}
  \exp(-2\pi \i (hx/N+ky/M+lz/O))

 Do you think it's possible to use numpy.fft for this?


I'm not clear on what you want to do here, but note that the term in the in
the exponent is of the form k, x, i.e., the inner product of the vectors k
and x. So if you rotate x by O so that the plane is defined by z = 0, then
k, Ox = O.T, x. That is, you can apply the transpose of the rotation to
the result of the fft.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Interpolation question

2010-03-29 Thread Brennan Williams
Andrea Gavana wrote:
 Hi Chris and All,

 On 29 March 2010 22:35, Christopher Barker wrote:
   
 Andrea Gavana wrote:
 
 Scaling each axis by its standard deviation is a typical first start.
 Shifting and scaling the values such that they each go from 0 to 1 is
 another useful thing to try.
   
 Ah, magnifico! Thank you Robert and Friedrich, it seems to be working
 now...
 
 One other thought -- core to much engineering is dimensional analysis --
 you know how we like those non-dimensional number!

 I think this situation is less critical, as you are interpolating, not
 optimizing or something, but many interpolation methods are built on the
 idea of some data points being closer than others to your point of interest.

 Who is to say if a point that is 2 hours away is closer or father than
 one 2 meters away? This is essentially what you are doing.

 Scaling everything to the same range is a start, but then you've still
 given them an implicit weighting.

 An alternative to to figure out a way to non-dimensionalize your
 parameters -- that *may* give you a more physically based scaling.

 And you might invent the Gavana Number in the process ;-)
 

 Might be :-D . At the moment I am pretty content with what I have got,
 it seems to be working fairly well although I didn't examine all the
 possible cases and it is very likely that my little tool will break
 disastrously for some combinations of parameters. However, I am not
 sure I am allowed to post an image comparing the real simulation
 with the prediction of the interpolated proxy model, but if you could
 see it, you would surely agree that it is a very reasonable approach.
 It seems to good to be true :-D .

 Again, this is mainly due to the fact that we have a very extensive
 set of simulations which cover a wide range of combinations of
 parameters, so the interpolation itself is only doing its job
 correctly. I don't think the technique can be applied blindly to
 whatever oil/gas/condensate /whatever reservoir, as non-linearities in
 fluid flow simulations appear where you least expect them, but it
 seems to be working ok (up to now) for our field (which is, by the
 way, one of the most complex and less understood condensate reservoir
 out there).

   
And of course that proxy simulator only deals with the input variables 
that you decided on 1,000+ simulations ago.
All you need is for someone to suggest something else like how about 
gas injection? and you're back to having to do
more real simulation runs (which is where a good experimental design 
comes in).

It would be interesting to know how well your proxy simulator compares 
to the real simulator for a combination of input variable values that
is a good distance outside your original parameter space.

Brennan

 Andrea.

 Imagination Is The Only Weapon In The War Against Reality.
 http://xoomer.alice.it/infinity77/

 == Never *EVER* use RemovalGroup for your house removal. You'll
 regret it forever.
 http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html ==
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

   


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Interpolation question

2010-03-29 Thread Andrea Gavana
On 29 March 2010 23:13, Brennan Williams wrote:
 Andrea Gavana wrote:
 Hi Chris and All,

 On 29 March 2010 22:35, Christopher Barker wrote:

 Andrea Gavana wrote:

 Scaling each axis by its standard deviation is a typical first start.
 Shifting and scaling the values such that they each go from 0 to 1 is
 another useful thing to try.

 Ah, magnifico! Thank you Robert and Friedrich, it seems to be working
 now...

 One other thought -- core to much engineering is dimensional analysis --
 you know how we like those non-dimensional number!

 I think this situation is less critical, as you are interpolating, not
 optimizing or something, but many interpolation methods are built on the
 idea of some data points being closer than others to your point of interest.

 Who is to say if a point that is 2 hours away is closer or father than
 one 2 meters away? This is essentially what you are doing.

 Scaling everything to the same range is a start, but then you've still
 given them an implicit weighting.

 An alternative to to figure out a way to non-dimensionalize your
 parameters -- that *may* give you a more physically based scaling.

 And you might invent the Gavana Number in the process ;-)


 Might be :-D . At the moment I am pretty content with what I have got,
 it seems to be working fairly well although I didn't examine all the
 possible cases and it is very likely that my little tool will break
 disastrously for some combinations of parameters. However, I am not
 sure I am allowed to post an image comparing the real simulation
 with the prediction of the interpolated proxy model, but if you could
 see it, you would surely agree that it is a very reasonable approach.
 It seems to good to be true :-D .

 Again, this is mainly due to the fact that we have a very extensive
 set of simulations which cover a wide range of combinations of
 parameters, so the interpolation itself is only doing its job
 correctly. I don't think the technique can be applied blindly to
 whatever oil/gas/condensate /whatever reservoir, as non-linearities in
 fluid flow simulations appear where you least expect them, but it
 seems to be working ok (up to now) for our field (which is, by the
 way, one of the most complex and less understood condensate reservoir
 out there).


 And of course that proxy simulator only deals with the input variables
 that you decided on 1,000+ simulations ago.

(1000  x  0 = now) . Correct: the next phase of the development for
the field has some strict rules we are not allowed to break. The
parameters we chose for optimizing this development phase (4 years ago
up to now) are the same as the ones I am using for the interpolation.
There is no more room for other options.

 All you need is for someone to suggest something else like how about
 gas injection?

This is already accounted for.

 and you're back to having to do
 more real simulation runs (which is where a good experimental design
 comes in).

Possibly, but as I said we can't even think of doing something like ED
now. It's too computationally expensive for such a field. This is why
I had the idea of using the existing set of simulations, which are a
good ED by themselves (even if we didn't plan for it in advance).

 It would be interesting to know how well your proxy simulator compares
 to the real simulator for a combination of input variable values that
 is a good distance outside your original parameter space.

This is extremely unlikely to happen ever. As I said, we explored a
wide range of number/type of possible producers and injectors, plus a
fairly extended range of production/injection profiles. As we are
dealing with reality here, there are physical limits on what
facilities you can install on your fields to try and ramp up
production as much as you can (i.e., upper bounds for number of
producers/injectors/plateaus), and political limits on how low you can
go and still be economical (i.e., lower bounds for number of
producers/injectors/plateaus). It seems to me we're covering
everything except the most extreme cases.

Anyway, I am not here to convince anyone on the validity of the
approach: I am a practical person, this thing works reasonably well
and I am more than happy. Other than that, we are already going wy
off topic for the Numpy list. Sorry about that.

If you (or anyone else) wishes to continue the discussion on Reservoir
Simulation, please feel free to contact me directly. For all other
interpolation suggestions, I am always open to new ideas. Thank you
again to the list for your help.

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/

== Never *EVER* use RemovalGroup for your house removal. You'll
regret it forever.
http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html ==
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Interpolation question

2010-03-29 Thread Friedrich Romstedt
2010/3/29 Andrea Gavana andrea.gav...@gmail.com:
 If anyone is interested in a follow up, I have tried a time-based
 interpolation of my oil profile (and gas and gas injection profiles)
 using those 40 interpolators (and even more, up to 400, one every
 month of fluid flow simulation time step).

 I wasn't expecting too much out of it, but when the interpolated
 profiles came out (for different combinations of input parameters) I
 felt like being on the wrong side of the Lala River in the valley of
 Areyoukidding. The results are striking. I get an impressive agreement
 between this interpolated proxy model and the real simulations,
 whether I use existing combinations of parameters or new ones (i.e., I
 create the interpolation and then run the real fluid flow simulation,
 comparing the outcomes).

I'm reasoning about the implications of this observation to our
understanding of your interpolation.  As Christopher pointed out, it's
very important to know how many gas injections wells are to be
weighted the same as one year.

When you have nice results using 40 Rbfs for each time instant, this
procedure means that the values for one time instant will not be
influenced by adjacent-year data.  I.e., you would probably get the
same result using a norm extraordinary blowing up the time coordinate.
 To make it clear in code, when the time is your first coordinate, and
you have three other coordinates, the *norm* would be:

def norm(x1, x2):
return numpy.sqrtx1 - x2) * [1e3, 1, 1]) ** 2).sum())

In this case, the epsilon should be fixed, to avoid the influence of
the changing distances on the epsilon determination inside of Rbf,
which would spoil the whole thing.

I have an idea how to tune your model:  Take, say, the half or three
thirds of your simulation data as interpolation database, and try to
reproduce the remaining part.  I have some ideas how to tune using
this in practice.

 As an aside, I got my colleagues reservoir engineers playfully
 complaining that it's time for them to pack their stuff and go home as
 this interpolator is doing all the job for us; obviously, this misses
 the point that it took 4 years to build such a comprehensive bunch of
 simulations which now allows us to somewhat predict a possible
 production profile in advance.

:-) :-)

 I wrapped everything up in a wxPython GUI with some Matplotlib graphs,
 and everyone seems happy.
Not only your collegues!
 The only small complain I have is that I
 wasn't able to come up with a vector implementation of RBFs, so it can
 be pretty slow to build and interpolate 400 RBFs for each property (3
 of them).

Haven't you spoken about 40 Rbfs for the time alone??

Something completely different: Are you going to do more simulations?

Friedrich
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Applying formula to all in an array which hasvalue from previous

2010-03-29 Thread Vishal Rana
Thanks Nadav!


On Mon, Mar 29, 2010 at 4:07 AM, Nadav Horesh nad...@visionsense.comwrote:

 The general guideline:

 Suppose the function definition is:

 def func(x,y):
# x and y are scalars
bla bla bla ...
return z # a scalar

 So,

 import numpy as np

 vecfun = np.vectorize(func)

 vecfun.ufunc.accumulate(array((0,1,2,3,4,5,6,7,8,9))


   Nadav.


 -Original Message-
 From: numpy-discussion-boun...@scipy.org on behalf of Vishal Rana
 Sent: Sun 28-Mar-10 21:19
 To: Discussion of Numerical Python
 Subject: [Numpy-discussion] Applying formula to all in an array which
 hasvalue from previous

 Hi,

 For a numpy array:

 array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

 I do some calculation with 0, 1... and get a value = 2.5, now use this
 value
 to do the repeat the same calculation with next element for example...
 2.5, 2 and get a value = 3.1
 3.1, 3 and get a value = 4.2
 4.2, 4 and get a value = 5.1
 
  and get a value = 8.5
 8.5, 9 and get a value = 9.8

 So I should be getting a new array like array([0, 2.5, 3.1, 4.2, 5.1, .
 8.5,9.8])

 Is it where numpy or scipy can help?

 Thanks
 Vishal


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Interpolation question

2010-03-29 Thread Andrea Gavana
HI Friedrich  All,

On 29 March 2010 23:44, Friedrich Romstedt wrote:
 2010/3/29 Andrea Gavana andrea.gav...@gmail.com:
 If anyone is interested in a follow up, I have tried a time-based
 interpolation of my oil profile (and gas and gas injection profiles)
 using those 40 interpolators (and even more, up to 400, one every
 month of fluid flow simulation time step).

 I wasn't expecting too much out of it, but when the interpolated
 profiles came out (for different combinations of input parameters) I
 felt like being on the wrong side of the Lala River in the valley of
 Areyoukidding. The results are striking. I get an impressive agreement
 between this interpolated proxy model and the real simulations,
 whether I use existing combinations of parameters or new ones (i.e., I
 create the interpolation and then run the real fluid flow simulation,
 comparing the outcomes).

 I'm reasoning about the implications of this observation to our
 understanding of your interpolation.  As Christopher pointed out, it's
 very important to know how many gas injections wells are to be
 weighted the same as one year.

 When you have nice results using 40 Rbfs for each time instant, this
 procedure means that the values for one time instant will not be
 influenced by adjacent-year data.  I.e., you would probably get the
 same result using a norm extraordinary blowing up the time coordinate.
  To make it clear in code, when the time is your first coordinate, and
 you have three other coordinates, the *norm* would be:

 def norm(x1, x2):
    return numpy.sqrtx1 - x2) * [1e3, 1, 1]) ** 2).sum())

 In this case, the epsilon should be fixed, to avoid the influence of
 the changing distances on the epsilon determination inside of Rbf,
 which would spoil the whole thing.

 I have an idea how to tune your model:  Take, say, the half or three
 thirds of your simulation data as interpolation database, and try to
 reproduce the remaining part.  I have some ideas how to tune using
 this in practice.

This is a very good idea indeed: I am actually running out of test
cases (it takes a while to run a simulation, and I need to do it every
time I try a new combination of parameters to check if the
interpolation is good enough or rubbish). I'll give it a go tomorrow
at work and I'll report back (even if I get very bad results :-D ).

 As an aside, I got my colleagues reservoir engineers playfully
 complaining that it's time for them to pack their stuff and go home as
 this interpolator is doing all the job for us; obviously, this misses
 the point that it took 4 years to build such a comprehensive bunch of
 simulations which now allows us to somewhat predict a possible
 production profile in advance.

 :-) :-)

 I wrapped everything up in a wxPython GUI with some Matplotlib graphs,
 and everyone seems happy.
 Not only your collegues!
 The only small complain I have is that I
 wasn't able to come up with a vector implementation of RBFs, so it can
 be pretty slow to build and interpolate 400 RBFs for each property (3
 of them).

 Haven't you spoken about 40 Rbfs for the time alone??

Yes, sorry about the confusion: depending on which time-step I
choose to compare the interpolation with the real simulation, I can
have 40 RBFs (1 every year of simulation) or more than 400 (one every
month of simulation, not all the monthly data are available for all
the simulations I have).

 Something completely different: Are you going to do more simulations?

110% surely undeniably yes. The little interpolation tool I have is
just a proof-of-concept and a little helper for us to have an initial
grasp of how the production profiles might look like before actually
running the real simulation. Something like a toy to play with (if you
can call play actually working on a reservoir simulation...). There
is no possible substitute for the reservoir simulator itself.

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/

== Never *EVER* use RemovalGroup for your house removal. You'll
regret it forever.
http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html ==
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Interpolation question

2010-03-29 Thread Robert Kern
On Mon, Mar 29, 2010 at 17:57, Andrea Gavana andrea.gav...@gmail.com wrote:
 HI Friedrich  All,

 On 29 March 2010 23:44, Friedrich Romstedt wrote:

 Something completely different: Are you going to do more simulations?

 110% surely undeniably yes. The little interpolation tool I have is
 just a proof-of-concept and a little helper for us to have an initial
 grasp of how the production profiles might look like before actually
 running the real simulation. Something like a toy to play with (if you
 can call play actually working on a reservoir simulation...). There
 is no possible substitute for the reservoir simulator itself.

One thing you might want to do is to investigate using Gaussian
processes instead of RBFs. They are closely related (and I even think
the 'gaussian' RBF corresponds to what you get from a
particularly-constructed GP), but you can include uncertainty
estimates of your data and get an estimate of the uncertainty of the
interpolant. GPs are also very closely related to kriging, which you
may also be familiar with. If you are going to be running more
simulations, GPs can tell you what new inputs you should simulate to
reduce your uncertainty the most.

PyMC has some GP code:

  http://pymc.googlecode.com/files/GPUserGuide.pdf

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Py3k: making a py3k compat header available in installed numpy for scipy

2010-03-29 Thread David Cournapeau
Pauli Virtanen wrote:
 ma, 2010-03-29 kello 19:13 +0900, David Cournapeau kirjoitti:
 I have worked on porting scipy to py3k, and it is mostly working. One
 thing which would be useful is to install something similar to
 npy_3kcompat.h in numpy, so that every scipy extension could share the
 compat header. Is the current python 3 compatibility header usable in
 the wild, or will it still significantly change (this requiring a
 different, more stable one) ?
 
 I believe it's reasonably stable, as it contains mostly simple stuff.
 Something perhaps can be added later, but I don't think anything will
 need to be removed.

Ok. If the C capsule stuff is still in flux, it may just be removed from 
the public header - I don't think we will need it anywhere in scipy.

 At least, I don't see what I would like to change there. The only thing
 I wouldn't perhaps like to have in the long run are the PyString and
 possibly PyInt redefinition macros.

I would also prefer a new name, instead of macro redefinition, but I can 
do it. The header would not be part of the public API proper anyway (I 
will put it somewhere else than numpy/include), it should never be 
pulled implicitly in when including one of the .h in numpy/include.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dealing with roundoff error

2010-03-29 Thread Mike Sarahan
Thank you all for your suggestions.  I ended up multiplying by 10 and
rounding, while casting the array to an int.  Certainly not the most
universal solution, but it worked for my data.

code, for anyone searching for examples:
np.array(np.round((hlspec[:,0]-offset)*10),dtype=np.int)

-Mike

On Sun, Mar 28, 2010 at 12:44 PM, Friedrich Romstedt
friedrichromst...@gmail.com wrote:
 2010/3/28 Mike Sarahan msara...@gmail.com:
 I have run into some roundoff problems trying to line up some
 experimental spectra.  The x coordinates are given in intervals of 0.1
 units.  I read the data in from a text file using np.loadtxt().

 I don't know your problem well enough, so the suggestion to use
 numpy.interp() is maybe not more than a useless shot in the dark?

 http://docs.scipy.org/doc/numpy/reference/generated/numpy.interp.html#numpy.interp

 Friedrich
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion