Re: [Numpy-discussion] something wrong with docs?

2009-09-22 Thread David Goldsmith
On Tue, Sep 22, 2009 at 9:29 PM, Fernando Perez wrote:

> On Tue, Sep 22, 2009 at 7:31 PM, David Goldsmith> is there a "standard" for
> these ala the docstring standard, or some other
> > extant way to promulgate and "strengthen" your "suggestion" (after proper
> > community vetting, of course);
>
> I'm not sure what you mean here, sorry.  I simply don't understand
> what you are looking to "strengthen" or what standard there could be:
> this is regular code that goes into reST blocks.  Sorry if I missed
> your point...
>

"It would be nice if we could move gradually
towards docs whose examples (at least those marked as such) were
always run via sphinx."

That's a "suggestion," but given your point, it seems like you'd advocate it
being more than that, no?

DG
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] The problem with arrays

2009-09-22 Thread yogesh karpate
Dear Fabrice
 Finally your suggestions worked :).Thanks a lot...
soon the code I'm working will be available as a part of Free Software
Foundation.
Regards
Yogesh

On Tue, Sep 22, 2009 at 11:23 PM, Fabrice Silva wrote:

> Le mardi 22 septembre 2009 à 23:00 +0530, yogesh karpate a écrit :
>
> > This is the main thing . When I try to store it in array like
> > R_time=array([R_t[0][i]]). It just stores the final value in that
> > array when loop ends.I cant get out of this For loop.I really have
> > this small problem. I really need help on this guys.
> >
> > for i in range(a1):
> > data_temp=(bpf[left[0][i]:
> > right[0][i]])# left is an array and right is also an array
> > maxloc=data_temp.argmax()   #taking indices of
> > max. value of data segment
> > maxval=data_temp[maxloc]
> > minloc=data_temp.argmin()
> > minval=data_temp[minloc]
> > maxloc = maxloc-1+left # add offset of present
> > location
> > minloc = minloc-1+left # add offset of present
> > location
> > R_index = maxloc
> > R_t = t[maxloc]
> > R_amp = array([maxval])
> > S_amp = minval#%%% Assuming the S-wave is the lowest
> > #%%% amp in the given window
> > #S_t = t[minloc]
> > R_time=array([R_t[0][i]])
> > plt.plot(R_time,R_amp,'go');
> > plt.show()
>
> Two options :
> - you define an empty list before the loop
>>>> R_time = []
>  and you append the computed value while looping
>>>> for i:
>>>> ...
>>>> R_time.append(t[maxloc])
>
> - or you define a preallocated array before the loop
>>>> R_time = np.empty(a1)
>  and fill it with the computed values
>>>> for i:
>>>> ...
>>>> R_time[i] = t[maxloc]
>
>
> Same thing with R_amp. After looping, whatever the solution you choose,
> you can plot  the whole set of (time, value) tuples
>>>> plt.plot(R_time, R_amp)
>
> --
> Fabrice Silva 
> LMA UPR CNRS 7051
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] something wrong with docs?

2009-09-22 Thread Fernando Perez
On Tue, Sep 22, 2009 at 7:31 PM, David Goldsmith
 wrote:
> Later in this thread, Fernando, you make a good case - scalability - for
> this, which, as someone who's been using only >>>, raises a number of
> questions in my mind: 0) this isn't applicable to docstrings, only to
> numpy-docs (i.e., the .rst files),

Yes, and to me this naturally divides things: examples in docstrings
should be compact enough that they fit comfortably in a >>> style.  If
they need an entire page of code, they probably should be in the main
docs but not in the docstring.  So I like very much the sphinx doctest
code blocks for longer examples interspersed with text, and the >>>
style for short ones that are a good fit for a docstring.

 correct; 1) assuming the answer is "yes,"
> is there a "standard" for these ala the docstring standard, or some other
> extant way to promulgate and "strengthen" your "suggestion" (after proper
> community vetting, of course);

I'm not sure what you mean here, sorry.  I simply don't understand
what you are looking to "strengthen" or what standard there could be:
this is regular code that goes into reST blocks.  Sorry if I missed
your point...

2) for those of us new to this approach, is
> there a "standard example" somewhere we can easily reference?

Yes, the sphinx docs have a page about the directive, including a brief example:

http://sphinx.pocoo.org/ext/doctest.html

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help]

2009-09-22 Thread David Goldsmith
On Tue, Sep 22, 2009 at 4:02 PM, Ralf Gommers
wrote:

>
> On Tue, Sep 22, 2009 at 1:58 PM, Michael Droettboom wrote:
>
> Trac has these bugs.  Any others?
>>
>> http://projects.scipy.org/numpy/ticket/1199
>> http://projects.scipy.org/numpy/ticket/1200
>> http://projects.scipy.org/numpy/ticket/856
>> http://projects.scipy.org/numpy/ticket/855
>> http://projects.scipy.org/numpy/ticket/1231
>>
>
> This one:
> http://article.gmane.org/gmane.comp.python.numeric.general/23638/match=chararray
>
> Cheers,
> Ralf
>

That last one never got "promoted" to a ticket?

DG
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?

2009-09-22 Thread David Goldsmith
On Tue, Sep 22, 2009 at 3:14 PM, Citi, Luca  wrote:

> My vote (if I am entitled to) goes to "change the code".
>
Whether or not the addressee of .base is an array, it should be "the object
> that has to be kept alive such that the data does not get deallocated"
> rather "one object which will keep alive another object, which will keep
> alive another object, , which will keep alive the object with the data".
> On creation of a new view B of object A, if A has ONWDATA true then B.base
> = A, else B.base = A.base.
>
> When working on
> http://projects.scipy.org/numpy/ticket/1085
> I had to walk the chain of bases to establish whether any of the inputs and
> the outputs were views of the same data.
> If "base" were the ultimate base, one would only need to check whether any
> of the inputs have the same base of any of the outputs.
>
> I tried to modify the code to change the behaviour.
> I have opened a ticket for this
> http://projects.scipy.org/numpy/ticket/1232
> and attached a patch but I am not 100% sure.
> I changed PyArray_View in convert.c and a few places in mapping.c and
> sequence.c.
>
> But if there is any reason why the current behaviour should be kept, just
> ignore the ticket.
>

You don't mean that literally, right?  A ticket can't just be ignored: it
can be changed to "will not fix," with, hopefully, a good explanation as to
why, but it has to be resolved and closed in some fashion, not just ignored,
or someone somewhere down the line will try to address it substantively. :-)

In any event, I think we need a few more "heavyweights" to weigh in on this
before code is changed: Robert? Charles? Travis?  Anyone?  Anyone wanna
"block"?

DG


>
> Luca
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] something wrong with docs?

2009-09-22 Thread David Goldsmith
On Mon, Sep 21, 2009 at 6:49 PM, Fernando Perez wrote:

> On Mon, Sep 21, 2009 at 11:32 AM, Pauli Virtanen  wrote:
> > The `sphinx.ext.doctest` extension is not enabled, so the testcode::
> > etc. directives are not available. I'm not sure if it should be enabled
> > -- it would be cleaner to just replace the testcode:: stuff with the
> > ordinary example markup.
> >
>
> Why not enable it?  It would be nice if we could move gradually
> towards docs whose examples (at least those marked as such) were
> always run via sphinx.  The more we do this, the higher the chances of
>

Later in this thread, Fernando, you make a good case - scalability - for
this, which, as someone who's been using only >>>, raises a number of
questions in my mind: 0) this isn't applicable to docstrings, only to
numpy-docs (i.e., the .rst files), correct; 1) assuming the answer is "yes,"
is there a "standard" for these ala the docstring standard, or some other
extant way to promulgate and "strengthen" your "suggestion" (after proper
community vetting, of course); 2) for those of us new to this approach, is
there a "standard example" somewhere we can easily reference?  Thanks!

DG

non-zero overlap between documentation and reality :)
>
> Cheers,
>
> f
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fancy indexing for

2009-09-22 Thread Daran Rife
Hi Robert,

This solution works beautifully! Thanks for sending it
along. I need to learn and understand more about fancy
indexing for multi-dimensional arrays, especially your
clever trick of np.newaxis for broadcasting.

Daran

--


> Hello list,
>
> This didn't seem to get through last time round, and my
> first version was poorly written.
>
> I have a rather pedestrian question about fancy indexing
> for multi-dimensional arrays.
>
> Suppose I have two 3-D arrays, one named "A" and the other "B",
> where both arrays have identical dimensions of time, longitude,
> and latitude. I wish to use data from A to conditionally select
> values from array B. Specifically, I first find the time where
> the values at each point in A are at their maximum. This is
> accomplished with:
>
> ?>>> tmax_idx = np.argsort(A, axis=0)
>
> I now wish to use this tmax_idx array to conditionally select
> the values from B. In essence, I want to pick values from B for
> times where the values at A are at their max. Can this be done
> with fancy indexing? Or is there a smarter way to do this? I've
> certainly done this sort of selection before, but the index
> selection array is 1D. I've carefully studied the excellent
> indexing documentation and examples on-line, but can't sort out
> whether what I want to do is even possible, without doing the
> brute force looping method, similar to:
>
> max_B = np.zeros((nlon, nlat), dtype=np.float32)
>
> for i in xrange(nlon):
> ? ?for j in xrange(nlat):
> ? ? ? ?max_B[i,j] = B[tmax_idx[i,j],i,j]

All of the index arrays need to be broadcastable to the same shape.
Thus, you want the "i" index array to be a column vector.

max_B = B[tmax_idx, np.arange(nlon)[:,np.newaxis], np.arange(nlat)]

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help]

2009-09-22 Thread Ralf Gommers
On Tue, Sep 22, 2009 at 1:58 PM, Michael Droettboom  wrote:

> Sorry to resurrect a long-dead thread, but I've been continuing Chris
> Hanley's investigation of chararray at Space Telescope Science Institute
> (and the broader astronomical community) for a while and have some
> findings to report back.
>

Thank you for the thorough investigation, it seems clear to me now that
chararray does have a purpose and is in good hands.

>
>
> Now to address the concerns iterated in this thread.  Unfortunately, I
> don't know where this thread began before it landed on the Numpy list,
> so I may be missing details which would help me address them.
>

The discussion began on the scipy list, but I think you addressed most
concerns in enough detail.

>
> > 0) "it gets very little use" (an assumption you presumably dispute);
> >
> Certainly not true from where I stand.
> > 1) "is pretty much undocumented" (less true than a week ago, but still
> true for several of the attributes, with another handful or so falling into
> the category of "poorly documented");
> >
> I don't quite understand this one -- 99% of the methods are wrappers
> around standard Python string methods.  I don't think we should
> redocument those.  I agree it needs a better top level docstring about
> its purpose (see functionalities (1) and (2) above) and its status (for
> backward compatibility).
>

Well, then the docstrings should say that. It can be a 4-line standard
template that refers to the stdlib docs, but it would be nice to get
something other than this in ipython:

>>> charar = np.chararray(2)
>>> charar.ljust?

Docstring:




> > 2) "probably more buggy than most other parts of NumPy" ("probably" being
> a euphemism, IMO);
> >
> Trac has these bugs.  Any others?
>
> http://projects.scipy.org/numpy/ticket/1199
> http://projects.scipy.org/numpy/ticket/1200
> http://projects.scipy.org/numpy/ticket/856
> http://projects.scipy.org/numpy/ticket/855
> http://projects.scipy.org/numpy/ticket/1231
>

This one:
http://article.gmane.org/gmane.comp.python.numeric.general/23638/match=chararray

Cheers,
Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?

2009-09-22 Thread Citi, Luca
My vote (if I am entitled to) goes to "change the code".
Whether or not the addressee of .base is an array, it should be "the object 
that has to be kept alive such that the data does not get deallocated" rather 
"one object which will keep alive another object, which will keep alive another 
object, , which will keep alive the object with the data".
On creation of a new view B of object A, if A has ONWDATA true then B.base = A, 
else B.base = A.base.

When working on
http://projects.scipy.org/numpy/ticket/1085
I had to walk the chain of bases to establish whether any of the inputs and the 
outputs were views of the same data.
If "base" were the ultimate base, one would only need to check whether any of 
the inputs have the same base of any of the outputs.

I tried to modify the code to change the behaviour.
I have opened a ticket for this http://projects.scipy.org/numpy/ticket/1232
and attached a patch but I am not 100% sure.
I changed PyArray_View in convert.c and a few places in mapping.c and 
sequence.c.

But if there is any reason why the current behaviour should be kept, just 
ignore the ticket.

Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help]

2009-09-22 Thread David Goldsmith
Michael:

First, thank you very much for your detailed and thorough analysis and recap
of the situation - it sounds to me like chararray is now in good hands! :-)

On Tue, Sep 22, 2009 at 10:58 AM, Michael Droettboom wrote:

> Sorry to resurrect a long-dead thread, but I've been continuing Chris
>

IMO, no apology necessary!


> Hanley's investigation of chararray at Space Telescope Science Institute
> (and the broader astronomical community) for a while and have some
> findings to report back.
>
> What I've taken from this thread is that chararray is in need of a
> maintainer.  I am able to spend some time to the cause, but first would
>

Yes, thank you!


> like to clarify what it will take to make it's continued inclusion more
> comfortable.
>
> Let me start with the use case.  chararrays are extensively returned
> from pyfits (a tool to handle the standard astronomy data format).
> pyfits is the basis of many applications, and it would be impossible to
> audit all of that code.  Most authors of those tools do not track
> numpy-discussion closely, which is why we don't hear from them on this
> list, but there is a great deal of pyfits-using code.
>
> Doing some spot-checking on this code, a common thing I see is SQL-like
> queries on recarrays of objects.  For instance, it is very common to a
> have a table of objects, with a "Target" column which is a string, and
> do something like (where c is a chararray of the 'Target' column):
>
>   subset = array[np.where(c.startswith('NGC'))]
>
> Strictly speaking, this is a use case for "vectorized string
> operations", not necessarily for the chararray class as it presently
> stands.  One could almost as easily do:
>
>   subset = array[np.where([x.startswith('NGC') for x in c])]
>
> ...and the latter is even slightly faster, since chararray currently
> loops in Python anyway.
>
> Even better, though, I have some experimental code to perform the loop
> in C, and I get 5x speed up on a table with ~120,000 rows.  If that were
> to be included in numpy, that's a strong argument against recommending
> list comprehensions in user code.  The use case suggests the continued
> existence of vectorized string operations in numpy -- whether that
> continues to be chararray, or some newer/better interface + chararray
> for backward compatibility, is an open question.  Personally I think a
> less object-oriented approach and just having a namespace full of
> vectorized string functions might be cleaner than the current situation
> of needing to create a view class around an ndarray.  I'm suggesting
> something like the following, using the same example, where {STR} is
> some namespace we would fill with vectorized string operations:
>
>   subset = array[np.where(np.{STR}.startswith(c, 'NGC'))]
>
> Now on to chararray as it now stands.  I view chararray as really two
> separable pieces of functionality:
>
>   1) Convenience to perform vectorized string operations using
> '.method' syntax, or in some cases infix operators (+, *)
>   2) Implicit "rstrip"ping of values
>
> (Note that raw ndarray's truncate values at the first NULL character,
> like C strings, but chararrays will strip any and all whitespace
> characters from the end).
>
> Changing (2) just seems to be asking to be the source of subtle bugs.
> Unfortunately, there's an inconsistency between 1) and 2) in the present
> implementation.  For example:
>
> In [9]: a = np.char.array(['a  '])
>
> In [10]: a
> Out[10]: chararray(['a'], dtype='|S3')
>
> In [11]: a[0] == 'a'
> Out[11]: True
>
> In [12]: a.endswith('a')
> Out[12]: array([False], dtype=bool)
>
> This is *the* design wart of chararray, IMHO, and one that's difficult
> to fix without breaking compatibility.  It might be a worthwhile
> experiment to remove (2) and see how much we really break, but it would
> be impossible to know for sure.
>
> Now to address the concerns iterated in this thread.  Unfortunately, I
> don't know where this thread began before it landed on the Numpy list,
> so I may be missing details which would help me address them.
>
> > 0) "it gets very little use" (an assumption you presumably dispute);
> >
> Certainly not true from where I stand.
>

I'm convinced.


> > 1) "is pretty much undocumented" (less true than a week ago, but still
> true for several of the attributes, with another handful or so falling into
> the category of "poorly documented");
> >
> I don't quite understand this one -- 99% of the methods are wrappers

around standard Python string methods.  I don't think we should
> redocument those.  I agree it needs a better top level docstring about
>

OK, that's what I needed to hear (that I don't believe anyone stated
explicitly before - I'm sure I'll be corrected if I'm wrong): in that case,
finishing these off is as simple as stating that in the functions'
docstrings (albeit in a way compliant w/ the numpy docstring standard, of
course; see below).




> > 6) it is, on its face, "counter to the spirit" of NumPy.
> >
> I don'

Re: [Numpy-discussion] is ndarray.base the closest base or the ultimate base?

2009-09-22 Thread David Goldsmith
So, what's the "bottom-line" of this thread: does the doc need to be
changed, or the code?

DG

2009/9/21 Hans Meine 

> Hi!
>
> On Monday 21 September 2009 12:31:27 Citi, Luca wrote:
> > I think you do not need to do the  chain up walk on view creation.
> > If the assumption is that base is the ultimate base, on view creation
> > you can do something like (pseudo-code):
> > view.base = parent if parent.owndata else parent.base
>
> Hmm.  My impression was that .base was for refcounting purposes *only*.
>  Thus,
> it is not even guaranteed that the attribute value is an array(-like)
> object.
>
> For example, I might want to allow direct access to some internal buffers
> of
> an object of mine in an extension module; then, I'd use .base to bind the
> lifetime of my object to the array (the lifetime of which I cannot control
> anymore).
>
> Ciao,
>   Hans
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] merging docs from wiki

2009-09-22 Thread David Goldsmith
On Sun, Sep 20, 2009 at 12:49 PM, Ralf Gommers
wrote:

> Hi,
>
> I'm done reviewing all the improved docstrings for NumPy, they can be
> merged now from the doc editor Patch page. Maybe I'll get around to doing
> the SciPy ones as well this week, but I can't promise that.
>

Thank you very much, Ralf!

There are a few docstrings on the Patch page I did not mark "Ok to apply":
>
> 1. the generic docstrings. Some are marked Ready for review, but they refer
> mostly to "self" and to "generic" which I don't think is very helpful. It
> would be great if someone could do just one of those docstrings and make it
> somewhat informative.


I couldn't agree more; unfortunately, I've been trying, highly
unsuccessfully, to get someone to do this for a while - I promoted them in
the hopes that a reviewer saying they needed an expert's eye would be more
authoritative - I guess we're about to find out if that's the case. ;-)

DG


> There are about 50 that can then be done in the same way.
>

> 2. get_numpy_include: the docstring is deleted because the function is
> deprecated. I don't think that is helpful but I'm not sure. Should this be
> reverted or applied?
>
> Cheers,
> Ralf
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy macosx10.5 binaries: compatible with 10.4?

2009-09-22 Thread Christopher Barker
Russell E. Owen wrote:
> All the official numpy 1.3.0 Mac binaries are labelled "macosx10.5". 
> Does anyone know if these are backwards compatible with MacOS X 10.4

I'm pretty sure they are.

> 10.3.9?

not so sure, but worth a try.

I've posted bug reports about the naming scheme, but haven't stepped up 
to fix it, so what can you do?

numpy's pretty easy to build on OS-X, too. At least it was the last time 
I tried it!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] polynomial ring dtype

2009-09-22 Thread Sebastian Walter
sorry if this a duplicate, it seems that my last mail got lost...
is there something to take care about when sending a mail to the numpy
mailing list?

On Tue, Sep 22, 2009 at 9:42 AM, Sebastian Walter
 wrote:
> This is somewhat similar to the question about fixed-point arithmetic
> earlier on this mailing list.
>
> I need to do computations on arrays whose elements are truncated polynomials.
> At the momement, I have implemented the univariate truncated
> polynomials as objects of a class UTPS.
>
> The class basically looks like this:
>
> class UTPS:
>    def __init__(self, taylor_coeffs):
>        """  polynomial x(t) =  tc[0] + tc[1] t + tc[2] t^2 + tc[3]
> t^3 + ... """
>        self.tc = numpy.asarray(taylor_coeffs)
>
>    def __add__(self, rhs):
>        return UTPS(self.tc + rhs.tc)
>
>    def sin(self):
>        # numpy.sin(self) apparently automatically calls self.sin()
> which is very cool
>
>    etc
>
> One can create arrays of UTPS instances  like this:
> x = numpy.array( [[UTPS([1,2]), UTPS([3,4])], [UTPS([0,1]), UTPS([4,3])]])
>
> and perform funcs and ufuncs on it
>
> y = numpy.sum(x)
> y = numy.sin(x)
> y = numpy.dot(numpy.eye(2), x)
>
> This works out of the box, which is very nice.
>
> my question:
> Is it possible to speed up the computation by defining a special dtype
> for truncated polynomials? Especially when the arrays get large,
> computing on arrays of objects is quite slow. I had a look at the
> numpy svn trunk but couldn't find any clues.
>
> If you are interested, you can have a look at the full pre alpha
> version code (BSD licence) at http://github.com/b45ch1/algopy .
>
> regards,
> Sebastian
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy and cython in pure python mode

2009-09-22 Thread Robert Kern
On Tue, Sep 22, 2009 at 01:33, Sebastian Haase  wrote:
> Hi,
> I'm not subscribed to the cython list - hoping enough people would
> care to justify my post here:
>
> I know that cython's numpy is still getting better and better over
> time, but is it already today possible to have numpy support when
> using Cython in "pure python" mode?
> I like the idea of being able to develop and debug code "the python
> way" -- and then just switching on the cython-overdrive mode.
> (Otherwise I have very good experience using C/C++ with appropriate
> typemaps, and I don't mind the C syntax)
>
> I only recently learned about the "pure python" mode on the sympy list
> (and at the EuroScipy2009 workshop).
> My understanding is that Cython's pure Python mode could be "played"
> in two ways:
> a) either not having a .pyx-file at all and putting everything into a
> py-file (using the "import cython" stuff)
> or b) putting only cython specific declaration in to a pyx file having
> the same basename as the py-file next to it.

I'm pretty sure that you need Cython syntax that is not supported by
the pure-Python mode in order to use numpy arrays effectively.

> One more: there is no way on reload cython-modules (yet),  right ?

Correct. There is no way to reload any extension module.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fancy indexing for

2009-09-22 Thread Robert Kern
On Tue, Sep 22, 2009 at 12:16, Daran Rife  wrote:
> Hello list,
>
> This didn't seem to get through last time round, and my
> first version was poorly written.
>
> I have a rather pedestrian question about fancy indexing
> for multi-dimensional arrays.
>
> Suppose I have two 3-D arrays, one named "A" and the other "B",
> where both arrays have identical dimensions of time, longitude,
> and latitude. I wish to use data from A to conditionally select
> values from array B. Specifically, I first find the time where
> the values at each point in A are at their maximum. This is
> accomplished with:
>
>  >>> tmax_idx = np.argsort(A, axis=0)
>
> I now wish to use this tmax_idx array to conditionally select
> the values from B. In essence, I want to pick values from B for
> times where the values at A are at their max. Can this be done
> with fancy indexing? Or is there a smarter way to do this? I've
> certainly done this sort of selection before, but the index
> selection array is 1D. I've carefully studied the excellent
> indexing documentation and examples on-line, but can't sort out
> whether what I want to do is even possible, without doing the
> brute force looping method, similar to:
>
> max_B = np.zeros((nlon, nlat), dtype=np.float32)
>
> for i in xrange(nlon):
>    for j in xrange(nlat):
>        max_B[i,j] = B[tmax_idx[i,j],i,j]

All of the index arrays need to be broadcastable to the same shape.
Thus, you want the "i" index array to be a column vector.

max_B = B[tmax_idx, np.arange(nlon)[:,np.newaxis], np.arange(nlat)]

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy macosx10.5 binaries: compatible with 10.4?

2009-09-22 Thread Russell E. Owen
All the official numpy 1.3.0 Mac binaries are labelled "macosx10.5". 
Does anyone know if these are backwards compatible with MacOS X 10.4 or 
10.3.9?

-- Russell

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] something wrong with docs?

2009-09-22 Thread Fernando Perez
On Tue, Sep 22, 2009 at 12:02 AM, Pauli Virtanen  wrote:
> I think sphinx.ext.doctest is able to also test the ordinary >>> marked-
> up examples, so there'd be no large need for new directives.
>

Well, >>> examples intermix input and output, and are thus very
annoying to paste back into new code or interactive sessions (ipython
has %doctest_mode that helps some, but you still have to avoid pasting
output). Furthermore, >>> examples get very unwieldy beyond a few
lines, and things like class definitions are hard to do and read in
that mode.

The nice thing about the sphinx full doctest support is that it scales
very well to more complex code blocks.

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [SciPy-dev] Deprecate chararray [was Plea for help]

2009-09-22 Thread Michael Droettboom
Sorry to resurrect a long-dead thread, but I've been continuing Chris 
Hanley's investigation of chararray at Space Telescope Science Institute 
(and the broader astronomical community) for a while and have some 
findings to report back.

What I've taken from this thread is that chararray is in need of a 
maintainer.  I am able to spend some time to the cause, but first would 
like to clarify what it will take to make it's continued inclusion more 
comfortable.

Let me start with the use case.  chararrays are extensively returned 
from pyfits (a tool to handle the standard astronomy data format).  
pyfits is the basis of many applications, and it would be impossible to 
audit all of that code.  Most authors of those tools do not track 
numpy-discussion closely, which is why we don't hear from them on this 
list, but there is a great deal of pyfits-using code. 

Doing some spot-checking on this code, a common thing I see is SQL-like 
queries on recarrays of objects.  For instance, it is very common to a 
have a table of objects, with a "Target" column which is a string, and 
do something like (where c is a chararray of the 'Target' column):

   subset = array[np.where(c.startswith('NGC'))]

Strictly speaking, this is a use case for "vectorized string 
operations", not necessarily for the chararray class as it presently 
stands.  One could almost as easily do:

   subset = array[np.where([x.startswith('NGC') for x in c])]

...and the latter is even slightly faster, since chararray currently 
loops in Python anyway.

Even better, though, I have some experimental code to perform the loop 
in C, and I get 5x speed up on a table with ~120,000 rows.  If that were 
to be included in numpy, that's a strong argument against recommending 
list comprehensions in user code.  The use case suggests the continued 
existence of vectorized string operations in numpy -- whether that 
continues to be chararray, or some newer/better interface + chararray 
for backward compatibility, is an open question.  Personally I think a 
less object-oriented approach and just having a namespace full of 
vectorized string functions might be cleaner than the current situation 
of needing to create a view class around an ndarray.  I'm suggesting 
something like the following, using the same example, where {STR} is 
some namespace we would fill with vectorized string operations:

   subset = array[np.where(np.{STR}.startswith(c, 'NGC'))]

Now on to chararray as it now stands.  I view chararray as really two 
separable pieces of functionality:

   1) Convenience to perform vectorized string operations using 
'.method' syntax, or in some cases infix operators (+, *)
   2) Implicit "rstrip"ping of values

(Note that raw ndarray's truncate values at the first NULL character, 
like C strings, but chararrays will strip any and all whitespace 
characters from the end).

Changing (2) just seems to be asking to be the source of subtle bugs.  
Unfortunately, there's an inconsistency between 1) and 2) in the present 
implementation.  For example:

In [9]: a = np.char.array(['a  '])

In [10]: a
Out[10]: chararray(['a'], dtype='|S3')

In [11]: a[0] == 'a'
Out[11]: True

In [12]: a.endswith('a')
Out[12]: array([False], dtype=bool)

This is *the* design wart of chararray, IMHO, and one that's difficult 
to fix without breaking compatibility.  It might be a worthwhile 
experiment to remove (2) and see how much we really break, but it would 
be impossible to know for sure.

Now to address the concerns iterated in this thread.  Unfortunately, I 
don't know where this thread began before it landed on the Numpy list, 
so I may be missing details which would help me address them.

> 0) "it gets very little use" (an assumption you presumably dispute);
>   
Certainly not true from where I stand.
> 1) "is pretty much undocumented" (less true than a week ago, but still true 
> for several of the attributes, with another handful or so falling into the 
> category of "poorly documented");
>   
I don't quite understand this one -- 99% of the methods are wrappers 
around standard Python string methods.  I don't think we should 
redocument those.  I agree it needs a better top level docstring about 
its purpose (see functionalities (1) and (2) above) and its status (for 
backward compatibility).
> 2) "probably more buggy than most other parts of NumPy" ("probably" being a 
> euphemism, IMO);
>   
Trac has these bugs.  Any others?

http://projects.scipy.org/numpy/ticket/1199
http://projects.scipy.org/numpy/ticket/1200
http://projects.scipy.org/numpy/ticket/856
http://projects.scipy.org/numpy/ticket/855
http://projects.scipy.org/numpy/ticket/1231
> 3) "there is not a really good use-case for it" (a conjecture, but one that 
> has yet to be challenged by counter-example); 
>   
See above.
> 4) it's not the first time its presence in NumPy has been questioned ("as 
> Stefan pointed out when asking this same question last year")
>   
Hopefully we're addressing that now.
> 5) N

Re: [Numpy-discussion] The problem with arrays

2009-09-22 Thread Fabrice Silva
Le mardi 22 septembre 2009 à 23:00 +0530, yogesh karpate a écrit :

> This is the main thing . When I try to store it in array like
> R_time=array([R_t[0][i]]). It just stores the final value in that
> array when loop ends.I cant get out of this For loop.I really have
> this small problem. I really need help on this guys.
> 
> for i in range(a1):
> data_temp=(bpf[left[0][i]:
> right[0][i]])# left is an array and right is also an array
> maxloc=data_temp.argmax()   #taking indices of
> max. value of data segment
> maxval=data_temp[maxloc] 
> minloc=data_temp.argmin()
> minval=data_temp[minloc]
> maxloc = maxloc-1+left # add offset of present
> location
> minloc = minloc-1+left # add offset of present
> location
> R_index = maxloc
> R_t = t[maxloc]
> R_amp = array([maxval])
> S_amp = minval#%%% Assuming the S-wave is the lowest
> #%%% amp in the given window
> #S_t = t[minloc]
> R_time=array([R_t[0][i]])
> plt.plot(R_time,R_amp,'go');
> plt.show()

Two options :
- you define an empty list before the loop
>>> R_time = []
  and you append the computed value while looping
>>> for i:
>>> ...
>>> R_time.append(t[maxloc])

- or you define a preallocated array before the loop
>>> R_time = np.empty(a1)
  and fill it with the computed values
>>> for i:
>>> ...
>>> R_time[i] = t[maxloc]


Same thing with R_amp. After looping, whatever the solution you choose,
you can plot  the whole set of (time, value) tuples
>>> plt.plot(R_time, R_amp)

-- 
Fabrice Silva 
LMA UPR CNRS 7051

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] The problem with arrays

2009-09-22 Thread yogesh karpate
On Tue, Sep 22, 2009 at 7:01 PM, Fabrice Silva wrote:

> Le mardi 22 septembre 2009 à 17:42 +0530, yogesh karpate a écrit :
> > I just tried your idea but the result is same. it didnt help .
> >
> > 2009/9/22 Nadav Horesh 
> > A quick answer with going into the details of your code:
> >
> > try
> >  plt.plot(R_time,R_amp,'go',hold=1)
> > (one line before the last)
> >
> >  Nadav
>
> You may separate the computation part and the plotting one by storing
> your results in a R_time and a R_amp array (length a1 arrays).
>
> Concerning the plotting issue : are you sure the points you want to be
> displayed aren't yet? print the values within the loop :
> >>> print (R_amp, R_time)
> to check your values.
> You may also inspect your graphs to see how many lines they have :
> >>> plt.gca().get_children()
> or
> >>> plt.gca().get_lines()
> might help.
>
   This is the main thing . When I try to store it in array like
R_time=array([R_t[0][i]]). It just stores the final value in that array when
loop ends.I cant get out of this For loop.I really have this small problem.
I really need help on this guys.

> for i in range(a1):
> data_temp=(bpf[left[0][i]:right[0][i]])# left is an array and
> right is also an array
> maxloc=data_temp.argmax()   #taking indices of  max. value of
> data segment
> maxval=data_temp[maxloc]
> minloc=data_temp.argmin()
> minval=data_temp[minloc]
> maxloc = maxloc-1+left # add offset of present location
> minloc = minloc-1+left # add offset of present location
> R_index = maxloc
> R_t = t[maxloc]
> R_amp = array([maxval])
> S_amp = minval#%%% Assuming the S-wave is the lowest
> #%%% amp in the given window
> #S_t = t[minloc]
> R_time=array([R_t[0][i]])
> plt.plot(R_time,R_amp,'go');
> plt.show()
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Fancy indexing for

2009-09-22 Thread Daran Rife
Hello list,

This didn't seem to get through last time round, and my
first version was poorly written.

I have a rather pedestrian question about fancy indexing
for multi-dimensional arrays.

Suppose I have two 3-D arrays, one named "A" and the other "B",
where both arrays have identical dimensions of time, longitude,
and latitude. I wish to use data from A to conditionally select
values from array B. Specifically, I first find the time where
the values at each point in A are at their maximum. This is
accomplished with:

 >>> tmax_idx = np.argsort(A, axis=0)

I now wish to use this tmax_idx array to conditionally select
the values from B. In essence, I want to pick values from B for
times where the values at A are at their max. Can this be done
with fancy indexing? Or is there a smarter way to do this? I've
certainly done this sort of selection before, but the index
selection array is 1D. I've carefully studied the excellent
indexing documentation and examples on-line, but can't sort out
whether what I want to do is even possible, without doing the
brute force looping method, similar to:

max_B = np.zeros((nlon, nlat), dtype=np.float32)

for i in xrange(nlon):
for j in xrange(nlat):
max_B[i,j] = B[tmax_idx[i,j],i,j]

As you know, this is reasonably fast for modest-sized arrays,
but is far more expensive for large arrays.


Thanks in advance for your help.


Sincerely,


Daran Rife


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Best way to insert C code in numpy code

2009-09-22 Thread Xavier Gnata
René Dudfield wrote:
> On Mon, Sep 21, 2009 at 8:12 PM, David Warde-Farley  
> wrote:
>   
>> On 21-Sep-09, at 2:55 PM, Xavier Gnata wrote:
>>
>> 
>>> Should I read that to learn you cython and numpy interact?
>>> Or is there another best documentation (with examples...)?
>>>   
>> You should have a look at the Bresenham algorithm thread you posted. I
>> went to the trouble of converting some Python code for Bresenham's
>> algorithm to Cython, and a pointer to the Cython+NumPy tutorial:
>>
>> http://wiki.cython.org/tutorials/numpy
>>
>> David
>> 
>
> I don't know about the best way... but here are two approaches I like...
>
> Another way is to make your C function then load it with ctypes(or
> wrap it with something else) and pass it pointers with
> array.ctype.data.  You can find the shape of the array in python, and
> pass it to your C function.  The benefit is it's just C code, and you
> can avoid the GIL too if you want.  Then if you keep your C code
> separate from python stuff other people can use your C code in other
> languages more easily.
>
> cinpy is another one(similar to weave) which can be good for fast
> prototyping... http://www.cs.tut.fi/~ask/cinpy/ or for changing the
> code to fit your data.
>
>   
Well I would have to find/read docs to be able to try that solution.
cython looks easy :)

cheers
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Parallelizable Performance Python Example

2009-09-22 Thread James Snyder

Hi -

I've recently been trying to adjust the performance python example (http://www.scipy.org/PerformancePython 
) so that it could be compared under a parallelized version.  I've  
adjusted the Gauss-Seidel 4 point method to a red-black checkerboarded  
(http://www.cs.colorado.edu/~mcbryan/3656.04/mail/87.htm) version that  
is much more easily parallelized on a shared memory system.


I've got some examples of this working, but I seem to be having  
trouble making it anywhere near efficient for the NumPy example (it 's  
around an order of magnitude slower than the non-red-black version).


Here's essentially what I'm doing with the NumPy solver:

   def numericTimeStep(self, dt=0.0):
   """
   Takes a time step using a NumPy expression.
   This has been adjusted to use checkerboard style indexing.
   """
   g = self.grid
   dx2, dy2 = np.float32(g.dx**2), np.float32(g.dy**2)
   dnr_inv = np.float32(0.5/(dx2 + dy2))
   u = g.u
   g.old_u = u.copy() # needed to compute the error.

   if self.count == 0:
   # Precompute Matrix Indexes
   X, Y = np.meshgrid(range(1,u.shape[0]-1),range(1,u.shape 
[1]-1))

   checker = (X+Y) % 2
   self.idx1 = checker==1
   self.idx2 = checker==0

   # The actual iteration
   g.u[1:-1, 1:-1][self.idx1] = ((g.u[0:-2, 1:-1][self.idx1] + g.u 
[2:, 1:-1][self.idx1])*dy2 +
(g.u[1:-1,0:-2][self.idx1] + g.u[1:-1, 2:] 
[self.idx1])*dx2)*dnr_inv


   g.u[1:-1, 1:-1][self.idx2] = ((g.u[0:-2, 1:-1][self.idx2] + g.u 
[2:, 1:-1][self.idx2])*dy2 +
   (g.u[1:-1,0:-2][self.idx2] + g.u[1:-1, 2:] 
[self.idx2])*dx2)*dnr_inv



   return g.computeError()

Any ideas?  I presume that the double-indexing is maybe what's killing  
this, and I could precompute some boolean indexing arrays, but the  
original version of this solver (plain Gauss-Seidel, 4 point  
averaging) is rather simple and clean :-):


   def numericTimeStep(self, dt=0.0):
   """Takes a time step using a NumPy expression."""
   g = self.grid
   dx2, dy2 = g.dx**2, g.dy**2
   dnr_inv = 0.5/(dx2 + dy2)
   u = g.u
   g.old_u = u.copy() # needed to compute the error.

   # The actual iteration
   u[1:-1, 1:-1] = ((u[0:-2, 1:-1] + u[2:, 1:-1])*dy2 +
(u[1:-1,0:-2] + u[1:-1, 2:])*dx2)*dnr_inv

   return g.computeError()


Here's a pure python version of the red-black solver (which is, of  
course, incredibly slow, but not that much slower than the non-red- 
black version):

   def slowTimeStep(self, dt=0.0):
   """Takes a time step using straight forward Python loops."""
   g = self.grid
   nx, ny = g.u.shape
   dx2, dy2 = np.float32(g.dx**2), np.float32(g.dy**2)
   dnr_inv = np.float32(0.5/(dx2 + dy2))
   u = g.u

   err = 0.0
   for offset in range(0,2):
 for i in range(1, nx-1):
 for j in range(1 + (i + offset) % 2, ny-1, 2):
 tmp = u[j,i]
 u[j,i] = ((u[j-1, i] + u[j+1, i])*dx2 +
   (u[j, i-1] + u[j, i+1])*dy2)*dnr_inv
 diff = u[j,i] - tmp
 err += diff*diff

   return np.sqrt(err)


--
James Snyder
Biomedical Engineering
Northwestern University
jbsny...@fanplastic.org
http://fanplastic.org/key.txt
ph: 847.448.0386


smime.p7s
Description: S/MIME cryptographic signature
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Best way to insert C code in numpy code

2009-09-22 Thread René Dudfield
On Tue, Sep 22, 2009 at 3:45 PM, Sturla Molden  wrote:
> Xavier Gnata skrev:
>> I have a large 2D numpy array as input and a 1D array as output.
>> In between, I would like to use C code.
>> C is requirement because it has to be fast and because the algorithm
>> cannot be written in a numpy oriented way :( (no way...really).
>>
> There are certain algorithms that cannot be vectorized, particularly
> those that are recursive/iterative.

Hi,

one thing you can do is guess(predict) what the previous answer is and
continue on a path from that guess.  Say you have 1000 processing
units, you could get the other 999 working on guesses for the answers
and go from there.  If one of your guess paths is right, you might be
able to skip a bunch of steps.  That's what cpus do with their
'speculative execution and branch prediction'.  even more OT, you can
take advantage of this sometimes by getting the cpu to work out
multiple things for you at once by putting in well placed if/elses,
but that's fairly cpu specific.  It's also used with servers... you
ask say 5 servers to give you a result, and wait for the first one to
give you the answer.

That cython pure module looks cool, thanks for pointing it out.

I wonder if anyone has tried using that with traces?  So common paths
in your code record the types, and then can be used by the cython pure
module to try and generate the types for you?  It could generate a
file containing a {callable : @cython.locals} mapping to be used on
compilation.  Similar to how psyco works, but with a separate run
program/compilation step.


cheers,
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Best way to insert C code in numpy code

2009-09-22 Thread Sturla Molden
Xavier Gnata skrev:
> I have a large 2D numpy array as input and a 1D array as output.
> In between, I would like to use C code.
> C is requirement because it has to be fast and because the algorithm 
> cannot be written in a numpy oriented way :( (no way...really).
>   
There are certain algorithms that cannot be vectorized, particularly 
those that are recursive/iterative. One example is MCMC methods such as 
the Gibbs sampler. You can get around it by running multiple Markov 
chains in parallel, and vectorizing this parallelism with NumPy. But you 
cannot vectorize one long chain. Vectorizing with NumPy only applies to 
data parallel problems.

But then there is a nice tool you should not about: Cython in pure 
Python mode. You just add some annotations to the Python code, and the 
.py file can be compiled to efficient C.

http://wiki.cython.org/pure

This is quite similar in spirit to the optional static typing that makes 
certain implementations of Common Lisp (CMUCL, SBCL, Franz) so insanely 
fast.



 










___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy question: Best hardware for Numpy?

2009-09-22 Thread Bruce Southey
On 09/22/2009 02:52 AM, Romain Brette wrote:
> David Warde-Farley a écrit :
>
>> On 21-Sep-09, at 10:53 AM, David Cournapeau wrote:
>>
>>  
>>> Concerning the hardware, I have just bought a core i7 (the cheapest
>>> model is ~ 200$ now, with 4 cores and 8 Mb of shared cache), and the
>>> thing flies for floating point computation. My last computer was a
>>> pentium 4 so I don't have a lot of reference, but you can compute ~
>>> 300e6 exp (assuming a contiguous array), and ATLAS 3.8.3 built on it
>>> is extremely fast - using the threaded version, the asymptotic peak
>>> performances are quite impressive. It takes for example 14s to inverse
>>> a 5000x5000 matrix of double.
>>>
>> I thought you had a Macbook too?
>>
>> The Core i5 750 seems like a good buy right now as well. A bit
>> cheaper, 4 cores and 8Mb of shared cache though at a slightly lower
>> clock speed.
>>
>> David
>>  
> How about the Core i7 975 (Extreme)?
> http://www.intel.com/performance/desktop/extreme.htm
>
> I am wondering if it is worth the extra money.
>
> Best,
> Romain
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
Hi,
Check out the charts and stuff at places like 
http://www.tomshardware.com or http://www.anandtech.com/ for example:
http://www.tomshardware.com/charts/2009-desktop-cpu-charts/benchmarks,60.html
http://www.cpubenchmark.net/index.php

As far as I know, if you want dual processors (in addition to the cores 
and hyperthreads) then you probably are stuck with Xeon's. Also 
currently the new Xeon's tend to have a slightly higher clock speed than 
the i7 series (xeon w5580 is 3.2GHz) so, without overclocking, they tend 
to be faster. The story tends to change with overclocking.

If you overclock then the i7 920 appears to be widely recommended 
especially given the current US$900 difference. Really the i7 975 makes 
overclocking very easy but there are many guides on overclocking the i7 
920 to 3-4Ghz with aircooling. Overclocking xeons may be impossible or 
hard to do.

Bruce

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Best way to insert C code in numpy code

2009-09-22 Thread Sturla Molden
René Dudfield skrev:
> Another way is to make your C function then load it with ctypes
Also one should beware that ctypes is a stable part of the Python 
standard library.

Cython is still unstable and in rapid development.

Pyrex is more stabile than Cython, but interfacing with ndarrays is harder.

If you have a requirement on not using experimental code, then Cython is 
not an option.









___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Best way to insert C code in numpy code

2009-09-22 Thread Sturla Molden
René Dudfield skrev:
> Another way is to make your C function then load it with ctypes(or
> wrap it with something else) and pass it pointers with
> array.ctype.data.  

numpy.ctypeslib.ndpointer is preferred when using ndarrays with ctypes.

> You can find the shape of the array in python, and
> pass it to your C function.  The benefit is it's just C code, and you
> can avoid the GIL too if you want.  Then if you keep your C code
> separate from python stuff other people can use your C code in other
> languages more easily.
You can do this with Cython as well, just use Cython for the glue code. 
The important difference is this: Cython is a language for writing C 
extensions, ctypes is a module for calling DLLs.

One important advantage of Cython is deterministic clean-up code. If you 
put a __dealloc__ method in a "cdef class", it will be called on garbage 
collection.

Another nice way of interfacing C with numpy is f2py. It also works with 
C, not just Fortran.

Yet another way (Windows specific) is to use win32com.client and pass 
array.ctype.data. That is nice if you have an ActiveX control; for 
Windows you often get commercial libraries like that. Also if you have 
.NET or Java objects, you can easily expose them to COM.












___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] The problem with arrays

2009-09-22 Thread Fabrice Silva
Le mardi 22 septembre 2009 à 17:42 +0530, yogesh karpate a écrit :
> I just tried your idea but the result is same. it didnt help .
> 
> 2009/9/22 Nadav Horesh 
> A quick answer with going into the details of your code:
> 
> try
>  plt.plot(R_time,R_amp,'go',hold=1)
> (one line before the last)
> 
>  Nadav

You may separate the computation part and the plotting one by storing
your results in a R_time and a R_amp array (length a1 arrays).

Concerning the plotting issue : are you sure the points you want to be
displayed aren't yet? print the values within the loop :
>>> print (R_amp, R_time)
to check your values.
You may also inspect your graphs to see how many lines they have :
>>> plt.gca().get_children()
or 
>>> plt.gca().get_lines()
might help


-- 
Fabrice Silva 
LMA UPR CNRS 7051

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] I want to help with a numpy python 3.1.x port

2009-09-22 Thread René Dudfield
On Sun, Sep 20, 2009 at 7:15 PM, Robert Kern  wrote:
> On Sun, Sep 20, 2009 at 13:13, René Dudfield  wrote:
>> Hi again,
>>
>> I noticed numpy includes a copy of distutils.  I guess because it's
>> been modified in some way?
>
> numpy.distutils is a set of extensions to distutils; it is not a copy
> of distutils.
>

cool thanks.

btw, my work is in the 'work' branch here:
http://github.com/illume/numpy3k/tree/work

I probably could have called it something other than 'work'... but I
just copy/pasted from the guide... so there you go.

Only done a bit more on it so far... will try next weekend to get at
least the setup.py build stuff running.  I think that will be a good
first step for me to post a diff, and then try and get it merged into
svn.


cu!
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deserialization uncouples shared arrays

2009-09-22 Thread Hans Meine
On Tuesday 22 September 2009 13:14:55 Hrvoje Niksic wrote:
> Hans Meine wrote:
> > On Tuesday 22 September 2009 11:01:37 Hrvoje Niksic wrote:
> >> Is it intended for deserialization to uncouple arrays that share a
> >> common base?
> >
> > I think it's not really intended, but it's a limitation by design.
>
> I wonder why a "base" attribute is even restored, then?

Oh, is it?  Don't know if that makes sense, then.

> If there is no
> care to restore the shared views, then views could simply be serialized
> as arrays?

After your posting, I thought they were..

> > AFAIK, it's related to Luca Citi's recent "ultimate base" thread - you
> > simply cannot ensure serialization of arbitrary ndarrays in this way,
> > because they may point to *any* memory, not necessarily an
> > ndarray-allocated one.
>
> That's true in general.  I wonder if it's possible to restore shared
> arrays if (by virtue of "base" attribute) we know that the ndarray
> shares the memory of another ndarray.

For this special case, it looks possible (although slightly dangerous) to me.
It's also a common special case, so maybe you should open a ticket?

Ciao,
  Hans
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] The problem with arrays

2009-09-22 Thread yogesh karpate
I just tried your idea but the result is same. it didnt help .

2009/9/22 Nadav Horesh 

> A quick answer with going into the details of your code:
>
> try
>  plt.plot(R_time,R_amp,'go',hold=1)
> (one line before the last)
>
>  Nadav
>
> -הודעה מקורית-
> מאת: numpy-discussion-boun...@scipy.org בשם yogesh karpate
> נשלח: ג 22-ספטמבר-09 14:11
> אל: numpy-discussion@scipy.org
> נושא: [Numpy-discussion] The problem with arrays
>
> Please kindly go through following code snippet
>  for i in range(a1):
>data_temp=(bpf[left[0][i]:right[0][i]])# left is an array and right
> is also an array
>maxloc=data_temp.argmax()   #taking indices of  max. value of
> data segment
>maxval=data_temp[maxloc]
>minloc=data_temp.argmin()
>minval=data_temp[minloc]
>maxloc = maxloc-1+left # add offset of present location
>minloc = minloc-1+left # add offset of present location
>R_index = maxloc
>R_t = t[maxloc]
>R_amp = array([maxval])
>S_amp = minval#%%% Assuming the S-wave is the lowest
>#%%% amp in the given window
>#S_t = t[minloc]
>R_time=array([R_t[0][i]])
>plt.plot(R_time,R_amp,'go');
>plt.show()
> The thing is that I want to plot R_time and R_amp in a single shot.The
> above
> code plots  R_time and R_amp each time and overwriting previous value as
> the
> loop continues,i.e. it displays the many graphs each  indicating single
> point (R_time,R_amp) as long as loop continues.What  I want is that all
> points from R_time and R_amp should be plotted in  one go.  I tried to take
> array  and store the value ,but it takes only one value  at the end of
> loop.
> How should I break this loop?Can anybody help me out 
> Thanx in Advance
> Regards
> Yogesh
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] The problem with arrays

2009-09-22 Thread Nadav Horesh
A quick answer with going into the details of your code:

try
  plt.plot(R_time,R_amp,'go',hold=1)
(one line before the last)

  Nadav

-הודעה מקורית-
מאת: numpy-discussion-boun...@scipy.org בשם yogesh karpate
נשלח: ג 22-ספטמבר-09 14:11
אל: numpy-discussion@scipy.org
נושא: [Numpy-discussion] The problem with arrays
 
Please kindly go through following code snippet
 for i in range(a1):
data_temp=(bpf[left[0][i]:right[0][i]])# left is an array and right
is also an array
maxloc=data_temp.argmax()   #taking indices of  max. value of
data segment
maxval=data_temp[maxloc]
minloc=data_temp.argmin()
minval=data_temp[minloc]
maxloc = maxloc-1+left # add offset of present location
minloc = minloc-1+left # add offset of present location
R_index = maxloc
R_t = t[maxloc]
R_amp = array([maxval])
S_amp = minval#%%% Assuming the S-wave is the lowest
#%%% amp in the given window
#S_t = t[minloc]
R_time=array([R_t[0][i]])
plt.plot(R_time,R_amp,'go');
plt.show()
The thing is that I want to plot R_time and R_amp in a single shot.The above
code plots  R_time and R_amp each time and overwriting previous value as the
loop continues,i.e. it displays the many graphs each  indicating single
point (R_time,R_amp) as long as loop continues.What  I want is that all
points from R_time and R_amp should be plotted in  one go.  I tried to take
array  and store the value ,but it takes only one value  at the end of loop.
How should I break this loop?Can anybody help me out 
Thanx in Advance
Regards
Yogesh

<>___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deserialization uncouples shared arrays

2009-09-22 Thread Hrvoje Niksic
Hans Meine wrote:
> On Tuesday 22 September 2009 11:01:37 Hrvoje Niksic wrote:
>> Is it intended for deserialization to uncouple arrays that share a
>> common base?
> 
> I think it's not really intended, but it's a limitation by design.

I wonder why a "base" attribute is even restored, then?  If there is no 
care to restore the shared views, then views could simply be serialized 
as arrays?

> AFAIK, it's related to Luca Citi's recent "ultimate base" thread - you simply 
> cannot ensure serialization of arbitrary ndarrays in this way, because they 
> may point to *any* memory, not necessarily an ndarray-allocated one.

That's true in general.  I wonder if it's possible to restore shared 
arrays if (by virtue of "base" attribute) we know that the ndarray 
shares the memory of another ndarray.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] The problem with arrays

2009-09-22 Thread yogesh karpate
Please kindly go through following code snippet
 for i in range(a1):
data_temp=(bpf[left[0][i]:right[0][i]])# left is an array and right
is also an array
maxloc=data_temp.argmax()   #taking indices of  max. value of
data segment
maxval=data_temp[maxloc]
minloc=data_temp.argmin()
minval=data_temp[minloc]
maxloc = maxloc-1+left # add offset of present location
minloc = minloc-1+left # add offset of present location
R_index = maxloc
R_t = t[maxloc]
R_amp = array([maxval])
S_amp = minval#%%% Assuming the S-wave is the lowest
#%%% amp in the given window
#S_t = t[minloc]
R_time=array([R_t[0][i]])
plt.plot(R_time,R_amp,'go');
plt.show()
The thing is that I want to plot R_time and R_amp in a single shot.The above
code plots  R_time and R_amp each time and overwriting previous value as the
loop continues,i.e. it displays the many graphs each  indicating single
point (R_time,R_amp) as long as loop continues.What  I want is that all
points from R_time and R_amp should be plotted in  one go.  I tried to take
array  and store the value ,but it takes only one value  at the end of loop.
How should I break this loop?Can anybody help me out 
Thanx in Advance
Regards
Yogesh
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deserialization uncouples shared arrays

2009-09-22 Thread Hans Meine
On Tuesday 22 September 2009 11:01:37 Hrvoje Niksic wrote:
> Is it intended for deserialization to uncouple arrays that share a
> common base?

I think it's not really intended, but it's a limitation by design.
AFAIK, it's related to Luca Citi's recent "ultimate base" thread - you simply 
cannot ensure serialization of arbitrary ndarrays in this way, because they 
may point to *any* memory, not necessarily an ndarray-allocated one.
(Think of internal camera framebuffers, memory allocated by 3rd party 
libraries, memmapped areas, etc.)

HTH,
  Hans
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Best way to insert C code in numpy code

2009-09-22 Thread René Dudfield
On Mon, Sep 21, 2009 at 8:12 PM, David Warde-Farley  wrote:
> On 21-Sep-09, at 2:55 PM, Xavier Gnata wrote:
>
>> Should I read that to learn you cython and numpy interact?
>> Or is there another best documentation (with examples...)?
>
> You should have a look at the Bresenham algorithm thread you posted. I
> went to the trouble of converting some Python code for Bresenham's
> algorithm to Cython, and a pointer to the Cython+NumPy tutorial:
>
> http://wiki.cython.org/tutorials/numpy
>
> David

I don't know about the best way... but here are two approaches I like...

Another way is to make your C function then load it with ctypes(or
wrap it with something else) and pass it pointers with
array.ctype.data.  You can find the shape of the array in python, and
pass it to your C function.  The benefit is it's just C code, and you
can avoid the GIL too if you want.  Then if you keep your C code
separate from python stuff other people can use your C code in other
languages more easily.

cinpy is another one(similar to weave) which can be good for fast
prototyping... http://www.cs.tut.fi/~ask/cinpy/ or for changing the
code to fit your data.


cheers,
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Best way to insert C code in numpy code

2009-09-22 Thread Chris Colbert
I give my vote to cython as well. I have a program which uses cython
for a portion simply because it was easier using a simple C for-loop
to do what i wanted rather than beating numpy into submission. It was
an order of magnitude faster as well.

Cheers,

Chris

On Mon, Sep 21, 2009 at 9:12 PM, David Warde-Farley  wrote:
> On 21-Sep-09, at 2:55 PM, Xavier Gnata wrote:
>
>> Should I read that to learn you cython and numpy interact?
>> Or is there another best documentation (with examples...)?
>
> You should have a look at the Bresenham algorithm thread you posted. I
> went to the trouble of converting some Python code for Bresenham's
> algorithm to Cython, and a pointer to the Cython+NumPy tutorial:
>
> http://wiki.cython.org/tutorials/numpy
>
> David
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Deserialization uncouples shared arrays

2009-09-22 Thread Hrvoje Niksic
Is it intended for deserialization to uncouple arrays that share a 
common base?  For example:

 >>> import numpy, cPickle as p
 >>> a = numpy.array([1, 2, 3])   # base array
 >>> b = a[:] # view one
 >>> b
array([1, 2, 3])
 >>> c = a[::-1]  # view two
 >>> c
array([3, 2, 1])
 >>> b.base is c.base
True

Arrays in b and c now share a common base, so changing the contents of 
one affects the other:

 >>> b[0] = 10
 >>> b
array([10,  2,  3])
 >>> c
array([ 3,  2, 10])

After serialization, the two arrays are effectively uncoupled, creating 
a different situation than before serialization:

 >>> d, e = p.loads(p.dumps((b, c), -1))
 >>> d
array([10,  2,  3])
 >>> e
array([ 3,  2, 10])
 >>> d.base is e.base
False

 >>> d[0] = 11
 >>> d
array([11,  2,  3])
 >>> e
array([ 3,  2, 10])

Is this behavior intentional, or is it an artifact of the 
implementation?  Can it be relied upon not to change in a future release?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy question: Best hardware for Numpy?

2009-09-22 Thread Romain Brette
David Warde-Farley a écrit :
> On 21-Sep-09, at 10:53 AM, David Cournapeau wrote:
> 
>> Concerning the hardware, I have just bought a core i7 (the cheapest
>> model is ~ 200$ now, with 4 cores and 8 Mb of shared cache), and the
>> thing flies for floating point computation. My last computer was a
>> pentium 4 so I don't have a lot of reference, but you can compute ~
>> 300e6 exp (assuming a contiguous array), and ATLAS 3.8.3 built on it
>> is extremely fast - using the threaded version, the asymptotic peak
>> performances are quite impressive. It takes for example 14s to inverse
>> a 5000x5000 matrix of double.
> 
> I thought you had a Macbook too?
> 
> The Core i5 750 seems like a good buy right now as well. A bit  
> cheaper, 4 cores and 8Mb of shared cache though at a slightly lower  
> clock speed.
> 
> David

How about the Core i7 975 (Extreme)? 
http://www.intel.com/performance/desktop/extreme.htm

I am wondering if it is worth the extra money.

Best,
Romain

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] polynomial ring dtype

2009-09-22 Thread Sebastian Walter
This is somewhat similar to the question about fixed-point arithmetic
earlier on this mailing list.

I need to do computations on arrays whose elements are truncated polynomials.
At the momement, I have implemented the univariate truncated
polynomials as objects of a class UTPS.

The class basically looks like this:

class UTPS:
def __init__(self, taylor_coeffs):
"""  polynomial x(t) =  tc[0] + tc[1] t + tc[2] t^2 + tc[3]
t^3 + ... """
self.tc = numpy.asarray(taylor_coeffs)

def __add__(self, rhs):
return UTPS(self.tc + rhs.tc)

def sin(self):
# numpy.sin(self) apparently automatically calls self.sin()
which is very cool

etc

One can create arrays of UTPS instances  like this:
x = numpy.array( [[UTPS([1,2]), UTPS([3,4])], [UTPS([0,1]), UTPS([4,3])]])

and perform funcs and ufuncs on it

y = numpy.sum(x)
y = numy.sin(x)
y = numpy.dot(numpy.eye(2), x)

This works out of the box, which is very nice.

my question:
Is it possible to speed up the computation by defining a special dtype
for truncated polynomials? Especially when the arrays get large,
computing on arrays of objects is quite slow. I had a look at the
numpy svn trunk but couldn't find any clues.

If you are interested, you can have a look at the full pre alpha
version code (BSD licence) at http://github.com/b45ch1/algopy .

regards,
Sebastian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy and cython in pure python mode

2009-09-22 Thread Sturla Molden
Sebastian Haase skrev:
> I know that cython's numpy is still getting better and better over
> time, but is it already today possible to have numpy support when
> using Cython in "pure python" mode?
>   
I'm not sure. There is this odd memoryview syntax:

import cython
view = cython.int[:,:](my2darray)

print view[3,4] # fast when compiled, according to Sverre



S.M.



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] something wrong with docs?

2009-09-22 Thread Pauli Virtanen
Mon, 21 Sep 2009 18:49:47 -0700, Fernando Perez wrote:

> On Mon, Sep 21, 2009 at 11:32 AM, Pauli Virtanen  wrote:
>> The `sphinx.ext.doctest` extension is not enabled, so the testcode::
>> etc. directives are not available. I'm not sure if it should be enabled
>> -- it would be cleaner to just replace the testcode:: stuff with the
>> ordinary example markup.
>>
>>
> Why not enable it?  It would be nice if we could move gradually towards
> docs whose examples (at least those marked as such) were always run via
> sphinx.  The more we do this, the higher the chances of non-zero overlap
> between documentation and reality :)

I think sphinx.ext.doctest is able to also test the ordinary >>> marked-
up examples, so there'd be no large need for new directives.

But oh well, I suppose enabling it can't hurt much.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion