Re: [Numpy-discussion] 1.8 release

2013-01-14 Thread Nathaniel Smith
On Mon, Jan 14, 2013 at 4:45 PM, Frédéric Bastien  wrote:
> I don't volontear for the next release manager, but +1 for shorter
> releases. I heard just good comments from that. Also, I'm not sure it
> would ask more from the release manager. Do someone have an idea? The
> most work I do as a release manager for theano is the
> preparation/tests/release notes and this depend on the amont of new
> stuff. And this seam exponential on the number of new changes in the
> release, not linear (no data, just an impression...). Making smaller
> release make this easier.
>
> But yes, this mean more announces. But this isn't what take the most
> times. Also, doing the release notes more frequently mean it is more
> recent in memory when you check the PR merged, so it make it easier to
> do.

Right, this is my experience too -- that it's actually easier to put
out more releases, because each one is manageable and you get a
routine going. ("Oops, it's March, better find an hour this week to
check the release notes and run the 'release beta1' script.") It
becomes almost boring, which is awesome. Putting out 5 small releases
is much, MUCH easier than putting out one giant 5x bigger release.

On Mon, Jan 14, 2013 at 9:26 PM, Ralf Gommers  wrote:
> +1 for faster and time-based releases.
>
> 3 months does sound a little too short to me (5 or 6 would be better), since
> a release cycle typically doesn't fit in one month.

The release cycle for 6-12+ months of changes doesn't typically fit in
one month, but we've never tried for a smaller release, so who knows.
I suppose that theoretically, as scientists, what we ought to do is to
attempt 1-2 releases at as aggressive a pace as we can imagine to see
how it goes, and then we'll have the data to interpolate the correct
speed instead of extrapolating... ;-)

On Mon, Jan 14, 2013 at 12:14 AM, Charles R Harris
 wrote:
> I think three months is a bit short. Much will depend on the release manager
> and I not sure what  Andrej's plans are. I'd happily nominate you for that
> role ;)

Careful, or I'll nominate you back! ;-) Seriously, though, Ondrej is
doing a great job, I doubt I'd do as well...

Ondrej: I know you're still doing heroic work getting 1.7 pulled
together, but if you have a moment-- Are you planning to stick around
as release manager after 1.7? And if so, what are your thoughts on
attempting such a short cycle?

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 1.8 release

2013-01-14 Thread Ralf Gommers
On Mon, Jan 14, 2013 at 1:19 AM, David Cournapeau wrote:

> On Sun, Jan 13, 2013 at 5:26 PM, Nathaniel Smith  wrote:
> > On Sun, Jan 13, 2013 at 7:03 PM, Charles R Harris
> >  wrote:
> >> Now that 1.7 is nearing release, it's time to look forward to the 1.8
> >> release. I'd like us to get back to the twice yearly schedule that we
> tried
> >> to maintain through the 1.3 - 1.6 releases, so I propose a June release
> as a
> >> goal. Call it the Spring Cleaning release. As to content, I'd like to
> see
> >> the following.
> >>
> >> Removal of Python 2.4-2.5 support.
> >> Removal of SCons support.
> >> The index work consolidated.
> >> Initial stab at removing the need for 2to3. See Pauli's PR for scipy.
> >> Miscellaneous enhancements and fixes.
> >
> > I'd actually like to propose a faster release cycle than this, even.
> > Perhaps 3 months between releases; 2 months from release n to the
> > first beta of n+1?
> >
> > The consequences would be:
> > * Changes get out to users faster.
> > * Each release is smaller, so it's easier for downstream projects to
> > adjust to each release -- instead of having this giant pile of changes
> > to work through all at once every 6-12 months
> > * End-users are less scared of updating, because the changes aren't so
> > overwhelming, so they end up actually testing (and getting to take
> > advantage of) the new stuff more.
> > * We get feedback more quickly, so we can fix up whatever we break
> > while we still know what we did.
> > * And for larger changes, if we release them incrementally, we can get
> > feedback before we've gone miles down the wrong path.
> > * Releases come out on time more often -- sort of paradoxical, but
> > with small, frequent releases, beta cycles go smoother, and it's
> > easier to say "don't worry, I'll get it ready for next time", or
> > "right, that patch was less done than we thought, let's take it out
> > for now" (also this is much easier if we don't have another years
> > worth of changes committed on top of the patch!).
> > * If your schedule does slip, then you still end up with a <6 month
> > release cycle.
> >
> > 1.6.x was branched from master in March 2011 and released in May 2011.
> > 1.7.x was branched from master in July 2012 and still isn't out. But
> > at least we've finally found and fixed the second to last bug!
> >
> > Wouldn't it be nice to have a 2-4 week beta cycle that only found
> > trivial and expected problems? We *already* have 6 months worth of
> > feature work in master that won't be in the *next* release.
> >
> > Note 1: if we do do this, then we'll also want to rethink the
> > deprecation cycle a bit -- right now we've sort of vaguely been saying
> > "well, we'll deprecate it in release n and take it out in n+1.
> > Whenever that is". 3 months definitely isn't long enough for a
> > deprecation period, so if we do do this then we'll want to deprecate
> > things for multiple releases before actually removing them. Details to
> > be determined.
> >
> > Note 2: in this kind of release schedule, you definitely don't want to
> > say "here are the features that will be in the next release!", because
> > then you end up slipping and sliding all over the place. Instead you
> > say "here are some things that I want to work on next, and we'll see
> > which release they end up in". Since we're already following the rule
> > that nothing goes into master until it's done and tested and ready for
> > release anyway, this doesn't really change much.
> >
> > Thoughts?
>
> Hey, my time to have a time-machine:
> http://mail.scipy.org/pipermail/numpy-discussion/2008-May/033754.html
>
> I still think it is a good idea :)
>

+1 for faster and time-based releases.

3 months does sound a little too short to me (5 or 6 would be better),
since a release cycle typically doesn't fit in one month.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Alan G Isaac
Thanks Pierre for noting that np.tile already
provides a chunk of this functionality:

 >>> a = np.tile(5,(1,2,3))
 >>> a
array([[[5, 5, 5],
 [5, 5, 5]]])
 >>> np.tile(1,a.shape)
array([[[1, 1, 1],
 [1, 1, 1]]])

I had not realized a scalar first argument was possible.

Alan Isaac
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Benjamin Root
On Mon, Jan 14, 2013 at 1:56 PM, David Warde-Farley <
d.warde.far...@gmail.com> wrote:

> On Mon, Jan 14, 2013 at 1:12 PM, Pierre Haessig
>  wrote:
> > In [8]: tile(nan, (3,3)) # (it's a verb ! )
>
> tile, in my opinion, is useful in some cases (for people who think in
> terms of repmat()) but not very NumPy-ish. What I'd like is a function
> that takes
>
> - an initial array_like "a"
> - a shape "s"
> - optionally, a dtype (otherwise inherit from a)
>
> and broadcasts "a" to the shape "s". In the case of scalars this is
> just a fill. In the case of, say, a (5,) vector and a (10, 5) shape,
> this broadcasts across rows, etc.
>
> I don't think it's worth special-casing scalar fills (except perhaps
> as an implementation detail) when you have rich broadcasting semantics
> that are already a fundamental part of NumPy, allowing for a much
> handier primitive.
>

I have similar problems with "tile".  I learned it for a particular use in
numpy, and it would be hard for me to see it for another (contextually)
different use.

I do like the way you are thinking in terms of the broadcasting semantics,
but I wonder if that is a bit awkward.  What I mean is, if one were to use
broadcasting semantics for creating an array, wouldn't one have just simply
used broadcasting anyway?  The point of broadcasting is to _avoid_ the
creation of unneeded arrays.  But maybe I can be convinced with some
examples.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread David Warde-Farley
On Mon, Jan 14, 2013 at 1:12 PM, Pierre Haessig
 wrote:
> In [8]: tile(nan, (3,3)) # (it's a verb ! )

tile, in my opinion, is useful in some cases (for people who think in
terms of repmat()) but not very NumPy-ish. What I'd like is a function
that takes

- an initial array_like "a"
- a shape "s"
- optionally, a dtype (otherwise inherit from a)

and broadcasts "a" to the shape "s". In the case of scalars this is
just a fill. In the case of, say, a (5,) vector and a (10, 5) shape,
this broadcasts across rows, etc.

I don't think it's worth special-casing scalar fills (except perhaps
as an implementation detail) when you have rich broadcasting semantics
that are already a fundamental part of NumPy, allowing for a much
handier primitive.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread David Warde-Farley
On Mon, Jan 14, 2013 at 9:57 AM, Benjamin Root  wrote:
>
>
> On Mon, Jan 14, 2013 at 7:38 AM, Pierre Haessig 
> wrote:
>>
>> Hi,
>>
>> Le 14/01/2013 00:39, Nathaniel Smith a écrit :
>> > (The nice thing about np.filled() is that it makes np.zeros() and
>> > np.ones() feel like clutter, rather than the reverse... not that I'm
>> > suggesting ever getting rid of them, but it makes the API conceptually
>> > feel smaller, not larger.)
>> Coming from the Matlab syntax, I feel that np.zeros and np.ones are in
>> numpy for Matlab (and maybe others ?) compatibilty and are useful for
>> that. Now that I've been "enlightened" by Python, I think that those
>> functions (especially np.ones) are indeed clutter. Therefore I favor the
>> introduction of these two new functions.
>>
>> However, I think Eric's remark about masked array API compatibility is
>> important. I don't know what other names are possible ? np.const ?
>>
>> Or maybe np.tile is also useful for that same purpose ? In that case
>> adding a dtype argument to np.tile would be useful.
>>
>> best,
>> Pierre
>>
>
> I am also +1 on the idea of having a filled() and filled_like() function (I
> learned a long time ago to just do a = np.empty() and a.fill() rather than
> the multiplication trick I learned from Matlab).  However, the collision
> with the masked array API is a non-starter for me.  np.const() and
> np.const_like() probably make the most sense, but I would prefer a verb over
> a noun.

Definitely -1 on const. Falsely implies immutability, to my mind.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Pierre Haessig
Le 14/01/2013 18:33, Benjamin Root a écrit :
>
>
> How about "initialized()"?
>
>
> A verb! +1 from me!
Shouldn't it be "initialize()" then ? I'm not so fond of it though,
because initialize is pretty broad in the field of programming.

What about "refurbishing" the already existing "tile()" function ? As of
now it almost does the job :

In [8]: tile(nan, (3,3)) # (it's a verb ! )
Out[8]:
array([[ nan,  nan,  nan],
   [ nan,  nan,  nan],
   [ nan,  nan,  nan]])


 though with two restrictions:
 * tile doesn't have a dtype keyword. Could this be added ?
 * tile performance on my computer seems to be twice as bad as "ones() *
val"

Best,
Pierre


signature.asc
Description: OpenPGP digital signature
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Charles R Harris
On Sun, Jan 13, 2013 at 4:24 PM, Robert Kern  wrote:

> On Sun, Jan 13, 2013 at 6:27 PM, Nathaniel Smith  wrote:
> > Hi all,
> >
> > PR 2875 adds two new functions, that generalize zeros(), ones(),
> > zeros_like(), ones_like(), by simply taking an arbitrary fill value:
> >   https://github.com/numpy/numpy/pull/2875
> > So
> >   np.ones((10, 10))
> > is the same as
> >   np.filled((10, 10), 1)
> >
> > The implementations are trivial, but the API seems useful because it
> > provides an idiomatic way of efficiently creating an array full of
> > inf, or nan, or None, whatever funny value you need. All the
> > alternatives are either inefficient (np.ones(...) * np.inf) or
> > cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
> > there's a question of taste here; one could argue instead that these
> > just add more clutter to the numpy namespace. So, before we merge,
> > anyone want to chime in?
>
> One alternative that does not expand the API with two-liners is to let
> the ndarray.fill() method return self:
>
>   a = np.empty(...).fill(20.0)
>
>
My thought also. Shades of the Python `.sort` method...

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Benjamin Root
On Mon, Jan 14, 2013 at 12:27 PM, Eric Firing  wrote:

> On 2013/01/14 6:15 AM, Olivier Delalleau wrote:
> > - I agree the name collision with np.ma.filled is a problem. I have no
> > better suggestion though at this point.
>
> How about "initialized()"?
>

A verb! +1 from me!

For those wondering, I have a personal rule that because functions *do*
something, they really should have verbs for their names.  I have to learn
to read functions like "ones" and "empty" like "give me ones" or "give me
an empty array".

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Eric Firing
On 2013/01/14 6:15 AM, Olivier Delalleau wrote:
> - I agree the name collision with np.ma.filled is a problem. I have no
> better suggestion though at this point.

How about "initialized()"?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: vals and vals_like or filled, filled_like?

2013-01-14 Thread Alan G Isaac
Just changing the subject line so a good suggestion
does not get lost ...

Alan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread josef . pktd
On Mon, Jan 14, 2013 at 11:22 AM,   wrote:
> On Mon, Jan 14, 2013 at 11:15 AM, Olivier Delalleau  wrote:
>> 2013/1/14 Matthew Brett :
>>> Hi,
>>>
>>> On Mon, Jan 14, 2013 at 9:02 AM, Dave Hirschfeld
>>>  wrote:
 Robert Kern  gmail.com> writes:

>
> >>> >
> >>> > One alternative that does not expand the API with two-liners is to 
> >>> > let
> >>> > the ndarray.fill() method return self:
> >>> >
> >>> >   a = np.empty(...).fill(20.0)
> >>>
> >>> This violates the convention that in-place operations never return
> >>> self, to avoid confusion with out-of-place operations. E.g.
> >>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
> >>> np.sort(), and in the broader Python world, list.sort() versus
> >>> sorted(), list.reverse() versus reversed(). (This was an explicit
> >>> reason given for list.sort to not return self, even.)
> >>>
> >>> Maybe enabling this idiom is a good enough reason to break the
> >>> convention ("Special cases aren't special enough to break the rules. /
> >>> Although practicality beats purity"), but it at least makes me -0 on
> >>> this...
> >>>
> >>
> >> I tend to agree with the notion that inplace operations shouldn't 
> >> return
> >> self, but I don't know if it's just because I've been conditioned this 
> >> way.
> >> Not returning self breaks the fluid interface pattern [1], as noted in 
> >> a
> >> similar discussion on pandas [2], FWIW, though there's likely some way 
> >> to
> >> have both worlds.
> >
> > Ah-hah, here's the email where Guide officially proclaims that there
> > shall be no "fluent interface" nonsense applied to in-place operators
> > in Python, because it hurts readability (at least for Dutch people
> > ):
> >   http://mail.python.org/pipermail/python-dev/2003-October/038855.html
>
> That's a statement about the policy for the stdlib, and just one
> person's opinion. You, and numpy, are permitted to have a different
> opinion.
>
> In any case, I'm not strongly advocating for it. It's violation of
> principle ("no fluent interfaces") is roughly in the same ballpark as
> np.filled() ("not every two-liner needs its own function"), so I
> thought I would toss it out there for consideration.
>
> --
> Robert Kern
>

 FWIW I'm +1 on the idea. Perhaps because I just don't see many practical
 downsides to breaking the convention but I regularly see a big issue with 
 there
 being no way to instantiate an array with a particular value.

 The one obvious way to do it is use ones and multiply by the value you 
 want. I
 work with a lot of inexperienced programmers and I see this idiom all the 
 time.
 It takes a fair amount of numpy knowledge to know that you should do it in 
 two
 lines by using empty and setting a slice.

 In [1]: %timeit NaN*ones(1)
 1000 loops, best of 3: 1.74 ms per loop

 In [2]: %%timeit
...: x = empty(1, dtype=float)
...: x[:] = NaN
...:
 1 loops, best of 3: 28 us per loop

 In [3]: 1.74e-3/28e-6
 Out[3]: 62.142857142857146


 Even when not in the mythical "tight loop" setting an array to one and then
 multiplying uses up a lot of cycles - it's nearly 2 orders of magnitude 
 slower
 than what we know they *should* be doing.

 I'm agnostic as to whether fill should be modified or new functions 
 provided but
 I think numpy is currently missing this functionality and that providing it
 would save a lot of new users from shooting themselves in the foot 
 performance-
 wise.
>>>
>>> Is this a fair summary?
>>>
>>> => fill(shape, val), fill_like(arr, val) - new functions, as proposed
>>> For: readable, seems to fit a pattern often used, presence in
>>> namespace may clue people into using the 'fill' rather than * val or +
>>> val
>>> Con: a very simple alias for a = ones(shape) ; a.fill(val), maybe
>>> cluttering already full namespace.
>>>
>>> => empty(shape).fill(val) - by allowing return value from arr.fill(val)
>>> For: readable
>>> Con: breaks guideline not to return anything from in-place operations,
>>> no presence in namespace means users may not find this pattern.
>>>
>>> => no new API
>>> For : easy maintenance
>>> Con : harder for users to discover fill pattern, filling a new array
>>> requires two lines instead of one.
>>>
>>> So maybe the decision rests on:
>>>
>>> How important is it that users see these function names in the
>>> namespace in order to discover the pattern "a = ones(shape) ;
>>> a.fill(val)"?
>>>
>>> How important is it to obey guidelines for no-return-from-in-place?
>>>
>>> How important is it to avoid expanding the namespace?
>>>
>>> How common is this pattern?
>>>
>>> On the last, I'd say that the only common use I have for this p

Re: [Numpy-discussion] 1.8 release

2013-01-14 Thread Frédéric Bastien
Hi,

I don't volontear for the next release manager, but +1 for shorter
releases. I heard just good comments from that. Also, I'm not sure it
would ask more from the release manager. Do someone have an idea? The
most work I do as a release manager for theano is the
preparation/tests/release notes and this depend on the amont of new
stuff. And this seam exponential on the number of new changes in the
release, not linear (no data, just an impression...). Making smaller
release make this easier.

But yes, this mean more announces. But this isn't what take the most
times. Also, doing the release notes more frequently mean it is more
recent in memory when you check the PR merged, so it make it easier to
do.

But what prevent us from making shorter release? Oother priorities
that can't wait, like work for papers to submit, or for collaboration
with partners.

just my 2cents.

Fred

On Mon, Jan 14, 2013 at 7:18 AM, Matthew Brett  wrote:
> Hi,
>
> On Mon, Jan 14, 2013 at 12:19 AM, David Cournapeau  wrote:
>> On Sun, Jan 13, 2013 at 5:26 PM, Nathaniel Smith  wrote:
>>> On Sun, Jan 13, 2013 at 7:03 PM, Charles R Harris
>>>  wrote:
 Now that 1.7 is nearing release, it's time to look forward to the 1.8
 release. I'd like us to get back to the twice yearly schedule that we tried
 to maintain through the 1.3 - 1.6 releases, so I propose a June release as 
 a
 goal. Call it the Spring Cleaning release. As to content, I'd like to see
 the following.

 Removal of Python 2.4-2.5 support.
 Removal of SCons support.
 The index work consolidated.
 Initial stab at removing the need for 2to3. See Pauli's PR for scipy.
 Miscellaneous enhancements and fixes.
>>>
>>> I'd actually like to propose a faster release cycle than this, even.
>>> Perhaps 3 months between releases; 2 months from release n to the
>>> first beta of n+1?
>>>
>>> The consequences would be:
>>> * Changes get out to users faster.
>>> * Each release is smaller, so it's easier for downstream projects to
>>> adjust to each release -- instead of having this giant pile of changes
>>> to work through all at once every 6-12 months
>>> * End-users are less scared of updating, because the changes aren't so
>>> overwhelming, so they end up actually testing (and getting to take
>>> advantage of) the new stuff more.
>>> * We get feedback more quickly, so we can fix up whatever we break
>>> while we still know what we did.
>>> * And for larger changes, if we release them incrementally, we can get
>>> feedback before we've gone miles down the wrong path.
>>> * Releases come out on time more often -- sort of paradoxical, but
>>> with small, frequent releases, beta cycles go smoother, and it's
>>> easier to say "don't worry, I'll get it ready for next time", or
>>> "right, that patch was less done than we thought, let's take it out
>>> for now" (also this is much easier if we don't have another years
>>> worth of changes committed on top of the patch!).
>>> * If your schedule does slip, then you still end up with a <6 month
>>> release cycle.
>>>
>>> 1.6.x was branched from master in March 2011 and released in May 2011.
>>> 1.7.x was branched from master in July 2012 and still isn't out. But
>>> at least we've finally found and fixed the second to last bug!
>>>
>>> Wouldn't it be nice to have a 2-4 week beta cycle that only found
>>> trivial and expected problems? We *already* have 6 months worth of
>>> feature work in master that won't be in the *next* release.
>>>
>>> Note 1: if we do do this, then we'll also want to rethink the
>>> deprecation cycle a bit -- right now we've sort of vaguely been saying
>>> "well, we'll deprecate it in release n and take it out in n+1.
>>> Whenever that is". 3 months definitely isn't long enough for a
>>> deprecation period, so if we do do this then we'll want to deprecate
>>> things for multiple releases before actually removing them. Details to
>>> be determined.
>>>
>>> Note 2: in this kind of release schedule, you definitely don't want to
>>> say "here are the features that will be in the next release!", because
>>> then you end up slipping and sliding all over the place. Instead you
>>> say "here are some things that I want to work on next, and we'll see
>>> which release they end up in". Since we're already following the rule
>>> that nothing goes into master until it's done and tested and ready for
>>> release anyway, this doesn't really change much.
>>>
>>> Thoughts?
>>
>> Hey, my time to have a time-machine:
>> http://mail.scipy.org/pipermail/numpy-discussion/2008-May/033754.html
>>
>> I still think it is a good idea :)
>
> I guess it is the release manager who has by far the largest say in
> this.  Who will that be for the next year or so?
>
> Best,
>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discuss

Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread josef . pktd
On Mon, Jan 14, 2013 at 11:15 AM, Olivier Delalleau  wrote:
> 2013/1/14 Matthew Brett :
>> Hi,
>>
>> On Mon, Jan 14, 2013 at 9:02 AM, Dave Hirschfeld
>>  wrote:
>>> Robert Kern  gmail.com> writes:
>>>

 >>> >
 >>> > One alternative that does not expand the API with two-liners is to 
 >>> > let
 >>> > the ndarray.fill() method return self:
 >>> >
 >>> >   a = np.empty(...).fill(20.0)
 >>>
 >>> This violates the convention that in-place operations never return
 >>> self, to avoid confusion with out-of-place operations. E.g.
 >>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
 >>> np.sort(), and in the broader Python world, list.sort() versus
 >>> sorted(), list.reverse() versus reversed(). (This was an explicit
 >>> reason given for list.sort to not return self, even.)
 >>>
 >>> Maybe enabling this idiom is a good enough reason to break the
 >>> convention ("Special cases aren't special enough to break the rules. /
 >>> Although practicality beats purity"), but it at least makes me -0 on
 >>> this...
 >>>
 >>
 >> I tend to agree with the notion that inplace operations shouldn't return
 >> self, but I don't know if it's just because I've been conditioned this 
 >> way.
 >> Not returning self breaks the fluid interface pattern [1], as noted in a
 >> similar discussion on pandas [2], FWIW, though there's likely some way 
 >> to
 >> have both worlds.
 >
 > Ah-hah, here's the email where Guide officially proclaims that there
 > shall be no "fluent interface" nonsense applied to in-place operators
 > in Python, because it hurts readability (at least for Dutch people
 > ):
 >   http://mail.python.org/pipermail/python-dev/2003-October/038855.html

 That's a statement about the policy for the stdlib, and just one
 person's opinion. You, and numpy, are permitted to have a different
 opinion.

 In any case, I'm not strongly advocating for it. It's violation of
 principle ("no fluent interfaces") is roughly in the same ballpark as
 np.filled() ("not every two-liner needs its own function"), so I
 thought I would toss it out there for consideration.

 --
 Robert Kern

>>>
>>> FWIW I'm +1 on the idea. Perhaps because I just don't see many practical
>>> downsides to breaking the convention but I regularly see a big issue with 
>>> there
>>> being no way to instantiate an array with a particular value.
>>>
>>> The one obvious way to do it is use ones and multiply by the value you 
>>> want. I
>>> work with a lot of inexperienced programmers and I see this idiom all the 
>>> time.
>>> It takes a fair amount of numpy knowledge to know that you should do it in 
>>> two
>>> lines by using empty and setting a slice.
>>>
>>> In [1]: %timeit NaN*ones(1)
>>> 1000 loops, best of 3: 1.74 ms per loop
>>>
>>> In [2]: %%timeit
>>>...: x = empty(1, dtype=float)
>>>...: x[:] = NaN
>>>...:
>>> 1 loops, best of 3: 28 us per loop
>>>
>>> In [3]: 1.74e-3/28e-6
>>> Out[3]: 62.142857142857146
>>>
>>>
>>> Even when not in the mythical "tight loop" setting an array to one and then
>>> multiplying uses up a lot of cycles - it's nearly 2 orders of magnitude 
>>> slower
>>> than what we know they *should* be doing.
>>>
>>> I'm agnostic as to whether fill should be modified or new functions 
>>> provided but
>>> I think numpy is currently missing this functionality and that providing it
>>> would save a lot of new users from shooting themselves in the foot 
>>> performance-
>>> wise.
>>
>> Is this a fair summary?
>>
>> => fill(shape, val), fill_like(arr, val) - new functions, as proposed
>> For: readable, seems to fit a pattern often used, presence in
>> namespace may clue people into using the 'fill' rather than * val or +
>> val
>> Con: a very simple alias for a = ones(shape) ; a.fill(val), maybe
>> cluttering already full namespace.
>>
>> => empty(shape).fill(val) - by allowing return value from arr.fill(val)
>> For: readable
>> Con: breaks guideline not to return anything from in-place operations,
>> no presence in namespace means users may not find this pattern.
>>
>> => no new API
>> For : easy maintenance
>> Con : harder for users to discover fill pattern, filling a new array
>> requires two lines instead of one.
>>
>> So maybe the decision rests on:
>>
>> How important is it that users see these function names in the
>> namespace in order to discover the pattern "a = ones(shape) ;
>> a.fill(val)"?
>>
>> How important is it to obey guidelines for no-return-from-in-place?
>>
>> How important is it to avoid expanding the namespace?
>>
>> How common is this pattern?
>>
>> On the last, I'd say that the only common use I have for this pattern
>> is to fill an array with NaN.
>
> My 2 cts from a user perspective:
>
> - +1 to have such a function. I usually use numpy.ones * scalar
> because honestly, spending two lines of co

Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Olivier Delalleau
2013/1/14 Matthew Brett :
> Hi,
>
> On Mon, Jan 14, 2013 at 9:02 AM, Dave Hirschfeld
>  wrote:
>> Robert Kern  gmail.com> writes:
>>
>>>
>>> >>> >
>>> >>> > One alternative that does not expand the API with two-liners is to let
>>> >>> > the ndarray.fill() method return self:
>>> >>> >
>>> >>> >   a = np.empty(...).fill(20.0)
>>> >>>
>>> >>> This violates the convention that in-place operations never return
>>> >>> self, to avoid confusion with out-of-place operations. E.g.
>>> >>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
>>> >>> np.sort(), and in the broader Python world, list.sort() versus
>>> >>> sorted(), list.reverse() versus reversed(). (This was an explicit
>>> >>> reason given for list.sort to not return self, even.)
>>> >>>
>>> >>> Maybe enabling this idiom is a good enough reason to break the
>>> >>> convention ("Special cases aren't special enough to break the rules. /
>>> >>> Although practicality beats purity"), but it at least makes me -0 on
>>> >>> this...
>>> >>>
>>> >>
>>> >> I tend to agree with the notion that inplace operations shouldn't return
>>> >> self, but I don't know if it's just because I've been conditioned this 
>>> >> way.
>>> >> Not returning self breaks the fluid interface pattern [1], as noted in a
>>> >> similar discussion on pandas [2], FWIW, though there's likely some way to
>>> >> have both worlds.
>>> >
>>> > Ah-hah, here's the email where Guide officially proclaims that there
>>> > shall be no "fluent interface" nonsense applied to in-place operators
>>> > in Python, because it hurts readability (at least for Dutch people
>>> > ):
>>> >   http://mail.python.org/pipermail/python-dev/2003-October/038855.html
>>>
>>> That's a statement about the policy for the stdlib, and just one
>>> person's opinion. You, and numpy, are permitted to have a different
>>> opinion.
>>>
>>> In any case, I'm not strongly advocating for it. It's violation of
>>> principle ("no fluent interfaces") is roughly in the same ballpark as
>>> np.filled() ("not every two-liner needs its own function"), so I
>>> thought I would toss it out there for consideration.
>>>
>>> --
>>> Robert Kern
>>>
>>
>> FWIW I'm +1 on the idea. Perhaps because I just don't see many practical
>> downsides to breaking the convention but I regularly see a big issue with 
>> there
>> being no way to instantiate an array with a particular value.
>>
>> The one obvious way to do it is use ones and multiply by the value you want. 
>> I
>> work with a lot of inexperienced programmers and I see this idiom all the 
>> time.
>> It takes a fair amount of numpy knowledge to know that you should do it in 
>> two
>> lines by using empty and setting a slice.
>>
>> In [1]: %timeit NaN*ones(1)
>> 1000 loops, best of 3: 1.74 ms per loop
>>
>> In [2]: %%timeit
>>...: x = empty(1, dtype=float)
>>...: x[:] = NaN
>>...:
>> 1 loops, best of 3: 28 us per loop
>>
>> In [3]: 1.74e-3/28e-6
>> Out[3]: 62.142857142857146
>>
>>
>> Even when not in the mythical "tight loop" setting an array to one and then
>> multiplying uses up a lot of cycles - it's nearly 2 orders of magnitude 
>> slower
>> than what we know they *should* be doing.
>>
>> I'm agnostic as to whether fill should be modified or new functions provided 
>> but
>> I think numpy is currently missing this functionality and that providing it
>> would save a lot of new users from shooting themselves in the foot 
>> performance-
>> wise.
>
> Is this a fair summary?
>
> => fill(shape, val), fill_like(arr, val) - new functions, as proposed
> For: readable, seems to fit a pattern often used, presence in
> namespace may clue people into using the 'fill' rather than * val or +
> val
> Con: a very simple alias for a = ones(shape) ; a.fill(val), maybe
> cluttering already full namespace.
>
> => empty(shape).fill(val) - by allowing return value from arr.fill(val)
> For: readable
> Con: breaks guideline not to return anything from in-place operations,
> no presence in namespace means users may not find this pattern.
>
> => no new API
> For : easy maintenance
> Con : harder for users to discover fill pattern, filling a new array
> requires two lines instead of one.
>
> So maybe the decision rests on:
>
> How important is it that users see these function names in the
> namespace in order to discover the pattern "a = ones(shape) ;
> a.fill(val)"?
>
> How important is it to obey guidelines for no-return-from-in-place?
>
> How important is it to avoid expanding the namespace?
>
> How common is this pattern?
>
> On the last, I'd say that the only common use I have for this pattern
> is to fill an array with NaN.

My 2 cts from a user perspective:

- +1 to have such a function. I usually use numpy.ones * scalar
because honestly, spending two lines of code for such a basic
operations seems like a waste. Even if it's slower and potentially
dangerous due to casting rules.
- I think having a noun rather than a verb makes more sense since we
have numpy.ones and numpy.

Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Matthew Brett
Hi,

On Mon, Jan 14, 2013 at 9:02 AM, Dave Hirschfeld
 wrote:
> Robert Kern  gmail.com> writes:
>
>>
>> >>> >
>> >>> > One alternative that does not expand the API with two-liners is to let
>> >>> > the ndarray.fill() method return self:
>> >>> >
>> >>> >   a = np.empty(...).fill(20.0)
>> >>>
>> >>> This violates the convention that in-place operations never return
>> >>> self, to avoid confusion with out-of-place operations. E.g.
>> >>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
>> >>> np.sort(), and in the broader Python world, list.sort() versus
>> >>> sorted(), list.reverse() versus reversed(). (This was an explicit
>> >>> reason given for list.sort to not return self, even.)
>> >>>
>> >>> Maybe enabling this idiom is a good enough reason to break the
>> >>> convention ("Special cases aren't special enough to break the rules. /
>> >>> Although practicality beats purity"), but it at least makes me -0 on
>> >>> this...
>> >>>
>> >>
>> >> I tend to agree with the notion that inplace operations shouldn't return
>> >> self, but I don't know if it's just because I've been conditioned this 
>> >> way.
>> >> Not returning self breaks the fluid interface pattern [1], as noted in a
>> >> similar discussion on pandas [2], FWIW, though there's likely some way to
>> >> have both worlds.
>> >
>> > Ah-hah, here's the email where Guide officially proclaims that there
>> > shall be no "fluent interface" nonsense applied to in-place operators
>> > in Python, because it hurts readability (at least for Dutch people
>> > ):
>> >   http://mail.python.org/pipermail/python-dev/2003-October/038855.html
>>
>> That's a statement about the policy for the stdlib, and just one
>> person's opinion. You, and numpy, are permitted to have a different
>> opinion.
>>
>> In any case, I'm not strongly advocating for it. It's violation of
>> principle ("no fluent interfaces") is roughly in the same ballpark as
>> np.filled() ("not every two-liner needs its own function"), so I
>> thought I would toss it out there for consideration.
>>
>> --
>> Robert Kern
>>
>
> FWIW I'm +1 on the idea. Perhaps because I just don't see many practical
> downsides to breaking the convention but I regularly see a big issue with 
> there
> being no way to instantiate an array with a particular value.
>
> The one obvious way to do it is use ones and multiply by the value you want. I
> work with a lot of inexperienced programmers and I see this idiom all the 
> time.
> It takes a fair amount of numpy knowledge to know that you should do it in two
> lines by using empty and setting a slice.
>
> In [1]: %timeit NaN*ones(1)
> 1000 loops, best of 3: 1.74 ms per loop
>
> In [2]: %%timeit
>...: x = empty(1, dtype=float)
>...: x[:] = NaN
>...:
> 1 loops, best of 3: 28 us per loop
>
> In [3]: 1.74e-3/28e-6
> Out[3]: 62.142857142857146
>
>
> Even when not in the mythical "tight loop" setting an array to one and then
> multiplying uses up a lot of cycles - it's nearly 2 orders of magnitude slower
> than what we know they *should* be doing.
>
> I'm agnostic as to whether fill should be modified or new functions provided 
> but
> I think numpy is currently missing this functionality and that providing it
> would save a lot of new users from shooting themselves in the foot 
> performance-
> wise.

Is this a fair summary?

=> fill(shape, val), fill_like(arr, val) - new functions, as proposed
For: readable, seems to fit a pattern often used, presence in
namespace may clue people into using the 'fill' rather than * val or +
val
Con: a very simple alias for a = ones(shape) ; a.fill(val), maybe
cluttering already full namespace.

=> empty(shape).fill(val) - by allowing return value from arr.fill(val)
For: readable
Con: breaks guideline not to return anything from in-place operations,
no presence in namespace means users may not find this pattern.

=> no new API
For : easy maintenance
Con : harder for users to discover fill pattern, filling a new array
requires two lines instead of one.

So maybe the decision rests on:

How important is it that users see these function names in the
namespace in order to discover the pattern "a = ones(shape) ;
a.fill(val)"?

How important is it to obey guidelines for no-return-from-in-place?

How important is it to avoid expanding the namespace?

How common is this pattern?

On the last, I'd say that the only common use I have for this pattern
is to fill an array with NaN.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Robert Kern
On Mon, Jan 14, 2013 at 4:12 PM, Frédéric Bastien  wrote:
> Why not optimize NumPy to detect a mul of an ndarray by a scalar to
> call fill? That way, "np.empty * 2" will be as fast as "x=np.empty;
> x.fill(2)"?

In general, each element of an array will be different, so the result
of the multiplication will be different, so fill can not be used.

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Robin
On Mon, Jan 14, 2013 at 2:57 PM, Benjamin Root  wrote:
> I am also +1 on the idea of having a filled() and filled_like() function (I
> learned a long time ago to just do a = np.empty() and a.fill() rather than
> the multiplication trick I learned from Matlab).  However, the collision
> with the masked array API is a non-starter for me.  np.const() and
> np.const_like() probably make the most sense, but I would prefer a verb over
> a noun.

To get an array of 1's, you call np.ones(shape), to get an array of
0's you call np.zeros(shape) so to get an array of val's why not call
np.vals(shape, val)?

Cheers

Robins
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Frédéric Bastien
Why not optimize NumPy to detect a mul of an ndarray by a scalar to
call fill? That way, "np.empty * 2" will be as fast as "x=np.empty;
x.fill(2)"?

Fred

On Mon, Jan 14, 2013 at 9:57 AM, Benjamin Root  wrote:
>
>
> On Mon, Jan 14, 2013 at 7:38 AM, Pierre Haessig 
> wrote:
>>
>> Hi,
>>
>> Le 14/01/2013 00:39, Nathaniel Smith a écrit :
>> > (The nice thing about np.filled() is that it makes np.zeros() and
>> > np.ones() feel like clutter, rather than the reverse... not that I'm
>> > suggesting ever getting rid of them, but it makes the API conceptually
>> > feel smaller, not larger.)
>> Coming from the Matlab syntax, I feel that np.zeros and np.ones are in
>> numpy for Matlab (and maybe others ?) compatibilty and are useful for
>> that. Now that I've been "enlightened" by Python, I think that those
>> functions (especially np.ones) are indeed clutter. Therefore I favor the
>> introduction of these two new functions.
>>
>> However, I think Eric's remark about masked array API compatibility is
>> important. I don't know what other names are possible ? np.const ?
>>
>> Or maybe np.tile is also useful for that same purpose ? In that case
>> adding a dtype argument to np.tile would be useful.
>>
>> best,
>> Pierre
>>
>
> I am also +1 on the idea of having a filled() and filled_like() function (I
> learned a long time ago to just do a = np.empty() and a.fill() rather than
> the multiplication trick I learned from Matlab).  However, the collision
> with the masked array API is a non-starter for me.  np.const() and
> np.const_like() probably make the most sense, but I would prefer a verb over
> a noun.
>
> Ben Root
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Benjamin Root
On Mon, Jan 14, 2013 at 7:38 AM, Pierre Haessig wrote:

> Hi,
>
> Le 14/01/2013 00:39, Nathaniel Smith a écrit :
> > (The nice thing about np.filled() is that it makes np.zeros() and
> > np.ones() feel like clutter, rather than the reverse... not that I'm
> > suggesting ever getting rid of them, but it makes the API conceptually
> > feel smaller, not larger.)
> Coming from the Matlab syntax, I feel that np.zeros and np.ones are in
> numpy for Matlab (and maybe others ?) compatibilty and are useful for
> that. Now that I've been "enlightened" by Python, I think that those
> functions (especially np.ones) are indeed clutter. Therefore I favor the
> introduction of these two new functions.
>
> However, I think Eric's remark about masked array API compatibility is
> important. I don't know what other names are possible ? np.const ?
>
> Or maybe np.tile is also useful for that same purpose ? In that case
> adding a dtype argument to np.tile would be useful.
>
> best,
> Pierre
>
>
I am also +1 on the idea of having a filled() and filled_like() function (I
learned a long time ago to just do a = np.empty() and a.fill() rather than
the multiplication trick I learned from Matlab).  However, the collision
with the masked array API is a non-starter for me.  np.const() and
np.const_like() probably make the most sense, but I would prefer a verb
over a noun.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] phase unwrapping (1d)

2013-01-14 Thread Neal Becker
This code should explain all:

import numpy as np
arg = np.angle

def nint (x):
return int (x + 0.5) if x >= 0 else int (x - 0.5)

def unwrap (inp, y=np.pi, init=0, cnt=0):
o = np.empty_like (inp)
prev_o = init
for i in range (len (inp)):
o[i] = cnt * 2 * y + inp[i]
delta = o[i] - prev_o

if delta / y > 1 or delta / y < -1:
n = nint (delta / (2*y))
o[i] -= 2*y*n
cnt -= n

prev_o = o[i]

return o


u = np.linspace (0, 400, 100) * np.pi/100
v = np.cos (u) + 1j * np.sin (u)
plot (arg(v))
plot (arg(v) + arg (v))
plot (unwrap (arg (v)))
plot (unwrap (arg (v) + arg (v)))
---

Pierre Haessig wrote:

> Hi Neal,
> 
> Le 11/01/2013 16:40, Neal Becker a écrit :
>> I wanted to be able to handle the case of
>>
>> unwrap (arg (x1) + arg (x2))
>>
>> Here, phase can change by more than 2pi.
> It's not clear to me what you mean by "change more than 2pi" ? Do you
> mean that the consecutive points of in input can increase by more than
> 2pi ? If that's the case, I feel like there is no a priori information
> in the data to detect such a "giant leap".
> 
> Also, I copy-paste here for reference the numpy.wrap code from [1] :
> 
> def unwrap(p, discont=pi, axis=-1):
> p = asarray(p)
> nd = len(p.shape)
> dd = diff(p, axis=axis)
> slice1 = [slice(None, None)]*nd # full slices
> slice1[axis] = slice(1, None)
> ddmod = mod(dd+pi, 2*pi)-pi
> _nx.copyto(ddmod, pi, where=(ddmod==-pi) & (dd > 0))
> ph_correct = ddmod - dd;
> _nx.copyto(ph_correct, 0, where=abs(dd) up = array(p, copy=True, dtype='d')
> up[slice1] = p[slice1] + ph_correct.cumsum(axis)
> return up
> 
> I don't know why it's too slow though. It looks well vectorized.
> 
> Coming back to your C algorithm, I'm not C guru so that I don't have a
> clear picture of what it's doing. Do you have a Python prototype ?
> 
> Best,
> Pierre
> 
> [1]
> https://github.com/numpy/numpy/blob/master/numpy/lib/function_base.py#L1117


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Insights / lessons learned from NumPy design

2013-01-14 Thread Mike Anderson
Just wanted to say a big thanks to everyone in the NumPy community who has
commented on this topic - it's given us a lot to think about and a lot of
good ideas to work into the design!

Best regards,

   Mike.

On 4 January 2013 14:29, Mike Anderson  wrote:

> Hello all,
>
> In the Clojure community there has been some discussion about creating a
> common matrix maths library / API. Currently there are a few different
> fledgeling matrix libraries in Clojure, so it seemed like a worthwhile
> effort to unify them and have a common base on which to build on.
>
> NumPy has been something of an inspiration for this, so I though I'd ask
> here to see what lessons have been learned.
>
> We're thinking of a matrix library with roughly the following design
> (subject to change!)
> - Support for multi-dimensional matrices (but with fast paths for 1D
> vectors and 2D matrices as the common cases)
> - Immutability by default, i.e. matrix operations are pure functions that
> create new matrices. There could be a "backdoor" option to mutate matrices,
> but that would be unidiomatic in Clojure
> - Support for 64-bit double precision floats only (this is the standard
> float type in Clojure)
> - Ability to support multiple different back-end matrix implementations
> (JBLAS, Colt, EJML, Vectorz, javax.vecmath etc.)
> - A full range of matrix operations. Operations would be delegated to back
> end implementations where they are supported, otherwise generic
> implementations could be used.
>
> Any thoughts on this topic based on the NumPy experience? In particular
> would be very interesting to know:
> - Features in NumPy which proved to be redundant / not worth the effort
> - Features that you wish had been designed in at the start
> - Design decisions that turned out to be a particularly big mistake /
> success
>
> Would love to hear your insights, any ideas+advice greatly appreciated!
>
>Mike.
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] phase unwrapping (1d)

2013-01-14 Thread Pierre Haessig
Hi Neal,

Le 11/01/2013 16:40, Neal Becker a écrit :
> I wanted to be able to handle the case of
>
> unwrap (arg (x1) + arg (x2))
>
> Here, phase can change by more than 2pi.
It's not clear to me what you mean by "change more than 2pi" ? Do you
mean that the consecutive points of in input can increase by more than
2pi ? If that's the case, I feel like there is no a priori information
in the data to detect such a "giant leap".

Also, I copy-paste here for reference the numpy.wrap code from [1] :

def unwrap(p, discont=pi, axis=-1):
p = asarray(p)
nd = len(p.shape)
dd = diff(p, axis=axis)
slice1 = [slice(None, None)]*nd # full slices
slice1[axis] = slice(1, None)
ddmod = mod(dd+pi, 2*pi)-pi
_nx.copyto(ddmod, pi, where=(ddmod==-pi) & (dd > 0))
ph_correct = ddmod - dd;
_nx.copyto(ph_correct, 0, where=abs(dd)https://github.com/numpy/numpy/blob/master/numpy/lib/function_base.py#L1117



signature.asc
Description: OpenPGP digital signature
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] phase unwrapping (1d)

2013-01-14 Thread Neal Becker
Nadav Horesh wrote:

> There is an unwrap function in numpy. Doesn't it work for you?
> 

Like I had said, np.unwrap was too slow.  Profiling showed it eating up an 
absurd proportion of time.  My c++ code was much better (although still 
surprisingly slow).

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpydoc for python 3?

2013-01-14 Thread Matthew Brett
Hi,

On Mon, Jan 14, 2013 at 10:35 AM, Jaakko Luttinen
 wrote:
> On 01/14/2013 12:53 AM, Matthew Brett wrote:
>> On Sun, Jan 13, 2013 at 10:46 PM, Jaakko Luttinen
>>  wrote:
>>> I'm a bit stuck trying to make numpydoc Python 3 compatible. I made
>>> setup.py try to use distutils.command.build_py.build_py_2to3 in order to
>>> transform installed code automatically to Python 3. However, the tests
>>> (in tests folder) are not part of the package but rather package_data,
>>> so they won't get transformed. How can I automatically transform the
>>> tests too? Probably there is some easy and "right" solution to this, but
>>> I haven't been able to figure out a nice and simple solution.. Any
>>> ideas? Thanks.
>>
>> Can you add tests as a package 'numpydoc.tests' and add an __init__.py
>> file to the 'tests' directory?
>
> I thought there is some reason why the 'tests' directory is not added as
> a package 'numpydoc.tests', so I didn't want to take that route.

I think the only reason is so that people can't import
'numpydoc.tests' in case they get confused.   We (nipy.org/nipy etc)
used to use packagedata for tests, but then we lost interest in
preventing people doing the import, and started to enjoy being able to
port things across as packages, do relative imports, run 2to3 and so
on.  So, I'd just go for it.

>> You might be able to get away without 2to3, using the kind of stuff
>> that Pauli has used for scipy recently:
>>
>> https://github.com/scipy/scipy/pull/397
>
> Ok, thanks, maybe I'll try to make the tests valid in all Python
> versions. It seems there's only one line which I'm not able to transform.
>
> In doc/sphinxext/tests/test_docscrape.py, on line 559:
> assert doc['Summary'][0] == u'öäöäöäöäö'.encode('utf-8')
>
> This is invalid in Python 3.0-3.2. How could I write this in such a way
> that it is valid in all Python versions? I'm a bit lost with these
> unicode encodings in Python (and in general).. And I didn't want to add
> dependency on 'six' package.

Pierre's suggestion is good; you can also do something like this:

# -*- coding: utf8 -*-
import sys

if sys.version_info[0] >= 3:
a = 'öäöäöäöäö'
else:
a = unicode('öäöäöäöäö', 'utf8')

The 'coding' line has to be the first or second line in the file.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Pierre Haessig
Hi,

Le 14/01/2013 00:39, Nathaniel Smith a écrit :
> (The nice thing about np.filled() is that it makes np.zeros() and
> np.ones() feel like clutter, rather than the reverse... not that I'm
> suggesting ever getting rid of them, but it makes the API conceptually
> feel smaller, not larger.)
Coming from the Matlab syntax, I feel that np.zeros and np.ones are in
numpy for Matlab (and maybe others ?) compatibilty and are useful for
that. Now that I've been "enlightened" by Python, I think that those
functions (especially np.ones) are indeed clutter. Therefore I favor the
introduction of these two new functions.

However, I think Eric's remark about masked array API compatibility is
important. I don't know what other names are possible ? np.const ?

Or maybe np.tile is also useful for that same purpose ? In that case
adding a dtype argument to np.tile would be useful.

best,
Pierre



signature.asc
Description: OpenPGP digital signature
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 1.8 release

2013-01-14 Thread Matthew Brett
Hi,

On Mon, Jan 14, 2013 at 12:19 AM, David Cournapeau  wrote:
> On Sun, Jan 13, 2013 at 5:26 PM, Nathaniel Smith  wrote:
>> On Sun, Jan 13, 2013 at 7:03 PM, Charles R Harris
>>  wrote:
>>> Now that 1.7 is nearing release, it's time to look forward to the 1.8
>>> release. I'd like us to get back to the twice yearly schedule that we tried
>>> to maintain through the 1.3 - 1.6 releases, so I propose a June release as a
>>> goal. Call it the Spring Cleaning release. As to content, I'd like to see
>>> the following.
>>>
>>> Removal of Python 2.4-2.5 support.
>>> Removal of SCons support.
>>> The index work consolidated.
>>> Initial stab at removing the need for 2to3. See Pauli's PR for scipy.
>>> Miscellaneous enhancements and fixes.
>>
>> I'd actually like to propose a faster release cycle than this, even.
>> Perhaps 3 months between releases; 2 months from release n to the
>> first beta of n+1?
>>
>> The consequences would be:
>> * Changes get out to users faster.
>> * Each release is smaller, so it's easier for downstream projects to
>> adjust to each release -- instead of having this giant pile of changes
>> to work through all at once every 6-12 months
>> * End-users are less scared of updating, because the changes aren't so
>> overwhelming, so they end up actually testing (and getting to take
>> advantage of) the new stuff more.
>> * We get feedback more quickly, so we can fix up whatever we break
>> while we still know what we did.
>> * And for larger changes, if we release them incrementally, we can get
>> feedback before we've gone miles down the wrong path.
>> * Releases come out on time more often -- sort of paradoxical, but
>> with small, frequent releases, beta cycles go smoother, and it's
>> easier to say "don't worry, I'll get it ready for next time", or
>> "right, that patch was less done than we thought, let's take it out
>> for now" (also this is much easier if we don't have another years
>> worth of changes committed on top of the patch!).
>> * If your schedule does slip, then you still end up with a <6 month
>> release cycle.
>>
>> 1.6.x was branched from master in March 2011 and released in May 2011.
>> 1.7.x was branched from master in July 2012 and still isn't out. But
>> at least we've finally found and fixed the second to last bug!
>>
>> Wouldn't it be nice to have a 2-4 week beta cycle that only found
>> trivial and expected problems? We *already* have 6 months worth of
>> feature work in master that won't be in the *next* release.
>>
>> Note 1: if we do do this, then we'll also want to rethink the
>> deprecation cycle a bit -- right now we've sort of vaguely been saying
>> "well, we'll deprecate it in release n and take it out in n+1.
>> Whenever that is". 3 months definitely isn't long enough for a
>> deprecation period, so if we do do this then we'll want to deprecate
>> things for multiple releases before actually removing them. Details to
>> be determined.
>>
>> Note 2: in this kind of release schedule, you definitely don't want to
>> say "here are the features that will be in the next release!", because
>> then you end up slipping and sliding all over the place. Instead you
>> say "here are some things that I want to work on next, and we'll see
>> which release they end up in". Since we're already following the rule
>> that nothing goes into master until it's done and tested and ready for
>> release anyway, this doesn't really change much.
>>
>> Thoughts?
>
> Hey, my time to have a time-machine:
> http://mail.scipy.org/pipermail/numpy-discussion/2008-May/033754.html
>
> I still think it is a good idea :)

I guess it is the release manager who has by far the largest say in
this.  Who will that be for the next year or so?

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 1.8 release

2013-01-14 Thread Phil Elson
I tried to suggest this for our matplotlib development cycle, but it didn't
get the roaring response I was hoping for (even though I was being
conservative by suggesting a 8-9 month release time):
http://matplotlib.1069221.n5.nabble.com/strategy-for-1-2-x-master-PEP8-changes-tp39453p39465.html

In essence, I think there is a lot of benefit in getting releases out
quicker. The biggest downside, IMHO, is that those who package the binary
releases have to work more frequently on what is not a particularly
glamorous task.

For those who are worried about the quality of releases being diminished by
releasing more frequently, an LTS approach could also work.

Good luck on getting these frequent releases going, IMHO there is a lot to
be said for having users on the latest and greatest, rather than have users
on old versions & still finding bug which were introduced 24 months ago and
fixed 12 months ago on master...

Cheers,

Phil








On 14 January 2013 00:19, David Cournapeau  wrote:

> On Sun, Jan 13, 2013 at 5:26 PM, Nathaniel Smith  wrote:
> > On Sun, Jan 13, 2013 at 7:03 PM, Charles R Harris
> >  wrote:
> >> Now that 1.7 is nearing release, it's time to look forward to the 1.8
> >> release. I'd like us to get back to the twice yearly schedule that we
> tried
> >> to maintain through the 1.3 - 1.6 releases, so I propose a June release
> as a
> >> goal. Call it the Spring Cleaning release. As to content, I'd like to
> see
> >> the following.
> >>
> >> Removal of Python 2.4-2.5 support.
> >> Removal of SCons support.
> >> The index work consolidated.
> >> Initial stab at removing the need for 2to3. See Pauli's PR for scipy.
> >> Miscellaneous enhancements and fixes.
> >
> > I'd actually like to propose a faster release cycle than this, even.
> > Perhaps 3 months between releases; 2 months from release n to the
> > first beta of n+1?
> >
> > The consequences would be:
> > * Changes get out to users faster.
> > * Each release is smaller, so it's easier for downstream projects to
> > adjust to each release -- instead of having this giant pile of changes
> > to work through all at once every 6-12 months
> > * End-users are less scared of updating, because the changes aren't so
> > overwhelming, so they end up actually testing (and getting to take
> > advantage of) the new stuff more.
> > * We get feedback more quickly, so we can fix up whatever we break
> > while we still know what we did.
> > * And for larger changes, if we release them incrementally, we can get
> > feedback before we've gone miles down the wrong path.
> > * Releases come out on time more often -- sort of paradoxical, but
> > with small, frequent releases, beta cycles go smoother, and it's
> > easier to say "don't worry, I'll get it ready for next time", or
> > "right, that patch was less done than we thought, let's take it out
> > for now" (also this is much easier if we don't have another years
> > worth of changes committed on top of the patch!).
> > * If your schedule does slip, then you still end up with a <6 month
> > release cycle.
> >
> > 1.6.x was branched from master in March 2011 and released in May 2011.
> > 1.7.x was branched from master in July 2012 and still isn't out. But
> > at least we've finally found and fixed the second to last bug!
> >
> > Wouldn't it be nice to have a 2-4 week beta cycle that only found
> > trivial and expected problems? We *already* have 6 months worth of
> > feature work in master that won't be in the *next* release.
> >
> > Note 1: if we do do this, then we'll also want to rethink the
> > deprecation cycle a bit -- right now we've sort of vaguely been saying
> > "well, we'll deprecate it in release n and take it out in n+1.
> > Whenever that is". 3 months definitely isn't long enough for a
> > deprecation period, so if we do do this then we'll want to deprecate
> > things for multiple releases before actually removing them. Details to
> > be determined.
> >
> > Note 2: in this kind of release schedule, you definitely don't want to
> > say "here are the features that will be in the next release!", because
> > then you end up slipping and sliding all over the place. Instead you
> > say "here are some things that I want to work on next, and we'll see
> > which release they end up in". Since we're already following the rule
> > that nothing goes into master until it's done and tested and ready for
> > release anyway, this doesn't really change much.
> >
> > Thoughts?
>
> Hey, my time to have a time-machine:
> http://mail.scipy.org/pipermail/numpy-discussion/2008-May/033754.html
>
> I still think it is a good idea :)
>
> cheers,
> David
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpydoc for python 3?

2013-01-14 Thread Pierre Haessig
Hi,

Le 14/01/2013 11:35, Jaakko Luttinen a écrit :
> Ok, thanks, maybe I'll try to make the tests valid in all Python
> versions. It seems there's only one line which I'm not able to transform.
>
> In doc/sphinxext/tests/test_docscrape.py, on line 559:
> assert doc['Summary'][0] == u'öäöäöäöäö'.encode('utf-8')
>
> This is invalid in Python 3.0-3.2. How could I write this in such a way
> that it is valid in all Python versions? I'm a bit lost with these
> unicode encodings in Python (and in general).. And I didn't want to add
> dependency on 'six' package.
Just as a side note about Python and encodings, I found great help in
watching (by chance) the PyCon 2012 presentation "Pragmatic Unicode or
How do I stop the Pain ?" by Ned Batchelder :
http://nedbatchelder.com/text/unipain.html

Now, if I understand the problem correctly, the u'xxx' syntax was
reintroduced in Python 3.3 specifically to enhance the 2to3
compatibility
(http://docs.python.org/3/whatsnew/3.3.html#pep-414-explicit-unicode-literals).
Maybe the question is then whether it's worth supporting Python 3.0-3.2
or not ?


Also, one possible rewrite of the test could be to replace the unicode
string with the corresponding utf8-encoded bytes :
assert doc['Summary'][0] ==
b'\xc3\xb6\xc3\xa4\xc3\xb6\xc3\xa4\xc3\xb6\xc3\xa4\xc3\xb6\xc3\xa4\xc3\xb6\xc3\xa5\xc3\xa5\xc3\xa5\xc3\xa5'
# output of 'öäöäöäöäö'.encode('utf-8')
(One restriction : I think the b'' prefix was introduced in Python 2.6)

I'm not sure for the readability though...

Best,
Pierre





signature.asc
Description: OpenPGP digital signature
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpydoc for python 3?

2013-01-14 Thread Jaakko Luttinen
On 01/14/2013 12:53 AM, Matthew Brett wrote:
> On Sun, Jan 13, 2013 at 10:46 PM, Jaakko Luttinen
>  wrote:
>> I'm a bit stuck trying to make numpydoc Python 3 compatible. I made
>> setup.py try to use distutils.command.build_py.build_py_2to3 in order to
>> transform installed code automatically to Python 3. However, the tests
>> (in tests folder) are not part of the package but rather package_data,
>> so they won't get transformed. How can I automatically transform the
>> tests too? Probably there is some easy and "right" solution to this, but
>> I haven't been able to figure out a nice and simple solution.. Any
>> ideas? Thanks.
> 
> Can you add tests as a package 'numpydoc.tests' and add an __init__.py
> file to the 'tests' directory?

I thought there is some reason why the 'tests' directory is not added as
a package 'numpydoc.tests', so I didn't want to take that route.

> You might be able to get away without 2to3, using the kind of stuff
> that Pauli has used for scipy recently:
> 
> https://github.com/scipy/scipy/pull/397

Ok, thanks, maybe I'll try to make the tests valid in all Python
versions. It seems there's only one line which I'm not able to transform.

In doc/sphinxext/tests/test_docscrape.py, on line 559:
assert doc['Summary'][0] == u'öäöäöäöäö'.encode('utf-8')

This is invalid in Python 3.0-3.2. How could I write this in such a way
that it is valid in all Python versions? I'm a bit lost with these
unicode encodings in Python (and in general).. And I didn't want to add
dependency on 'six' package.

Regards,
Jaakko

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Dave Hirschfeld
Robert Kern  gmail.com> writes:

> 
> >>> >
> >>> > One alternative that does not expand the API with two-liners is to let
> >>> > the ndarray.fill() method return self:
> >>> >
> >>> >   a = np.empty(...).fill(20.0)
> >>>
> >>> This violates the convention that in-place operations never return
> >>> self, to avoid confusion with out-of-place operations. E.g.
> >>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
> >>> np.sort(), and in the broader Python world, list.sort() versus
> >>> sorted(), list.reverse() versus reversed(). (This was an explicit
> >>> reason given for list.sort to not return self, even.)
> >>>
> >>> Maybe enabling this idiom is a good enough reason to break the
> >>> convention ("Special cases aren't special enough to break the rules. /
> >>> Although practicality beats purity"), but it at least makes me -0 on
> >>> this...
> >>>
> >>
> >> I tend to agree with the notion that inplace operations shouldn't return
> >> self, but I don't know if it's just because I've been conditioned this way.
> >> Not returning self breaks the fluid interface pattern [1], as noted in a
> >> similar discussion on pandas [2], FWIW, though there's likely some way to
> >> have both worlds.
> >
> > Ah-hah, here's the email where Guide officially proclaims that there
> > shall be no "fluent interface" nonsense applied to in-place operators
> > in Python, because it hurts readability (at least for Dutch people
> > ):
> >   http://mail.python.org/pipermail/python-dev/2003-October/038855.html
> 
> That's a statement about the policy for the stdlib, and just one
> person's opinion. You, and numpy, are permitted to have a different
> opinion.
> 
> In any case, I'm not strongly advocating for it. It's violation of
> principle ("no fluent interfaces") is roughly in the same ballpark as
> np.filled() ("not every two-liner needs its own function"), so I
> thought I would toss it out there for consideration.
> 
> --
> Robert Kern
> 

FWIW I'm +1 on the idea. Perhaps because I just don't see many practical 
downsides to breaking the convention but I regularly see a big issue with there 
being no way to instantiate an array with a particular value.

The one obvious way to do it is use ones and multiply by the value you want. I 
work with a lot of inexperienced programmers and I see this idiom all the time. 
It takes a fair amount of numpy knowledge to know that you should do it in two 
lines by using empty and setting a slice.

In [1]: %timeit NaN*ones(1)
1000 loops, best of 3: 1.74 ms per loop

In [2]: %%timeit
   ...: x = empty(1, dtype=float)
   ...: x[:] = NaN
   ...: 
1 loops, best of 3: 28 us per loop

In [3]: 1.74e-3/28e-6
Out[3]: 62.142857142857146


Even when not in the mythical "tight loop" setting an array to one and then 
multiplying uses up a lot of cycles - it's nearly 2 orders of magnitude slower 
than what we know they *should* be doing.

I'm agnostic as to whether fill should be modified or new functions provided 
but 
I think numpy is currently missing this functionality and that providing it 
would save a lot of new users from shooting themselves in the foot performance-
wise.

-Dave





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion