Re: [Numpy-discussion] inplace unary operations?

2014-08-31 Thread josef.pktd
On Sat, Aug 30, 2014 at 1:45 PM, Nathaniel Smith n...@pobox.com wrote:

 On Sat, Aug 30, 2014 at 6:43 PM,  josef.p...@gmail.com wrote:
  Is there a way to negate a boolean, or to change the sign of a float
 inplace
  ?

 np.logical_not(arr, out=arr)
 np.negative(arr, out=arr)


Thanks Nathaniel.

np.negative might save a bit of memory and time when we have to negate the
loglikelihood all the time.

Josef




 -n

 --
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] inplace unary operations?

2014-08-31 Thread Pierre Barbier de Reuille
Just to point out another solution to change the sign:

 arr *= -1

Both solutions take the same time on my computer. However, the boolean
equivalent:

 arr ^= True

is a lot slower than using negative.

My two cents ...



-- 
Dr. Barbier de Reuille, Pierre
Institute of Plant Sciences
Altenbergrain 21, CH-3013 Bern, Switzerland
http://www.botany.unibe.ch/associated/systemsx/index.php


On 31 August 2014 15:31, josef.p...@gmail.com wrote:




 On Sat, Aug 30, 2014 at 1:45 PM, Nathaniel Smith n...@pobox.com wrote:

 On Sat, Aug 30, 2014 at 6:43 PM,  josef.p...@gmail.com wrote:
  Is there a way to negate a boolean, or to change the sign of a float
 inplace
  ?

 np.logical_not(arr, out=arr)
 np.negative(arr, out=arr)


 Thanks Nathaniel.

 np.negative might save a bit of memory and time when we have to negate the
 loglikelihood all the time.

 Josef




 -n

 --
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Does a `mergesorted` function make sense?

2014-08-31 Thread Eelco Hoogendoorn
Ive organized all code I had relating to this subject in a github repository
https://github.com/EelcoHoogendoorn/Numpy_arraysetops_EP. That should
facilitate shooting around ideas. Ive also added more documentation and
structure to make it easier to see what is going on.

Hopefully we can converge on a common vision, and then improve the
documentation and testing to make it worthy of including in the numpy
master.

Note that there is also a complete rewrite of the classic
numpy.arraysetops, such that they are also generalized to more complex
input, such as finding unique graph edges, and so on.

You mentioned getting the numpy core developers involved; are they not
subscribed to this mailing list? I wouldn't be surprised; youd hope there
is a channel of discussion concerning development with higher signal to
noise


On Thu, Aug 28, 2014 at 1:49 AM, Eelco Hoogendoorn 
hoogendoorn.ee...@gmail.com wrote:

 I just checked the docs on ufuncs, and it appears that's a solved problem
 now, since ufunc.reduceat now comes with an axis argument. Or maybe it
 already did when I wrote that, but I simply wasn't paying attention. Either
 way, the code is fully vectorized now, in both grouped and non-grouped
 axes. Its a lot of code, but all that happens for a grouping other than
 some O(1) and O(n) stuff is an argsort of the keys, and then the reduction
 itself, all fully vectorized.

 Note that I sort the values first, and then use ufunc.reduceat on the
 groups. It would seem to me that ufunc.at should be more efficient, by
 avoiding this indirection, but testing very much revealed the opposite, for
 reasons unclear to me. Perhaps that's changed now as well.


 On Wed, Aug 27, 2014 at 11:32 PM, Jaime Fernández del Río 
 jaime.f...@gmail.com wrote:

 Yes, I was aware of that. But the point would be to provide true
 vectorization on those operations.

 The way I see it, numpy may not have to have a GroupBy implementation,
 but it should at least enable implementing one that is fast and efficient
 over any axis.


 On Wed, Aug 27, 2014 at 12:38 PM, Eelco Hoogendoorn 
 hoogendoorn.ee...@gmail.com wrote:

 i.e, if the grouped axis is small but the other axes are not, you could
 write this, which avoids the python loop over the long axis that
 np.vectorize would otherwise perform.

 import numpy as np
 from grouping import group_by
 keys = np.random.randint(0,4,10)
 values = np.random.rand(10,2000)
 for k,g in zip(*group_by(keys)(values)):
 print k, g.mean(0)




 On Wed, Aug 27, 2014 at 9:29 PM, Eelco Hoogendoorn 
 hoogendoorn.ee...@gmail.com wrote:

 f.i., this works as expected as well (100 keys of 1d int arrays and 100
 values of 1d float arrays):

 group_by(randint(0,4,(100,2))).mean(rand(100,2))


 On Wed, Aug 27, 2014 at 9:27 PM, Eelco Hoogendoorn 
 hoogendoorn.ee...@gmail.com wrote:

 If I understand you correctly, the current implementation supports
 these operations. All reductions over groups (except for median) are
 performed through the corresponding ufunc (see GroupBy.reduce). This works
 on multidimensional arrays as well, although this broadcasting over the
 non-grouping axes is accomplished using np.vectorize. Actual vectorization
 only happens over the axis being grouped over, but this is usually a long
 axis. If it isn't, it is more efficient to perform a reduction by means of
 splitting the array by its groups first, and then map the iterable of
 groups over some reduction operation (as noted in the docstring of
 GroupBy.reduce).


 On Wed, Aug 27, 2014 at 8:29 PM, Jaime Fernández del Río 
 jaime.f...@gmail.com wrote:

 Hi Eelco,

 I took a deeper look into your code a couple of weeks back. I don't
 think I have fully grasped what it allows completely, but I agree that 
 some
 form of what you have there is highly desirable. Along the same lines, 
 for
 sometime I have been thinking that the right place for a `groupby` in 
 numpy
 is as a method of ufuncs, so that `np.add.groupby(arr, groups)` would do 
 a
 multidimensional version of `np.bincount(groups, weights=arr)`. You would
 then need a more powerful version of `np.unique` to produce the `groups`,
 but that is something that Joe Kington's old PR was very close to
 achieving, that should probably be resurrected as well. But yes, there
 seems to be material for a NEP here, and some guidance from one of the
 numpy devs would be helpful in getting this somewhere.

 Jaime


 On Wed, Aug 27, 2014 at 10:35 AM, Eelco Hoogendoorn 
 hoogendoorn.ee...@gmail.com wrote:

 It wouldn't hurt to have this function, but my intuition is that its
 use will be minimal. If you are already working with sorted arrays, you
 already have a flop cost on that order of magnitude, and the optimized
 merge saves you a factor two at the very most. Using numpy means you are
 sacrificing factors of two and beyond relative to pure C left right and
 center anyway, so if this kind of thing matters to you, you probably 
 wont
 be working in numpy in the first place.

 That said,