Re: [Numpy-discussion] Compare NumPy arrays with threshold

2017-05-18 Thread Nissim Derdiger
Hi again,
Thanks for the responses to my question!
Roberts answer worked very well for me, except for 1 small issue:

This line:
close_mask = np.isclose(MatA, MatB, Threshold, equal_nan=True)
returns each difference twice - once j in compare to I and once for I in 
compare to j

for example:

for this input:
MatA = [[10,20,30],[40,50,60]]
MatB = [[10,30,30],[40,50,160]]

My old code will return:
0,1,20,30
1,3,60,160
You code returns:
0,1,20,30
1,3,60,160
0,1,30,20
1,3,160,60


I can simply cut "close_mask" to half so I'll have only 1 iteration, but that 
does not seems to be efficient..
any ideas?



Also, what should I change to support 3D arrays as well?


Thanks again,
Nissim.




-Original Message-
From: NumPy-Discussion 
[mailto:numpy-discussion-bounces+nissimd=elspec-ltd@python.org] On Behalf 
Of numpy-discussion-requ...@python.org
Sent: Wednesday, May 17, 2017 8:17 PM
To: numpy-discussion@python.org
Subject: NumPy-Discussion Digest, Vol 128, Issue 18

Send NumPy-Discussion mailing list submissions to
numpy-discussion@python.org

To subscribe or unsubscribe via the World Wide Web, visit
https://mail.python.org/mailman/listinfo/numpy-discussion
or, via email, send a message with subject or body 'help' to

numpy-discussion-requ...@python.org

You can reach the person managing the list at

numpy-discussion-ow...@python.org

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of NumPy-Discussion digest..."


Today's Topics:

   1. Compare NumPy arrays with threshold and return the
  differences (Nissim Derdiger)
   2. Re: Compare NumPy arrays with threshold and return the
  differences (Paul Hobson)
   3. Re: Compare NumPy arrays with threshold and return the
  differences (Robert Kern)


--

Message: 1
Date: Wed, 17 May 2017 16:50:40 +
From: Nissim Derdiger mailto:niss...@elspec-ltd.com>>
To: "numpy-discussion@python.org" 
mailto:numpy-discussion@python.org>>
Subject: [Numpy-discussion] Compare NumPy arrays with threshold and
return the differences
Message-ID:

<9EFE3345170EF24DB67C61C1B05EEEDB4073F384@EX10.Elspec.local>
Content-Type: text/plain; charset="us-ascii"

Hi,

In my script, I need to compare big NumPy arrays (2D or 3D), and return a list 
of all cells with difference bigger than a defined threshold.
The compare itself can be done easily done with "allclose" function, like that:
Threshold = 0.1
if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)):
Print('Same')

But this compare does not return which cells are not the same.

The easiest (yet naive) way to know which cells are not the same is to use a 
simple for loops code like this one:
def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold):
   if not Arr1.shape == Arr2.shape:
   return ['Arrays size not the same']
   Dimensions = Arr1.shape
   Diff = []
   for i in range(Dimensions [0]):
   for j in range(Dimensions [1]):
   if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold, 
equal_nan=True):
   Diff.append(',' + str(i) + ',' + str(j) + ',' + str(Arr1[i,j]) + 
','
   + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n')
   return Diff

(and same for 3D arrays - with 1 more for loop) This way is very slow when the 
Arrays are big and full of none-equal cells.

Is there a fast straight forward way in case they are not the same - to get a 
list of the uneven cells? maybe some built-in function in the NumPy itself?
Thanks!
Nissim


-- next part --
An HTML attachment was scrubbed...
URL: 


--

Message: 2
Date: Wed, 17 May 2017 10:13:46 -0700
From: Paul Hobson mailto:pmhob...@gmail.com>>
To: Discussion of Numerical Python 
mailto:numpy-discussion@python.org>>
Subject: Re: [Numpy-discussion] Compare NumPy arrays with threshold
and return the differences
Message-ID:

mailto:CADT3MEABot==+z_il7qkzim0rdm+0hn4kp4w-vekeoqew2p...@mail.gmail.com>>
Content-Type: text/plain; charset="utf-8"

I would do something like:

diff_is_large = (array1 - array2) > threshold index_at_large_diff = 
numpy.nonzero(diff_is_large)
array1[index_at_large_diff].tolist()


On Wed, May 17, 2017 at 9:50 AM, Nissim Derdiger 
mailto:niss...@elspec-ltd.com>>
wrote:

> Hi,
> In my script, I need to compare big NumPy arrays (2D or 3D), and
> return a list of all cells with difference bigger than a defined threshold.
> The compare itself can be done easily done with "allclose" function,
> like
> that:
> Threshold = 0.1
> if (np.allclose(Arr1, Arr2, Threshold, equ

[Numpy-discussion] failed to add routine to the core module

2017-05-18 Thread marc

Dear Numpy developers,

I'm trying to add a routine to calculate the sum of a product of two 
arrays (a dot product). But that would not increase the memory (from 
what I saw np.dot is increasing the memory while it should not be 
necessary). The idea is to avoid the use of the temporary array in the 
calculation of the variance ( numpy/numpy/core/_methods.py line 112).


The routine that I want to implement look like this in python,

|arr = np.random.rand(10)|
|mean = arr.mean()|
|var = 0.0|
|for ai in arr: var += (ai-mean)**2|

I would like to implement it in the umath module. As a first step, I 
tried to reproduce the divmod function of umath, but I did not manage to 
do it, you can find my fork here 
 (the branch with 
the changes is call looking_around). During compilation I get the 
following error,


|gcc: numpy/core/src/multiarray/number.c
In file included from numpy/core/src/multiarray/number.c:17:0: 
numpy/core/src/multiarray/number.c: In function ‘array_sum_multiply’:
numpy/core/src/private/binop_override.h:176:39: error: ‘PyNumberMethods 
{aka struct }’ has no member named ‘nb_sum_multiply’ 
(void*)(Py_TYPE(m2)->tp_as_number->SLOT_NAME) != (void*)(test_func))

^
numpy/core/src/private/binop_override.h:180:13: note: in expansion of 
macro ‘BINOP_IS_FORWARD’ if (BINOP_IS_FORWARD(m1, m2, slot_expr, 
test_func) && \

^
numpy/core/src/multiarray/number.c:363:5: note: in expansion of macro 
‘BINOP_GIVE_UP_IF_NEEDED’ BINOP_GIVE_UP_IF_NEEDED(m1, m2, 
nb_sum_multiply, array_sum_multiply);|


Sorry if my question seems basic, but I'm new in Numpy development.
Any help?

Thank you in advance,

Marc Barbry

PS: I opened an issues as well on the github repository
https://github.com/numpy/numpy/issues/9130
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] failed to add routine to the core module

2017-05-18 Thread Marten van Kerkwijk
Hi Marc,

ufuncs are quite tricky to compile. Part of your problem is that, I
think, you started a bit too high up: `divmod` is also a binary
operation, so that part you do not need at all. It may be an idea to
start instead with a PR that implemented a new ufunc, e.g.,
https://github.com/numpy/numpy/pull/8795, so that you can see what is
involved.

All the best,

Marten



On Thu, May 18, 2017 at 9:04 AM, marc  wrote:
> Dear Numpy developers,
>
> I'm trying to add a routine to calculate the sum of a product of two arrays
> (a dot product). But that would not increase the memory (from what I saw
> np.dot is increasing the memory while it should not be necessary). The idea
> is to avoid the use of the temporary array in the calculation of the
> variance ( numpy/numpy/core/_methods.py line 112).
>
> The routine that I want to implement look like this in python,
>
> arr = np.random.rand(10)
> mean = arr.mean()
> var = 0.0
> for ai in arr: var += (ai-mean)**2
>
> I would like to implement it in the umath module. As a first step, I tried
> to reproduce the divmod function of umath, but I did not manage to do it,
> you can find my fork here (the branch with the changes is call
> looking_around). During compilation I get the following error,
>
> gcc: numpy/core/src/multiarray/number.c
> In file included from numpy/core/src/multiarray/number.c:17:0:
> numpy/core/src/multiarray/number.c: In function ‘array_sum_multiply’:
> numpy/core/src/private/binop_override.h:176:39: error: ‘PyNumberMethods {aka
> struct }’ has no member named ‘nb_sum_multiply’
> (void*)(Py_TYPE(m2)->tp_as_number->SLOT_NAME) != (void*)(test_func))
> ^
> numpy/core/src/private/binop_override.h:180:13: note: in expansion of macro
> ‘BINOP_IS_FORWARD’ if (BINOP_IS_FORWARD(m1, m2, slot_expr, test_func) && \
> ^
> numpy/core/src/multiarray/number.c:363:5: note: in expansion of macro
> ‘BINOP_GIVE_UP_IF_NEEDED’ BINOP_GIVE_UP_IF_NEEDED(m1, m2, nb_sum_multiply,
> array_sum_multiply);
>
> Sorry if my question seems basic, but I'm new in Numpy development.
> Any help?
>
> Thank you in advance,
>
> Marc Barbry
>
> PS: I opened an issues as well on the github repository
> https://github.com/numpy/numpy/issues/9130
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] failed to add routine to the core module

2017-05-18 Thread marc

Hello Marten,

Thank you for your help, effectively, the example that you propose is 
much easier to imitate, I can now continue further.


Thanks,
Marc

On 05/18/2017 04:32 PM, Marten van Kerkwijk wrote:

Hi Marc,

ufuncs are quite tricky to compile. Part of your problem is that, I
think, you started a bit too high up: `divmod` is also a binary
operation, so that part you do not need at all. It may be an idea to
start instead with a PR that implemented a new ufunc, e.g.,
https://github.com/numpy/numpy/pull/8795, so that you can see what is
involved.

All the best,

Marten



On Thu, May 18, 2017 at 9:04 AM, marc  wrote:

Dear Numpy developers,

I'm trying to add a routine to calculate the sum of a product of two arrays
(a dot product). But that would not increase the memory (from what I saw
np.dot is increasing the memory while it should not be necessary). The idea
is to avoid the use of the temporary array in the calculation of the
variance ( numpy/numpy/core/_methods.py line 112).

The routine that I want to implement look like this in python,

arr = np.random.rand(10)
mean = arr.mean()
var = 0.0
for ai in arr: var += (ai-mean)**2

I would like to implement it in the umath module. As a first step, I tried
to reproduce the divmod function of umath, but I did not manage to do it,
you can find my fork here (the branch with the changes is call
looking_around). During compilation I get the following error,

gcc: numpy/core/src/multiarray/number.c
In file included from numpy/core/src/multiarray/number.c:17:0:
numpy/core/src/multiarray/number.c: In function ‘array_sum_multiply’:
numpy/core/src/private/binop_override.h:176:39: error: ‘PyNumberMethods {aka
struct }’ has no member named ‘nb_sum_multiply’
(void*)(Py_TYPE(m2)->tp_as_number->SLOT_NAME) != (void*)(test_func))
 ^
numpy/core/src/private/binop_override.h:180:13: note: in expansion of macro
‘BINOP_IS_FORWARD’ if (BINOP_IS_FORWARD(m1, m2, slot_expr, test_func) && \
 ^
numpy/core/src/multiarray/number.c:363:5: note: in expansion of macro
‘BINOP_GIVE_UP_IF_NEEDED’ BINOP_GIVE_UP_IF_NEEDED(m1, m2, nb_sum_multiply,
array_sum_multiply);

Sorry if my question seems basic, but I'm new in Numpy development.
Any help?

Thank you in advance,

Marc Barbry

PS: I opened an issues as well on the github repository
https://github.com/numpy/numpy/issues/9130

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] failed to add routine to the core module

2017-05-18 Thread Sebastian Berg
On Thu, 2017-05-18 at 15:04 +0200, marc wrote:
> Dear Numpy developers,
> I'm trying to add a routine to calculate the sum of a product of two
> arrays (a dot product). But that would not increase the memory (from
> what I saw np.dot is increasing the memory while it should not be
> necessary). The idea is to avoid the use of the temporary array in
> the calculation of the variance ( numpy/numpy/core/_methods.py line
> 112).

np.dot should only increase memory in some cases (such as non-
contiguous arrays) and be much faster in most cases (unless e.g. you do
not have a BLAS compatible type). You might also want to check out
np.einsum, which is pretty slick and can handle these kind of
operations as well. Note that `np.dot` calls into BLAS so that it is in
general much faster then np.einsum.

- Sebastian

> The routine that I want to implement look like this in python,
> arr = np.random.rand(10)
> mean = arr.mean()
> var = 0.0
> for ai in arr: var += (ai-mean)**2
> I would like to implement it in the umath module. As a first step, I
> tried to reproduce the divmod function of umath, but I did not manage
> to do it, you can find my fork here (the branch with the changes is
> call looking_around). During compilation I get the following error,
> gcc: numpy/core/src/multiarray/number.c 
> In file included from numpy/core/src/multiarray/number.c:17:0:
> numpy/core/src/multiarray/number.c: In function
> ‘array_sum_multiply’: 
> numpy/core/src/private/binop_override.h:176:39: error:
> ‘PyNumberMethods {aka struct }’ has no member named
> ‘nb_sum_multiply’ (void*)(Py_TYPE(m2)->tp_as_number->SLOT_NAME) !=
> (void*)(test_func)) 
>                             ^ 
> numpy/core/src/private/binop_override.h:180:13: note: in expansion of
> macro ‘BINOP_IS_FORWARD’ if (BINOP_IS_FORWARD(m1, m2, slot_expr,
> test_func) && \
>         ^ 
> numpy/core/src/multiarray/number.c:363:5: note: in expansion of macro
> ‘BINOP_GIVE_UP_IF_NEEDED’ BINOP_GIVE_UP_IF_NEEDED(m1, m2,
> nb_sum_multiply, array_sum_multiply);
> Sorry if my question seems basic, but I'm new in Numpy development.
> Any help?
> Thank you in advance,
> Marc Barbry
> 
> PS: I opened an issues as well on the github repository
> https://github.com/numpy/numpy/issues/9130 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] NumPy 1.13.0rc2 released

2017-05-18 Thread Charles R Harris
Hi All,

I'm pleased to announce the NumPy 1.13.0rc2 release. This release supports
Python 2.7 and 3.4-3.6 and contains many new features. It is one of the
most ambitious releases in the last several years. Some of the highlights
and new functions are

*Highlights*

   - Operations like ``a + b + c`` will reuse temporaries on some
   platforms, resulting in less memory use and faster execution.
   - Inplace operations check if inputs overlap outputs and create
   temporaries to avoid problems.
   - New __array_ufunc__ attribute provides improved ability for classes to
   override default ufunc behavior.
   - New np.block function for creating blocked arrays.


*New functions*

   - New ``np.positive`` ufunc.
   - New ``np.divmod`` ufunc provides more efficient divmod.
   - New ``np.isnat`` ufunc tests for NaT special values.
   - New ``np.heaviside`` ufunc computes the Heaviside function.
   - New ``np.isin`` function, improves on ``in1d``.
   - New ``np.block`` function for creating blocked arrays.
   - New ``PyArray_MapIterArrayCopyIfOverlap`` added to NumPy C-API.

Wheels for the pre-release are available on PyPI. Source tarballs,
zipfiles, release notes, and the changelog are available on github
.

A total of 102 people contributed to this release.  People with a "+" by
their
names contributed a patch for the first time.

   - A. Jesse Jiryu Davis +
   - Alessandro Pietro Bardelli +
   - Alex Rothberg +
   - Alexander Shadchin
   - Allan Haldane
   - Andres Guzman-Ballen +
   - Antoine Pitrou
   - Antony Lee
   - B R S Recht +
   - Baurzhan Muftakhidinov +
   - Ben Rowland
   - Benda Xu +
   - Blake Griffith
   - Bradley Wogsland +
   - Brandon Carter +
   - CJ Carey
   - Charles Harris
   - Christoph Gohlke
   - Danny Hermes +
   - David Hagen +
   - David Nicholson +
   - Duke Vijitbenjaronk +
   - Egor Klenin +
   - Elliott Forney +
   - Elliott M Forney +
   - Endolith
   - Eric Wieser
   - Erik M. Bray
   - Eugene +
   - Evan Limanto +
   - Felix Berkenkamp +
   - François Bissey +
   - Frederic Bastien
   - Greg Young
   - Gregory R. Lee
   - Importance of Being Ernest +
   - Jaime Fernandez
   - Jakub Wilk +
   - James Cowgill +
   - James Sanders
   - Jean Utke +
   - Jesse Thoren +
   - Jim Crist +
   - Joerg Behrmann +
   - John Kirkham
   - Jonathan Helmus
   - Jonathan L Long
   - Jonathan Tammo Siebert +
   - Joseph Fox-Rabinovitz
   - Joshua Loyal +
   - Juan Nunez-Iglesias +
   - Julian Taylor
   - Kirill Balunov +
   - Likhith Chitneni +
   - Loïc Estève
   - Mads Ohm Larsen
   - Marein Könings +
   - Marten van Kerkwijk
   - Martin Thoma
   - Martino Sorbaro +
   - Marvin Schmidt +
   - Matthew Brett
   - Matthias Bussonnier +
   - Matthias C. M. Troffaes +
   - Matti Picus
   - Michael Seifert
   - Mikhail Pak +
   - Mortada Mehyar
   - Nathaniel J. Smith
   - Nick Papior
   - Oscar Villellas +
   - Pauli Virtanen
   - Pavel Potocek
   - Pete Peeradej Tanruangporn +
   - Philipp A +
   - Ralf Gommers
   - Robert Kern
   - Roland Kaufmann +
   - Ronan Lamy
   - Sami Salonen +
   - Sanchez Gonzalez Alvaro
   - Sebastian Berg
   - Shota Kawabuchi
   - Simon Gibbons
   - Stefan Otte
   - Stefan Peterson +
   - Stephan Hoyer
   - Søren Fuglede Jørgensen +
   - Takuya Akiba
   - Tom Boyd +
   - Ville Skyttä +
   - Warren Weckesser
   - Wendell Smith
   - Yu Feng
   - Zixu Zhao +
   - Zè Vinícius +
   - aha66 +
   - drabach +
   - drlvk +
   - jsh9 +
   - solarjoe +
   - zengi +

Cheers,

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Compare NumPy arrays with threshold

2017-05-18 Thread Robert Kern
On Thu, May 18, 2017 at 5:07 AM, Nissim Derdiger 
wrote:
>
> Hi again,
> Thanks for the responses to my question!
> Roberts answer worked very well for me, except for 1 small issue:
>
> This line:
> close_mask = np.isclose(MatA, MatB, Threshold, equal_nan=True)
> returns each difference twice – once j in compare to I and once for I in
compare to j

No, it returns a boolean array the same size as MatA and MatB. It literally
can't contain "each difference twice". Maybe there is something else in
your code that is producing the doubling that you see, possibly in the
printing of the results.

I'm not seeing the behavior that you speak of. Please post your complete
code that produced the doubled output that you see.

import numpy as np

MatA = np.array([[10,20,30],[40,50,60]])
MatB = np.array([[10,30,30],[40,50,160]])
Threshold = 1.0

# Note the `atol=` here. I missed it before.
close_mask = np.isclose(MatA, MatB, atol=Threshold, equal_nan=True)
far_mask = ~close_mask
i_idx, j_idx = np.nonzero(far_mask)
for i, j in zip(i_idx, j_idx):
print("{0}, {1}, {2}, {3}, {4}, Fail".format(i, j, MatA[i, j], MatB[i,
j], Threshold))


I get the following output:

$ python isclose.py
0, 1, 20, 30, 1.0, Fail
1, 2, 60, 160, 1.0, Fail

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion