Hi again, Thanks for the responses to my question! Roberts answer worked very well for me, except for 1 small issue:
This line: close_mask = np.isclose(MatA, MatB, Threshold, equal_nan=True) returns each difference twice - once j in compare to I and once for I in compare to j for example: for this input: MatA = [[10,20,30],[40,50,60]] MatB = [[10,30,30],[40,50,160]] My old code will return: 0,1,20,30 1,3,60,160 You code returns: 0,1,20,30 1,3,60,160 0,1,30,20 1,3,160,60 I can simply cut "close_mask" to half so I'll have only 1 iteration, but that does not seems to be efficient.. any ideas? Also, what should I change to support 3D arrays as well? Thanks again, Nissim. -----Original Message----- From: NumPy-Discussion [mailto:numpy-discussion-bounces+nissimd=elspec-ltd....@python.org] On Behalf Of numpy-discussion-requ...@python.org Sent: Wednesday, May 17, 2017 8:17 PM To: numpy-discussion@python.org Subject: NumPy-Discussion Digest, Vol 128, Issue 18 Send NumPy-Discussion mailing list submissions to numpy-discussion@python.org<mailto:numpy-discussion@python.org> To subscribe or unsubscribe via the World Wide Web, visit https://mail.python.org/mailman/listinfo/numpy-discussion or, via email, send a message with subject or body 'help' to numpy-discussion-requ...@python.org<mailto:numpy-discussion-requ...@python.org> You can reach the person managing the list at numpy-discussion-ow...@python.org<mailto:numpy-discussion-ow...@python.org> When replying, please edit your Subject line so it is more specific than "Re: Contents of NumPy-Discussion digest..." Today's Topics: 1. Compare NumPy arrays with threshold and return the differences (Nissim Derdiger) 2. Re: Compare NumPy arrays with threshold and return the differences (Paul Hobson) 3. Re: Compare NumPy arrays with threshold and return the differences (Robert Kern) ---------------------------------------------------------------------- Message: 1 Date: Wed, 17 May 2017 16:50:40 +0000 From: Nissim Derdiger <niss...@elspec-ltd.com<mailto:niss...@elspec-ltd.com>> To: "numpy-discussion@python.org<mailto:numpy-discussion@python.org>" <numpy-discussion@python.org<mailto:numpy-discussion@python.org>> Subject: [Numpy-discussion] Compare NumPy arrays with threshold and return the differences Message-ID: <9EFE3345170EF24DB67C61C1B05EEEDB4073F384@EX10.Elspec.local<mailto:9EFE3345170EF24DB67C61C1B05EEEDB4073F384@EX10.Elspec.local>> Content-Type: text/plain; charset="us-ascii" Hi, In my script, I need to compare big NumPy arrays (2D or 3D), and return a list of all cells with difference bigger than a defined threshold. The compare itself can be done easily done with "allclose" function, like that: Threshold = 0.1 if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)): Print('Same') But this compare does not return which cells are not the same. The easiest (yet naive) way to know which cells are not the same is to use a simple for loops code like this one: def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold): if not Arr1.shape == Arr2.shape: return ['Arrays size not the same'] Dimensions = Arr1.shape Diff = [] for i in range(Dimensions [0]): for j in range(Dimensions [1]): if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold, equal_nan=True): Diff.append(',' + str(i) + ',' + str(j) + ',' + str(Arr1[i,j]) + ',' + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n') return Diff (and same for 3D arrays - with 1 more for loop) This way is very slow when the Arrays are big and full of none-equal cells. Is there a fast straight forward way in case they are not the same - to get a list of the uneven cells? maybe some built-in function in the NumPy itself? Thanks! Nissim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170517/a8bfd324/attachment-0001.html> ------------------------------ Message: 2 Date: Wed, 17 May 2017 10:13:46 -0700 From: Paul Hobson <pmhob...@gmail.com<mailto:pmhob...@gmail.com>> To: Discussion of Numerical Python <numpy-discussion@python.org<mailto:numpy-discussion@python.org>> Subject: Re: [Numpy-discussion] Compare NumPy arrays with threshold and return the differences Message-ID: <CADT3MEABot==+z_il7qkzim0rdm+0hn4kp4w-vekeoqew2p...@mail.gmail.com<mailto:CADT3MEABot==+z_il7qkzim0rdm+0hn4kp4w-vekeoqew2p...@mail.gmail.com>> Content-Type: text/plain; charset="utf-8" I would do something like: diff_is_large = (array1 - array2) > threshold index_at_large_diff = numpy.nonzero(diff_is_large) array1[index_at_large_diff].tolist() On Wed, May 17, 2017 at 9:50 AM, Nissim Derdiger <niss...@elspec-ltd.com<mailto:niss...@elspec-ltd.com>> wrote: > Hi, > In my script, I need to compare big NumPy arrays (2D or 3D), and > return a list of all cells with difference bigger than a defined threshold. > The compare itself can be done easily done with "allclose" function, > like > that: > Threshold = 0.1 > if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)): > Print('Same') > But this compare does not return *which* cells are not the same. > > The easiest (yet naive) way to know which cells are not the same is to > use a simple for loops code like this one: > def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold): > if not Arr1.shape == Arr2.shape: > return ['Arrays size not the same'] > Dimensions = Arr1.shape > Diff = [] > for i in range(Dimensions [0]): > for j in range(Dimensions [1]): > if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold, > equal_nan=True): > Diff.append(',' + str(i) + ',' + str(j) + ',' + > str(Arr1[i,j]) + ',' > + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n') > return Diff > (and same for 3D arrays - with 1 more for loop) This way is very slow > when the Arrays are big and full of none-equal cells. > > Is there a fast straight forward way in case they are not the same - > to get a list of the uneven cells? maybe some built-in function in the > NumPy itself? > Thanks! > Nissim > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org<mailto:NumPy-Discussion@python.org> > https://mail.python.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170517/6183339c/attachment-0001.html> ------------------------------ Message: 3 Date: Wed, 17 May 2017 10:16:09 -0700 From: Robert Kern <robert.k...@gmail.com<mailto:robert.k...@gmail.com>> To: Discussion of Numerical Python <numpy-discussion@python.org<mailto:numpy-discussion@python.org>> Subject: Re: [Numpy-discussion] Compare NumPy arrays with threshold and return the differences Message-ID: <CAF6FJisn3Oj18HOOP-DJGOi7rTwr-1U4npef+wCd=ennmkm...@mail.gmail.com<mailto:CAF6FJisn3Oj18HOOP-DJGOi7rTwr-1U4npef+wCd=ennmkm...@mail.gmail.com>> Content-Type: text/plain; charset="utf-8" On Wed, May 17, 2017 at 9:50 AM, Nissim Derdiger <niss...@elspec-ltd.com<mailto:niss...@elspec-ltd.com>> wrote: > Hi, > In my script, I need to compare big NumPy arrays (2D or 3D), and > return a list of all cells with difference bigger than a defined threshold. > The compare itself can be done easily done with "allclose" function, > like > that: > Threshold = 0.1 > if (np.allclose(Arr1, Arr2, Threshold, equal_nan=True)): > Print('Same') > But this compare does not return *which* cells are not the same. > > The easiest (yet naive) way to know which cells are not the same is to > use a simple for loops code like this one: > def CheckWhichCellsAreNotEqualInArrays(Arr1,Arr2,Threshold): > if not Arr1.shape == Arr2.shape: > return ['Arrays size not the same'] > Dimensions = Arr1.shape > Diff = [] > for i in range(Dimensions [0]): > for j in range(Dimensions [1]): > if not np.allclose(Arr1[i][j], Arr2[i][j], Threshold, > equal_nan=True): > Diff.append(',' + str(i) + ',' + str(j) + ',' + > str(Arr1[i,j]) + ',' > + str(Arr2[i,j]) + ',' + str(Threshold) + ',Fail\n') > return Diff > (and same for 3D arrays - with 1 more for loop) This way is very slow > when the Arrays are big and full of none-equal cells. > > Is there a fast straight forward way in case they are not the same - > to get a list of the uneven cells? maybe some built-in function in the > NumPy itself? > Use `close_mask = np.isclose(Arr1, Arr2, Threshold, equal_nan=True)` to return a boolean mask the same shape as the arrays which is True where the elements are close and False where they are not. You can invert it to get a boolean mask which is True where they are "far" with respect to the threshold: `far_mask = ~close_mask`. Then you can use `i_idx, j_idx = np.nonzero(far_mask)` to get arrays of the `i` and `j` indices where the values are far. For example: for i, j in zip(i_idx, j_idx): print("{0}, {1}, {2}, {3}, {4}, Fail".format(i, j, Arr1[i, j], Arr2[i, j], Threshold)) -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20170517/3d57f695/attachment.html> ------------------------------ Subject: Digest Footer _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org<mailto:NumPy-Discussion@python.org> https://mail.python.org/mailman/listinfo/numpy-discussion ------------------------------ End of NumPy-Discussion Digest, Vol 128, Issue 18 *************************************************
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion