Re: [Numpy-discussion] best way of speeding up a filtering-like algorithm
Well, one tip to start with: numpy.where(some_comparison, True, False) is the same as but slower than some_comparison Eric On Wed, 28 Mar 2018 at 18:36 Moroney, Catherine M (398E) < catherine.m.moro...@jpl.nasa.gov> wrote: > Hello, > > > > I have the following sample code (pretty simple algorithm that uses a > rolling filter window) and am wondering what the best way is of speeding it > up. I tried rewriting it in Cython by pre-declaring the variables but that > didn’t buy me a lot of time. Then I rewrote it in Fortran (and compiled it > with f2py) and now it’s lightning fast. But I would still like to know if > I could rewrite it in pure python/numpy/scipy or in Cython and get a > similar speedup. > > > > Here is the raw Python code: > > > > def mixed_coastline_slow(nsidc, radius, count, mask=None): > > > > nsidc_copy = numpy.copy(nsidc) > > > > if (mask is None): > > idx_coastline = numpy.where(nsidc_copy == NSIDC_COASTLINE_MIXED) > > else: > > idx_coastline = numpy.where(mask & (nsidc_copy == > NSIDC_COASTLINE_MIXED)) > > > > for (irow0, icol0) in zip(idx_coastline[0], idx_coastline[1]): > > > > rows = ( max(irow0-radius, 0), min(irow0+radius+1, > nsidc_copy.shape[0]) ) > > cols = ( max(icol0-radius, 0), min(icol0+radius+1, > nsidc_copy.shape[1]) ) > > window = nsidc[rows[0]:rows[1], cols[0]:cols[1]] > > > > npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True, > False).sum() > > nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window <= > NSIDC_FRESHSNOW), \ > > True, False).sum() > > > > if (100.0*nsnowice/npoints >= count): > > nsidc_copy[irow0, icol0] = MISR_SEAICE_THRESHOLD > > > > return nsidc_copy > > > > and here is my attempt at Cython-izing it: > > > > import numpy > > cimport numpy as cnumpy > > cimport cython > > > > cdef int NSIDC_SIZE = 721 > > cdef int NSIDC_NO_SNOW = 0 > > cdef int NSIDC_ALL_SNOW = 100 > > cdef int NSIDC_FRESHSNOW = 103 > > cdef int NSIDC_PERMSNOW = 101 > > cdef int NSIDC_SEAICE_LOW = 1 > > cdef int NSIDC_SEAICE_HIGH = 100 > > cdef int NSIDC_COASTLINE_MIXED = 252 > > cdef int NSIDC_SUSPECT_ICE = 253 > > > > cdef int MISR_SEAICE_THRESHOLD = 6 > > > > def mixed_coastline(cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc, int > radius, int count): > > > > cdef int irow, icol, irow1, irow2, icol1, icol2, npoints, nsnowice > > cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc2 \ > > = numpy.empty(shape=(NSIDC_SIZE, NSIDC_SIZE), dtype=numpy.uint8) > > cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] window \ > > = numpy.empty(shape=(2*radius+1, 2*radius+1), dtype=numpy.uint8) > > > > nsidc2 = numpy.copy(nsidc) > > > > idx_coastline = numpy.where(nsidc2 == NSIDC_COASTLINE_MIXED) > > > > for (irow, icol) in zip(idx_coastline[0], idx_coastline[1]): > > > > irow1 = max(irow-radius, 0) > > irow2 = min(irow+radius+1, NSIDC_SIZE) > > icol1 = max(icol-radius, 0) > > icol2 = min(icol+radius+1, NSIDC_SIZE) > > window = nsidc[irow1:irow2, icol1:icol2] > > > > npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True, > False).sum() > > nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window > <= NSIDC_FRESHSNOW), \ > > True, False).sum() > > > > if (100.0*nsnowice/npoints >= count): > >nsidc2[irow, icol] = MISR_SEAICE_THRESHOLD > > > > return nsidc2 > > > > Thanks in advance for any advice! > > > > Catherine > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] best way of speeding up a filtering-like algorithm
It looks like you are creating a coastline mask (or a coastline mask + some other mask), and computing the ratio of two quantities in a particular window around each point. If your coastline covers a sufficiently large portion of the image, you may get quite a bit of mileage using an efficient convolution instead of summing the windows directly. For example, you could use scipy.signal.convolve2d with inputs being (nsidc_copy != NSIDC_COASTLINE_MIXED), (nsidc_copy == NSIDC_SEAICE_LOW & nsdic_copy == NSIDC_FRESHSNOW) for the frst array, and a (2*radius x 2*radius) array of ones for the second. You may have to center the block of ones in an array of zeros the same size as nsdic_copy, but I am not sure about that. Another option you may want to try is implementing your window movement more efficiently. If you step your window center along using an algorithm like flood-fill, you can insure that there will be very large overlap between successive steps (even if there is a break in the coastline). That means that you can reuse most of the data you've extracted. You will only need to subtract off the non-overlapping portion of the previous window and add in the non-overlapping portion of the updated window. If radius is 16, giving you a 32x32 window, you go from summing ~1000 pixels per quantity of interest, to summing only ~120 if the window moves along a diagonal, and only 64 if it moves vertically or horizontally. While an algorithm like this will probably give you the greatest boost, it is a pain to implement. If I had to guess, this looks like L2 processing for a multi-spectral instrument. If you don't mind me asking, what mission is this for? I'm working on space-looking detectors at the moment, but have spent many years on the L0, L1b and L1 portions of the GOES-R ground system. - Joe On Wed, Mar 28, 2018 at 9:43 PM, Eric Wieser wrote: > Well, one tip to start with: > > numpy.where(some_comparison, True, False) > > is the same as but slower than > > some_comparison > > Eric > > On Wed, 28 Mar 2018 at 18:36 Moroney, Catherine M (398E) > wrote: >> >> Hello, >> >> >> >> I have the following sample code (pretty simple algorithm that uses a >> rolling filter window) and am wondering what the best way is of speeding it >> up. I tried rewriting it in Cython by pre-declaring the variables but that >> didn’t buy me a lot of time. Then I rewrote it in Fortran (and compiled it >> with f2py) and now it’s lightning fast. But I would still like to know if I >> could rewrite it in pure python/numpy/scipy or in Cython and get a similar >> speedup. >> >> >> >> Here is the raw Python code: >> >> >> >> def mixed_coastline_slow(nsidc, radius, count, mask=None): >> >> >> >> nsidc_copy = numpy.copy(nsidc) >> >> >> >> if (mask is None): >> >> idx_coastline = numpy.where(nsidc_copy == NSIDC_COASTLINE_MIXED) >> >> else: >> >> idx_coastline = numpy.where(mask & (nsidc_copy == >> NSIDC_COASTLINE_MIXED)) >> >> >> >> for (irow0, icol0) in zip(idx_coastline[0], idx_coastline[1]): >> >> >> >> rows = ( max(irow0-radius, 0), min(irow0+radius+1, >> nsidc_copy.shape[0]) ) >> >> cols = ( max(icol0-radius, 0), min(icol0+radius+1, >> nsidc_copy.shape[1]) ) >> >> window = nsidc[rows[0]:rows[1], cols[0]:cols[1]] >> >> >> >> npoints = numpy.where(window != NSIDC_COASTLINE_MIXED, True, >> False).sum() >> >> nsnowice = numpy.where( (window >= NSIDC_SEAICE_LOW) & (window <= >> NSIDC_FRESHSNOW), \ >> >> True, False).sum() >> >> >> >> if (100.0*nsnowice/npoints >= count): >> >> nsidc_copy[irow0, icol0] = MISR_SEAICE_THRESHOLD >> >> >> >> return nsidc_copy >> >> >> >> and here is my attempt at Cython-izing it: >> >> >> >> import numpy >> >> cimport numpy as cnumpy >> >> cimport cython >> >> >> >> cdef int NSIDC_SIZE = 721 >> >> cdef int NSIDC_NO_SNOW = 0 >> >> cdef int NSIDC_ALL_SNOW = 100 >> >> cdef int NSIDC_FRESHSNOW = 103 >> >> cdef int NSIDC_PERMSNOW = 101 >> >> cdef int NSIDC_SEAICE_LOW = 1 >> >> cdef int NSIDC_SEAICE_HIGH = 100 >> >> cdef int NSIDC_COASTLINE_MIXED = 252 >> >> cdef int NSIDC_SUSPECT_ICE = 253 >> >> >> >> cdef int MISR_SEAICE_THRESHOLD = 6 >> >> >> >> def mixed_coastline(cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc, int >> radius, int count): >> >> >> >> cdef int irow, icol, irow1, irow2, icol1, icol2, npoints, nsnowice >> >> cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] nsidc2 \ >> >> = numpy.empty(shape=(NSIDC_SIZE, NSIDC_SIZE), dtype=numpy.uint8) >> >> cdef cnumpy.ndarray[cnumpy.uint8_t, ndim=2] window \ >> >> = numpy.empty(shape=(2*radius+1, 2*radius+1), dtype=numpy.uint8) >> >> >> >> nsidc2 = numpy.copy(nsidc) >> >> >> >> idx_coastline = numpy.where(nsidc2 == NSIDC_COASTLINE_MIXED) >> >> >> >> for (irow, icol) in zip(idx_coastline[0], idx_coastline[1]): >> >> >> >> irow1 = max(irow-radius, 0) >> >> irow2 = min(irow+radius+1,
[Numpy-discussion] PR adding support for object arrays to np.isinf, np.isnan, np.isfinite
I have opened PR #10820 to add support for `dtype=object` to `np.isinf`, `np.isnan`, `np.isfinite`. The PR is a fairly minor change, but I would like to make sure that I understand at least the basics of ufuncs before I start adding support for datetimes and timedeltas to `np.isfinite` and eventually to `np.histogram`. I have left a few comments in areas I am not sure about, and would greatly appreciate feedback, even if the PR is not found suitable for merging. With this PR, object arrays containing any numerical or simulated numerical types (implementing `__float__` or `__complex__` methods) are processed as would be expected. While working on PR, I came up with two questions for the gurus: 1. Am I correct in understanding that `isinf`, `isnan` and `isfinite` currently cast integer inputs to float to process them? Why are integer inputs not optimized to return arrays of all False, False, True, respectively for those functions? 2. Why are `isneginf` and `isposinf` not ufuncs? Is there any reason not to make them ufuncs (besides the renaming of the `y` parameter to `out`, which technically breaks some backward compatibility)? Regards, - Joe ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion