Sturla Molden wrote: > We recently has a discussion regarding an optimization of NumPy's median > to average O(n) complexity. After some searching, I found out there is a > selection algorithm competitive in speed with Hoare's quick select. It > has the advantage of being a lot simpler to implement. In plain Python: > > import numpy as np > > def wirthselect(array, k): > > """ Niklaus Wirth's selection algortithm """ > > a = np.ascontiguousarray(array) > if (a is array): a = a.copy() > > l = 0 > m = a.shape[0] - 1 > while l < m: > x = a[k] > i = l > j = m > while 1: > while a[i] < x: i += 1 > while x < a[j]: j -= 1 > if i <= j: > tmp = a[i] > a[i] = a[j] > a[j] = tmp > i += 1 > j -= 1 > if i > j: break > if j < k: l = i > if k < i: m = j > > return a > > > Now, the median can be obtained in average O(n) time as: > > > def median(x): > > """ median in average O(n) time """ > > n = x.shape[0] > k = n >> 1 > s = wirthselect(x, k) > if n & 1: > return s[k] > else: > return 0.5*(s[k]+s[:k].max()) > > > The beauty of this is that Wirth select is extremely easy to migrate to > Cython: > > > import numpy > ctypedef numpy.double_t T # or whatever > > def wirthselect(numpy.ndarray[T, ndim=1] array, int k): > > cdef int i, j, l, m > cdef T x, tmp > cdef T *a > > _array = np.ascontiguousarray(array) > if (_array is array): _array = _array.copy() > a = <T *> _array.data > > l = 0 > m = <int> a.shape[0] - 1
Nitpick: This will fail on large arrays. I guess numpy.npy_intp is the right type to use in this case? -- Dag Sverre _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion