> Ah, I see. You definitely do not want to reassign the .data buffer in
> this case. An out= parameter does not reassign the memory location
> that the array object points to. It should use the allocated memory
> that was already there. It shouldn't "copy" anything at all;
> otherwise, "median(x, out=out)" is no better than "out[:] =
> median(x)". Personally, I don't think that a function should expose an
> out= parameter unless if it can make good on that promise of memory
> efficency.

I agree - but there are more efficient median algorithms out there
which can make use of the memory efficiently.  I wanted to establish
the call signature to allow that.  I don't feel strongly about it
though.

> Can you show us the current implementation that you have?

is attached, comments welcome...

Matthew
import numpy as np

def median(a, axis=0, dtype=None, out=None):
    """Compute the median along the specified axis.

    Returns the median of the array elements.  The median is taken
    over the first dimension of the array by default, otherwise over
    the specified axis.

    Parameters
    ----------
    axis : {None, int}, optional
        Axis along which the medians are computed. The default is to
        compute the median along the first dimension.  axis=None
        returns the median of the flattened array

    dtype : type, optional
        Type to use in returning the medians. For arrays of integer
        type the default is float32, for arrays of float types it is
        the same as the array type. Integer arrays may return float
        medians because, given the chosen axis has length N, and N is
        even, the median is given by the mean of the two central
        values (see notes)

    out : ndarray, optional
        Alternative output array in which to place the result. It must
        have the same shape as the expected output but the type will be
        cast if necessary.

    Returns
    -------
    median : The return type varies, see above.
        A new array holding the result is returned unless out is
        specified, in which case a reference to out is returned.

    SeeAlso
    -------
    mean

    Notes
    -----
    Given a vector V length N, the median of V is the middle value of
    a sorted copy of V (Vs) - i.e. Vs[(N-1)/2], when N is odd. It is
    the mean of the two middle values of Vs, when N is even.
    """
    sorted = np.sort(a, axis)
    if dtype is None:
        if a.dtype in np.sctypes['int']:
            dtype = np.float32
        else:
            dtype = a.dtype
    if axis is None:
        axis = 0
    indexer = [slice(None)] * sorted.ndim
    index = int(sorted.shape[axis]/2)
    if sorted.shape[axis] % 2 == 1:
        indexer[axis] = index
        ret = sorted(indexer)
    else:
        indexer[axis] = slice(index-1, index+1)
        ret = np.sum(sorted[indexer], axis=axis)/2.0
        if dtype in np.sctypes['int']:
            ret = ret.round()
    if ret.dtype != dtype:
        ret = ret.astype(dtype)
    if not out is None:
        if not (out.shape == ret.shape and
                out.nbytes == ret.nbytes):
            raise ValueError, 'wrong shape for output'
        # This doesn't work - out.data = ret.data
        raise ValueError, 'out parameter not working yet'
    return ret
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to