Re: [Numpy-discussion] member1d and unique elements

Greg Novak Tue, 05 Aug 2008 10:46:43 -0700

Argh.  I could swear that yesterday I typed test cases just like the
one you provide, and it behaved correctly.  Nevertheless, it clearly
fails in spite of my memory, so attached is a version which I believe
gives the correct behavior.


Greg

On Tue, Aug 5, 2008 at 9:00 AM, Robert Cimrman <[EMAIL PROTECTED]> wrote:
> I do not have much time to investigate it in detail right now, but it
> does not work for repeated entries in ar1:
>
> In [14]: nm.setmember1d( [1,2,3,2], [1, 3] )
> Out[14]: array([ True,  True,  True, False], dtype=bool)

def setmember1d( ar1, ar2, handle_dupes=True):
    """Return a boolean array of shape of ar1 containing True where the elements
    of ar1 are in ar2 and False otherwise.

    If handle_dupes is true, allow for the possibility that ar1 or ar2
    each contain duplicate values.  If you are sure that each array
    contains only unique elelemnts, you can set handle_dupes to False
    for faster execution.
    
    Use unique1d() to generate arrays with only unique elements to use as inputs
    to this function.

    :Parameters:
      - `ar1` : array
      - `ar2` : array
      - `handle_dupes` : boolean
      
    :Returns:
      - `mask` : bool array
        The values ar1[mask] are in ar2.

    :See also:
      numpy.lib.arraysetops has a number of other functions for performing set
      operations on arrays.
    """
    # We need this to be a stable sort, so always use 'mergesort' here. The
    # values from the first array should always come before the values from the
    # second array.
    ar = nm.concatenate( (ar1, ar2 ) )
    order = ar.argsort(kind='mergesort')
    sar = ar[order]        
    equal_adj = (sar[1:] == sar[:-1])
    flag = nm.concatenate( (equal_adj, [False] ) )
    
    if handle_dupes:
        # if there is duplication, then being equal to your next
        # higher neighbor in the sorted array equal is not sufficient
        # to establish that your value exists in ar2 -- it may have
        # come from ar1.  A complication is that that this is
        # transitive: setmember1d([2,2], [2]) must recognize _both_
        # 2's in ar1 as appearing in ar2, so neither is it sufficient
        # to test if you're equal to your neighbor and your neighbor
        # came from ar2.  Initially mask is 0 for values from ar1 and
        # 1 for values from ar2.  If an entry is equal to the next
        # higher neighbor and mask is 1 for the higher neighbor, then
        # mask is set to 1 for the lower neighbor also.  At the end,
        # mask is 1 if the value of the entry appears in ar2.
        zlike = nm.zeros_like
        mask = nm.concatenate( (zlike( ar1 ), zlike( ar2 ) + 1) )
        
        smask = mask[order]
        prev_smask = zlike(smask) - 1
        while not (prev_smask == smask).all():
            prev_smask[:] = smask
            smask[nm.where(equal_adj & smask[1:])[0]] = 1
        flag *= smask
        
    indx = order.argsort(kind='mergesort')[:len( ar1 )]
    return flag[indx]

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] member1d and unique elements

Reply via email to