Francesc Altet <[EMAIL PROTECTED]> writes: > A Tuesday 21 August 2007, Mark.Miller escrigué: >> Is there a good loopless way to identify all of the unique rows in an >> array? Something like numpy.unique() is ideal, but capable of >> extracting unique subarrays along an axis. > > You can always do a view of the rows as strings and then use unique().
For large arrays it probably makes sense to hash the rows by taking a dot product with a random vector. Then sort the hash values and identify blocks of equal values (allowing for rounding errors). Rows with different hash values are guaranteed to be different; for blocks of rows with the same hash value, you'll have to check, but this will probably be much less work than checking every row, and (I hope) BLAS makes the dot-product phase go fast. -- Jouni K. Seppänen http://www.iki.fi/jks _______________________________________________ Numpy-discussion mailing list [email protected] http://projects.scipy.org/mailman/listinfo/numpy-discussion
