This is trivial in pandas. a simple groupby. In [6]: data = [[ 'a', 27, 14.5 ],['b', 12, 99.0],['a', 17, 100.3], ['b', 12, -329.0]]
In [7]: df = DataFrame(data, columns=list('ABC')) In [8]: df Out[8]: A B C 0 a 27 14.5 1 b 12 99.0 2 a 17 100.3 3 b 12 -329.0 In [9]: df.groupby('A').first() Out[9]: B C A a 27 14.5 b 12 99.0 In [10]: df.groupby('A').last() Out[10]: B C A a 17 100.3 b 12 -329.0 On Mon, Jul 4, 2016 at 7:27 PM, Skip Montanaro <skip.montan...@gmail.com> wrote: > > Any way that you can make your keys numeric? Then you can run np.diff on > > that first column, and use the indices of nonzero entries > (np.flatnonzero) > > to know where values change. With a +1/-1 offset (that I am too lazy to > > figure out right now ;) you can then index into the original rows to get > > either the first or last occurrence of each run. > > I'll give it some thought, but one of the elements of the key is definitely > a (short, < six characters) string. Hashing it probably wouldn't work, too > great a chance for collisions. > > S > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion