[Numpy-discussion] Sorting objects with ndarrays
Hi, I need to have list of objects that contain ndarrays to be sorted. The reason that I want them sorted is that these list are populated in an arbitrary order, but there order really doesn't matter, and I am trying to make it reproducible for debugging and hashing. The problem is that ndarrays cannot be compared. So I have tried to override the 'cmp' in the 'sorted' function, however I am comparing fairly complex objects, and I am having a hard time predicting wich member of the object will contain the array. So I am building a more and more complex 'cmp' replacement. Does anybody has a good idea what a better strategy would be? Cheers, Gaël ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Sorting objects with ndarrays
su, 2010-02-28 kello 10:25 +0100, Gael Varoquaux kirjoitti: [clip] The problem is that ndarrays cannot be compared. So I have tried to override the 'cmp' in the 'sorted' function, however I am comparing fairly complex objects, and I am having a hard time predicting wich member of the object will contain the array. I don't understand what predicting which member of the object means? Do you mean that in the array, you have classes that contain ndarrays as their attributes, and the classes have __cmp__ implemented? If not, can you tell why def xcmp(a, b): a_nd = isinstance(a, ndarray) b_nd = isinstance(b, ndarray) if a_nd and b_nd: pass # compare ndarrays in some way elif a_nd: return 1 # sort ndarrays first elif b_nd: return -1 # sort ndarrays first else: return cmp(a, b) # ordinary compare does not work? Cheers, Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Sorting objects with ndarrays
On Sun, Feb 28, 2010 at 01:05:15PM +0200, Pauli Virtanen wrote: su, 2010-02-28 kello 10:25 +0100, Gael Varoquaux kirjoitti: [clip] The problem is that ndarrays cannot be compared. So I have tried to override the 'cmp' in the 'sorted' function, however I am comparing fairly complex objects, and I am having a hard time predicting wich member of the object will contain the array. I don't understand what predicting which member of the object means? Do you mean that in the array, you have classes that contain ndarrays as their attributes, and the classes have __cmp__ implemented? Well, I might not have to compare ndarrays, but fairly arbitrary structures (dictionnaries, classes and lists) as I am dealing with semi-structured data coming from a stack of unorganised experimental data. Python has some logic for comparing these structures by comparing their members, but if these are ndarrays, I am back to my original problem. If not, can you tell why def xcmp(a, b): a_nd = isinstance(a, ndarray) b_nd = isinstance(b, ndarray) if a_nd and b_nd: pass # compare ndarrays in some way elif a_nd: return 1 # sort ndarrays first elif b_nd: return -1 # sort ndarrays first else: return cmp(a, b) # ordinary compare does not work? Because I have things like lists of ndarrays, on which this fails. If I could say: use recursively xcmp instead of cmp for this sort, it would work, but the only way I can think of doing this is by monkey-patching temporarily __builtins__.cmp, which I'd like to avoid, as it is not thread safe. Cheers, Gaël ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Sorting objects with ndarrays
On Sun, Feb 28, 2010 at 03:01:18PM +0100, Friedrich Romstedt wrote: Well, I might not have to compare ndarrays, but fairly arbitrary structures (dictionnaries, classes and lists) as I am dealing with semi-structured data coming from a stack of unorganised experimental data. Python has some logic for comparing these structures by comparing their members, but if these are ndarrays, I am back to my original problem. I also do not understand how to build an oder on such a thing at all, maybe you can give a simple example? Well, you can't really build an order in the mathematical sens of ordering. All I care is that if you give me twice the samed shuffled list of elements, it comes out identical. I am fighting the fact that dictionnaries in Python have no order, and thus shuflle the data from run to run. Hmm, you could also replace numpy.greater and similar temporarily with an with statement like: # Everything as usual, comparing ndarrays results in ndarrays here. with monkeypatched_operators: # Comparing ndarrays may result in scalars or what you need. pass # Perform the sorting # Everything as usual ... Though that's maybe not threadsafe too. Yes, it's not threadsafe either. Then you could use ndarray.flatten().tolist() to compare them using usual Python semantics? That solves the local problem of comparing 2 arrays (though will be quite slow), but not the general problem of sorting in a reproducible order (may it be arbitary) objects containing arrays. Anyhow, I solved the problem implementing a subclass of dict and using it everywhere in my code. Right now it seems to be working for what I need. Cheers, Gaël ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion