[Numpy-discussion] Sorting objects with ndarrays

2010-02-28 Thread Gael Varoquaux
Hi,

I need to have list of objects that contain ndarrays to be sorted. The
reason that I want them sorted is that these list are populated in an
arbitrary order, but there order really doesn't matter, and I am trying
to make it reproducible for debugging and hashing.

The problem is that ndarrays cannot be compared. So I have tried to
override the 'cmp' in the 'sorted' function, however I am comparing
fairly complex objects, and I am having a hard time predicting wich
member of the object will contain the array. So I am building a more and
more complex 'cmp' replacement.

Does anybody has a good idea what a better strategy would be?

Cheers,

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sorting objects with ndarrays

2010-02-28 Thread Pauli Virtanen
su, 2010-02-28 kello 10:25 +0100, Gael Varoquaux kirjoitti:
[clip]
 The problem is that ndarrays cannot be compared. So I have tried to
 override the 'cmp' in the 'sorted' function, however I am comparing
 fairly complex objects, and I am having a hard time predicting wich
 member of the object will contain the array. 

I don't understand what predicting which member of the object means?
Do you mean that in the array, you have classes that contain ndarrays as
their attributes, and the classes have __cmp__ implemented?

If not, can you tell why

def xcmp(a, b):
a_nd = isinstance(a, ndarray)
b_nd = isinstance(b, ndarray)

if a_nd and b_nd:
pass # compare ndarrays in some way
elif a_nd:
return 1  # sort ndarrays first
elif b_nd:
return -1 # sort ndarrays first
else:
return cmp(a, b) # ordinary compare

does not work?

Cheers,
Pauli


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sorting objects with ndarrays

2010-02-28 Thread Gael Varoquaux
On Sun, Feb 28, 2010 at 01:05:15PM +0200, Pauli Virtanen wrote:
 su, 2010-02-28 kello 10:25 +0100, Gael Varoquaux kirjoitti:
 [clip]
  The problem is that ndarrays cannot be compared. So I have tried to
  override the 'cmp' in the 'sorted' function, however I am comparing
  fairly complex objects, and I am having a hard time predicting wich
  member of the object will contain the array. 

 I don't understand what predicting which member of the object means?
 Do you mean that in the array, you have classes that contain ndarrays as
 their attributes, and the classes have __cmp__ implemented?

Well, I might not have to compare ndarrays, but fairly arbitrary
structures (dictionnaries, classes and lists) as I am dealing with
semi-structured data coming from a stack of unorganised experimental
data. Python has some logic for comparing these structures by comparing
their members, but if these are ndarrays, I am back to my original
problem.

 If not, can you tell why

 def xcmp(a, b):
 a_nd = isinstance(a, ndarray)
 b_nd = isinstance(b, ndarray)

 if a_nd and b_nd:
 pass # compare ndarrays in some way
 elif a_nd:
 return 1  # sort ndarrays first
 elif b_nd:
 return -1 # sort ndarrays first
 else:
 return cmp(a, b) # ordinary compare

 does not work?

Because I have things like lists of ndarrays, on which this fails. If I
could say: use recursively xcmp instead of cmp for this sort, it would
work, but the only way I can think of doing this is by monkey-patching
temporarily __builtins__.cmp, which I'd like to avoid, as it is not
thread safe.

Cheers,

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sorting objects with ndarrays

2010-02-28 Thread Gael Varoquaux
On Sun, Feb 28, 2010 at 03:01:18PM +0100, Friedrich Romstedt wrote:
  Well, I might not have to compare ndarrays, but fairly arbitrary
  structures (dictionnaries, classes and lists) as I am dealing with
  semi-structured data coming from a stack of unorganised experimental
  data. Python has some logic for comparing these structures by comparing
  their members, but if these are ndarrays, I am back to my original
  problem.

 I also do not understand how to build an oder on such a thing at all,
 maybe you can give a simple example?

Well, you can't really build an order in the mathematical sens of
ordering. All I care is that if you give me twice the samed shuffled list
of elements, it comes out identical. I am fighting the fact that
dictionnaries in Python have no order, and thus shuflle the data from run
to run.

 Hmm, you could also replace numpy.greater and similar temporarily with
 an with statement like:

 # Everything as usual, comparing ndarrays results in ndarrays here.

 with monkeypatched_operators:
 # Comparing ndarrays may result in scalars or what you need.
 pass  # Perform the sorting

 # Everything as usual ...

 Though that's maybe not threadsafe too.

Yes, it's not threadsafe either.

 Then you could use ndarray.flatten().tolist() to compare them using
 usual Python semantics?

That solves the local problem of comparing 2 arrays (though will be quite
slow), but not the general problem of sorting in a reproducible order
(may it be arbitary) objects containing arrays.

Anyhow, I solved the problem implementing a subclass of dict and using it
everywhere in my code. Right now it seems to be working for what I need.

Cheers,

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion