Anne Archibald wrote: > > I, on the other hand, was making specifically that suggestion: users > should not use nans to indicate missing values. Users should use > masked arrays to indicate missing values.
I agree it is the nicest solution in theory, but I think it is impractical (as mentioned by Eric Firing in his email). > > This part I pretty much agree with. I can't really see which one is better (failing or returning NaN for sort/min/max and their sort counterpat), or if we should let the choice be left to the user. I am fine with both, and they both require the same amount of work. > Or we can make them behave drastically differently. > Masked arrays clearly need to be able to handle masked values flexibly > and explicitly. So I think nans should be handled simply and > conservatively: propagate them if possible, raise if not. I agree about this behavior being the default. I just think that for a couple of functions, we could we give either separate functions, or additional arguments to existing functions to ignore them: I am thinking about min/max/sort and their arg* counterpart, because those are really basic, and because we already have nanmean/nanstd/nanmedian (e.g. having a nansort would help for nanmean to be much faster). > > If users are concerned about performance, it's worth noting that on > some machines nans force a fallback to software floating-point > handling, with a corresponding very large performance hit. I was more concerned with the cost of treating NaN when you do not have NaN in your array when you have to treat for NaN explicitely (everything involving comparison). But I don't see any obvious way to avoid that cost, David _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion