On Fri, Mar 6, 2009 at 1:50 PM, Alois Schlögl <[email protected]> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > Fair example. This example requires some explicit handling of NaN's. > Lets look at the case that raises an error: > > c = mean(x); > if any(isnan(c)) > error(); > end; > > > With the skippingNaN-mean() you do > > if any(isnan(x(:)) > error(); > end > c=mean(x); > > > In both cases you need somethink to do about the NaN's e.g. some error > handling. Except for the performance issue, there is no disadvantage in > using the nanskipping-mean(). >
No, I want just to leave them there. > > And one could also imagine to address the performance issue by a change > in the interface (e.g. by raising a global flag) > > [c,N]=mean(x); > if flag_nans_occured(), > error(); > end > > Actually, flag_nans_occured() is now supported. > So you don't want to remember nanmean/mean but it's OK to remember checking for flag_nans_occured ()? > > - --- > > You might consider it an advantage, that you can do the error checking > much later, e.g. > > c = mean(x); > d = do_some_more(c); > if any(isnan(d)) > error(); > end; > > However, this makes reading the code and finding the error more > difficult. Because, one can not easily see which step is causing the NaN. If I'm just writing a function that calculates, say, the centroid and distance vectors to vertices, I just want to return NaNs, not gripe inside the function. This is how most of Octave's functions work. That lets the user to choose the most suitable error handling. Maybe this particular invalid result won't be actually used in the computation - that's the whole point of NaNs, they're just more flexible than runtime errors. > >> >>> I'm asking because in 15+ years of using Matlab and Octave, I've never >>> found such a case. Maybe I can learn something new. >> >> See above. >> >>> Even in case, NaN propagation is desired, I guess I'd prefer to have an >>> explicit check for NaN's in order to emphasize that special case and >>> make the code more readable. Again, I've never come across a case were I >>> needed the mean to propagate NaN's. >>> >> >> Same thing - you're just used to skipping NaNs in mean, others may not be. > > > Yes, currently we have two different approaches. That's good for > comparing both approaches. > > I understand also that there is resistance to changes - that's just the > way it is, and its good because it provides a rather stable system. > However, this resistance should not stop one from adapting new/better > approaches once the advantages of the new approach become clear. > Octave is not a statistical package. I consider Octave's "mean" to be a general mathematical function, rather than a statistical one. That's why it prefers the straightforward calculation, where NaNs are just NaNs, they indicate an invalid numeric value. It's also documented that the calculation works that way. > >> >>>> But the different NaN treatment is not actually that bad, I doubt >>>> anyone would notice (the performance hit may be noticeable, but it is >>>> also unlikely). >>> I'm aware that the performance hit might be a disadvantage in using the >>> NaN-toolbox (although the benchmark tests have not been widely applied). >>> I guess its the major obstacle for a more widely application. >>> >> >> I can't judge that. Maybe most people are fine with it. In any case, >> I'm certainly free to not use the package if I don't like what it >> does. Besides, the functionality I was asking for (i.e. nanmean >> without shadowed mean) is provided by another package, so I just have >> no problem. >> >>> On the other hand, you gain in terms of programming effort: >>> (i) software is doing more often the right thing, >> >> depends... >> >>> (ii) its less likely to fail due to NaN-related issues. >> >> depends... >> >>> (iii) its more likely that users unaware of the NaN-issue get it right >>> in the first place, >> >> and stay unaware... (if it's right, of course) >> >>> (iv) no need to think about whether nanmean or mean is the right function; >>> (v) of course using always nanmean() would also do, but its nicer to >>> write only mean(); >>> >> >> I strongly prefer to have different syntax for functions doing different >> things. >> >>> In my experience, these advantages outweigh the small performance >>> penalty. These are also the reasons, why it was developed. Except for >>> compatibility tests, I've never found a need to turn off the NaN-toolbox. >>> >> >> Good for you :) >> >> cheers >> > > > Cheers, > Alois > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iEYEARECAAYFAkmxHBEACgkQzSlbmAlvEIhsfgCguFUSjwyJFat9M0dTJkzIhtxE > G9UAoJTHujFRjJzobem77EBmi300tcX4 > =y397 > -----END PGP SIGNATURE----- > -- RNDr. Jaroslav Hajek computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz ------------------------------------------------------------------------------ Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H _______________________________________________ Octave-dev mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/octave-dev
