On Thu, Mar 5, 2009 at 12:29 PM, Jaroslav Hajek <[email protected]> wrote: > On Thu, Mar 5, 2009 at 12:02 PM, Alois Schlögl <[email protected]> > wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Jaroslav Hajek wrote: >>>> sumskipnan counts also the number of non-NaNs. >>>> [s,c]=sumskipnan(...) >>>> >>>> computing both s and c in a single step is beneficial for estimating >>>> mean, variance and other statistics. >>>> >>> >>> well, you can do >>> >>> nans = isnan (x); >>> x(nans) = 0; >>> s = sum (x, dim); >>> c = size (x, dim) - sum (nans); >>> >>> Not exactly as fast as doing it all in a single loop, but simplistic. >> >> I guess, you meant >> c = size (x, dim) - sum (nans,dim); >> >> In terms of simplicity, >> [s,c]=sumskipnan(x,dim); >> will win. >> > > Depends on what you count in. I wrote the first from top of my head, > whereas for the second I'd need to look up the syntax. But I don't > have any fundamental objections against the existence of sumskipnan, > of course. > >>> >>>>> Besides, I think the fact that the NaN package shadows Octave's >>>>> built-in functions is very dangerous and confusing, even though I >>>>> understand the motivation. I think this package should not be >>>>> installed by default. >>>> >>>> Where do you see a danger ? Please explain. >>>> >>> >>> It seems that sometimes users (especially windows users) get this >>> package unknowingly loaded. Not that this is your fault, just that it >>> probably shouldn't be on by default in distributions. >>> >>> The more painful issue is that it makes the package less attractive to >>> use - for instance, if I want to use the nanmean function to get >>> nan-free mean, but I *don't* want the built-in mean to be shadowed >>> (because the replacement is slower). >> >> Therefore, it would be nice to have a pre-compiled sumskipnan that >> limits the performance hit. And their is certainly room for further >> improvement. > > I don't want to limit it. I just don't want it to be there. I would > like to be able to use *both* nanmean and the default mean at the same > time. > >>> >>> OTOH, I admit sometimes it may be good to be able to just substitute >>> the default stats by nan-free ones. >>> >>> I think it would be better to split the package in two, say, "nan" and >>> "nan-shadow" that would separate the two uses, because right now I >>> need to manually edit "path" after the package is loaded if I don't >>> want the default funcs to be shadowed. >> >> >> I donot know how this should work. We have already two competing >> stats-packages, the default one and the NaN-toolbox. A third option >> would just increase the confusion. Personally, I'd prefer merging the >> advantages of both approaches in a single solution. >>
I now think about porting sumskipnan into the statistics package and reimplement the nansum etc. function using it. What do you think about it? regards -- RNDr. Jaroslav Hajek computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz ------------------------------------------------------------------------------ Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H _______________________________________________ Octave-dev mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/octave-dev
