Oh, I didn't know there was a Debian SciPy list! From: Alexandre Fayolle [mailto:[EMAIL PROTECTED] On Wed, Aug 10, 2005 at 02:39:54PM -0400, Perry, Alexander (GE Infrastructure) wrote: > > There are typographic bugs in stats.py that make two functions fail. In > > addition, the calculation for nanstd() provides obviously wrong answers. > > I have not done any formal checking for the correction attached below.
> I'm having a few problems understanding your patch for nanstd, which > does not seem to give the right results either, but I'm not sure of the > expected semantics of the nanstd function. Is it expected that > nanstd([nan, 2., 4.]) == std([2., 4.]) > or that > nanstd([nan, 2., 4.]) == std([0., 2., 4.]) > your code produces neither. I'd go for the first option (given the > definition of nanmean) I don't know; as far as I can tell, it isn't documented what upstream intends the function to return. Feel free to pick which you think is best and we can see what upstream does with my bug report. Bear in mind there is a third result option, which is what I attempted to achieve. I don't know whether it is the correct one, of course. The third option is that the nan-removed dataset is a sample of the larger dataset, so that the standard deviation of the larger dataset has to be increased beyond the the standard deviation of the nan-removed dataset to account for the increase in uncertainty. > I propose the following implementation of nanstd: Fine by me. > > In the same file, nnlf() method passes its own object twice to _nnlf() > I don't see this in my code. Line 746 of /usr/lib/python2.3/site-packages/scipy/stats/distributions.py > > ! return self._nnlf(self, x, *args) + N*log(scale) The "_nnlf" is being called as a method of "self", so automatically gets the identity of itself being passed to the call as a consequence. Placing "self" as the first parameter means it gets passed a second time. This is sometimes useful, but not in this case; look at line 724. > The diff you sent me are strange, some of > the chunks seem to be reversed. Yeah, I notice now that part of my diffs ended up reversed. Sorry. > Do we agree that the bottom version is correct ? I.e.: > return self._nnlf(x, *args) + N*log(scale) Yes. > This is the code I had in my source tree. Really? How odd. My lines were quoted from a computer running Testing, the file is in binary package "python2.3-scipy" for i386 and the unmodified installed version is 0.3.2-6 > This is the final patch that I plan to upload. Some of that looks reversed; for example: > - mu, mu2 = self.stats(*args,**{'moments':'mv'}) > - muhat = st.nanmean(data) > - mu2hat = st.nanstd(data) > + mu, mu2, g1, g2 = self.stats(*args,**{'moments':'mv'}) > + muhat = stats.nanmean(data) > + mu2hat = stats.nanstd(data) Maybe you already have some of my fixes in your local tree?