On 7/11/06, Luc Maisonobe <[EMAIL PROTECTED]> wrote:
J.Pietschmann wrote :

> Well, the majority of the num math text books on my shelf actually
> recommend computing the sum of the squared errors instead of the
> algebraic equivalent form given in the more analytically oriented
> text books (and used above). This is, of course, more complicated
> and still prone to adverse numerical effects unless the sequence
> is also sorted.


Can you provide references?

You are right, but this would also imply storing all values and either
recompute everything as points are added/removed or set up a "dirty"
flag to perform lazy evaluation only when needed. This has an impact on
both memory and CPU usage.

The current implementation does not retain each points, it simply
handles them on the fly by updating a few running sums. It can handle an
extremely large number of points with a very little memory footprint.

Do you think we should provide two implementations, one being memory/CPU
friendly and the other one being accuracy-friendly ?


No, unless there are compelling arguments indicating that direct
computation is in fact more accurate for many instances (contradicting
references in the javadoc), in which case we would as you point out
need to maintain two versions, since we can't abandon the scalability
and performance of the current (essentially stateless) impl.   See the
references to the Chan / Golub article on accumulating sums of squares
in the addData javadoc and the appliled regression text (Weisberg)
cited there.  See also, e.g., Neter and Wasserman,  Applied Linear
Statistical Models [isbn 0256117365].

Phil

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to