Yoav, Thanks for the comments. See attempt at interspersed responses below. Hola,
>Yes, it would be good to maintain acceptable html in javadoc. Yet, I'd >like to point out that javadoc isn't java code. while we would like to >maintain lots of it to help our users understand it, the library works >just fine without it. [Yoav] But if you do have it, it's be nice if it were in a human-friendly browsing format, given that it's intended for humans and that most of them use the HTML JavaDocs ;) [phil] I agree. We have a good bit of work to do still here. Patches welcome :-) One thing that we do have beyond the package, class and method javadocs is the user guide, which is nearing completion. >>5) Is double suitable for these calculations? Should the strictfp flag be >>used? (I have no idea as to the answer, but I have to ask) >> >Neither do I. Can anyone enlighten us? [Yoav] You probably want strictfp: http://www.jguru.com/faq/view.jsp?EID=17544. [Phil] I am not sure that we want this, but I am by no means a JVM expert. From what I understand, the decision comes down to strict consistency of results on different platforms (mostly involving NaN and other boundary conditions) vs. performance. In most practical applications, I would personally be more interested in performance. It would be a major PITA (given the way things have to be declared); but I suppose that in theory we could support both. I am open to discussion on this, but my vote at this point would be to release without strictfp support for 1.0. [Yoav] Out of curiosity, why read each url/file twice? [Phil] Because the implementation is primitive ;-) The load method of EmpiricalDistribution needs to 1) compute basic univariate statistics for the whole file and 2) divide the range of values in the file into a predetermined number of "bins" and compute univariate statistics for the values in each bin. The simplest way to do this is to pass the data once to do 1), then use the min and max discovered in 1) to set up the bins and compute the bin stats in the second pass. Since the files may be large, it is not a good idea to try to load the data into memory during the first pass. A single pass algorithm would have to either dynamically adjust the bins (and bin stats) as new extreme values are discovered or take extrema as arguments. I would prefer not to require the extrema to be specified in advance. The dynamic bin adjustment would be hard to do efficiently (at least is seems hard to me -- bright ideas / patches welcome :-) Phil
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]