Re: [R] Why does "summary" show number of NAs as non-integer?
On 6/1/05, Earl F. Glynn <[EMAIL PROTECTED]> wrote: > "Berton Gunter" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] > > summary() is an S3 generic that for your vector dispatches > > summary.default(). The output of summary default has class "table" and so > > calls print.table (print is another S3 generic). Look at the code of > > print.table() to see how it formats the output. > > "Marc Schwartz" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] > > On Tue, 2005-05-31 at 17:14 -0500, Earl F. Glynn wrote: > > > > Why isn't the number of NA's just "2" instead of the "2.000" shown > above? > > > "The same number of decimal places is used throughout a vector > > I'm talking about how this should be designed. The current impementation > may be to print a vector using generic logic, but why use generic logic to > produce a wrong solution? Shouldn't correctness be more important than using > a generic solution? > > There is special logic to suppress NA's when they don't exist (see below), > so why isn't there special logic to print the count of NAs, which MUST be an > integer, correctly when they do exist? > > An integer should NOT be displayed with meaningless decimal places. Why > would this ever be desirable? The generic solution should be dropped in > favor of a correct solution. > > # Why not use special logic to show the number of NA's correctly as an > integer? > > set.seed(19) > > summary( c(NA, runif(10,1,100), NaN) ) > Min. 1st Qu. MedianMean 3rd Qu.Max.NA's > 7.771 24.850 43.040 43.940 63.540 83.830 2.000 > > # There is already special logic to suppress NA's > > set.seed(19) > > summary( runif(10,1,100) ) > Min. 1st Qu. MedianMean 3rd Qu.Max. > 7.771 24.850 43.040 43.940 63.540 83.830 > > "2.000" and "2" do not have equivalent meaning. Try: R> library(Hmisc) R> describe( c(NA, runif(10,1,100), NaN) ) c(NA, runif(10, 1, 100), NaN) n missing uniqueMean .05 .10 .25 .50 .75 .90 10 2 10 50.99 15.24 16.82 21.14 52.70 76.35 83.52 .95 90.79 13.65 17.17 18.12 30.18 46.21 59.19 65.36 80.01 81.90 98.06 Frequency 1 1 1 1 1 1 1 1 1 1 %10101010101010101010 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Why does "summary" show number of NAs as non-integer?
"Berton Gunter" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > summary() is an S3 generic that for your vector dispatches > summary.default(). The output of summary default has class "table" and so > calls print.table (print is another S3 generic). Look at the code of > print.table() to see how it formats the output. "Marc Schwartz" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Tue, 2005-05-31 at 17:14 -0500, Earl F. Glynn wrote: > > Why isn't the number of NA's just "2" instead of the "2.000" shown above? > "The same number of decimal places is used throughout a vector I'm talking about how this should be designed. The current impementation may be to print a vector using generic logic, but why use generic logic to produce a wrong solution? Shouldn't correctness be more important than using a generic solution? There is special logic to suppress NA's when they don't exist (see below), so why isn't there special logic to print the count of NAs, which MUST be an integer, correctly when they do exist? An integer should NOT be displayed with meaningless decimal places. Why would this ever be desirable? The generic solution should be dropped in favor of a correct solution. # Why not use special logic to show the number of NA's correctly as an integer? > set.seed(19) > summary( c(NA, runif(10,1,100), NaN) ) Min. 1st Qu. MedianMean 3rd Qu.Max.NA's 7.771 24.850 43.040 43.940 63.540 83.830 2.000 # There is already special logic to suppress NA's > set.seed(19) > summary( runif(10,1,100) ) Min. 1st Qu. MedianMean 3rd Qu.Max. 7.771 24.850 43.040 43.940 63.540 83.830 "2.000" and "2" do not have equivalent meaning. efg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Why does "summary" show number of NAs as non-integer?
On Tue, 2005-05-31 at 17:14 -0500, Earl F. Glynn wrote: > Example: > > > set.seed(19) > > summary( c(NA, runif(10,1,100), NaN) ) >Min. 1st Qu. MedianMean 3rd Qu.Max.NA's > 7.771 24.850 43.040 43.940 63.540 83.830 2.000 > > Why isn't the number of NA's just "2" instead of the "2.000" shown above? > > efg This is actually related to the thread on formatting numbers. In reviewing the Detail section of ?print.default: "The same number of decimal places is used throughout a vector, This means that digits specifies the minimum number of significant digits to be used, and that at least one entry will be printed with that minimum number." 'digits' in the above is the digits argument to print.default(). In this case, it defaults to options("digits"), which is 7. In the above output from summary, you will note that all of the output has three digits after the decimal place. Thus: > c(2) [1] 2 > c(2, 3) [1] 2 3 > c(2, 3.5) [1] 2.0 3.5 > c(2, 3.57) [1] 2.00 3.57 > c(2, 3.579) [1] 2.000 3.579 Note how the output format of "2" varies depending upon how many decimal places I use in the second element. This goes to the need to use other functions where there is a need to exert greater control over how numeric output can be formatted and aligned using formatC() and/or sprintf(). For example: > sprintf("0 decimal places: %d3 decimal places: %4.3f", 2, 3.57911) [1] "0 decimal places: 23 decimal places: 3.579" See ?sprintf and ?formatC for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Why does "summary" show number of NAs as non-integer?
summary() is an S3 generic that for your vector dispatches summary.default(). The output of summary default has class "table" and so calls print.table (print is another S3 generic). Look at the code of print.table() to see how it formats the output. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA "The business of the statistician is to catalyze the scientific learning process." - George E. P. Box > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Earl F. Glynn > Sent: Tuesday, May 31, 2005 3:14 PM > To: r-help@stat.math.ethz.ch > Subject: [R] Why does "summary" show number of NAs as non-integer? > > Example: > > > set.seed(19) > > summary( c(NA, runif(10,1,100), NaN) ) >Min. 1st Qu. MedianMean 3rd Qu.Max.NA's > 7.771 24.850 43.040 43.940 63.540 83.830 2.000 > > Why isn't the number of NA's just "2" instead of the "2.000" > shown above? > > efg > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Why does "summary" show number of NAs as non-integer?
Example: > set.seed(19) > summary( c(NA, runif(10,1,100), NaN) ) Min. 1st Qu. MedianMean 3rd Qu.Max.NA's 7.771 24.850 43.040 43.940 63.540 83.830 2.000 Why isn't the number of NA's just "2" instead of the "2.000" shown above? efg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html