Thanks for the ideas. It is great to have such skilled assistance with this issue. That said, I don't think we've solved this one, yet.

Looking back at where my numbers came from, I found that I had read in integers from a file, divided by 1000, then (critically) subtracted those numbers from 2. It turns out that the important part seems to be the subtraction, not the data source.

It isn't necessary to read in the data to get the effect. Here is a simple example:

write.table(c(1,2)-c(0.995,1.995), file="data.txt", row.names=F, col.names=F)

$ cat data.txt
0.005
0.00499999999999989

Here is another simple example that uses seq() and does not require reading in data. As you can see, the output for both commands should be the same, but there is a big difference in how the numbers are represented in the output. What causes the inconsistency within and between these two output files?

write.table(1-seq(0.995,0.840,-.005), file="data1.txt", row.names=F, 
col.names=F)
write.table(2-seq(1.995,1.840,-.005), file="data2.txt", row.names=F, 
col.names=F)

$ head -33 data[12].txt
==> data1.txt <==
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
0.055
0.0600000000000001
0.0649999999999999
0.0700000000000001
0.075
0.08
0.085
0.09
0.095
0.1
0.105
0.11
0.115
0.12
0.125
0.13
0.135
0.14
0.145
0.15
0.155
0.16

==> data2.txt <==
0.00499999999999989
0.00999999999999979
0.0149999999999999
0.0199999999999998
0.0249999999999999
0.0299999999999998
0.0349999999999999
0.0399999999999998
0.0449999999999999
0.0499999999999998
0.0549999999999999
0.0599999999999998
0.0649999999999999
0.0699999999999998
0.075
0.0799999999999998
0.085
0.0899999999999999
0.095
0.0999999999999999
0.105
0.11
0.115
0.12
0.125
0.13
0.135
0.14
0.145
0.15
0.155
0.16

Importantly, if I do this...

write.table(seq(0.005,0.160,.005), file="data.txt", row.names=F, col.names=F)

...I'm producing all the same values, but no number in the output file exceeds three digits to the right of the decimal.

Thanks again for all of the helpful comments and ideas.

Best,
Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota
http://scholar.google.com/citations?user=EV_phq4AAAAJ


On Sat, 15 Mar 2014, Duncan Murdoch wrote:

On 14-03-14 11:03 PM, Mike Miller wrote:
On Fri, 14 Mar 2014, Duncan Murdoch wrote:

On 14-03-14 8:59 PM, Mike Miller wrote:
What I'm using:

R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)

That's not current, but it's not very old...

According to some docs, options(digits) controls numerical precision in
output of write.table().  I'm using the default value for digits:

getOption("digits")
[1] 7

I have a bunch of numbers in a data frame that are only a few digits to
the right of the decimal:

That's not enough to reproduce this.  Put together a self-contained
reproducible example if you're wondering why something behaves as it
does. With just a bunch of output, you'll just get uninformed guesses.


Thanks for the tip.  Here's what I've done:

data2 <- data[c(94,120),c(18,20,21)]

Thanks, I got the data2.Rdata file. Peter was right, you don't have what you think you have in that dataframe. See below.

save(data2, file="data2.Rdata")
q("no")

$ R
load("data2.Rdata")
data2
        V18   V20      V21
94  0.008 0.008 0.000064
120 0.023 0.023 0.000529

I'll create a dataframe that looks like yours:

data3 <- data.frame(V18=c(0.008, 0.023), V20=c(0.008, 0.023),
V21=c(0.000064, 0.000529))
data3
   V18   V20      V21
1 0.008 0.008 0.000064
2 0.023 0.023 0.000529


But it's not the same:

data2-data3
             V18           V20           V21
94   6.938894e-18  6.938894e-18  1.219727e-19
120 -9.020562e-17 -9.020562e-17 -4.119968e-18

I can't tell where these errors crept in; they are likely there in your "data" object, which you didn't give us. I'd guess as Peter did that your numbers are the results of computations that introduced rounding error.

Duncan Murdoch

write.table(data2, file="data2.txt", sep="\t", row.names=F, col.names=F)

$ cat data2.txt
0.00800000000000001     0.00800000000000001     6.40000000000001e-05
0.0229999999999999      0.0229999999999999      0.000528999999999996

The data2.Rdata file is attached to this message.

I guess that is enough to reproduce this exact finding.  I don't know how
it works in general.

I don't have a newer version of R available right now.  It did the same
thing on an older version (2.15.1).

Interestingly, on a different machine with an even older version (2.12.2)
I see something a little different:

0.008   0.008   6.40000000000001e-05
0.0229999999999999      0.0229999999999999      0.000528999999999996

Best,
Mike



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to