Hi, I think you're not understanding quite what's going on with hist. Reread the help, and take a look at this small example. The solution I'd use is the last item.
> x <- rep(1:10, times=1:10) > table(x) x 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 > > > hist(x, plot=FALSE, right=TRUE)$counts [1] 3 3 4 5 6 7 8 9 10 > hist(x, plot=FALSE, right=TRUE)$breaks [1] 1 2 3 4 5 6 7 8 9 10 > hist(x, plot=FALSE, right=TRUE)$mids [1] 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 > > > hist(x, plot=FALSE, right=FALSE)$counts [1] 1 2 3 4 5 6 7 8 19 > hist(x, plot=FALSE, right=FALSE)$breaks [1] 1 2 3 4 5 6 7 8 9 10 > hist(x, plot=FALSE, right=FALSE)$mids [1] 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 > > > hist(x, plot=FALSE, breaks=seq(.5, 10.5, by=1))$counts [1] 1 2 3 4 5 6 7 8 9 10 > hist(x, plot=FALSE, breaks=seq(.5, 10.5, by=1))$breaks [1] 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 > hist(x, plot=FALSE, breaks=seq(.5, 10.5, by=1))$mids [1] 1 2 3 4 5 6 7 8 9 10 Sarah On Sat, Dec 31, 2011 at 10:25 AM, Aren Cambre <a...@arencambre.com> wrote: > I have two large datasets (156K and 2.06M records). Each row has the > hour that an event happened, represented by an integer from 0 to 23. > > R's histogram is combining some data. > > Here's the command I ran to get the histogram: >> histinfo <- hist(crashes$hour, right=FALSE) > > Here's histinfo: >> histinfo > $breaks > [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 > > $counts > [1] 4755 4618 5959 3292 2378 2715 4592 6144 6860 5598 5601 > 6596 7152 7490 8166 > [16] 9758 11301 11745 9943 7494 6272 6220 11669 > > $intensities > [1] 0.03041876 0.02954234 0.03812101 0.02105963 0.01521258 0.01736844 > 0.02937602 0.03930449 > [9] 0.04388490 0.03581161 0.03583081 0.04219604 0.04575289 0.04791515 > 0.05223967 0.06242403 > [17] 0.07229494 0.07513530 0.06360752 0.04794074 0.04012334 0.03979068 > 0.07464911 > > $density > [1] 0.03041876 0.02954234 0.03812101 0.02105963 0.01521258 0.01736844 > 0.02937602 0.03930449 > [9] 0.04388490 0.03581161 0.03583081 0.04219604 0.04575289 0.04791515 > 0.05223967 0.06242403 > [17] 0.07229494 0.07513530 0.06360752 0.04794074 0.04012334 0.03979068 > 0.07464911 > > $mids > [1] 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 11.5 12.5 > 13.5 14.5 15.5 16.5 17.5 > [19] 18.5 19.5 20.5 21.5 22.5 > > $xname > [1] "crashes$hour" > > $equidist > [1] TRUE > > attr(,"class") > [1] "histogram" > > Note how the last value in counts is 11669. It's relevant to the > output of table(crashes$hour): > 0 1 2 3 4 5 6 7 8 9 10 > 11 12 13 14 > 4755 4618 5959 3292 2378 2715 4592 6144 6860 5598 5601 > 6596 7152 7490 8166 > 15 16 17 18 19 20 21 22 23 > 9758 11301 11745 9943 7494 6272 6220 6000 5669 > > Notice how the sum of 22 and 23 from table(crashes$hour) is 11669? Is > that correct for the histogram to combine hours 22 and 23? Since I > specified right = FALSE, I figured there's no way 23 would be combined > with 22? > > Adding breaks=24 to the hist makes no difference; it's still stuck at > 23 breaks. I also tried breaks=25 and 23 and several other values, in > case I am misinterpreting breaks's meaning, but none of them make a > difference. > > I imagine this is a n00b question, so my apologies if this is obvious. > > Aren > -- Sarah Goslee http://www.functionaldiversity.org ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.