Here is a test I ran and looks fine, but then I created the data, so
it might have something to do with your data:

> x <- sample(0:23, 100000, TRUE)
> a <- hist(x, breaks = 24)
> a[1:5]
$breaks
 [1]  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

$counts
 [1] 8262 4114 4186 4106 4153 4234 4206 4155 4157 4203 4186 4158 4132
4139 4231 4216 4158 4054 4185 4153
[21] 4281 4110 4221

$intensities
 [1] 0.08262 0.04114 0.04186 0.04106 0.04153 0.04234 0.04206 0.04155
0.04157 0.04203 0.04186 0.04158
[13] 0.04132 0.04139 0.04231 0.04216 0.04158 0.04054 0.04185 0.04153
0.04281 0.04110 0.04221

$density
 [1] 0.08262 0.04114 0.04186 0.04106 0.04153 0.04234 0.04206 0.04155
0.04157 0.04203 0.04186 0.04158
[13] 0.04132 0.04139 0.04231 0.04216 0.04158 0.04054 0.04185 0.04153
0.04281 0.04110 0.04221

$mids
 [1]  0.5  1.5  2.5  3.5  4.5  5.5  6.5  7.5  8.5  9.5 10.5 11.5 12.5
13.5 14.5 15.5 16.5 17.5 18.5 19.5
[21] 20.5 21.5 22.5

> table(x)
x
   0    1    2    3    4    5    6    7    8    9   10   11   12   13
 14   15   16   17   18   19   20
4168 4094 4114 4186 4106 4153 4234 4206 4155 4157 4203 4186 4158 4132
4139 4231 4216 4158 4054 4185 4153
  21   22   23
4281 4110 4221
>


On Sat, Dec 31, 2011 at 11:20 AM, Sarah Goslee <sarah.gos...@gmail.com> wrote:
> Hi,
>
> I think you're not understanding quite what's going on with hist. Reread the
> help, and take a look at this small example. The solution I'd use is the last
> item.
>
>> x <- rep(1:10, times=1:10)
>> table(x)
> x
>  1 2 3 4 5 6 7 8 9 10
>  1 2 3 4 5 6 7 8 9 10
>>
>>
>> hist(x, plot=FALSE, right=TRUE)$counts
> [1] 3 3 4 5 6 7 8 9 10
>> hist(x, plot=FALSE, right=TRUE)$breaks
>  [1] 1 2 3 4 5 6 7 8 9 10
>> hist(x, plot=FALSE, right=TRUE)$mids
> [1] 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5
>>
>>
>> hist(x, plot=FALSE, right=FALSE)$counts
> [1]  1  2  3  4  5  6  7  8 19
>> hist(x, plot=FALSE, right=FALSE)$breaks
>  [1] 1 2 3 4 5 6 7 8 9 10
>> hist(x, plot=FALSE, right=FALSE)$mids
> [1] 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5
>>
>>
>> hist(x, plot=FALSE, breaks=seq(.5, 10.5, by=1))$counts
>  [1] 1 2 3 4 5 6 7 8 9 10
>> hist(x, plot=FALSE, breaks=seq(.5, 10.5, by=1))$breaks
>  [1]  0.5  1.5  2.5  3.5  4.5  5.5  6.5  7.5  8.5  9.5 10.5
>> hist(x, plot=FALSE, breaks=seq(.5, 10.5, by=1))$mids
>  [1] 1 2 3 4 5 6 7 8 9 10
>
>
> Sarah
>
> On Sat, Dec 31, 2011 at 10:25 AM, Aren Cambre <a...@arencambre.com> wrote:
>> I have two large datasets (156K and 2.06M records). Each row has the
>> hour that an event happened, represented by an integer from 0 to 23.
>>
>> R's histogram is combining some data.
>>
>> Here's the command I ran to get the histogram:
>>> histinfo <- hist(crashes$hour, right=FALSE)
>>
>> Here's histinfo:
>>> histinfo
>> $breaks
>>  [1]  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
>>
>> $counts
>>  [1]  4755  4618  5959  3292  2378  2715  4592  6144  6860  5598  5601
>>  6596  7152  7490  8166
>> [16]  9758 11301 11745  9943  7494  6272  6220 11669
>>
>> $intensities
>>  [1] 0.03041876 0.02954234 0.03812101 0.02105963 0.01521258 0.01736844
>> 0.02937602 0.03930449
>>  [9] 0.04388490 0.03581161 0.03583081 0.04219604 0.04575289 0.04791515
>> 0.05223967 0.06242403
>> [17] 0.07229494 0.07513530 0.06360752 0.04794074 0.04012334 0.03979068
>> 0.07464911
>>
>> $density
>>  [1] 0.03041876 0.02954234 0.03812101 0.02105963 0.01521258 0.01736844
>> 0.02937602 0.03930449
>>  [9] 0.04388490 0.03581161 0.03583081 0.04219604 0.04575289 0.04791515
>> 0.05223967 0.06242403
>> [17] 0.07229494 0.07513530 0.06360752 0.04794074 0.04012334 0.03979068
>> 0.07464911
>>
>> $mids
>>  [1]  0.5  1.5  2.5  3.5  4.5  5.5  6.5  7.5  8.5  9.5 10.5 11.5 12.5
>> 13.5 14.5 15.5 16.5 17.5
>> [19] 18.5 19.5 20.5 21.5 22.5
>>
>> $xname
>> [1] "crashes$hour"
>>
>> $equidist
>> [1] TRUE
>>
>> attr(,"class")
>> [1] "histogram"
>>
>> Note how the last value in counts is 11669. It's relevant to the
>> output of table(crashes$hour):
>>     0     1     2     3     4     5     6     7     8     9    10
>> 11    12    13    14
>>  4755  4618  5959  3292  2378  2715  4592  6144  6860  5598  5601
>> 6596  7152  7490  8166
>>    15    16    17    18    19    20    21    22    23
>>  9758 11301 11745  9943  7494  6272  6220  6000  5669
>>
>> Notice how the sum of 22 and 23 from table(crashes$hour) is 11669? Is
>> that correct for the histogram to combine hours 22 and 23? Since I
>> specified right = FALSE, I figured there's no way 23 would be combined
>> with 22?
>>
>> Adding breaks=24 to the hist makes no difference; it's still stuck at
>> 23 breaks. I also tried breaks=25 and 23 and several other values, in
>> case I am misinterpreting breaks's meaning, but none of them make a
>> difference.
>>
>> I imagine this is a n00b question, so my apologies if this is obvious.
>>
>> Aren
>>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to