Hello,


Às 19:28 de 17/08/21, Bert Gunter escreveu:
Inline below.



On Tue, Aug 17, 2021 at 4:09 AM Rui Barradas <ruipbarra...@sapo.pt> wrote:

Hello,

I had forgotten about plot.histogram, it does make everything simpler.
To have percentages on the bars, in the code below I use package scales.

Note that it seems to me that you do not want densities, to have
percentages,  the proportions of counts are given by any of

Under the default of equal width bins -- which is what Sturges gives

Right.

if I read the docs correctly -- since the densities sum to 1,

The "densities" do not sum to 1. From ?hist, section Value:

density 
values f^(x[i]), as estimated density values. If all(diff(breaks) == 1), they are the relative frequencies counts/n and in general satisfy
sum[i; f^(x[i]) (b[i+1]-b[i])] = 1, where b[i] = breaks[i].


If all(diff(breaks) == 1) is FALSE, the density list member must be multiplied by diff(.$breaks)


h <- hist(datasetregs$Amount, plot = FALSE)
sum(h$density)
#[1] 1e-04
diff(h$breaks)
#[1] 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000
sum(h$density*diff(h$breaks))
#[1] 1


Hope this helps,

Rui Barradas

they are
already the proportion of counts in each histogram bin, no?

-- Bert



h$counts/sum(h$counts)
h$density*diff(h$breaks)



# One histogram for all dates
h <- hist(datasetregs$Amount, plot = FALSE)
plot(h, labels = scales::percent(h$counts/sum(h$counts)),
       ylim = c(0, 1.1*max(h$counts)))



# Histograms by date
sp <- split(datasetregs, datasetregs$Date)
old_par <- par(mfrow = c(1, 3))
h_list <- lapply(seq_along(sp), function(i){
    hist_title <- paste("Histogram of", names(sp)[i])
    h <- hist(sp[[i]]$Amount, plot = FALSE)
    plot(h, main = hist_title, xlab = "Amount",
         labels = scales::percent(h$counts/sum(h$counts)),
         ylim = c(0, 1.1*max(h$counts)))
})
par(old_par)


Hope this helps,

Rui Barradas

Às 01:49 de 17/08/21, Bert Gunter escreveu:
I may well misunderstand, but proffered solutions seem more complicated
than necessary.
Note that the return of hist() can be saved as a list of class "histogram"
and then plotted with  plot.histogram(), which already has a "labels"
argument that seems to be what you want. A simple example is"

dat <- runif(50, 0, 10)
myhist <- hist(dat, freq = TRUE, breaks ="Sturges")

plot(myhist, col = "darkgray",
       labels = as.character(round(myhist$density*100,1) ),
       ylim = c(0, 1.1*max(myhist$counts)))
## note that this is plot.histogram because myhist has class "histogram"

Note that I expanded the y axis a bit to be sure to include the labels. You
can, of course, plot your separate years as Rui has indicated or via e.g.
?layout.

Apologies if I have misunderstood. Just ignore this in that case.
Otherwise, I leave it to you to fill in details.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Aug 16, 2021 at 4:14 PM Paul Bernal <paulberna...@gmail.com> wrote:

Dear Jim,

Thank you so much for your kind reply. Yes, this is what I am looking for,
however, can´t see clearly how the bars correspond to the bins in the
x-axis. Maybe there is a way to align the amounts so that they match the
columns, sorry if I sound picky, but just want to learn if there is a way
to accomplish this.

Best regards,

Paul

El lun, 16 ago 2021 a las 17:57, Jim Lemon (<drjimle...@gmail.com>)
escribió:

Hi Paul,
I just worked out your first request:

datasetregs<-<-structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"), class =
"factor"),
      Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200,
      15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100, 15000,
      15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000, 15000,
      15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000,
      15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000, 15000,
      16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000, 15000,
      15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000, 15000,
      15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000, 15000,
      15000, 15000, 15000, 15000)), row.names = c(NA, -74L), class =
"data.frame")
histval<-with(datasetregs, hist(Amount, groups=Date, scale="frequency",
   breaks="Sturges", col="darkgray"))
library(plotrix)
histpcts<-paste0(round(100*histval$counts/sum(histval$counts),1),"%")
barlabels(histval$mids,histval$counts,histpcts)

I think that's what you asked for:

Jim

On Tue, Aug 17, 2021 at 8:44 AM Paul Bernal <paulberna...@gmail.com>
wrote:

This is way better, now, how could I put the frequency labels in the
columns as a percentage, instead of presenting them as counts?

Thank you so much.

Paul

El lun, 16 ago 2021 a las 17:33, Rui Barradas (<ruipbarra...@sapo.pt>)
escribió:

Hello,

You forgot to cc the list.

Here are two ways, both of them apply hist() and text() to Amount
split
by Date. The return value of hist is saved because it's a list with
members the histogram's bars midpoints and the counts. Those are used
to
know where to put the text labels.
A vector lbls is created to get rid of counts of zero.

The main difference between the two ways is the histogram's titles.


old_par <- par(mfrow = c(1, 3))
h_list <- with(datasetregs, tapply(Amount, Date, function(x){
     h <- hist(x)
     lbls <- ifelse(h$counts == 0, NA_integer_, h$counts)
     text(h$mids, h$counts/2, labels = lbls)
}))
par(old_par)



old_par <- par(mfrow = c(1, 3))
sp <- split(datasetregs, datasetregs$Date)
h_list <- lapply(seq_along(sp), function(i){
     hist_title <- paste("Histogram of", names(sp)[i])
     h <- hist(sp[[i]]$Amount, main = hist_title)
     lbls <- ifelse(h$counts == 0, NA_integer_, h$counts)
     text(h$mids, h$counts/2, labels = lbls)
})
par(old_par)


Hope this helps,

Rui Barradas

Às 23:16 de 16/08/21, Paul Bernal escreveu:
Dear Rui,

The hist() function comes from the graphics package, from what I
could
see. The thing is that I want to divide the Amount column into
several
bins and then generate three different histograms, one for each AF
period (AF refers to fiscal years). As you can see, the data
contains
three fiscal years (2017, 2020 and 2021). I want to see the
percentage
of cases that fall into different amount categories, from 15,000
and
below, 16,000 to 17,000, from 18,000 to 19,000, and so on.

Thanks for your kind help.

Paul

El lun, 16 ago 2021 a las 17:07, Rui Barradas (<
ruipbarra...@sapo.pt
<mailto:ruipbarra...@sapo.pt>>) escribió:

      Hello,

      The function Hist comes from what package?

      Are you sure you don't want a bar plot?


      agg <- aggregate(Amount ~ Date, datasetregs, sum)
      bp <- barplot(Amount ~ Date, agg)
      with(agg, text(bp, Amount/2, labels = Amount))


      Hope this helps,

      Rui Barradas

      Às 22:54 de 16/08/21, Paul Bernal escreveu:
       > Hello everyone,
       >
       > I am currently working with R version 4.1.0 and I am trying
to
      include
       > (inside the columns of the histogram), the percentage
      distribution and I
       > want to generate three histograms, one for each fiscal year
(in
      the Date
       > column, there are three fiscal year AF 2017, AF 2020 and AF
      2021). However,
       > I can´t seem to accomplish this.
       >
       > Here is my data:
       >
       > structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L,
       > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L,
       > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L,
       > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L,
       > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L,
       > 3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"),
class =
       > "factor"),
       >      Amount = c(40100, 101100, 35000, 40100, 15000, 45100,
40200,
       >      15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100,
15000,
       >      15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000,
15000,
       >      15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000,
15000,
       >      15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000,
15000,
       >      16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000,
15000,
       >      15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000,
15000,
       >      15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000,
15000,
       >      15000, 15000, 15000, 15000)), row.names = c(NA, -74L),
class
=
       > "data.frame")
       >
       > I would like to modify the following script:
       >
       >> with(datasetregs, Hist(Amount, groups=Date,
scale="frequency",
       > +   breaks="Sturges", col="darkgray"))
       >
       > #The only thing missing here are the percentages
corresponding to
      each bin
       > (I would like to see the percentages inside each column, or
on
      top outside
       > if possible)
       >
       > Any help will be greatly appreciated.
       >
       > Best regards,
       >
       > Paul.
       >
       >       [[alternative HTML version deleted]]
       >
       > ______________________________________________
       > R-help@r-project.org <mailto:R-help@r-project.org> mailing
list
      -- To UNSUBSCRIBE and more, see
       > https://stat.ethz.ch/mailman/listinfo/r-help
      <https://stat.ethz.ch/mailman/listinfo/r-help>
       > PLEASE do read the posting guide
      http://www.R-project.org/posting-guide.html
      <http://www.R-project.org/posting-guide.html>
       > and provide commented, minimal, self-contained, reproducible
code.
       >



          [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


          [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


       [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to