For further examples people might check out histograms in Leland Wilkinson's
book the Grammar of Graphics.  He presents a "gap histogram" with unequal bin
widths (breakpoints determined by partial Vornoi tesselation in 1 dimension)
such that area of bar is determined by the count of cases in the bar times the
width, just as with the ordinary histogram made with equal bin widths.

Brian S. Cade

USGS
[EMAIL PROTECTED]

Tim Erickson wrote:

> First, a comment about the recent conversation; then some new, related
> stuff:
>
> I agree with Donald Burrill; there is no assumption that histograms have
> equal bar widths.
>
> See (I think) Tukey in EDA. The example I remember is for looking at an
> income distribution, where you might well have bins of, say, 0-5K, 5-10K,
> 10-25, 25-50, 50-100, etc. This is an example where the natural categories,
> the natural bins, are unequal, and it would make sense to create a display
> that shows the categories _on a continuous axis_.
>
> If I remember right, the official, orthodox view of histograms, is that they
> can have arbitrary bar widths, with DENSITY on the other axis. That is, (as
> Donald described) AREA is proportional to frequency, not the height of the
> bar. Furthermore, it's easy for people to interpret these charts informally
> (the distributions look right). The hard part is answering detailed
> questions such as, "how many people earn from $25-50K?" -- which might be
> better answered by a table anyhow.
>
> It is only in the special case -- where bar widths are equal -- that the
> heights of the bars are proportional to frequency. Alas, since this "special
> case" is so common, we get used to "histogram" == "a frequency chart, rather
> like a bar chart but for continuous variables."
>
> -----------------------
> Now for the new part.
>
> A common histogram -- equal bin width, different frequencies -- is one
> special case for a histogram.
>
> A few years ago, I helped implement another special case:
>
>    equal frequencies, changing bin widths.
>
> We called this an "Ntigram" (pronounced "EN-ti-gram") (someone else must
> have invented it too, but it's only been practical since microcomputers).
> That is, if you have ten bins of equal frequency, each represents a decile.
> Four bins, a quartile, N bins an N-tile, whence N-tigram.
>
> These graphs are pretty interesting, especially if you have more than about
> 10 cases in each bin. Here's why, I think: When you look at the distribution
> of a population with a common histogram, you're always asking, is this
> feature I see real?
>
> Consider an age distribution of a sample from a community. It humps up in
> the middle (baby boomers, college students) and trails off at the end. If
> you bin by five years, up at the top end, you see a peak of 70-75 year olds
> and a gap at 75-80. Is it real? If there are only five people in the "peak,"
> probably not. So we want to smooth that out.
>
> But when you smooth it out by increasing the bin size, you lose possibly
> real structure in the more populous areas of the distribution.
>
> With an Ntigram, by contrast, if you plot the 20-iles, you get skinny bins
> (lots of structure) where you have lots of people, and wide bins (less
> structure, more smoothing) where the population is low.
>
> Anyhow, these graphs are part of Fathom (www.keypress.com/fathom). If you
> want to play with them, download the demo. Load in some Census microdata,
> make a graph, and display "age." You can change the default dot pot to a box
> plot, a histogram, or to an Ntigram; drag on the bar edges to change the
> widths. (Windows only for now, but I'm playing with the nascent Mac version
> and it's great.).
>
> --
> Tim Erickson * eeps media * [EMAIL PROTECTED]
> 5269 Miles Avenue, Oakland CA 94618 * 510.653.3377
>
> =================================================================
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
>                   http://jse.stat.ncsu.edu/
> =================================================================





=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to