You should be using barplot and not hist. I think this produces what you want:
x <- "filename,last_modified,email_addr,country_residence file1,3/4/2006 13:54,email1,Korea (South) file2,3/4/2006 14:33,email2,United States file2,3/4/2006 16:03,email2,United States file2,3/4/2006 16:17,email3,United States file2,3/4/2006 16:28,email3,United States file3,3/4/2006 19:13,email4,United States file2,3/4/2006 21:22,email5,India file4,3/4/2006 21:46,email6,United States file1,3/4/2006 22:04,email7,Japan file2,3/4/2006 22:09,email8,Croatia file1,3/4/2006 22:22,email7,Japan file1,3/4/2006 22:29,email9,India file1,3/4/2006 23:06,email6,United States file1,3/4/2006 23:33,email6,United States file5,3/4/2006 23:44,email10,China file1,3/5/2006 0:13,email9,India file2,3/5/2006 0:52,email8,Croatia file2,3/5/2006 0:54,email8,Croatia file2,3/5/2006 1:10,email5,India file6,3/5/2006 2:17,email9,India file2,3/5/2006 2:24,email11,Italy file7,3/5/2006 2:36,email12,Italy file8,3/5/2006 2:52,email12,Italy file2,3/5/2006 3:09,email13,United Kingdom file2,3/5/2006 4:02,email14,India file2,3/5/2006 4:07,email14,India file2,3/5/2006 4:14,email14,India file2,3/5/2006 4:37,email5,India file2,3/5/2006 4:44,email15,Belgium file1,3/5/2006 5:02,email9,India file1,3/5/2006 5:24,email16,Taiwan file2,3/5/2006 6:06,email17,Saudi Arabia file2,3/5/2006 7:32,email17,Saudi Arabia file2,3/5/2006 8:12,email18,Brazil file2,3/5/2006 8:26,email18,Brazil file2,3/5/2006 9:49,email19,United Kingdom file1,3/5/2006 10:49,email11,Italy file1,3/5/2006 11:16,email13,United Kingdom file1,3/5/2006 11:16,email13,United Kingdom file1,3/5/2006 11:45,email13,United Kingdom file1,3/5/2006 14:34,email20,Australia file9,3/5/2006 14:56,email20,Australia file9,3/5/2006 14:56,email20,Australia file5,3/5/2006 16:43,email21,United States file1,3/5/2006 17:17,email7,Japan file2,3/5/2006 17:26,email22,Japan file2,3/5/2006 17:27,email22,Japan file2,3/5/2006 17:33,email23,China file1,3/5/2006 17:45,email22,Japan file2,3/5/2006 17:45,email22,Japan file2,3/5/2006 17:59,email23,China file1,3/5/2006 18:27,email24,Japan file1,3/5/2006 18:47,email25,Taiwan file2,3/5/2006 18:48,email26,New Zealand file2,3/5/2006 19:15,email27,Canada file2,3/5/2006 19:23,email28,Canada file2,3/5/2006 19:24,email28,Canada file10,3/5/2006 19:49,email29,Japan file10,3/5/2006 19:52,email29,Japan file10,3/5/2006 19:57,email29,Japan file2,3/5/2006 20:01,email29,Japan file2,3/5/2006 20:02,email29,Japan file2,3/5/2006 20:06,email29,Japan" d <- read.csv(textConnection(x)) barplot(table(d$filename), main="All Files", las=2) # plot counts for all the files # generate plots for each file name showing which emails used them counts <- table(d$filename, d$email_addr) for (i in seq(nrow(counts))){ .index <- which(counts[i,] > 0) barplot(counts[i, .index], las=2, names.arg=colnames(counts)[.index], main=rownames(counts)[i]) } On 6/18/07, Matthew Trunnell <[EMAIL PROTECTED]> wrote: > > Hello R gurus, > > I just spent my first weekend wrestling with R, but so far have come > up empty handed. > > I have a dataset that represents file downloads; it has 4 dimensions: > date, filename, email, and country. (sample data below) > > My first goal is to get an idea of the frequency of repeated > downloads. Let me explain that. Some people tend to download > multiple times, e.g. if the download fails they keep trying over and > over. I'm trying to build a histogram that shows the repeat count > along the x-axis, that is, how many people downloaded once, twice, > three times, etc. I plan to compare the median of that before and > after we switched ISPs. > > To accomplish this, I'm assuming that I'll first need to combine the > email and filename columns so as to represent a single download > attempt by an individual. Does that sound right? Later, it would be > nice to limit the histogram to a single filename, country, or company. > I can probably figure that out myself after I understand how to write > this funky histogram expression. > > With the help of Verzani's introductory text, I've learned how to read > in the CSV data and do some simple tables, like this: > > hist(table(d$filename)) > hist(table(d$filename[substring(d$filename, 1, 5)=="file1"])) > hist(sort(table(d$filename[substring(d$filename, 1, 5)=="file1"]))) > > Obviously, these commands count the frequency of the files. What I'd > like to see are the repeats grouped along the x-axis; I'd like to > find, for all files, the distribution of retries. I hope that makes > sense. :) > > Can someone point me in the right direction? I'm very new to R and to > statistics, but I write code for a living. At this point I'd almost > be better off writing a program do this kind of simple counting... but > I have a feeling R would be so useful if I could just get past the > initial learning curve. > > Thank you in advance, > Matt > > Here's some real data, with the private info replaced :) > > d<-read.table > (file="C:\\users\\trunnellm\\downloads\\statistics\\downloads.csv", > sep=",", quote="\"", header=TRUE) > > filename,last_modified,email_addr,country_residence > file1,3/4/2006 13:54,email1,Korea (South) > file2,3/4/2006 14:33,email2,United States > file2,3/4/2006 16:03,email2,United States > file2,3/4/2006 16:17,email3,United States > file2,3/4/2006 16:28,email3,United States > file3,3/4/2006 19:13,email4,United States > file2,3/4/2006 21:22,email5,India > file4,3/4/2006 21:46,email6,United States > file1,3/4/2006 22:04,email7,Japan > file2,3/4/2006 22:09,email8,Croatia > file1,3/4/2006 22:22,email7,Japan > file1,3/4/2006 22:29,email9,India > file1,3/4/2006 23:06,email6,United States > file1,3/4/2006 23:33,email6,United States > file5,3/4/2006 23:44,email10,China > file1,3/5/2006 0:13,email9,India > file2,3/5/2006 0:52,email8,Croatia > file2,3/5/2006 0:54,email8,Croatia > file2,3/5/2006 1:10,email5,India > file6,3/5/2006 2:17,email9,India > file2,3/5/2006 2:24,email11,Italy > file7,3/5/2006 2:36,email12,Italy > file8,3/5/2006 2:52,email12,Italy > file2,3/5/2006 3:09,email13,United Kingdom > file2,3/5/2006 4:02,email14,India > file2,3/5/2006 4:07,email14,India > file2,3/5/2006 4:14,email14,India > file2,3/5/2006 4:37,email5,India > file2,3/5/2006 4:44,email15,Belgium > file1,3/5/2006 5:02,email9,India > file1,3/5/2006 5:24,email16,Taiwan > file2,3/5/2006 6:06,email17,Saudi Arabia > file2,3/5/2006 7:32,email17,Saudi Arabia > file2,3/5/2006 8:12,email18,Brazil > file2,3/5/2006 8:26,email18,Brazil > file2,3/5/2006 9:49,email19,United Kingdom > file1,3/5/2006 10:49,email11,Italy > file1,3/5/2006 11:16,email13,United Kingdom > file1,3/5/2006 11:16,email13,United Kingdom > file1,3/5/2006 11:45,email13,United Kingdom > file1,3/5/2006 14:34,email20,Australia > file9,3/5/2006 14:56,email20,Australia > file9,3/5/2006 14:56,email20,Australia > file5,3/5/2006 16:43,email21,United States > file1,3/5/2006 17:17,email7,Japan > file2,3/5/2006 17:26,email22,Japan > file2,3/5/2006 17:27,email22,Japan > file2,3/5/2006 17:33,email23,China > file1,3/5/2006 17:45,email22,Japan > file2,3/5/2006 17:45,email22,Japan > file2,3/5/2006 17:59,email23,China > file1,3/5/2006 18:27,email24,Japan > file1,3/5/2006 18:47,email25,Taiwan > file2,3/5/2006 18:48,email26,New Zealand > file2,3/5/2006 19:15,email27,Canada > file2,3/5/2006 19:23,email28,Canada > file2,3/5/2006 19:24,email28,Canada > file10,3/5/2006 19:49,email29,Japan > file10,3/5/2006 19:52,email29,Japan > file10,3/5/2006 19:57,email29,Japan > file2,3/5/2006 20:01,email29,Japan > file2,3/5/2006 20:02,email29,Japan > file2,3/5/2006 20:06,email29,Japan > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.