Re: [R] stripping #s in a text file prior to reading into table or dataframe
Thanks for your advice! I still get the same error, though -- not sure why. read.table('don.5.clusters.txt', header = TRUE, comment.char = '', quote ='') Error in read.table(don.5.clusters.txt, header = TRUE, comment.char = , : more columns than column names Any other thoughts? -- Donald Braman http://ssrn.com/author=286206 http://www.culturalcognition.net/braman/ http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 Henrique Dallazuanna Tue, 26 Oct 2010 09:11:33 -0700 Try this: read.table('don.5.clusters.txt', header = TRUE, comment.char = '', quote = '') On Tue, Oct 26, 2010 at 1:15 PM, Donald Braman dbra...@law.gwu.edu wrote: That's one of the things I tried, but which didn't work. I get the following error when I do that: Error in read.table(file = don.5.clusters.txt, header = TRUE, comment.char = , : more columns than column names If I remove the hashes by other means, I don't get that error. On Tue, Oct 26, 2010 at 10:49 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 26/10/2010 10:33 AM, Donald Braman wrote: I'm importing a lot of text tables of data (from Latent Gold) that includes hashes in some of the column names (Cluster#1, Cluster#2, etc.). Is there an easy way to strip the offending hashes out before pushing the text into a table or data frame? I thought I'd use gsub, e.g., but can't figure out how to read in a text file without reading it into a table or data frame (which would be ill structured, given the hashes). I could do it in another scripting language or shell script, but would like to try to do it in R. readLines() will read it, but you may not need to do that. Set comment.char= to turn off the special meaning of # in read.table() and related functions. Duncan -- Donald Braman phone: 971-645-0607 http://www.culturalcognition.net/braman/ http://ssrn.com/author=286206 http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stripping #s in a text file prior to reading into table or dataframe
read.delim2 did the trick -- many thanks!!! On Wed, Oct 27, 2010 at 10:01 AM, Jorge Ivan Velez jorgeivanve...@gmail.com wrote: ?read.delim2 HTH, Jorge On Wed, Oct 27, 2010 at 9:51 AM, Donald Braman dbra...@law.gwu.eduwrote: Thanks for your advice! I still get the same error, though -- not sure why. read.table('don.5.clusters.txt', header = TRUE, comment.char = '', quote ='') Error in read.table(don.5.clusters.txt, header = TRUE, comment.char = , : more columns than column names Any other thoughts? -- Donald Braman http://ssrn.com/author=286206 http://www.culturalcognition.net/braman/ http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 Henrique Dallazuanna Tue, 26 Oct 2010 09:11:33 -0700 Try this: read.table('don.5.clusters.txt', header = TRUE, comment.char = '', quote = '') On Tue, Oct 26, 2010 at 1:15 PM, Donald Braman dbra...@law.gwu.edu wrote: That's one of the things I tried, but which didn't work. I get the following error when I do that: Error in read.table(file = don.5.clusters.txt, header = TRUE, comment.char = , : more columns than column names If I remove the hashes by other means, I don't get that error. On Tue, Oct 26, 2010 at 10:49 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 26/10/2010 10:33 AM, Donald Braman wrote: I'm importing a lot of text tables of data (from Latent Gold) that includes hashes in some of the column names (Cluster#1, Cluster#2, etc.). Is there an easy way to strip the offending hashes out before pushing the text into a table or data frame? I thought I'd use gsub, e.g., but can't figure out how to read in a text file without reading it into a table or data frame (which would be ill structured, given the hashes). I could do it in another scripting language or shell script, but would like to try to do it in R. readLines() will read it, but you may not need to do that. Set comment.char= to turn off the special meaning of # in read.table() and related functions. Duncan -- Donald Braman phone: 971-645-0607 http://www.culturalcognition.net/braman/ http://ssrn.com/author=286206 http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Donald Braman phone: 971-645-0607 http://www.culturalcognition.net/braman/ http://ssrn.com/author=286206 http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stripping #s in a text file prior to reading into table or dataframe
I'm importing a lot of text tables of data (from Latent Gold) that includes hashes in some of the column names (Cluster#1, Cluster#2, etc.). Is there an easy way to strip the offending hashes out before pushing the text into a table or data frame? I thought I'd use gsub, e.g., but can't figure out how to read in a text file without reading it into a table or data frame (which would be ill structured, given the hashes). I could do it in another scripting language or shell script, but would like to try to do it in R. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stripping #s in a text file prior to reading into table or dataframe
That's one of the things I tried, but which didn't work. I get the following error when I do that: Error in read.table(file = don.5.clusters.txt, header = TRUE, comment.char = , : more columns than column names If I remove the hashes by other means, I don't get that error. On Tue, Oct 26, 2010 at 10:49 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 26/10/2010 10:33 AM, Donald Braman wrote: I'm importing a lot of text tables of data (from Latent Gold) that includes hashes in some of the column names (Cluster#1, Cluster#2, etc.). Is there an easy way to strip the offending hashes out before pushing the text into a table or data frame? I thought I'd use gsub, e.g., but can't figure out how to read in a text file without reading it into a table or data frame (which would be ill structured, given the hashes). I could do it in another scripting language or shell script, but would like to try to do it in R. readLines() will read it, but you may not need to do that. Set comment.char= to turn off the special meaning of # in read.table() and related functions. Duncan -- Donald Braman phone: 971-645-0607 http://www.culturalcognition.net/braman/ http://ssrn.com/author=286206 http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dirichlet models
Does anyone know of a package (or workaround) for fitting a dirichlet distribution by maximum likelihood? (I'm looking for something like this: http://repec.org/bocode/d/dirifit.html, that allows for both dependent variables summing to 1 predictive variables of any sort.) Don -- Donald Braman phone: 971-645-0607 http://www.culturalcognition.net/braman/ http://ssrn.com/author=286206 http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] latent class analysis with mixed variable types
As an alternative to Latent GOLD, I'm wondering if anyone knows of and R package that can manage Latent Class Analysis with mixed variable types (continuous, ordinal, and nominal/binary). [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] repeating values in levels()
Thanks! I've figured out how to fix it, but how I got here is still a puzzle. :-) Cheers, Don On Sat, Oct 17, 2009 at 5:36 PM, Peter Ehlers ehl...@ucalgary.ca wrote: Donald Braman wrote: Can someone help me understand this results? levels(as.factor(miset1$facts_convict)) [1] 1 1 2 3 4 5 6 Don't know how you got your data that way, but I wonder if you've done str() on your data after whatever procedure you used to get to this stage. Here's one way to get this pathological state: set.seed(2) x - sample(5, 15, rep=TRUE) y - factor(x, levels=c(1, 1:5)) ## repeating level 1 levels(y) [1] 1 1 2 3 4 5 converting to numeric and back doesn't seem to help: levels(as.factor(as.numeric(miset1$facts_convict))) [1] 1 1 2 3 4 5 6 I suspect that miset1$facts_convict is already a factor [str() would tell you] and that the following comment from ?factor applies: In particular, as.numeric applied to a factor is meaningless ... If my guess is correct, you should be able to fix things with newy - factor(y) levels(newy) [1] 1 2 3 4 5 -Peter Ehlers It's messing up my ologits. Any way to correct this? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] repeating values in levels()
Can someone help me understand this results? levels(as.factor(miset1$facts_convict)) [1] 1 1 2 3 4 5 6 converting to numeric and back doesn't seem to help: levels(as.factor(as.numeric(miset1$facts_convict))) [1] 1 1 2 3 4 5 6 It's messing up my ologits. Any way to correct this? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] standard error associated with correlation coefficient
I want the standard error associated with a correlation. I can calculate using cor var, but am wondering if there are libraries that already provide this function. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simple graph question: manipulating variable names
This is a simple problem that has stumped me: I'm trying to loop through a few dozen variable names in graphs. I've tried various approaches like this: attach(mydata) ivs - c(oneiv, anotheriv, yetanotheriv) dvs - c(onedv, anotherdv, yetanotherdv) for (iv in ivs) { for (dv in dvs) { graphname - paste(iv, dv, .png, sep = ) png(file=graphname, width=300, height=300) plot(dv ~ iv, pch=.) lines(loess.smooth(iv, dv), lty=1) dev.off() } } Clearly that doesn't work. I'm not sure how to make R see the iv and dv strings as variables. Advice? Donald Braman phone: 413-628-1221 http://www.culturalcognition.net/braman/ http://ssrn.com/author=286206 http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lowess puzzle
I was trying to fit a curve to the number of people who identify as liberal by age. I got some puzzling results which suggested to me that I don't really understand how local polynomial fitting works. Why, I am wondering, is lowess producing a local fit of zero for every age? liberal.bin [1] 0 0 0 0 1 0 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 [57] 1 1 1 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 [113] 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 [169] 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 1 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 [225] 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 1 [281] 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 1 1 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 [337] 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 1 [393] 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 [449] 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 1 1 0 0 1 0 0 0 1 1 0 0 1 0 1 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 [505] 1 0 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 1 [561] 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 [617] 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 [673] 1 0 0 0 0 0 0 1 0 1 1 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 1 0 1 1 [729] 1 0 1 0 0 0 0 0 1 1 0 0 0 1 0 0 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 1 0 0 0 1 [785] 0 0 0 1 1 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 [841] 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 1 1 0 [897] 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 [953] 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 [1009] 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 [1065] 0 0 0 0 1 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 0 0 0 [1121] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 1 1 1 0 0 0 0 0 [1177] 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 1 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 [1233] 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 0 0 1 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 [1289] 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 [1345] 1 0 1 1 0 1 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [1401] 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [1457] 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 age [1] 62 62 27 61 44 53 50 71 57 43 66 60 45 71 52 61 32 62 51 50 48 35 73 63 48 38 25 28 66 81 45 50 52 23 55 62 72 [38] 64 63 56 52 69 62 41 40 51 28 58 44 48 53 32 28 48 29 58 31 40 57 65 23 75 39 57 30 66 66 52 52 65 52 44 50 36 [75] 25 55 36 49 51 23 80 51 60 42 45 38 46 50 63 64 68 25 33 37 60 46 82 26 55 55 46 24 59 23 33 60 79 48 60 68 40 [112] 45 55 61 57 29 75 30 71 51 46 52 38 33 65 68 63 43 64 58 38 71 73 62 43 83 21 66 60 46 60 59 61 52 50 48 42 64 [149] 50 24 23 60 61 52 59 24 22 70 63 65 74 50 80 54 55 47 75 67 41 46 57 63 50 62 31 58 75 21 28 69 67 62 47 56 38 [186] 79 69 52 54 32 70 58 50 35 39 34 59 43 54 54 57 28 43 47 56 48 51 57 72 34 57 51 46 50 48 40 59 66 29 50 33 45 [223] 65 57 65 69 45 46 65 76 74 54 61 43 43 38 36 58 68 54 65 42 53 72 45 39 61 40 44 79 79 50 27 63 50 70 34 32 27 [260] 49 72 50 53 47 49 44 41 24 22 41 25 27 64 70 49 50 22 35 26 62 45 19 31 61 62 39 80 44 42 41 66 59 67 41 64 47 [297] 75 24 37 33 46 47 36 53 59 36 26 24 37 69 49 44 73 53 50 79 19 42 54 46 31 55 45 53 56 67 50 47 43 77 32 60 28 [334] 59 71 48 50 39 31 31 32 59 60 51 62 39 38 31 28 58 20 70 57 60 55 25 38 31 61 71 33 57 60 68 64 20 46 38 68 56 [371] 23 49 54 56 22 38 31 46 46 28 40 33 43 63 49 30 36 60 50 61 64 31 30 34 26 25 66 55 57 56 31 28 65 31 72 32 44 [408] 58 50 49 56 38 50 65 57 72 40 64 74 73 36 60 67 78 67 66 38 28 79 55 56 65 69 34 51 31 61 69 19 38 72 32 55 56 [445] 51 25 27 45 48 60 41 27 45 47 62 23 68 57 23 47 59 50 36 43 59 81 27 40 50 76 45 26 46 53
[R] lowess puzzle
I am trying to fit a curve to the number of people who identify as liberal by age. I got some puzzling results which suggested to me that I don't really understand how local polynomial fitting works. Why, I am wondering, is lowess producing a local fit of zero for every age? liberal [1] 0 0 0 0 1 0 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 [40] 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 0 [79] 1 1 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 [118] 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 [157] 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 [196] 1 0 0 1 0 0 0 1 0 1 1 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 1 [235] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 [274] 0 0 0 0 1 0 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 1 1 1 0 0 1 0 [313] 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 [352] 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 [391] 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 1 [430] 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 1 1 [469] 0 0 1 0 0 0 1 1 0 0 1 0 1 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 1 [508] 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 1 0 0 [547] 0 0 1 0 1 0 1 1 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 0 1 [586] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 1 [625] 0 1 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 [664] 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 [703] 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 1 0 1 1 1 0 1 0 0 0 0 0 1 1 0 0 0 [742] 1 0 0 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 1 [781] 0 0 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0 1 [820] 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 [859] 1 0 1 0 0 0 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 1 1 0 0 [898] 1 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0 [937] 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 1 0 0 0 0 [976] 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 0 0 1 0 0 [1015] 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 [1054] 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 [1093] 0 0 0 1 1 0 1 1 1 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [1132] 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 1 1 [1171] 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 [1210] 1 1 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 [1249] 0 1 0 0 0 1 0 0 1 1 0 0 1 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 [1288] 0 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 1 0 0 [1327] 1 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 1 0 1 1 0 1 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 [1366] 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 [1405] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 [1444] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 [1483] 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 age [1] 62 62 27 61 44 53 50 71 57 43 66 60 45 71 52 61 32 62 51 50 48 35 73 63 48 38 [27] 25 28 66 81 45 50 52 23 55 62 72 64 63 56 52 69 62 41 40 51 28 58 44 48 53 32 [53] 28 48 29 58 31 40 57 65 23 75 39 57 30 66 66 52 52 65 52 44 50 36 25 55 36 49 [79] 51 23 80 51 60 42 45 38 46 50 63 64 68 25 33 37 60 46 82 26 55 55 46 24 59 23 [105] 33 60 79 48 60 68 40 45 55 61 57 29 75 30 71 51 46 52 38 33 65 68 63 43 64 58 [131] 38 71 73 62 43 83 21 66 60 46 60 59 61 52 50 48 42 64 50 24 23 60 61 52 59 24 [157] 22 70 63 65 74 50 80 54 55 47 75 67 41 46 57 63 50 62 31 58 75 21 28 69 67 62 [183] 47 56 38 79 69 52 54 32 70 58 50 35 39 34 59 43 54 54 57 28 43 47 56 48 51 57 [209] 72 34 57 51 46 50 48 40 59 66 29 50 33 45 65 57 65 69 45 46 65 76 74 54 61 43 [235] 43 38 36 58 68 54 65 42 53 72 45 39 61 40 44 79 79 50 27 63 50 70 34 32 27 49 [261] 72 50 53 47 49 44 41 24 22 41 25 27 64 70 49 50 22 35 26 62 45 19 31 61 62 39 [287] 80 44 42 41 66 59 67 41 64 47 75 24 37 33 46 47 36 53 59 36 26 24 37 69 49 44 [313] 73 53 50 79 19 42 54 46 31 55 45 53 56 67 50 47 43 77 32 60 28 59 71 48 50 39 [339] 31 31 32 59 60 51 62 39 38 31 28 58 20 70 57 60 55 25 38 31 61 71 33 57 60 68 [365] 64 20 46 38 68 56 23 49 54 56 22 38 31 46 46 28 40 33 43 63 49 30 36 60 50 61 [391] 64 31 30 34 26 25 66 55 57 56 31 28 65 31 72 32 44 58 50 49 56 38 50 65 57 72 [417] 40 64 74 73 36 60 67 78 67 66 38 28 79 55 56 65 69 34 51 31 61 69
Re: [R] lowess puzzle
Resolved. It works if I set iter=0. On Thu, Aug 6, 2009 at 9:03 PM, Donald Braman dbra...@law.gwu.edu wrote: I was trying to fit a curve to the number of people who identify as liberal by age. I got some puzzling results which suggested to me that I don't really understand how local polynomial fitting works. Why, I am wondering, is lowess producing a local fit of zero for every age? liberal.bin [1] 0 0 0 0 1 0 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 [57] 1 1 1 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 [113] 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 [169] 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 1 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 [225] 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 1 [281] 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 1 1 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 [337] 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 1 [393] 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 [449] 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 1 1 0 0 1 0 0 0 1 1 0 0 1 0 1 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 [505] 1 0 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 1 [561] 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 [617] 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 [673] 1 0 0 0 0 0 0 1 0 1 1 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 1 0 1 1 [729] 1 0 1 0 0 0 0 0 1 1 0 0 0 1 0 0 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 1 0 0 0 1 [785] 0 0 0 1 1 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 [841] 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 1 1 0 [897] 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 [953] 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 [1009] 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 [1065] 0 0 0 0 1 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 0 0 0 [1121] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 1 1 1 0 0 0 0 0 [1177] 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 1 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 [1233] 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 0 0 1 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 [1289] 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 [1345] 1 0 1 1 0 1 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [1401] 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [1457] 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 age [1] 62 62 27 61 44 53 50 71 57 43 66 60 45 71 52 61 32 62 51 50 48 35 73 63 48 38 25 28 66 81 45 50 52 23 55 62 72 [38] 64 63 56 52 69 62 41 40 51 28 58 44 48 53 32 28 48 29 58 31 40 57 65 23 75 39 57 30 66 66 52 52 65 52 44 50 36 [75] 25 55 36 49 51 23 80 51 60 42 45 38 46 50 63 64 68 25 33 37 60 46 82 26 55 55 46 24 59 23 33 60 79 48 60 68 40 [112] 45 55 61 57 29 75 30 71 51 46 52 38 33 65 68 63 43 64 58 38 71 73 62 43 83 21 66 60 46 60 59 61 52 50 48 42 64 [149] 50 24 23 60 61 52 59 24 22 70 63 65 74 50 80 54 55 47 75 67 41 46 57 63 50 62 31 58 75 21 28 69 67 62 47 56 38 [186] 79 69 52 54 32 70 58 50 35 39 34 59 43 54 54 57 28 43 47 56 48 51 57 72 34 57 51 46 50 48 40 59 66 29 50 33 45 [223] 65 57 65 69 45 46 65 76 74 54 61 43 43 38 36 58 68 54 65 42 53 72 45 39 61 40 44 79 79 50 27 63 50 70 34 32 27 [260] 49 72 50 53 47 49 44 41 24 22 41 25 27 64 70 49 50 22 35 26 62 45 19 31 61 62 39 80 44 42 41 66 59 67 41 64 47 [297] 75 24 37 33 46 47 36 53 59 36 26 24 37 69 49 44 73 53 50 79 19 42 54 46 31 55 45 53 56 67 50 47 43 77 32 60 28 [334] 59 71 48 50 39 31 31 32 59 60 51 62 39 38 31 28 58 20 70 57 60 55 25 38 31 61 71 33 57 60 68 64 20 46 38 68 56 [371] 23 49 54 56 22 38 31 46 46 28 40 33 43 63 49 30 36 60 50 61 64 31 30 34 26 25 66 55 57 56 31 28 65 31 72 32 44 [408] 58 50 49
[R] recoding strings containing colons
Curious to know if recode can work with strings containing colons. I haven't gotten it to work yet, but perhaps there is a way? Donald Braman http://www.culturalcognition.com/braman/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mapping states with colors
Hi folks, I'm just learning how to use maps. As an initial foray, I'm mapping the states that have duty to retreat (blue) and stand your ground (red) self-defense standards. Here is my extremely naive script: dtr - c('alabama', 'arizona', 'conneticut', 'delaware', 'dist of columbia' , 'hawaii', 'maryland', 'massachusetts', 'minnesota', 'missouri', 'nebraska' , 'new hampshire', 'new jersey', 'new mexico', 'new york', 'north carolina' , 'north dakota', 'ohio', 'pennsylvania', 'rhode island', 'virginia', 'wyoming', 'arkansas', 'vermont') syg - c( 'alaska', 'california', 'colorado', 'florida', 'georgia', 'idaho' , 'illinois', 'indiana', 'iowa', 'kansas', 'kentucky', 'louisiana', 'maine' , 'michigan', 'mississippi', 'montana', 'nevada', 'oklahoma', 'oregon', 'south carolina', 'south dakota', 'tennessee', 'texas', 'utah', 'washington', 'west va', 'wisconsin') map('state', proj='bonne', param=50, region = c(syg, dtr), fill=TRUE, col=c('red', 'blue')) Obviously that doesn't work. A couple questions: 1. How do I get Alaska Hawaii on the map? 2. How to I set the col atttribute for a subset of the states I'm mapping? Many thanks in advance for any help! Don Donald Braman http://www.culturalcognition.net http://ssrn.com/author=286206 http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] newbie query: simple crosstabs
I've been playing around with various table tools, trying to construct a fairly simple cross-tab. It shouldn't be hard, but for some reason it turning out to be (for me). If I want to see how many men and how many women agree with a agree/disagree question (coded 1,0), I can do this: attach(mydata) mytable - table(male, q1.bin) # gender and a binary response variable prop.table(mytable, 1) # row percentages q1.bin male 0 1 0 0.3988 0.6012 1 0.2879 0.7121 I can repeat that for each of the items I want gender breakdowns for (q2, q3, q4 ). But what I really want is a table that shows the percentage answering yes (coded as 1) across many, many binary response items. E.g., male q1.bin q2.bin q3.bin ... 0 0.6012 0.3421 0.9871 ... 1 0.7121 0.6223 0.0198 ... I've tried various combinations of apply cbind, but to no avail. It would be easy in SPSS crosstabs, but darnit, I want to use R! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] newbie query: simple crosstabs
Thanks for the help everyone! I'm new to vectors, and don't quite get it. This works for me: binary.vars - c(q1, q2, q3, ...) apply(mydata[binary.vars], 2, tapply, mydata[male], mean) but this doesn't: other.vars - c(male, race, religion) apply(mydata[other.binary.vars], 2, tapply, mydata[other.vars], mean) What am I missing? On Tue, Apr 7, 2009 at 6:14 PM, hadley wickham h.wick...@gmail.com wrote: On Tue, Apr 7, 2009 at 4:41 PM, Jorge Ivan Velez jorgeivanve...@gmail.com wrote: Hi Eik, You're absolutely right. My bad. Here is the correction of the code I sent: apply(mydata[,-1], 2, tapply, mydata[,1], function(x) sum(x)/length(x)) Or more simply: apply(mydata[,-1], 2, tapply, mydata[,1], mean) Hadley -- http://had.co.nz/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] inverting a table
Is there an easy way to invert a table? (not to solve for the inverted matrix, just swap rows for columns vice versa). I've gone through my data manipulation bible (Phil Spector's book), but to no avail. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] quantile / centile
I'm wondering if there is a simple way to assign a quantile to a vector in a data frame, much like one could in Stata using centile. Let's say I want 100 slices in my assignation. I can easily see what the limits of each slice by using quantile: quantile(my.df$my.var, probs=seq(0, 1, 0.01)) But how do I assign the appropriate value to each row/record in my data frame? Clearly the following won't work, but what will? my.df$my.new.var - quantile(my.df$my.var, probs=seq(0, 1, 0.01)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] quantile / centile
Thanks, for the response! Unfortunately, I was unclear; my problem is not that I need to know what the percentile ranges are, but that I need to assign an appropriate percentile range to each of the records in my dataframe. My dataframe contains somewhere between 1000 and 9000 rows/records in my dataframe (depending on context), not a hundred rows. That is, I'd like to assign a corresponding quantile value to each row that corresponds to the quantile() result for each record in my 1000-9000 row data frame. Thanks again for any help! On Sat, Sep 27, 2008 at 8:54 AM, Henrique Dallazuanna [EMAIL PROTECTED]wrote: Try this: my.df$my.newvar - quantile(my.df$my.var, probs = seq(0.01,1, 0.01)) On Sat, Sep 27, 2008 at 3:50 AM, Donald Braman [EMAIL PROTECTED] wrote: I'm wondering if there is a simple way to assign a quantile to a vector in a data frame, much like one could in Stata using centile. Let's say I want 100 slices in my assignation. I can easily see what the limits of each slice by using quantile: quantile(my.df$my.var, probs=seq(0, 1, 0.01)) But how do I assign the appropriate value to each row/record in my data frame? Clearly the following won't work, but what will? my.df$my.new.var - quantile(my.df$my.var, probs=seq(0, 1, 0.01)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sem testing multiple hypotheses with BIC
I'm coming from the AMOS world and am wondering if there is a simple way to do multiple hypothesis testing in the manner of BIC analyses in AMOS using the sem package in R. I've read the documentation, but don't see anything in there except for basic BIC scores. Perhaps someone has devised a simple way to compare the relative likelihood of all possible path-fittings within a specified set of paths? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] two newbie questions
# I've tried to make this easy to paste into R, though it's probably so simple you won't need to. # I have some data (there are many more variables, but this is a reasonable approximation of it) # here's a fabricated data frame that is similar in form to mine: my.df - data.frame(replicate(10, round(rnorm(100, mean=3.5, sd=1 var.list - c(dv1, dv2, dv3, iv1, iv2, iv3, iv4, iv5, intv1, intv2) names(my.df) - var.list # I have some are DVs: dvs - c(dv1, dv2, dv3) # some IVs: ivs - c(iv1, iv2, iv3, iv4, iv5) # and some binary interaction variables: intvs - c(intv1, intv2) library(car) my.df[intvs] - lapply(my.df[intvs], function(x) recode(x, recodes = lo:3.5=0; 3.5:hi=1; ,as.factor.result = FALSE)) # now I loop through a series of interactions using the vector numbers: for(dv in 1:3) { for(iv in 4:8) { for (intv in 9:10) { jpeg(paste(names(my.df[iv]), names(my.df[dv]), names(my.df[intv]), .jpg, sep=_)) with(data.frame(my.df), { my.fit - lm( my.df[[dv]] ~ my.df[[iv]] + my.df[[intv]] + my.df[[iv]]:my.df[[intv]]) colors - ifelse (my.df[[intv]] == 1, black, grey) plot(my.df[[iv]], my.df[[dv]], xlab=names(my.df[iv]), ylab=names(my.df[dv]), col=colors, pch=.) curve (cbind (1, 1, x, 1*x) %*% coef(my.fit), add=TRUE, col=black) curve (cbind (1, 0, x, 0*x) %*% coef(my.fit), add=TRUE, col=gray) }) dev.off() } } } # Question1: Works fine, but using the vector numbers feels kludgy -- especially if the variables in question aren't consecutive. # Is there a more elegant way of doing this with my lists of variable names? Something like this, for example: for(dv in dvs) { for(iv in ivs) { for (intv in intvs) { jpeg(paste(dv, iv, intv, .jpg, sep=_)) with(data.frame(my.df), { my.fit - lm(my.df[dv] ~ my.df[iv] + my.df[intv] + my.df[iv]:my.df[intv]) colors - ifelse (my.df[[intv]] == 1, black, grey) plot(my.df[iv], my.df[dv], xlab=iv, ylab=names(dv), col=colors, pch=.) curve (cbind (1, 1, x, 1*x) %*% coef(my.fit), add=TRUE, col=black) curve (cbind (1, 0, x, 0*x) %*% coef(my.fit), add=TRUE, col=gray) }) dev.off() } } } # Clearly that's wrong -- why it's wrong is obscure to me, though! Please educate me! # Question2: Could this could be done by using apply rather than a loop? # Or is looping better here bc there are several actions performed at each iteration? # I'm still trying to get my head around all the ways to ditch looping in R. Donald Braman http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 http://research.yale.edu/culturalcognition http://ssrn.com/author=286206 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two newbie questions
Wow -- many thanks for the mind-*expanding* help! I'm really impressed by R's ability to handle this so concisely It's going to take me a while to get used to applying things to vectors, but the more I understand, the nicer R looks. On Sun, Jun 22, 2008 at 6:59 PM, jim holtman [EMAIL PROTECTED] wrote: This does away with the 'for' loops and uses 'expand.grid' to create the combinations. I think I got the right variables substituted: my.df - data.frame(replicate(10, round(rnorm(100, mean=3.5, sd=1 var.list - c(dv1, dv2, dv3, iv1, iv2, iv3, iv4, iv5, intv1, intv2) names(my.df) - var.list # I have some are DVs: dvs - c(dv1, dv2, dv3) # some IVs: ivs - c(iv1, iv2, iv3, iv4, iv5) # and some binary interaction variables: intvs - c(intv1, intv2) library(car) my.df[intvs] - lapply(my.df[intvs], function(x) recode(x, recodes = lo:3.5=0; 3.5:hi=1; ,as.factor.result = FALSE)) # now I loop through a series of interactions using the vector numbers: # create a dataframe of values to check xpnd - expand.grid(dvs, ivs, intvs) # create combinations invisible(apply(xpnd, 1, function(.row) { jpeg(paste(paste(.row, collapse=_),.jpg, sep='')) my.fit - lm( my.df[[.row[1]]] ~ my.df[[.row[2]]] + my.df[[.row[3]]] + my.df[[.row[2]]]:my.df[[.row[3]]]) colors - ifelse (my.df[[.row[3]]] == 1, black, grey) plot(my.df[[.row[2]]], my.df[[.row[1]]], xlab=.row[2], ylab=.row[1], col=colors, pch=.) curve (cbind (1, 1, x, 1*x) %*% coef(my.fit), add=TRUE, col=black) curve (cbind (1, 0, x, 0*x) %*% coef(my.fit), add=TRUE, col=gray) dev.off() } )) On Sun, Jun 22, 2008 at 6:26 PM, Donald Braman [EMAIL PROTECTED] wrote: # I've tried to make this easy to paste into R, though it's probably so simple you won't need to. # I have some data (there are many more variables, but this is a reasonable approximation of it) # here's a fabricated data frame that is similar in form to mine: my.df - data.frame(replicate(10, round(rnorm(100, mean=3.5, sd=1 var.list - c(dv1, dv2, dv3, iv1, iv2, iv3, iv4, iv5, intv1, intv2) names(my.df) - var.list # I have some are DVs: dvs - c(dv1, dv2, dv3) # some IVs: ivs - c(iv1, iv2, iv3, iv4, iv5) # and some binary interaction variables: intvs - c(intv1, intv2) library(car) my.df[intvs] - lapply(my.df[intvs], function(x) recode(x, recodes = lo:3.5=0; 3.5:hi=1; ,as.factor.result = FALSE)) # now I loop through a series of interactions using the vector numbers: for(dv in 1:3) { for(iv in 4:8) { for (intv in 9:10) { jpeg(paste(names(my.df[iv]), names(my.df[dv]), names(my.df[intv]), .jpg, sep=_)) with(data.frame(my.df), { my.fit - lm( my.df[[dv]] ~ my.df[[iv]] + my.df[[intv]] + my.df[[iv]]:my.df[[intv]]) colors - ifelse (my.df[[intv]] == 1, black, grey) plot(my.df[[iv]], my.df[[dv]], xlab=names(my.df[iv]), ylab=names(my.df[dv]), col=colors, pch=.) curve (cbind (1, 1, x, 1*x) %*% coef(my.fit), add=TRUE, col=black) curve (cbind (1, 0, x, 0*x) %*% coef(my.fit), add=TRUE, col=gray) }) dev.off() } } } # Question1: Works fine, but using the vector numbers feels kludgy -- especially if the variables in question aren't consecutive. # Is there a more elegant way of doing this with my lists of variable names? Something like this, for example: for(dv in dvs) { for(iv in ivs) { for (intv in intvs) { jpeg(paste(dv, iv, intv, .jpg, sep=_)) with(data.frame(my.df), { my.fit - lm(my.df[dv] ~ my.df[iv] + my.df[intv] + my.df[iv]:my.df[intv]) colors - ifelse (my.df[[intv]] == 1, black, grey) plot(my.df[iv], my.df[dv], xlab=iv, ylab=names(dv), col=colors, pch=.) curve (cbind (1, 1, x, 1*x) %*% coef(my.fit), add=TRUE, col=black) curve (cbind (1, 0, x, 0*x) %*% coef(my.fit), add=TRUE, col=gray) }) dev.off() } } } # Clearly that's wrong -- why it's wrong is obscure to me, though! Please educate me! # Question2: Could this could be done by using apply rather than a loop? # Or is looping better here bc there are several actions performed at each iteration? # I'm still trying to get my head around all the ways to ditch looping in R. Donald Braman http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 http://research.yale.edu/culturalcognition http://ssrn.com/author=286206 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R
[R] imputationlist, update, and recode
I'm stumbling my way through manipulating data in multiply imputed datasets, and have run into a problem translating code I used to run on my pre-imputed dataset to multiple datasets. The imputation runs just fine, as does the reading of the mi data sets into an imputationList. I run into trouble, though, when I try to construct a scale across all the data sets. Is there a simple way to do this? (here's what I've been trying) vars_to_impute = c(var1, ... var50) imputed - amelia(data=vars_to_impute, m=5, outname=miset) files.allmisets - list.files(getwd(),pattern=miset*,full=TRUE) allmis - imputationList(lapply(files.allmisets, read.csv)) scale1_vars - c(var1, var2, var3, ... var20) scale2_vars - c(var21, var22, var23, ... var34) allmis - update(allmis, myscale1 = rowMeans(allmis[scale1_vars], na.rm=TRUE)) allmis - update(allmis, myscale2 = rowMeans(allmis[scale2_vars], na.rm=TRUE)) Any help with this or general pointers about how to manage scale construction across multiple data sets much appreciated. Don [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] manipulating multiply imputed data sets
Hi folks, I have five imputed data sets and would like to apply the same recoding routines to each. I could do this sort of thing pretty easily in Stata using MIM, but I've decided to go cold turkey on other stats packages as a incentive for learning more about R. Most of the recoding is for nominal variables, like race, religion, urbanicity, and the like. So, for example, to recode race for my first dataset, inmi1, I would do the following: miset1$white - recode(miset1$RACE, '1=1; else=0; ') miset1$black - recode(miset1$RACE, '2=1; else=0; ') miset1$asian - recode(miset1$RACE, '3=1; else=0; ') miset1$hispanic - recode(miset1$RACE, '4=1; else=0; ') miset1$raceother - recode(miset1$RACE, '5=1; else=0; ') I've tried a number of variations, e.g., on the following using recode (from the car package) with imputationList (from the mitools package), though without success: files.allmisets - list.files(getwd(),pattern=miset*.csv$,full=TRUE) allmis - imputationList(lapply(files.allmisets, read.csv)) allmis - update(allmis, white - recode(RACE, '1=1; else=0; ')) I've also tried some basic loops. I guess I'm also a bit confused as to when R references the original object and when it creates a new one. I suppose I could do this in Python and the use PyR, but I'd really like to learn a bit more about how R syntax. Any help on this specific problem or general advice on manipulating data in multiply imputed datasets in R would be much appreciated. -- Donald Braman http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 http://research.yale.edu/culturalcognition http://ssrn.com/author=286206 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] recoding data with loops
# I'm new to R and am trying to get the hang of how it handles # dataframes loops. If anyone can help me with some simple tasks, # I'd be much obliged. # First, i'd like to generate some random data in a dataframe # to efficiently illustrate what I'm up to. # let's say I have six variables as listed below (I really # have hundreds, but a few will illustrate the point). # I want to generate my dataframe (mdf) # with the 6 variables X 100 values with rnorm(7). # How do I do this? I tried many variations on the following: var_list - c(HEQUAL, EWEALTH, ERADEQ, HREVDIS1, EDISCRIM, HREVDIS2) for(i in 1:length(var_list)) {var_list[1] - rnorm(100)} mdf - data.frame(cbind(varlist[1:length(var_list)]) mdf # Then, I'd like to recode the variables that begin with the letter H. # I've tried many variations of the following, but to no avail: reverse_list - c(HEQUAL, HREVDIS1, HREVDIS2) reversed_list - c(RHEQUAL, RHREVDIS1, RHREVDIS2) for(i in 1:length(reverse_list)) {mdf[ ,e_reversed_list][[i]] - recode(mdf[ ,e_reverse_list][[i]], '5:99=NA; 1=4; 2=3; 3=2; 4=1; ', as.factor.result=FALSE) # I'm sure I have many deep misunderstandings about the R language, but # if I can get this much done, I think I'll be well on my way to understanding R # and how it works with loops and dataframes. # Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recoding data with loops
Thanks for quick response! I have done. I have tried many configurations of the various examples given there, but the examples are pretty short and none explain how to loop through nonconsecutive variables in a data frame. I've also read dozens of pages that come up when I google data.frame rnorm and data.frame loops, but to no avail. On Mon, May 19, 2008 at 3:54 PM, Bert Gunter [EMAIL PROTECTED] wrote: If you're serious, start by reading the docs, especially An Introduction to R. There are also other learning resources listed on CRAN. -- Bert gunter Genentech -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Donald Braman Sent: Monday, May 19, 2008 12:42 PM To: r-help@r-project.org Subject: [R] recoding data with loops # I'm new to R and am trying to get the hang of how it handles # dataframes loops. If anyone can help me with some simple tasks, # I'd be much obliged. # First, i'd like to generate some random data in a dataframe # to efficiently illustrate what I'm up to. # let's say I have six variables as listed below (I really # have hundreds, but a few will illustrate the point). # I want to generate my dataframe (mdf) # with the 6 variables X 100 values with rnorm(7). # How do I do this? I tried many variations on the following: var_list - c(HEQUAL, EWEALTH, ERADEQ, HREVDIS1, EDISCRIM, HREVDIS2) for(i in 1:length(var_list)) {var_list[1] - rnorm(100)} mdf - data.frame(cbind(varlist[1:length(var_list)]) mdf # Then, I'd like to recode the variables that begin with the letter H. # I've tried many variations of the following, but to no avail: reverse_list - c(HEQUAL, HREVDIS1, HREVDIS2) reversed_list - c(RHEQUAL, RHREVDIS1, RHREVDIS2) for(i in 1:length(reverse_list)) {mdf[ ,e_reversed_list][[i]] - recode(mdf[ ,e_reverse_list][[i]], '5:99=NA; 1=4; 2=3; 3=2; 4=1; ', as.factor.result=FALSE) # I'm sure I have many deep misunderstandings about the R language, but # if I can get this much done, I think I'll be well on my way to understanding R # and how it works with loops and dataframes. # Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Donald Braman http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 http://research.yale.edu/culturalcognition http://ssrn.com/author=286206 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recoding data with loops
Many thanks -- You are right; I had rnorm() and sample() mixed up in my code. I'll work on generating a normal ordinal sample next. Cheers, Don On Mon, May 19, 2008 at 4:07 PM, Erik Iverson [EMAIL PROTECTED] wrote: Hello - Donald Braman wrote: # I'm new to R and am trying to get the hang of how it handles # dataframes loops. If anyone can help me with some simple tasks, # I'd be much obliged. # First, i'd like to generate some random data in a dataframe # to efficiently illustrate what I'm up to. # let's say I have six variables as listed below (I really # have hundreds, but a few will illustrate the point). # I want to generate my dataframe (mdf) # with the 6 variables X 100 values with rnorm(7). # How do I do this? I tried many variations on the following: var_list - c(HEQUAL, EWEALTH, ERADEQ, HREVDIS1, EDISCRIM, HREVDIS2) for(i in 1:length(var_list)) {var_list[1] - rnorm(100)} mdf - data.frame(cbind(varlist[1:length(var_list)]) mdf There are many ways to do this. Do you mean that you want 6 columns, 100 observations in each column, each a sample from a normal distribution with mean = 7 and sd = 1? You can do this without looping in one of several ways. If you are coming from a SAS environment (my guess since you talk of looping over data.frames), you may be used to looping through a data object. In R, you can usually avoid this since many functions are vectorized, or take a 'whole object' approach. var_list - c(HEQUAL, EWEALTH, ERADEQ, HREVDIS1, EDISCRIM, HREVDIS2) mdf - data.frame(replicate(6, rnorm(100, 7))) ## generate random data names(mdf) ## default names names(mdf) - var_list ## use our names # Then, I'd like to recode the variables that begin with the letter H. # I've tried many variations of the following, but to no avail: reverse_list - c(HEQUAL, HREVDIS1, HREVDIS2) reversed_list - c(RHEQUAL, RHREVDIS1, RHREVDIS2) for(i in 1:length(reverse_list)) {mdf[ ,e_reversed_list][[i]] - recode(mdf[ ,e_reverse_list][[i]], '5:99=NA; 1=4; 2=3; 3=2; 4=1; ', as.factor.result=FALSE) I'm not quite sure what you are after here. What do you mean by recode? What package is your 'recode' function located in? It appears that you may be under the impression that the data.frame contains integers, but certainly it will not since it was generated with rnorm? sample can generate a samples of the type you may be after, for example, sample(7, 100, replace = TRUE) Best, Erik Iverson -- Donald Braman http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 http://research.yale.edu/culturalcognition http://ssrn.com/author=286206 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recoding data with loops
Erik, Your example was just what I needed to generate the data -- many, many thanks! The names() function was something I had not grasped fully. I now have this and it works very nicely: var_list - c(HEQUAL, EWEALTH, ERADEQ, HREVDIS1, EDISCRIM, HREVDIS2) mdf - data.frame(replicate(length(var_list), sample(7,100, replace = TRUE))) ## generate random data names(mdf) ## default names names(mdf) - var_list ## use our names mdf I'm still trying to figure out how to recode (using the car package) data into new variables using a similar loop. Basically, I'm not sure how to call the variable name and append it to the dataframe name in a loop. In Stata I'd do this using single quotes, but clearly that's not how R works. I tried several variations on this: reverse_me_varnames - c(HEQUAL, HREVDIS1, HREVDIS2) reversed_varnames - c(RHEQUAL, RHREVDIS1, RHREVDIS2) for(i in 1:length(reverse_me_varnames)) {mdf$reversed_varnames[i] - recode(mdf$reverse_me_varnames[i], '5:7=NA; 1=4; 2=3; 3=2; 4=1;', as.factor.result=FALSE) While I don't get an error message, the data don't change. Any advice on reverse coding non-continguous variables? On Mon, May 19, 2008 at 4:12 PM, Donald Braman [EMAIL PROTECTED] wrote: Many thanks -- You are right; I had rnorm() and sample() mixed up in my code. I'll work on generating a normal ordinal sample next. Cheers, Don On Mon, May 19, 2008 at 4:07 PM, Erik Iverson [EMAIL PROTECTED] wrote: Hello - Donald Braman wrote: # I'm new to R and am trying to get the hang of how it handles # dataframes loops. If anyone can help me with some simple tasks, # I'd be much obliged. # First, i'd like to generate some random data in a dataframe # to efficiently illustrate what I'm up to. # let's say I have six variables as listed below (I really # have hundreds, but a few will illustrate the point). # I want to generate my dataframe (mdf) # with the 6 variables X 100 values with rnorm(7). # How do I do this? I tried many variations on the following: var_list - c(HEQUAL, EWEALTH, ERADEQ, HREVDIS1, EDISCRIM, HREVDIS2) for(i in 1:length(var_list)) {var_list[1] - rnorm(100)} mdf - data.frame(cbind(varlist[1:length(var_list)]) mdf There are many ways to do this. Do you mean that you want 6 columns, 100 observations in each column, each a sample from a normal distribution with mean = 7 and sd = 1? You can do this without looping in one of several ways. If you are coming from a SAS environment (my guess since you talk of looping over data.frames), you may be used to looping through a data object. In R, you can usually avoid this since many functions are vectorized, or take a 'whole object' approach. var_list - c(HEQUAL, EWEALTH, ERADEQ, HREVDIS1, EDISCRIM, HREVDIS2) mdf - data.frame(replicate(6, rnorm(100, 7))) ## generate random data names(mdf) ## default names names(mdf) - var_list ## use our names # Then, I'd like to recode the variables that begin with the letter H. # I've tried many variations of the following, but to no avail: reverse_list - c(HEQUAL, HREVDIS1, HREVDIS2) reversed_list - c(RHEQUAL, RHREVDIS1, RHREVDIS2) for(i in 1:length(reverse_list)) {mdf[ ,e_reversed_list][[i]] - recode(mdf[ ,e_reverse_list][[i]], '5:99=NA; 1=4; 2=3; 3=2; 4=1; ', as.factor.result=FALSE) I'm not quite sure what you are after here. What do you mean by recode? What package is your 'recode' function located in? It appears that you may be under the impression that the data.frame contains integers, but certainly it will not since it was generated with rnorm? sample can generate a samples of the type you may be after, for example, sample(7, 100, replace = TRUE) Best, Erik Iverson -- Donald Braman http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 http://research.yale.edu/culturalcognition http://ssrn.com/author=286206 -- Donald Braman http://www.law.gwu.edu/Faculty/profile.aspx?id=10123 http://research.yale.edu/culturalcognition http://ssrn.com/author=286206 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recoding data with loops
Many, many thanks Erik! For anyone who is searching around looking for a way to recode in R, here's the full code Erik provided: var_list - c(HEQUAL, EWEALTH, ERADEQ, HREVDIS1, EDISCRIM, HREVDIS2) ## my original list of variables mdf - data.frame(replicate(length(var_list), sample(7,100, replace = TRUE))) ## generate 100 records of random numbers sampled from 1:7 names(mdf) ## unnecessary, but helpful to see what R supplies as default names names(mdf) - var_list ## substitues my variable names mdf ## lovely! reverse_me_varnames - c(HEQUAL, HREVDIS1, HREVDIS2) ## these are the variables I want to reverse code reversed_varnames -paste(R, reverse_me_varnames, sep = ) ## this generates the names of the reversed variables by taking on an R mdf[reversed_varnames] - lapply(mdf[reverse_me_varnames], function(x) recode(x, recodes = 5:7=NA; 1=4; 2=3; 3=2; 4=1;, as.factor.result = FALSE)) ## this applies the recode function to all the variable I want to recode and stores them in the new R___ variables. mdf ## lovely! I really like that R doesn't even need to use loops to do this -- seems very efficient to me! On Mon, May 19, 2008 at 6:49 PM, Erik Iverson [EMAIL PROTECTED] wrote: Got it, I did not know of the 'recode' function in car. So you would like to recode those specific columns then? Once again, we can do it without a loop, this time with the help of a function called lapply, which applies a function to each item in a list in turn. Try: reverse_me_varnames - c(HEQUAL, HREVDIS1, HREVDIS2) reversed_varnames -paste(R, reverse_me_varnames, sep = ) ## See ?paste mdf[reversed_varnames] - lapply(mdf[reverse_me_varnames], function(x) recode(x, recodes = 5:7=NA; 1=4; 2=3; 3=2; 4=1;, as.factor.result = FALSE)) Now what does this actually mean? To the left of '-' is simply the new columns of our data.frame. We want to then use lapply to do some function to a list of objects. The first argument to lapply is that list. In this case, it is simply the columns of the data.frame you want reversed. A data.frame is a list in R. See ?list and ?data.frame. Then, the next argument to lapply is a function that we want to perform on each element in our list. So, we create a function that accepts as input a variable I simply call 'x'. This 'x' is going to be an item from the list we passed lapply, which is one of the columns of mdf in 'reverse_me_varnames'. We then use the recode function in the car package to recode x, in a similar way to what you tried before. This function of x we define will get called three times in the above example, once for each of reverse_me_varnames. It will then assign those three new columns to the left-hand side of the - operator, which are three newly-named columns. To see why what you tried before did not work, with the for loop, try: mdf$HEQUAL contrasted with t1 - c(HEQUAL) mdf$t1 From the help for ?Extract, $ does not allow 'computed' indices. I hope this helps! Erik Donald Braman wrote: Erik, Your example was just what I needed to generate the data -- many, many thanks! The names() function was something I had not grasped fully. I now have this and it works very nicely: var_list - c(HEQUAL, EWEALTH, ERADEQ, HREVDIS1, EDISCRIM, HREVDIS2) mdf - data.frame(replicate(length(var_list), sample(7,100, replace = TRUE))) ## generate random data names(mdf) ## default names names(mdf) - var_list ## use our names mdf I'm still trying to figure out how to recode (using the car package) data into new variables using a similar loop. Basically, I'm not sure how to call the variable name and append it to the dataframe name in a loop. In Stata I'd do this using single quotes, but clearly that's not how R works. I tried several variations on this: reverse_me_varnames - c(HEQUAL, HREVDIS1, HREVDIS2) reversed_varnames - c(RHEQUAL, RHREVDIS1, RHREVDIS2) for(i in 1:length(reverse_me_varnames)) {mdf$reversed_varnames[i] - recode(mdf$reverse_me_varnames[i], '5:7=NA; 1=4; 2=3; 3=2; 4=1;', as.factor.result=FALSE) While I don't get an error message, the data don't change. Any advice on reverse coding non-continguous variables? On Mon, May 19, 2008 at 4:12 PM, Donald Braman [EMAIL PROTECTED]mailto: [EMAIL PROTECTED] wrote: Many thanks -- You are right; I had rnorm() and sample() mixed up in my code. I'll work on generating a normal ordinal sample next. Cheers, Don On Mon, May 19, 2008 at 4:07 PM, Erik Iverson [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hello - Donald Braman wrote: # I'm new to R and am trying to get the hang of how it handles # dataframes loops. If anyone can help me with some simple tasks, # I'd be much obliged. # First, i'd like to generate some random data in a dataframe # to efficiently illustrate what I'm up