Re: [R] Populating then sorting a matrix and/or data.frame
All values in a matrix are the same type, so if you've set up a matrix with a character column then your numeric values will also be stored as character. That would explain why they are being converted to factors. It would also explain why your query isn't working. Michael On 11 November 2010 18:40, Noah Silverman n...@smartmediacorp.com wrote: That was a typo. It should have read: results[results$one 100,] It does still fail. There is ONE column that is text. So my guess is that R is seeing that and assuming that the entire data.frame should be factors. -N On 11/10/10 11:16 PM, Michael Bedward wrote: Hello Noah, If you set these names... names(results)- c(one, two, three) this won't work... results[results$c 100,] because you don't have a column called c (unless that's just a typo in your post). I tried making it a data.frame with foo- data.frame(results) But that converted all the numeric values to factors!!! Not sure what's going on there. If 'results' is a numeric matrix you should get a data.frame with numeric cols since under the hood this is just calling the as.data.frame function. Michael On 11 November 2010 16:02, Noah Silvermann...@smartmediacorp.com wrote: Hi, I have a process in R that produces a lot of output. My plan was to build up a matrix or data.frame row by row, so that I'll have a nice object with all the resulting data. I started with: results- matrix(ncol=3) names(results)- c(one, two, three) Then, when looping through the data: results- rbind(results, c(a,b,c)) This seems to work fine. BUT, my problem arises when I want to filter, sort, etc. I tried (thinking like a data.frame): results[results$c 100,] But that fails. I tried making it a data.frame with foo- data.frame(results) But that converted all the numeric values to factors!!! Which causes a whole mess of problems. Any ideas?? -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
That makes perfect sense. Since I need to build up the results table sequentially as I iterate through the data, how would you recommend it?? Thanks, -N On 11/11/10 12:03 AM, Michael Bedward wrote: All values in a matrix are the same type, so if you've set up a matrix with a character column then your numeric values will also be stored as character. That would explain why they are being converted to factors. It would also explain why your query isn't working. Michael On 11 November 2010 18:40, Noah Silvermann...@smartmediacorp.com wrote: That was a typo. It should have read: results[results$one 100,] It does still fail. There is ONE column that is text. So my guess is that R is seeing that and assuming that the entire data.frame should be factors. -N On 11/10/10 11:16 PM, Michael Bedward wrote: Hello Noah, If you set these names... names(results)- c(one, two, three) this won't work... results[results$c100,] because you don't have a column called c (unless that's just a typo in your post). I tried making it a data.frame with foo- data.frame(results) But that converted all the numeric values to factors!!! Not sure what's going on there. If 'results' is a numeric matrix you should get a data.frame with numeric cols since under the hood this is just calling the as.data.frame function. Michael On 11 November 2010 16:02, Noah Silvermann...@smartmediacorp.comwrote: Hi, I have a process in R that produces a lot of output. My plan was to build up a matrix or data.frame row by row, so that I'll have a nice object with all the resulting data. I started with: results- matrix(ncol=3) names(results)- c(one, two, three) Then, when looping through the data: results- rbind(results, c(a,b,c)) This seems to work fine. BUT, my problem arises when I want to filter, sort, etc. I tried (thinking like a data.frame): results[results$c100,] But that fails. I tried making it a data.frame with foo- data.frame(results) But that converted all the numeric values to factors!!! Which causes a whole mess of problems. Any ideas?? -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
You can use rbind as in your original post, but if you've got a mix of character and numeric data start with a data.frame rather than a matrix. Michael On 11 November 2010 20:30, Noah Silverman n...@smartmediacorp.com wrote: That makes perfect sense. Since I need to build up the results table sequentially as I iterate through the data, how would you recommend it?? Thanks, -N On 11/11/10 12:03 AM, Michael Bedward wrote: All values in a matrix are the same type, so if you've set up a matrix with a character column then your numeric values will also be stored as character. That would explain why they are being converted to factors. It would also explain why your query isn't working. Michael On 11 November 2010 18:40, Noah Silvermann...@smartmediacorp.com wrote: That was a typo. It should have read: results[results$one 100,] It does still fail. There is ONE column that is text. So my guess is that R is seeing that and assuming that the entire data.frame should be factors. -N On 11/10/10 11:16 PM, Michael Bedward wrote: Hello Noah, If you set these names... names(results)- c(one, two, three) this won't work... results[results$c 100,] because you don't have a column called c (unless that's just a typo in your post). I tried making it a data.frame with foo- data.frame(results) But that converted all the numeric values to factors!!! Not sure what's going on there. If 'results' is a numeric matrix you should get a data.frame with numeric cols since under the hood this is just calling the as.data.frame function. Michael On 11 November 2010 16:02, Noah Silvermann...@smartmediacorp.com wrote: Hi, I have a process in R that produces a lot of output. My plan was to build up a matrix or data.frame row by row, so that I'll have a nice object with all the resulting data. I started with: results- matrix(ncol=3) names(results)- c(one, two, three) Then, when looping through the data: results- rbind(results, c(a,b,c)) This seems to work fine. BUT, my problem arises when I want to filter, sort, etc. I tried (thinking like a data.frame): results[results$c 100,] But that fails. I tried making it a data.frame with foo- data.frame(results) But that converted all the numeric values to factors!!! Which causes a whole mess of problems. Any ideas?? -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
Still doesn't work. When using rbind to build the data.frame, it get a structure mostly full of NA. The data is correct, so something about pushing into the data.frame is breaking. Example code: results - data.frame() for(i in 1:n){ #do all the work #a is a test label. b,c,d are numeric. results - rbind(results, c(a,b,c,d)) } On 11/11/10 2:00 AM, Michael Bedward wrote: You can use rbind as in your original post, but if you've got a mix of character and numeric data start with a data.frame rather than a matrix. Michael On 11 November 2010 20:30, Noah Silvermann...@smartmediacorp.com wrote: That makes perfect sense. Since I need to build up the results table sequentially as I iterate through the data, how would you recommend it?? Thanks, -N On 11/11/10 12:03 AM, Michael Bedward wrote: All values in a matrix are the same type, so if you've set up a matrix with a character column then your numeric values will also be stored as character. That would explain why they are being converted to factors. It would also explain why your query isn't working. Michael On 11 November 2010 18:40, Noah Silvermann...@smartmediacorp.comwrote: That was a typo. It should have read: results[results$one100,] It does still fail. There is ONE column that is text. So my guess is that R is seeing that and assuming that the entire data.frame should be factors. -N On 11/10/10 11:16 PM, Michael Bedward wrote: Hello Noah, If you set these names... names(results)- c(one, two, three) this won't work... results[results$c 100,] because you don't have a column called c (unless that's just a typo in your post). I tried making it a data.frame with foo- data.frame(results) But that converted all the numeric values to factors!!! Not sure what's going on there. If 'results' is a numeric matrix you should get a data.frame with numeric cols since under the hood this is just calling the as.data.frame function. Michael On 11 November 2010 16:02, Noah Silvermann...@smartmediacorp.com wrote: Hi, I have a process in R that produces a lot of output. My plan was to build up a matrix or data.frame row by row, so that I'll have a nice object with all the resulting data. I started with: results- matrix(ncol=3) names(results)- c(one, two, three) Then, when looping through the data: results- rbind(results, c(a,b,c)) This seems to work fine. BUT, my problem arises when I want to filter, sort, etc. I tried (thinking like a data.frame): results[results$c 100,] But that fails. I tried making it a data.frame with foo- data.frame(results) But that converted all the numeric values to factors!!! Which causes a whole mess of problems. Any ideas?? -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
On Thu, Nov 11, 2010 at 11:33 AM, Noah Silverman n...@smartmediacorp.com wrote: Still doesn't work. When using rbind to build the data.frame, it get a structure mostly full of NA. The data is correct, so something about pushing into the data.frame is breaking. Example code: results - data.frame() for(i in 1:n){ #do all the work #a is a test label. b,c,d are numeric. results - rbind(results, c(a,b,c,d)) } Works for me: results - data.frame() n = 10 for(i in 1:n){ a = LETTERS[i]; b = i; c = 3*i + 2 d = rnorm(1); results - rbind(results, c(a,b,c,d)) } results X.A. X.1. X.5. X.0.142223304589023. 1 A150.142223304589023 2 B280.243612305595176 3 C3 110.476795513990516 4 D4 14 1.0278220664213 5 E5 170.916608672305205 6 F6 20 1.61075985995586 7 G7 230.370423691258896 8 H8 26 -0.0528603547004191 9 I9 29-2.07888666920403 10J 10 32-1.87980721733655 Maybe there's something wrong with the calculation you do? Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
Peter, Your example doesn't work for me unless I set options(stringsAsFactors=TRUE) first. (If I do set that, then all columns of 'results' have class character, which I doubt the user wants.) results - data.frame() n = 10 for(i in 1:n){ +a = LETTERS[i]; +b = i; +c = 3*i + 2 +d = rnorm(1); +results - rbind(results, c(a,b,c,d)) + } There were 36 warnings (use warnings() to see them) warnings()[1:5] $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = B) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = 2) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = 8) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = -0.305558353507095) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = C) results X.A. X.1. X.5. X.1.43055780028799. 1 A151.43055780028799 2 NA NA NANA 3 NA NA NANA 4 NA NA NANA 5 NA NA NANA 6 NA NA NANA 7 NA NA NANA 8 NA NA NANA 9 NA NA NANA 10 NA NA NANA Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Peter Langfelder Sent: Thursday, November 11, 2010 12:25 PM To: Noah Silverman Cc: r-help@r-project.org Subject: Re: [R] Populating then sorting a matrix and/or data.frame On Thu, Nov 11, 2010 at 11:33 AM, Noah Silverman n...@smartmediacorp.com wrote: Still doesn't work. When using rbind to build the data.frame, it get a structure mostly full of NA. The data is correct, so something about pushing into the data.frame is breaking. Example code: results - data.frame() for(i in 1:n){ #do all the work #a is a test label. b,c,d are numeric. results - rbind(results, c(a,b,c,d)) } Works for me: results - data.frame() n = 10 for(i in 1:n){ a = LETTERS[i]; b = i; c = 3*i + 2 d = rnorm(1); results - rbind(results, c(a,b,c,d)) } results X.A. X.1. X.5. X.0.142223304589023. 1 A150.142223304589023 2 B280.243612305595176 3 C3 110.476795513990516 4 D4 14 1.0278220664213 5 E5 170.916608672305205 6 F6 20 1.61075985995586 7 G7 230.370423691258896 8 H8 26 -0.0528603547004191 9 I9 29-2.07888666920403 10J 10 32-1.87980721733655 Maybe there's something wrong with the calculation you do? Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
--- On Thu, 11/11/10, William Dunlap wdun...@tibco.com wrote: From: William Dunlap wdun...@tibco.com Subject: Re: [R] Populating then sorting a matrix and/or data.frame To: Peter Langfelder peter.langfel...@gmail.com, r-help@r-project.org Received: Thursday, November 11, 2010, 4:19 PM Peter, Your example doesn't work for me unless I set options(stringsAsFactors=TRUE) first. Don't you mean stringsAsFactors=FALSE here? At least I get the same results you do but with stringsAsFactors=FALSE The TRUE condition is giving multiple NAs and error messages (If I do set that, then all columns of 'results' have class character, which I doubt the user wants.) results - data.frame() n = 10 for(i in 1:n){ + a = LETTERS[i]; + b = i; + c = 3*i + 2 + d = rnorm(1); + results - rbind(results, c(a,b,c,d)) + } There were 36 warnings (use warnings() to see them) warnings()[1:5] $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = B) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = 2) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = 8) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = -0.305558353507095) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = C) results X.A. X.1. X.5. X.1.43055780028799. 1 A 1 5 1.43055780028799 2 NA NA NA NA 3 NA NA NA NA 4 NA NA NA NA 5 NA NA NA NA 6 NA NA NA NA 7 NA NA NA NA 8 NA NA NA NA 9 NA NA NA NA 10 NA NA NA NA Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Peter Langfelder Sent: Thursday, November 11, 2010 12:25 PM To: Noah Silverman Cc: r-help@r-project.org Subject: Re: [R] Populating then sorting a matrix and/or data.frame On Thu, Nov 11, 2010 at 11:33 AM, Noah Silverman n...@smartmediacorp.com wrote: Still doesn't work. When using rbind to build the data.frame, it get a structure mostly full of NA. The data is correct, so something about pushing into the data.frame is breaking. Example code: results - data.frame() for(i in 1:n){ #do all the work #a is a test label. b,c,d are numeric. results - rbind(results, c(a,b,c,d)) } Works for me: results - data.frame() n = 10 for(i in 1:n){ a = LETTERS[i]; b = i; c = 3*i + 2 d = rnorm(1); results - rbind(results, c(a,b,c,d)) } results X.A. X.1. X.5. X.0.142223304589023. 1 A 1 5 0.142223304589023 2 B 2 8 0.243612305595176 3 C 3 11 0.476795513990516 4 D 4 14 1.0278220664213 5 E 5 17 0.916608672305205 6 F 6 20 1.61075985995586 7 G 7 23 0.370423691258896 8 H 8 26 -0.0528603547004191 9 I 9 29 -2.07888666920403 10 J 10 32 -1.87980721733655 Maybe there's something wrong with the calculation you do? Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
On Thu, Nov 11, 2010 at 1:19 PM, William Dunlap wdun...@tibco.com wrote: Peter, Your example doesn't work for me unless I set options(stringsAsFactors=TRUE) first. (If I do set that, then all columns of 'results' have class character, which I doubt the user wants.) You probably mean stringsAsFactors=FALSE. What you say makes sense, because the c() function produces a vector in which all components have the same type, wnd it will be character. If you don't want to have characters, my solution would be n = 10 results - data.frame(a = rep(, n), b = rep(0, n), c = rep(0, n), d = rep(0, n)) for(i in 1:n){ a = LETTERS[i]; b = i; c = 3*i + 2 d = rnorm(1); results$a[i] = a results$b[i] = b results$c[i] = c results$d[i] = d } results a b c d 1 A 1 5 -1.31553805 2 B 2 8 0.09198054 3 C 3 11 -0.05860804 4 D 4 14 0.77796136 5 E 5 17 1.28924697 6 F 6 20 0.47631483 7 G 7 23 -1.23727076 8 H 8 26 0.83595295 9 I 9 29 0.69435349 10 J 10 32 -0.30922930 mode(results[, 1]) [1] character mode(results[, 2]) [1] numeric mode(results[, 3]) [1] numeric mode(results[, 4]) [1] numeric or alternatively n = 10 num - data.frame(b = rep(0, n), c = rep(0, n), d = rep(0, n)) labels = rep(, n); for(i in 1:n){ a = LETTERS[i]; b = i; c = 3*i + 2 d = rnorm(1); labels[i] = a num[i, ] = c(b, c, d) } results = data.frame(a = labels, num) results a b c d 1 A 1 5 -0.47150097 2 B 2 8 -1.30507313 3 C 3 11 -1.09860425 4 D 4 14 0.91326330 5 E 5 17 -0.09732841 6 F 6 20 -0.75134162 7 G 7 23 0.31360908 8 H 8 26 -1.54406716 9 I 9 29 -0.36075743 10 J 10 32 -0.23758269 mode(results[, 1]) [1] character mode(results[, 2]) [1] numeric mode(results[, 3]) [1] numeric mode(results[, 4]) [1] numeric Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
You are right, I mistyped it. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: John Kane [mailto:jrkrid...@yahoo.ca] Sent: Thursday, November 11, 2010 1:58 PM To: Peter Langfelder; r-help@r-project.org; William Dunlap Subject: Re: [R] Populating then sorting a matrix and/or data.frame --- On Thu, 11/11/10, William Dunlap wdun...@tibco.com wrote: From: William Dunlap wdun...@tibco.com Subject: Re: [R] Populating then sorting a matrix and/or data.frame To: Peter Langfelder peter.langfel...@gmail.com, r-help@r-project.org Received: Thursday, November 11, 2010, 4:19 PM Peter, Your example doesn't work for me unless I set options(stringsAsFactors=TRUE) first. Don't you mean stringsAsFactors=FALSE here? At least I get the same results you do but with stringsAsFactors=FALSE The TRUE condition is giving multiple NAs and error messages (If I do set that, then all columns of 'results' have class character, which I doubt the user wants.) results - data.frame() n = 10 for(i in 1:n){ + a = LETTERS[i]; + b = i; + c = 3*i + 2 + d = rnorm(1); + results - rbind(results, c(a,b,c,d)) + } There were 36 warnings (use warnings() to see them) warnings()[1:5] $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = B) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = 2) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = 8) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = -0.305558353507095) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = C) results X.A. X.1. X.5. X.1.43055780028799. 1 A 1 5 1.43055780028799 2 NA NA NA NA 3 NA NA NA NA 4 NA NA NA NA 5 NA NA NA NA 6 NA NA NA NA 7 NA NA NA NA 8 NA NA NA NA 9 NA NA NA NA 10 NA NA NA NA Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Peter Langfelder Sent: Thursday, November 11, 2010 12:25 PM To: Noah Silverman Cc: r-help@r-project.org Subject: Re: [R] Populating then sorting a matrix and/or data.frame On Thu, Nov 11, 2010 at 11:33 AM, Noah Silverman n...@smartmediacorp.com wrote: Still doesn't work. When using rbind to build the data.frame, it get a structure mostly full of NA. The data is correct, so something about pushing into the data.frame is breaking. Example code: results - data.frame() for(i in 1:n){ #do all the work #a is a test label. b,c,d are numeric. results - rbind(results, c(a,b,c,d)) } Works for me: results - data.frame() n = 10 for(i in 1:n){ a = LETTERS[i]; b = i; c = 3*i + 2 d = rnorm(1); results - rbind(results, c(a,b,c,d)) } results X.A. X.1. X.5. X.0.142223304589023. 1 A 1 5 0.142223304589023 2 B 2 8 0.243612305595176 3 C 3 11 0.476795513990516 4 D 4 14 1.0278220664213 5 E 5 17 0.916608672305205 6 F 6 20 1.61075985995586 7 G 7 23 0.370423691258896 8 H 8 26 -0.0528603547004191 9 I 9 29 -2.07888666920403 10 J 10 32 -1.87980721733655 Maybe there's something wrong with the calculation you do? Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
On Thu, Nov 11, 2010 at 1:19 PM, William Dunlap wdun...@tibco.com wrote: Peter, Your example doesn't work for me unless I set options(stringsAsFactors=TRUE) first. Yes, you need to set options(stringsAsFactors=FALSE) (note the FALSE). I do it always so I forgot about that, sorry. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
Your errors look exactly like mine. Changing the option flag does allow me to create the data.frame without any errors. A quick look confirms that all the values are there and correct. However, R has coerced all of my numeric values to strings. Using your sample code also turns all the numeric values to strings. Perhaps it has something to do with the way R in interpreting the first column? Thanks! -N On 11/11/10 1:59 PM, William Dunlap wrote: You are right, I mistyped it. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: John Kane [mailto:jrkrid...@yahoo.ca] Sent: Thursday, November 11, 2010 1:58 PM To: Peter Langfelder; r-help@r-project.org; William Dunlap Subject: Re: [R] Populating then sorting a matrix and/or data.frame --- On Thu, 11/11/10, William Dunlap wdun...@tibco.com wrote: From: William Dunlap wdun...@tibco.com Subject: Re: [R] Populating then sorting a matrix and/or data.frame To: Peter Langfelder peter.langfel...@gmail.com, r-help@r-project.org Received: Thursday, November 11, 2010, 4:19 PM Peter, Your example doesn't work for me unless I set options(stringsAsFactors=TRUE) first. Don't you mean stringsAsFactors=FALSE here? At least I get the same results you do but with stringsAsFactors=FALSE The TRUE condition is giving multiple NAs and error messages (If I do set that, then all columns of 'results' have class character, which I doubt the user wants.) results - data.frame() n = 10 for(i in 1:n){ +a = LETTERS[i]; +b = i; +c = 3*i + 2 +d = rnorm(1); +results - rbind(results, c(a,b,c,d)) + } There were 36 warnings (use warnings() to see them) warnings()[1:5] $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = B) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = 2) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = 8) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = -0.305558353507095) $`invalid factor level, NAs generated` `[-.factor`(`*tmp*`, ri, value = C) results X.A. X.1. X.5. X.1.43055780028799. 1 A1 51.43055780028799 2 NA NA NA NA 3 NA NA NA NA 4 NA NA NA NA 5 NA NA NA NA 6 NA NA NA NA 7 NA NA NA NA 8 NA NA NA NA 9 NA NA NA NA 10 NA NA NA NA Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Peter Langfelder Sent: Thursday, November 11, 2010 12:25 PM To: Noah Silverman Cc: r-help@r-project.org Subject: Re: [R] Populating then sorting a matrix and/or data.frame On Thu, Nov 11, 2010 at 11:33 AM, Noah Silverman n...@smartmediacorp.com wrote: Still doesn't work. When using rbind to build the data.frame, it get a structure mostly full of NA. The data is correct, so something about pushing into the data.frame is breaking. Example code: results - data.frame() for(i in 1:n){ #do all the work #a is a test label. b,c,d are numeric. results - rbind(results, c(a,b,c,d)) } Works for me: results - data.frame() n = 10 for(i in 1:n){ a = LETTERS[i]; b = i; c = 3*i + 2 d = rnorm(1); results - rbind(results, c(a,b,c,d)) } results X.A. X.1. X.5. X.0.142223304589023. 1 A1 50.142223304589023 2 B2 80.243612305595176 3 C 3 110.476795513990516 4 D 4 14 1.0278220664213 5 E 5 170.916608672305205 6 F 6 20 1.61075985995586 7 G 7 230.370423691258896 8 H 8 26 -0.0528603547004191 9 I 9 29-2.07888666920403 10 J 10 32 -1.87980721733655 Maybe there's something wrong with the calculation you do? Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
Re: [R] Populating then sorting a matrix and/or data.frame
That makes perfect sense. All of my numbers are being coerced into strings by the c() function. Subsequently, my data.frame contains all strings. I can't know the length of the data.frame ahead of time, so can't predefine it like your example. One thought would be to make it arbitrarily long filled with 0 and delete off the unused rows. But this seems rather wasteful. -N On 11/11/10 2:02 PM, Peter Langfelder wrote: On Thu, Nov 11, 2010 at 1:19 PM, William Dunlap wdun...@tibco.com wrote: Peter, Your example doesn't work for me unless I set options(stringsAsFactors=TRUE) first. (If I do set that, then all columns of 'results' have class character, which I doubt the user wants.) You probably mean stringsAsFactors=FALSE. What you say makes sense, because the c() function produces a vector in which all components have the same type, wnd it will be character. If you don't want to have characters, my solution would be n = 10 results - data.frame(a = rep(, n), b = rep(0, n), c = rep(0, n), d = rep(0, n)) for(i in 1:n){ a = LETTERS[i]; b = i; c = 3*i + 2 d = rnorm(1); results$a[i] = a results$b[i] = b results$c[i] = c results$d[i] = d } results a b c d 1 A 1 5 -1.31553805 2 B 2 8 0.09198054 3 C 3 11 -0.05860804 4 D 4 14 0.77796136 5 E 5 17 1.28924697 6 F 6 20 0.47631483 7 G 7 23 -1.23727076 8 H 8 26 0.83595295 9 I 9 29 0.69435349 10 J 10 32 -0.30922930 mode(results[, 1]) [1] character mode(results[, 2]) [1] numeric mode(results[, 3]) [1] numeric mode(results[, 4]) [1] numeric or alternatively n = 10 num - data.frame(b = rep(0, n), c = rep(0, n), d = rep(0, n)) labels = rep(, n); for(i in 1:n){ a = LETTERS[i]; b = i; c = 3*i + 2 d = rnorm(1); labels[i] = a num[i, ] = c(b, c, d) } results = data.frame(a = labels, num) results a b c d 1 A 1 5 -0.47150097 2 B 2 8 -1.30507313 3 C 3 11 -1.09860425 4 D 4 14 0.91326330 5 E 5 17 -0.09732841 6 F 6 20 -0.75134162 7 G 7 23 0.31360908 8 H 8 26 -1.54406716 9 I 9 29 -0.36075743 10 J 10 32 -0.23758269 mode(results[, 1]) [1] character mode(results[, 2]) [1] numeric mode(results[, 3]) [1] numeric mode(results[, 4]) [1] numeric Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
I see 4 ways to write the code: 1. make the frame very long at the start and use my code - this is practical if you know that your data frame will not be longer than a certain number of rows, be it a million; 2a. use something like result1 = data.frame(a=a, b=b, c=c, d=d) within the loop to create a 1x4 data frame that you can rbind to results within the loop; 2b. make the code a bit more intelligent, for example by allocating blocks of say n=1000 at a time as needed and rbind-ing them to result; 3. fill up results with characters using your rbind(results, c(a,b,c,d)), then use something like results[, c(2:4)] = apply(apply(results[, c(2:4), 2, as.character), 2, as.numeric) to convert the characters in columns 2:4 to numbers (this construct also works with factors) The difference between 2a and 2b is that 2b may be faster if n is large, because 2a grows 4 objects by 1 unit n times, which is quite slow. The same holds for solution 3. In that sense solution 1 may be less wasteful than solutions 2a or 3 although it may not look like that. Peter On Thu, Nov 11, 2010 at 3:38 PM, Noah Silverman n...@smartmediacorp.com wrote: That makes perfect sense. All of my numbers are being coerced into strings by the c() function. Subsequently, my data.frame contains all strings. I can't know the length of the data.frame ahead of time, so can't predefine it like your example. One thought would be to make it arbitrarily long filled with 0 and delete off the unused rows. But this seems rather wasteful. -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
On Nov 11, 2010, at 6:38 PM, Noah Silverman wrote: That makes perfect sense. All of my numbers are being coerced into strings by the c() function. Subsequently, my data.frame contains all strings. I can't know the length of the data.frame ahead of time, so can't predefine it like your example. One thought would be to make it arbitrarily long filled with 0 and delete off the unused rows. But this seems rather wasteful. Although it might be faster, though. Here is a non-c() method using instead the list function (with options(stringsAsFactors=FALSE). List does not coerce to same mode and rbind.dta.frame will accept a list as a row argument: results - data.frame(a=vector(mode=character, length=0) , b=vector(mode=numeric, length=0), cc=vector(mode=numeric, length=0), # note: avoid c as name d=vector(mode=numeric, length=0)) n = 10 for(i in 1:n){ a = LETTERS[i]; b = i; cc = 3*i + 2 d = rnorm(1); results - rbind(results, list(a=a,b=b,cc=cc,d=c)) } results a b cc d 2 A 1 5 5 21 B 2 8 8 3 C 3 11 11 4 D 4 14 14 5 E 5 17 17 6 F 6 20 20 7 G 7 23 23 8 H 8 26 26 9 I 9 29 29 10 J 10 32 32 OOOPs used d=c and there was a c vector hanging around to be picked up. -- David. -N On 11/11/10 2:02 PM, Peter Langfelder wrote: On Thu, Nov 11, 2010 at 1:19 PM, William Dunlap wdun...@tibco.com wrote: Peter, Your example doesn't work for me unless I set options(stringsAsFactors=TRUE) first. (If I do set that, then all columns of 'results' have class character, which I doubt the user wants.) You probably mean stringsAsFactors=FALSE. What you say makes sense, because the c() function produces a vector in which all components have the same type, wnd it will be character. If you don't want to have characters, my solution would be n = 10 results - data.frame(a = rep(, n), b = rep(0, n), c = rep(0, n), d = rep(0, n)) for(i in 1:n){ a = LETTERS[i]; b = i; c = 3*i + 2 d = rnorm(1); results$a[i] = a results$b[i] = b results$c[i] = c results$d[i] = d } results a b c d 1 A 1 5 -1.31553805 2 B 2 8 0.09198054 3 C 3 11 -0.05860804 4 D 4 14 0.77796136 5 E 5 17 1.28924697 6 F 6 20 0.47631483 7 G 7 23 -1.23727076 8 H 8 26 0.83595295 9 I 9 29 0.69435349 10 J 10 32 -0.30922930 mode(results[, 1]) [1] character mode(results[, 2]) [1] numeric mode(results[, 3]) [1] numeric mode(results[, 4]) [1] numeric or alternatively n = 10 num - data.frame(b = rep(0, n), c = rep(0, n), d = rep(0, n)) labels = rep(, n); for(i in 1:n){ a = LETTERS[i]; b = i; c = 3*i + 2 d = rnorm(1); labels[i] = a num[i, ] = c(b, c, d) } results = data.frame(a = labels, num) results a b c d 1 A 1 5 -0.47150097 2 B 2 8 -1.30507313 3 C 3 11 -1.09860425 4 D 4 14 0.91326330 5 E 5 17 -0.09732841 6 F 6 20 -0.75134162 7 G 7 23 0.31360908 8 H 8 26 -1.54406716 9 I 9 29 -0.36075743 10 J 10 32 -0.23758269 mode(results[, 1]) [1] character mode(results[, 2]) [1] numeric mode(results[, 3]) [1] numeric mode(results[, 4]) [1] numeric Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
David, Great solution. While a bit longer to enter, it lets me explicitly define a type for each column. Thanks!!! -N On 11/11/10 4:02 PM, David Winsemius wrote: On Nov 11, 2010, at 6:38 PM, Noah Silverman wrote: That makes perfect sense. All of my numbers are being coerced into strings by the c() function. Subsequently, my data.frame contains all strings. I can't know the length of the data.frame ahead of time, so can't predefine it like your example. One thought would be to make it arbitrarily long filled with 0 and delete off the unused rows. But this seems rather wasteful. Although it might be faster, though. Here is a non-c() method using instead the list function (with options(stringsAsFactors=FALSE). List does not coerce to same mode and rbind.dta.frame will accept a list as a row argument: results - data.frame(a=vector(mode=character, length=0) , b=vector(mode=numeric, length=0), cc=vector(mode=numeric, length=0), # note: avoid c as name d=vector(mode=numeric, length=0)) n = 10 for(i in 1:n){ a = LETTERS[i]; b = i; cc = 3*i + 2 d = rnorm(1); results - rbind(results, list(a=a,b=b,cc=cc,d=c)) } results a b cc d 2 A 1 5 5 21 B 2 8 8 3 C 3 11 11 4 D 4 14 14 5 E 5 17 17 6 F 6 20 20 7 G 7 23 23 8 H 8 26 26 9 I 9 29 29 10 J 10 32 32 OOOPs used d=c and there was a c vector hanging around to be picked up. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Populating then sorting a matrix and/or data.frame
Hi, I have a process in R that produces a lot of output. My plan was to build up a matrix or data.frame row by row, so that I'll have a nice object with all the resulting data. I started with: results - matrix(ncol=3) names(results) - c(one, two, three) Then, when looping through the data: results - rbind(results, c(a,b,c)) This seems to work fine. BUT, my problem arises when I want to filter, sort, etc. I tried (thinking like a data.frame): results[results$c 100,] But that fails. I tried making it a data.frame with foo - data.frame(results) But that converted all the numeric values to factors!!! Which causes a whole mess of problems. Any ideas?? -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
Hello Noah, If you set these names... names(results) - c(one, two, three) this won't work... results[results$c 100,] because you don't have a column called c (unless that's just a typo in your post). I tried making it a data.frame with foo - data.frame(results) But that converted all the numeric values to factors!!! Not sure what's going on there. If 'results' is a numeric matrix you should get a data.frame with numeric cols since under the hood this is just calling the as.data.frame function. Michael On 11 November 2010 16:02, Noah Silverman n...@smartmediacorp.com wrote: Hi, I have a process in R that produces a lot of output. My plan was to build up a matrix or data.frame row by row, so that I'll have a nice object with all the resulting data. I started with: results - matrix(ncol=3) names(results) - c(one, two, three) Then, when looping through the data: results - rbind(results, c(a,b,c)) This seems to work fine. BUT, my problem arises when I want to filter, sort, etc. I tried (thinking like a data.frame): results[results$c 100,] But that fails. I tried making it a data.frame with foo - data.frame(results) But that converted all the numeric values to factors!!! Which causes a whole mess of problems. Any ideas?? -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populating then sorting a matrix and/or data.frame
That was a typo. It should have read: results[results$one 100,] It does still fail. There is ONE column that is text. So my guess is that R is seeing that and assuming that the entire data.frame should be factors. -N On 11/10/10 11:16 PM, Michael Bedward wrote: Hello Noah, If you set these names... names(results)- c(one, two, three) this won't work... results[results$c 100,] because you don't have a column called c (unless that's just a typo in your post). I tried making it a data.frame with foo- data.frame(results) But that converted all the numeric values to factors!!! Not sure what's going on there. If 'results' is a numeric matrix you should get a data.frame with numeric cols since under the hood this is just calling the as.data.frame function. Michael On 11 November 2010 16:02, Noah Silvermann...@smartmediacorp.com wrote: Hi, I have a process in R that produces a lot of output. My plan was to build up a matrix or data.frame row by row, so that I'll have a nice object with all the resulting data. I started with: results- matrix(ncol=3) names(results)- c(one, two, three) Then, when looping through the data: results- rbind(results, c(a,b,c)) This seems to work fine. BUT, my problem arises when I want to filter, sort, etc. I tried (thinking like a data.frame): results[results$c 100,] But that fails. I tried making it a data.frame with foo- data.frame(results) But that converted all the numeric values to factors!!! Which causes a whole mess of problems. Any ideas?? -N __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.