Re: [R] loop testing unidentified columns
Thank you! > On Jun 20, 2016, at 12:41 PM, David L Carlson <dcarl...@tamu.edu> wrote: > > It does not test the first column, but a vector must have consecutive > indices. Since you did not assign a value, R inserts a missing value. If you > don't want to see it use > >> results.pc.all[, -1] > [,1] [,2] > results.212 > results.323 > > - > David L Carlson > Department of Anthropology > Texas A University > College Station, TX 77840-4352 > > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Brittany > Demmitt > Sent: Monday, June 20, 2016 12:15 PM > To: r-help@r-project.org > Subject: [R] loop testing unidentified columns > > Hello, > > I want to compare all of the columns of one data frame to another to see if > any of the columns are equivalent to one another. The first column in both of > my data frames are the sample IDs and do not need to be compared. Below is an > example of the loop I am using to compare the two data frames that counts the > number of equivalent values there between two columns. So in this example the > value of 3 means that all three observations for the two columns being > compared were equivalent. The loop works fine but I do not understand why it > tests the first column of the sample IDs providing “NA” for the sum of > matching when my loop is specifying to only test columns 2-3. > > Thank you! > > > #create dataframe A > A = matrix(c("a",3,4,"b",5,7,"c",3,7),nrow=3, ncol=3,byrow = TRUE) > A <- as.data.frame(A) > A$V2 <- as.numeric(A$V2) > A$V3 <- as.numeric(A$V3) > str(A) > > #create dataframe B > B = matrix(c("a",1,1,"b",6,2,"c",2,2),nrow=3, ncol=3,byrow = TRUE) > B <- as.data.frame(B) > B$V2 <- as.numeric(B$V2) > B$V3 <- as.numeric(B$V3) > str(B) > > results.2 <- numeric() > results.3 <- numeric() > > > #compare columns to identify those that are identical in the two dataframes > for(i in 2:3){ > results.2[i] <- sum(A[,2]==B[,i]) > results.3[i] <- sum(A[,3]==B[,i]) > results.pc.all <- rbind(results.2,results.3) > } > results.pc.all > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] loop testing unidentified columns
Hello, I want to compare all of the columns of one data frame to another to see if any of the columns are equivalent to one another. The first column in both of my data frames are the sample IDs and do not need to be compared. Below is an example of the loop I am using to compare the two data frames that counts the number of equivalent values there between two columns. So in this example the value of 3 means that all three observations for the two columns being compared were equivalent. The loop works fine but I do not understand why it tests the first column of the sample IDs providing “NA” for the sum of matching when my loop is specifying to only test columns 2-3. Thank you! #create dataframe A A = matrix(c("a",3,4,"b",5,7,"c",3,7),nrow=3, ncol=3,byrow = TRUE) A <- as.data.frame(A) A$V2 <- as.numeric(A$V2) A$V3 <- as.numeric(A$V3) str(A) #create dataframe B B = matrix(c("a",1,1,"b",6,2,"c",2,2),nrow=3, ncol=3,byrow = TRUE) B <- as.data.frame(B) B$V2 <- as.numeric(B$V2) B$V3 <- as.numeric(B$V3) str(B) results.2 <- numeric() results.3 <- numeric() #compare columns to identify those that are identical in the two dataframes for(i in 2:3){ results.2[i] <- sum(A[,2]==B[,i]) results.3[i] <- sum(A[,3]==B[,i]) results.pc.all <- rbind(results.2,results.3) } results.pc.all __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.csv file= question
The trial “test” script worked, as well as adding getwd() to my current script also fixed the problem. So it seems “file=“ isn’t necessary after all to run from the terminal. Thanks everyone for your help! :-) On Aug 4, 2015, at 9:20 AM, Ista Zahn istaz...@gmail.com wrote: On Tue, Aug 4, 2015 at 11:12 AM, Ista Zahn istaz...@gmail.com wrote: On Tue, Aug 4, 2015 at 11:04 AM, John Kane jrkrid...@inbox.com wrote: You probably need to ask this on a RStudio forum but my guess is it is just a little 'refinement' that the RStudio people added. Similar in concept o the the matching . Really? write.csv(data,”/home/data.csv”) works for me in Rstudio, ESS, Terminal, Rscript etc. Well, actually I misspoke. I don't actually have permission to write to /home on my system. But write.csv(data, ~/data.csv) works. John Kane Kingston ON Canada -Original Message- From: demmi...@gmail.com Sent: Tue, 4 Aug 2015 08:51:24 -0600 To: r-help@r-project.org Subject: [R] write.csv file= question Hello, I have a quick question about the “file=“ specification for the command write.csv.When I run this command in Rstudio I do not need the “file=“ specified. For example the below command works just fine. write.csv(data,”/home/data.csv”) However when I am running an Rscript from the terminal and putting it in the background I need to specify “file=“. So for the example above I need to instead have write.csv(data,file=”/home/data.csv”) Any ideas why this is the case? Writing file= isn’t a problem, just trying to get an idea of how R works better. Thanks! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Can't remember your password? Do you need a strong and secure password? Use Password manager! It stores your passwords protects your account. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] write.csv file= question
Hello, I have a quick question about the “file=“ specification for the command write.csv.When I run this command in Rstudio I do not need the “file=“ specified. For example the below command works just fine. write.csv(data,”/home/data.csv”) However when I am running an Rscript from the terminal and putting it in the background I need to specify “file=“. So for the example above I need to instead have write.csv(data,file=”/home/data.csv”) Any ideas why this is the case? Writing file= isn’t a problem, just trying to get an idea of how R works better. Thanks! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.csv file= question
Thanks! On Aug 4, 2015, at 9:04 AM, John Kane jrkrid...@inbox.com wrote: You probably need to ask this on a RStudio forum but my guess is it is just a little 'refinement' that the RStudio people added. Similar in concept o the the matching . John Kane Kingston ON Canada -Original Message- From: demmi...@gmail.com Sent: Tue, 4 Aug 2015 08:51:24 -0600 To: r-help@r-project.org Subject: [R] write.csv file= question Hello, I have a quick question about the “file=“ specification for the command write.csv.When I run this command in Rstudio I do not need the “file=“ specified. For example the below command works just fine. write.csv(data,”/home/data.csv”) However when I am running an Rscript from the terminal and putting it in the background I need to specify “file=“. So for the example above I need to instead have write.csv(data,file=”/home/data.csv”) Any ideas why this is the case? Writing file= isn’t a problem, just trying to get an idea of how R works better. Thanks! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Can't remember your password? Do you need a strong and secure password? Use Password manager! It stores your passwords protects your account. Check it out at http://mysecurelogon.com/password-manager __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] powerTransform warning message?
Thank you so much for the explanation. That was very helpful! :-) Thanks! Brittany On Jul 16, 2015, at 6:16 PM, John Fox j...@mcmaster.ca wrote: Dear Brittany, On Thu, 16 Jul 2015 17:35:38 -0600 Brittany Demmitt demmi...@gmail.com wrote: Hello, I have a series of 40 variables that I am trying to transform via the boxcox method using the powerTransfrom function in R. I have no zero values in any of my variables. When I run the powerTransform function on the full data set I get the following warning. Warning message: In sqrt(diag(solve(res$hessian))) : NaNs produced However, when I analyze the variables in groups, rather than all 40 at a time I do not get this warning message. Why would this be? And does this mean this warning is safe to ignore? No, it is not safe to ignore the warning, and the problem has nothing to do with non-positive values in the data -- when you say that there are no 0s in the data, I assume that you mean that the data values are all positive. The square-roots of the diagonal entries of the Hessian at the (pseudo-) ML estimates are the SEs of the estimated transformation parameters. If the Hessian can't be inverted, that usually implies that the maximum of the (pseudo-) likelihood isn't well defined. This isn't surprising when you're trying to transform as many as 40 variables at a time to multivariate normality. It's my general experience that people often throw their data into the Box-Cox black box and hope for the best without first examining the data, and, e.g., insuring a reasonable ratio of maximum/minimum values for each variable, checking for extreme outliers, etc. Of course, I don't know that you did that, and it's perfectly possible that you were careful. I would like to add that all of my lambda values are in the -5 to 5 range. I also get different lambda values when I analyze the variables together versus in groups. Is this to be expected? Yes. It's very unlikely that both are right. If, e.g., the variables are multivariate normal within groups then their marginal distribution is a mixture of multivariate normals, which almost surely isn't itself normal. I hope this helps, John John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ Thank you so much! Brittany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] powerTransform warning message?
Hello, I have a series of 40 variables that I am trying to transform via the boxcox method using the powerTransfrom function in R. I have no zero values in any of my variables. When I run the powerTransform function on the full data set I get the following warning. Warning message: In sqrt(diag(solve(res$hessian))) : NaNs produced However, when I analyze the variables in groups, rather than all 40 at a time I do not get this warning message. Why would this be? And does this mean this warning is safe to ignore? I would like to add that all of my lambda values are in the -5 to 5 range. I also get different lambda values when I analyze the variables together versus in groups. Is this to be expected? Thank you so much! Brittany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] powerTransform Convergence erro
Hi John, That does help, thanks! Brittany On Jun 11, 2015, at 4:02 PM, John Fox j...@mcmaster.ca wrote: Dear Brittany, There is an essentially perfect linear dependency among the variables in your data (note the last eigenvalue, which is 0 within rounding error): eigen(cor(problem.data.boxcox[,-1]), only.values=TRUE) $values [1] 3.644257e+00 1.821582e+00 1.712152e+00 1.205091e+00 1.007231e+00 9.231163e-01 9.048724e-01 [8] 8.718398e-01 8.379187e-01 7.371353e-01 6.334100e-01 5.235629e-01 4.757997e-01 4.246831e-01 [15] 2.773471e-01 -2.802502e-16 In addition, some of your variables have many tied values at the bottom of their distributions, making them very poor candidates for normalizing power transformations; for example, sum(problem.data.boxcox$variable2 == 1) [1] 626 I hope this helps, John John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ On Thu, 11 Jun 2015 09:37:57 -0600 Brittany Demmitt demmi...@gmail.com wrote: Hi John, Thank you so much for the info! I have attached the data in .csv format that is giving me the warning along with the command that I am running. It i a data frame with 1510 sample IDs and then their values for 16 variables. I am trying to transform the 16 variables. I do not receive the warning when I run each variable independently, just when I run the entire dataframe at once. However, I have run this command with other larger data frames all at once with no warnings, so I am not sure why it is not working now. Any help is appreciated! Thanks! :-) Britt Commands Run: #read in the data frame problem.data.boxcox - read.csv(problem.data.boxcox.csv) #run a power transformation (I do not run that on the first column because it is just sample ids) problem.data.boxcox.pT - powerTransform(problem.data.boxcox[,-1]) Warning message: In estimateTransform(x, y, NULL, ...) : Convergence failure: return code = 1 On Jun 10, 2015, at 2:15 PM, John Fox j...@mcmaster.ca wrote: Dear Brittany, As explained in ?powerTransform, this function uses optim() to optimize a generalized Box-Cox criterion. For explanation of return codes, see ?optim. In particular, code 1 indicates that the maximum number of iterations was exceeded. Although you might try increasing the permitted number of iterations or otherwise tweaking the arguments to optim(), your problem is probably ill-conditioned in some manner that is impossible to know without more information, such as your data. I hope this helps, John John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ On Wed, 10 Jun 2015 10:54:30 -0600 Brittany Demmitt demmi...@gmail.com wrote: Hello, I am trying to use the powerTransform function in the package car to identify the lambda: transform my data. However, I receive the following warning: Warning message: In estimateTransform(x, y, NULL, ...) : Convergence failure: return code = 1 I can not find a description of what return code =1 means for the car package. How do I look that up, or does anyone know what the warning means? Thank you so much! Brittany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] powerTransform Convergence error
Hello, I am trying to use the powerTransform function in the package car to identify the lambda: transform my data. However, I receive the following warning: Warning message: In estimateTransform(x, y, NULL, ...) : Convergence failure: return code = 1 I can not find a description of what return code =1 means for the car package. How do I look that up, or does anyone know what the warning means? Thank you so much! Brittany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.