[R] How to change the order of columns in a data frame?
Dear all, I have a data frame in which the columns need to be ordered. The first column X is at the right position, but the remaining columns X1-Xn should be ordered like this: X1, X2, X3 etc instead of like below. colnames(pos1) [1] X X1 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X2 X20 X3 X4 X5 X6 X7 X8 X9 pos1[1:5,1:5] X X1 X10 X11 X12 1 100.5 7949.469 18509.064 8484.969 17401.056 2 101.5 3080.058 7794.691 3211.323 8211.058 3 102.5 1854.347 4347.571 1783.846 4827.338 4 103.5 2064.441 8421.746 2012.536 8363.785 5 104.5 9650.402 26637.926 10730.647 27053.421 I am trying to first change the first column name to something without an X and save as a vector. I would then remove the X from each position use the vector for renaming the columns. Then the column 2-n could be ordered, I hope... colnames(pos)[1] - Mass columnNames - colnames(pos) Does any of you have an idea how to do this, or perhaps there is a smoother solution? Would it be easier to solve it if the contents of the first column were extracted and used as row names instead? Best regards, Joel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the order of columns in a data frame?
It does not work when using more variables, and my data frames usually contains about thousand columns... Best, Joel fakedata - data.frame(A=c(0,0,0), X1=c(1,1,1), X6=c(6,6,6), X7=c(7,7,7), X3=c(3,3,3), X4=c(4,4,4), X9=c(9,9,9), X2=c(2,2,2), X8=c(8,8,8), X5=c(5,5,5)) fakedata A X1 X6 X7 X3 X4 X9 X2 X8 X5 1 0 1 6 7 3 4 9 2 8 5 2 0 1 6 7 3 4 9 2 8 5 3 0 1 6 7 3 4 9 2 8 5 pos - colnames(fakedata)[2:ncol(fakedata)] pos [1] X1 X6 X7 X3 X4 X9 X2 X8 X5 pos - c(1, 1+as.numeric(gsub(X, , pos))) pos [1] 1 2 7 8 4 5 10 3 9 6 fakedata[, pos] A X1 X9 X2 X7 X3 X5 X6 X8 X4 1 0 1 9 2 7 3 5 6 8 4 2 0 1 9 2 7 3 5 6 8 4 3 0 1 9 2 7 3 5 6 8 4 Sarah Goslee sarah.gos...@gmail.com 17-02-2012 14:36 fakedata - data.frame(A=c(0,0,0), X2=c(2,2,2), X1=c(1,1,1), X3=c(3,3,3)) fakedata A X2 X1 X3 1 0 2 1 3 2 0 2 1 3 3 0 2 1 3 pos - colnames(fakedata)[2:ncol(fakedata)] pos - c(1, 1+as.numeric(gsub(X, , pos))) fakedata[, pos] A X1 X2 X3 1 0 1 2 3 2 0 1 2 3 3 0 1 2 3 Sarah 2012/2/17 Joel Fürstenberg-Hägg jo...@life.ku.dk: Dear all, I have a data frame in which the columns need to be ordered. The first column X is at the right position, but the remaining columns X1-Xn should be ordered like this: X1, X2, X3 etc instead of like below. colnames(pos1) [1] X X1 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X2 X20 X3 X4 X5 X6 X7 X8 X9 pos1[1:5,1:5] X X1 X10 X11 X12 1 100.5 7949.469 18509.064 8484.969 17401.056 2 101.5 3080.058 7794.691 3211.323 8211.058 3 102.5 1854.347 4347.571 1783.846 4827.338 4 103.5 2064.441 8421.746 2012.536 8363.785 5 104.5 9650.402 26637.926 10730.647 27053.421 I am trying to first change the first column name to something without an X and save as a vector. I would then remove the X from each position use the vector for renaming the columns. Then the column 2-n could be ordered, I hope... colnames(pos)[1] - Mass columnNames - colnames(pos) Does any of you have an idea how to do this, or perhaps there is a smoother solution? Would it be easier to solve it if the contents of the first column were extracted and used as row names instead? Best regards, Joel -- Sarah Goslee http://www.functionaldiversity.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the order of columns in a data frame?
@Alfredo The X is removed, but the reordering does not work: colnames(df)[1] - Mass columnNames - colnames(df) colnames(df) [1] Mass X1 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X2 X20 X3 X4 X5 X6 X7 X8 X9 colnames(df) - gsub(X,,colnames(df)) colnames(df) [1] Mass 110 11 12 13 14 15 16 17 18 19 220 34567 89 df - df[,colnames(df)] colnames(df) [1] Mass 110 11 12 13 14 15 16 17 18 19 220 34567 89 Best, Joel Alfredo Alessandrini caveneb...@gmail.com 17-02-2012 14:40 Hi Joel, to replace the colnames: colnames(dataframe - )gsub(X,,colnames(dataframe)) to order by colnames: dataframe - dataframe[,colnames(dataframe)] Alfredo 2012/2/17 Joel Fürstenberg-Hägg jo...@life.ku.dk Dear all, I have a data frame in which the columns need to be ordered. The first column X is at the right position, but the remaining columns X1-Xn should be ordered like this: X1, X2, X3 etc instead of like below. colnames(pos1) [1] X X1 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X2 X20 X3 X4 X5 X6 X7 X8 X9 pos1[1:5,1:5] X X1 X10 X11 X12 1 100.5 7949.469 18509.064 8484.969 17401.056 2 101.5 3080.058 7794.691 3211.323 8211.058 3 102.5 1854.347 4347.571 1783.846 4827.338 4 103.5 2064.441 8421.746 2012.536 8363.785 5 104.5 9650.402 26637.926 10730.647 27053.421 I am trying to first change the first column name to something without an X and save as a vector. I would then remove the X from each position use the vector for renaming the columns. Then the column 2-n could be ordered, I hope... colnames(pos)[1] - Mass columnNames - colnames(pos) Does any of you have an idea how to do this, or perhaps there is a smoother solution? Would it be easier to solve it if the contents of the first column were extracted and used as row names instead? Best regards, Joel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the order of columns in a data frame?
@ Jim That would work for just a few columns, but I will have around 1000 of them so I need something more generic. best, Joel jim holtman jholt...@gmail.com 17-02-2012 14:44 pos2 - pos1[, c(X, X1, X2, X3, X4, X5, X6, X7, X8, X9, X10, X11, X12, X13, X14, X15, X16, X17, X18, X19, X20)] 2012/2/17 Joel Fürstenberg-Hägg jo...@life.ku.dk: Dear all, I have a data frame in which the columns need to be ordered. The first column X is at the right position, but the remaining columns X1-Xn should be ordered like this: X1, X2, X3 etc instead of like below. colnames(pos1) [1] X X1 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X2 X20 X3 X4 X5 X6 X7 X8 X9 pos1[1:5,1:5] X X1 X10 X11 X12 1 100.5 7949.469 18509.064 8484.969 17401.056 2 101.5 3080.058 7794.691 3211.323 8211.058 3 102.5 1854.347 4347.571 1783.846 4827.338 4 103.5 2064.441 8421.746 2012.536 8363.785 5 104.5 9650.402 26637.926 10730.647 27053.421 I am trying to first change the first column name to something without an X and save as a vector. I would then remove the X from each position use the vector for renaming the columns. Then the column 2-n could be ordered, I hope... colnames(pos)[1] - Mass columnNames - colnames(pos) Does any of you have an idea how to do this, or perhaps there is a smoother solution? Would it be easier to solve it if the contents of the first column were extracted and used as row names instead? Best regards, Joel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the order of columns in a data frame?
Thank you Sarah, now it works!! Sarah Goslee sarah.gos...@gmail.com 17-02-2012 15:13 Sorry, it should be: fakedata[, order(pos)] A X1 X2 X3 X4 X5 X6 X7 X8 X9 1 0 1 2 3 4 5 6 7 8 9 2 0 1 2 3 4 5 6 7 8 9 3 0 1 2 3 4 5 6 7 8 9 Using order also ensures that non-sequential column ids will work: fakedata - data.frame(A=c(0,0,0), X1=c(1,1,1), X6=c(6,6,6), X7=c(7,7,7), X3=c(3,3,3), X4=c(4,4,4), X9=c(9,9,9), X2=c(2,2,2), X8=c(8,8,8)) pos - colnames(fakedata)[2:ncol(fakedata)] pos - c(1, 1+as.numeric(gsub(X, , pos))) fakedata A X1 X6 X7 X3 X4 X9 X2 X8 1 0 1 6 7 3 4 9 2 8 2 0 1 6 7 3 4 9 2 8 3 0 1 6 7 3 4 9 2 8 fakedata[, order(pos)] A X1 X2 X3 X4 X6 X7 X8 X9 1 0 1 2 3 4 6 7 8 9 2 0 1 2 3 4 6 7 8 9 3 0 1 2 3 4 6 7 8 9 Sarah 2012/2/17 Joel Fürstenberg-Hägg jo...@life.ku.dk: It does not work when using more variables, and my data frames usually contains about thousand columns... Best, Joel fakedata - data.frame(A=c(0,0,0), X1=c(1,1,1), X6=c(6,6,6), X7=c(7,7,7), X3=c(3,3,3), X4=c(4,4,4), X9=c(9,9,9), X2=c(2,2,2), X8=c(8,8,8), X5=c(5,5,5)) fakedata A X1 X6 X7 X3 X4 X9 X2 X8 X5 1 0 1 6 7 3 4 9 2 8 5 2 0 1 6 7 3 4 9 2 8 5 3 0 1 6 7 3 4 9 2 8 5 pos - colnames(fakedata)[2:ncol(fakedata)] pos [1] X1 X6 X7 X3 X4 X9 X2 X8 X5 pos - c(1, 1+as.numeric(gsub(X, , pos))) pos [1] 1 2 7 8 4 5 10 3 9 6 fakedata[, pos] A X1 X9 X2 X7 X3 X5 X6 X8 X4 1 0 1 9 2 7 3 5 6 8 4 2 0 1 9 2 7 3 5 6 8 4 3 0 1 9 2 7 3 5 6 8 4 Sarah Goslee sarah.gos...@gmail.com 17-02-2012 14:36 fakedata - data.frame(A=c(0,0,0), X2=c(2,2,2), X1=c(1,1,1), X3=c(3,3,3)) fakedata A X2 X1 X3 1 0 2 1 3 2 0 2 1 3 3 0 2 1 3 pos - colnames(fakedata)[2:ncol(fakedata)] pos - c(1, 1+as.numeric(gsub(X, , pos))) fakedata[, pos] A X1 X2 X3 1 0 1 2 3 2 0 1 2 3 3 0 1 2 3 Sarah 2012/2/17 Joel Fürstenberg-Hägg jo...@life.ku.dk: Dear all, I have a data frame in which the columns need to be ordered. The first column X is at the right position, but the remaining columns X1-Xn should be ordered like this: X1, X2, X3 etc instead of like below. colnames(pos1) [1] X X1 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X2 X20 X3 X4 X5 X6 X7 X8 X9 pos1[1:5,1:5] X X1 X10 X11 X12 1 100.5 7949.469 18509.064 8484.969 17401.056 2 101.5 3080.058 7794.691 3211.323 8211.058 3 102.5 1854.347 4347.571 1783.846 4827.338 4 103.5 2064.441 8421.746 2012.536 8363.785 5 104.5 9650.402 26637.926 10730.647 27053.421 I am trying to first change the first column name to something without an X and save as a vector. I would then remove the X from each position use the vector for renaming the columns. Then the column 2-n could be ordered, I hope... colnames(pos)[1] - Mass columnNames - colnames(pos) Does any of you have an idea how to do this, or perhaps there is a smoother solution? Would it be easier to solve it if the contents of the first column were extracted and used as row names instead? Best regards, Joel -- Sarah Goslee http://www.functionaldiversity.org -- Sarah Goslee http://www.stringpage.com http://www.sarahgoslee.com http://www.functionaldiversity.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Normality tests on groups of rows in a data frame, grouped based on content in other columns
Hi Dennis, Thanks for your prompt response. Best, Joel Dennis Murphy djmu...@gmail.com 30-10-2011 21:11 Hi: Here are a few ways (untested, so caveat emptor): # plyr package library('plyr') ddply(df, .(Plant, Tissue, Gene), summarise, ntest = shapiro.test(ExpressionLevel)) # data.table package library('data.table') dt - data.table(df, key = 'Plant, Tissue, Gene') dt[, list(ntest = shapiro.test(ExpressionLevel)), by = key(dt)] # aggregate() function aggregate(ExpressionLevel ~ Plant + Tissue + Gene, data = df, FUN = shapiro.test) # doBy package: summaryBy(ExpressionLevel ~ Plant + Tissue + Gene, data = df, FUN = shapiro.test) There are others, too... HTH, Dennis 2011/10/30 Joel Fürstenberg-Hägg jo...@life.ku.dk: Dear R users, I have a data frame in the form below, on which I would like to make normality tests on the values in the ExpressionLevel column. head(df) ID Plant Tissue Gene ExpressionLevel 1 1 p1 t1 g1 366.53 2 2 p1 t1 g2 0.57 3 3 p1 t1 g311.81 4 4 p1 t2 g1 498.43 5 5 p1 t2 g2 2.14 6 6 p1 t2 g3 7.85 I would like to make the tests on every group according to the content of the Plant, Tissue and Gene columns. My first problem is how to run a function for all these sub groups. I first thought of making subsets: group1 - subset(df, Plant==p1 Tissue==t1 Gene==g1) group2 - subset(df, Plant==p1 Tissue==t1 Gene==g2) group3 - subset(df, Plant==p1 Tissue==t1 Gene==g3) group4 - subset(df, Plant==p1 Tissue==t2 Gene==g1) group5 - subset(df, Plant==p1 Tissue==t2 Gene==g2) group6 - subset(df, Plant==p1 Tissue==t2 Gene==g3) etc... But that would be very time consuming and I would like to be able to use the code for other data frames... I have also tried to store these in a list, which I am looping through, running the tests, something like this: alist=list(group1, group2, group3, group4, group5, group6) for(i in alist) { print(shapiro.test(i$ExpressionLevel)) print(pearson.test(i$ExpressionLevel)) print(pearson.test(i$ExpressionLevel, adjust=FALSE)) } But, there must be an easier and more elegant way of doing this... I found the example below at http://stackoverflow.com/questions/4716152/why-do-r-objects-not-print-in-a-function-or-a-for-loop. I think might be used for the printing of the results, but I do not know how to adjust for my data frame, since the functions are applied on several columns instead of certain rows in one column. DF - data.frame(A = rnorm(100), B = rlnorm(100)) obj2 - lapply(DF, shapiro.test) tab2 - lapply(obj, function(x) c(W = unname(x$statistic), p.value = x$p.value)) tab2 - data.frame(do.call(rbind, tab2)) printCoefmat(tab2, has.Pvalue = TRUE) Finally, I have found several different functions for testing for normality, but which one(s) should I choose? As far as I can see in the help files they only differ in the minimum number of samples required. Thanks in advance! Kind regards, Joel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Normality tests on groups of rows in a data frame, grouped based on content in other columns
Dear R users, I have a data frame in the form below, on which I would like to make normality tests on the values in the ExpressionLevel column. head(df) ID Plant Tissue Gene ExpressionLevel 1 1 p1 t1 g1 366.53 2 2 p1 t1 g2 0.57 3 3 p1 t1 g311.81 4 4 p1 t2 g1 498.43 5 5 p1 t2 g2 2.14 6 6 p1 t2 g3 7.85 I would like to make the tests on every group according to the content of the Plant, Tissue and Gene columns. My first problem is how to run a function for all these sub groups. I first thought of making subsets: group1 - subset(df, Plant==p1 Tissue==t1 Gene==g1) group2 - subset(df, Plant==p1 Tissue==t1 Gene==g2) group3 - subset(df, Plant==p1 Tissue==t1 Gene==g3) group4 - subset(df, Plant==p1 Tissue==t2 Gene==g1) group5 - subset(df, Plant==p1 Tissue==t2 Gene==g2) group6 - subset(df, Plant==p1 Tissue==t2 Gene==g3) etc... But that would be very time consuming and I would like to be able to use the code for other data frames... I have also tried to store these in a list, which I am looping through, running the tests, something like this: alist=list(group1, group2, group3, group4, group5, group6) for(i in alist) { print(shapiro.test(i$ExpressionLevel)) print(pearson.test(i$ExpressionLevel)) print(pearson.test(i$ExpressionLevel, adjust=FALSE)) } But, there must be an easier and more elegant way of doing this... I found the example below at http://stackoverflow.com/questions/4716152/why-do-r-objects-not-print-in-a-function-or-a-for-loop. I think might be used for the printing of the results, but I do not know how to adjust for my data frame, since the functions are applied on several columns instead of certain rows in one column. DF - data.frame(A = rnorm(100), B = rlnorm(100)) obj2 - lapply(DF, shapiro.test) tab2 - lapply(obj, function(x) c(W = unname(x$statistic), p.value = x$p.value)) tab2 - data.frame(do.call(rbind, tab2)) printCoefmat(tab2, has.Pvalue = TRUE) Finally, I have found several different functions for testing for normality, but which one(s) should I choose? As far as I can see in the help files they only differ in the minimum number of samples required. Thanks in advance! Kind regards, Joel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NA Replacement by lowest value?
Hi all, I need to replace missing values in a matrix by 10 % of the lowest available value in the matrix. I've got a function I've used earlier to replace negative values by the lowest value, in a data frame, but I'm not sure how to modify it... nonNeg = as.data.frame(apply(orig.df, 2, function(col) # Change negative values to a small value, close to zero { min.val = min(col[col 0]) col[col 0] = (min.val / 10) col # Column index })) I think this is how to start, but the NA replacement part doesn't work... newMatrix = as.matrix(apply(oldMatrix, 2, function(col) { min.val = min(mData, na.rm = T) # Find the smallest value in the dataset col[col == NA] = (min.val / 10) # Doesn't work... col # Column index } Does any of you have any suggestions? Best regards, Joel _ Hitta kärleken i vinter! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NA Replacement by lowest value?
Thanks a lot Paul!! Best, Joel Date: Thu, 28 Jan 2010 10:48:37 +0100 From: p.hiems...@geo.uu.nl To: joel_furstenberg_h...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] NA Replacement by lowest value? Joel Fürstenberg-Hägg wrote: Hi all, I need to replace missing values in a matrix by 10 % of the lowest available value in the matrix. I've got a function I've used earlier to replace negative values by the lowest value, in a data frame, but I'm not sure how to modify it... nonNeg = as.data.frame(apply(orig.df, 2, function(col) # Change negative values to a small value, close to zero { min.val = min(col[col 0]) col[col 0] = (min.val / 10) col # Column index })) I think this is how to start, but the NA replacement part doesn't work... newMatrix = as.matrix(apply(oldMatrix, 2, function(col) { min.val = min(mData, na.rm = T) # Find the smallest value in the dataset col[col == NA] = (min.val / 10) # Doesn't work... use is.na(col) t find the NA's. cheers, Paul col # Column index } Does any of you have any suggestions? Best regards, Joel _ Hitta kärleken i vinter! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +3130 274 3113 Mon-Tue Phone: +3130 253 5773 Wed-Fri http://intamap.geo.uu.nl/~paul _ Hitta hetaste singlarna på MSN Dejting! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NA Replacement by lowest value?
Hi Jim, That's what Pauls suggested too, works great! Best, Joel Date: Thu, 28 Jan 2010 20:57:57 +1100 From: j...@bitwrit.com.au To: joel_furstenberg_h...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] NA Replacement by lowest value? On 01/28/2010 08:35 PM, Joel Fürstenberg-Hägg wrote: Hi all, I need to replace missing values in a matrix by 10 % of the lowest available value in the matrix. I've got a function I've used earlier to replace negative values by the lowest value, in a data frame, but I'm not sure how to modify it... nonNeg = as.data.frame(apply(orig.df, 2, function(col) # Change negative values to a small value, close to zero { min.val = min(col[col 0]) col[col 0] = (min.val / 10) col # Column index })) I think this is how to start, but the NA replacement part doesn't work... newMatrix = as.matrix(apply(oldMatrix, 2, function(col) { min.val = min(mData, na.rm = T) # Find the smallest value in the dataset col[col == NA] = (min.val / 10) # Doesn't work... col # Column index } Does any of you have any suggestions? Hi Joel, You probably want to use: col[is.na(col)]-min.val/10 Jim _ Hitta kärleken i vinter! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Remove part of string in colname and calculate mean for columns groups
Hi all, I have two question. First, I wonder how to remove a part of the column names in a matrix? I would like to remove the _ACCX or _NAX part below. Is there a method where the _ as well as all characters after i can be removed? dim(exprdata) [1] 88 512 colnames(exprdata[,c(1:20)]) [1] Akita_ACC1 Akita_ACC2 Akita_ACC3 Akita_ACC4 Alc.0_ACC1 Alc.0_ACC2 Alc.0_ACC3 [8] Alc.0_ACC4 Alc.0_ACC5 Bl.1_ACC1 Bl.1_ACC2 Bl.1_ACC3 Bl.1_ACC4 Bla.1_ACC1 [15] Bla.1_ACC2 Bla.1_ACC3 Bla.1_ACC4 Blh.1_ACC1 Blh.1_ACC2 Blh.1_ACC3 Secondly, I would like to calculate the mean of each column group in the matrix, for instance all columns beginning with Akita, and save all new columns as a new matrix. For instance, use: head(exprdata[,c(1:4)]) Akita_ACC1 Akita_ACC2 Akita_ACC3 Akita_ACC4 A15-101 6.668931 NA NA NA A122001-101 10.562564 11.706395 11.608989 8.289093 A128001-101 14.946749 8.112625 8.176438 10.104254 A133001-101 5.186679 6.089870 4.119589 3.168841 A133003-101 NA NA 19.825480 2.587695 A134001-101 3.259402 4.835642 4.679607 4.490254 To get something like: Akita A15-101 6.668931 A122001-101 10.54176 A128001-101 10.10425 A133001-101 3.168841 A133003-101 2.587695 A134001-101 4.490254 However, the column groups are of different sizes (3-10 columns) so I guess I'll need a method based on the column names. Anyone who can help me? Best regards, Joel _ Nya Windows 7 - Hitta en dator som passar dig! Mer information. http://windows.microsoft.com/shop [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to delete matrix rows based on NA frequency?
Hi all, I would like to remove rows from a matrix, based on the frequency of missing values. If there are more than 10 % missing values, the row should be deleted. I use the following to calculate the frequencies, thereby getting a new matrix with the frequencies: freqNA=rowMeans(is.na(exprdata)) But is there a shorter way to remove the rows based on (1-freqNA)0.1 than looping through the whole matrix using a for loop? All the best, Joel _ Hitta kärleken i vinter! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to calculate the row wise means for grouped columns in matrix?
Hi all, I want to calculate the row wise mean of groups of columns in a matrix M. All columns belonging to the same group have the same column name. My idea is to create a new vector V containing these column names, but after first removing the duplicates. Then I would calculate the means using for instance rowMean() and by comparing the column names of M with the vector V, getting the indices of the columns to use. What do you think, is it a good idea or not? If yes, any suggestions how to do it? If no, is there any alternative solution that might work better? All the best, Joel _ Lagra alla dina foton på Skydrive. Det är enkelt och säkert! http://www.skydrive.live.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Exchange NAs for mean
Hi all, I'm have a matrix (X) with observations as rows and parameters as columns. I'm trying to exchange all missing values in a column by the column mean using the code below, but so far, nothing happens with the NAs... Can anyone see where the problem is? N-nrow(X) # Calculate number of rows = 108 p-ncol(X) # Calculate number of columns = 88 # Replace by columnwise mean for (i in colnames(X)) # Do for all columns in the matrix { for (j in rownames(X)) # Go through all rows { if(is.na(X[j,i])) # Search for missing value in the given position { X[j,i]=mean(X[1:p, i]) # Change missing value to the mean of the column } } } All the best, Joel _ Hitta hetaste singlarna på MSN Dejting! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Get rid of automatic title from Anova plots
Hi all, I'm having a problem when putting the four plots (Residuals vs Fitted, Normal Q-Q, Scale - Location, and Residuals vs Leverage) from an Anova using plot(aov()) into a pdf. A string aov(df[,i] ~ cat)) is added as the main title, so when I use mtext() the two titles are overlapping. Does anyone know how to get rid of that title? I've tried plot(aov(df[,i] ~ cat), main=) but without succes. pdf(Anova plots 0809.pdf, height=10, width=10) par(mfrow=c(2,2), oma=c(2,2,4,2)) # Anova test for LT50 with different anthocyanin scores df=fieldTrial[idx0809, c(31:32)] cat=fieldTrial[idx0809, c(14)] for (i in colnames(df)) { print(i) print(summary(aov(df[,i] ~ cat))) plot(aov(df[,i] ~ cat)) mtext(text=paste(Anova test for, i, with different anthocyanin scores), cex=1.5, side=3, outer=TRUE) } dev.off() Best regards, Joel _ Hitta kärleken i vinter! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple regression script
Hi all, I'm doing Multiple linear regression for a data set. However, it takes a lot of time, as I would like to check every possible combination of factors, evalute the results based for instance on their p values, and then choose the best regression model. So, I wonder if anyone might have a script for that? Or if not, do you have some suggestions how to create such a script? I've been told there is a similar function in SAS, but I'm not sure how it works. Furthermore, I'm not sure how to deal with the evaluation of the results, are there any other factors I should consider, such as R square etc? All the best, Joel _ Hitta kärleken i vinter! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Decision trees with factors and numericals
Hi all, Does any of you know how to make a decision tree when the data set contains factors and numericals? I've got a data frame with 3 columns, where y and x1 are numerical and x2 contains factors. Is it possible to use the rpart package, and in that case how? Otherwise, is there another alternative? This is what I've tried so far rpart(LT50_NA ~ Raf + Antho, data=decTreeNA, method=anova) # Have tried method=class as well Error in as.character(x) : cannot coerce type 'closure' to vector of type 'character' Best regards, Joel _ Lagra alla dina foton på Skydrive. Det är enkelt och säkert! http://www.skydrive.live.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] From R to LaTeX to pdf?
Hi all, Anyone experienced in the LaTeX format? I'm trying to use the xtable package to create nice anova tables, but how do I do to produce a pdf from the resulting LaTeX table? I've tried WinShell and MiKTeX, but I couldn't get any of them working... Here's an example of the output in R: % latex table generated in R 2.9.2 by xtable 1.5-6 package % Tue Nov 24 14:17:32 2009 \begin{tabular}{lr} \hline Df Sum Sq Mean Sq F value Pr($$F) \\ \hline cat 2 40.50 20.25 6.66 0.0019 \\ Residuals 107 325.13 3.04 \\ \hline \end{tabular} Best regards, Joel _ Lagra alla dina foton på Skydrive. Det är enkelt och säkert! http://www.skydrive.live.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in text.rpart(fit) : fit is not a tree, just a root
Hi all, I've tried to make a decision tree for the following data set: Level X Response 279 C 2.4728646 -9.445 341 B 0.5986398 -9.413 343 B 1.1786271 -9.413 384 D 1.4797870 -9.413 390 C 2.0364569 -9.133 391 D 0.9365739 -9.133 452 A 1.2858741 -11.480 455 C 1.3256245 -9.413 510 C 0.5758865 -9.413 537 D 1.9289431 -9.413 540 C 1.8646144 -9.413 554 B 1.3903752 -10.080 Using these commands: fit=rpart(Response ~ X + Level, data=decTree, method=anova, control=rpart.control(minsplit=1)) printcp(fit) # display cp table Regression tree: rpart(formula = Response ~ X + Level, data=decTree, method = anova, control = rpart.control(minsplit = 1)) Variables actually used in tree construction: character(0) Root node error: 4.4697/12 = 0.37247 n= 12 CP nsplit rel error 1 0.01 0 1 I don't get a tree... plot(fit) # plot decision tree Error in plot.rpart(fit) : fit is not a tree, just a root text(fit) # label the decision tree plot Error in text.rpart(fit) : fit is not a tree, just a root Can anyone tell me what's going wrong and give a hint how to solve it? Best regards, Joel _ Nya Windows 7 - Hitta en dator som passar dig! Mer information. http://windows.microsoft.com/shop [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 3D plot, rotatable and with adjustable symbols
Hi all, I've tried to make a 3D plot, but have run into some problems. I'd like to have a plot that I can rotate interactively using the mouse, which is possible using the plots3d {R.basic}. However, I would like to change the symbols used as the points, but there's no pch in plot3d(). If I use the Scatterplot3d package, I'm able to change this, but not able to rotate the plot interactively. Does anyone know a solution to this? Maybe another package is better? Best regards, Joel _ Nya Windows 7 gör allt lite enklare. Hitta en dator som passar dig! http://windows.microsoft.com/shop [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trellis settings get lost when printing to pdf
Hi all, I've got some problems when changing the trellis settings for the lattice plots. The plots look exactly as I want them to when calling show.settings() as well as when plotting them in the graphical window. But when printing to a pdf file, none of the settings are used!? Does anyone know what might have happened? Because the when changing the trellis settings, these should remain in the new state until you close R right..? # Change settings for the boxplot appearance new.dot=trellis.par.get(box.dot) new.rectangle=trellis.par.get(box.rectangle) new.umbrella=trellis.par.get(box.umbrella) new.symbol=trellis.par.get(plot.symbol) new.strip.background=trellis.par.get(strip.background) new.strip.shingle=trellis.par.get(strip.shingle) new.dot$pch=| new.dot$col=black new.rectangle$col=black new.rectangle$fill=grey65 new.umbrella$col=black new.umbrella$lty=1 # Continous line, not dotted new.symbol$col=black new.strip.background$col=grey87 # Background colour in the upper label new.strip.shingle$col=black # Border colour around the upper label trellis.par.set(box.dot=new.dot, box.rectangle=new.rectangle, box.umbrella=new.umbrella, plot.symbol=new.symbol, strip.background=new.strip.background, strip.shingle=new.strip.shingle) Best regards, Joel _ Nya Windows 7 - Hitta en dator som passar dig! Mer information. http://windows.microsoft.com/shop [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loadings and scores from fastICA?
Ok, so then the S gives the individual components, good. Thanks Tony! But what about the principal components from the PCA plot, how are they calculated? And are the linear mixing matrix A really the same as the loadings/weights? There must be different loadings for the PCA and ICA right? Best regards, Joel Date: Wed, 11 Nov 2009 14:29:06 -0700 From: tpl...@acm.org To: joel_furstenberg_h...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Loadings and scores from fastICA? The help for fastICA says: The data matrix X is considered to be a linear combination of non-Gaussian (independent) components i.e. X = SA where columns of S contain the independent components and A is a linear mixing matrix. The value of fastICA is a list with components S (the estimated source matrix) and A (the estimated mixing matrix). Are these what you want? -- Tony Plate Joel Fürstenberg-Hägg wrote: Hi all, Does anyone know how to get the independent components and loadings from an Independent Component Analysis (ICA), as well as principal components and loadings from a Pricipal Component analysis (PCA) using the fastICA package? Or perhaps if there's another way to do ICAs in R? Below is an example from the fastICA manual (http://cran.r-project.org/web/packages/fastICA/fastICA.pdf) if(require(MASS)) { x - mvrnorm(n = 1000, mu = c(0, 0), Sigma = matrix(c(10, 3, 3, 1), 2, 2)) x1 - mvrnorm(n = 1000, mu = c(-1, 2), Sigma = matrix(c(10, 3, 3, 1), 2, 2)) X - rbind(x, x1) a - fastICA(X, 2, alg.typ = deflation, fun = logcosh, alpha = 1, method = R, row.norm = FALSE, maxit = 200, tol = 0.0001, verbose = TRUE) par(mfrow = c(1, 3)) plot(a$X, main = Pre-processed data) plot(a$X%*%a$K, main = PCA components) plot(a$S, main = ICA components) } Best regards, Joel _ Hitta kärleken i vinter! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Hitta kärleken nu i vår! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] prcomp() PCA vs fastICA() PCA?
Hi all, I wonder what the difference is between the functions prcomp and the PCA plotting method used in example 3 from the fastICA package. They give totally different plots. The reason for asking is that I've earlier used prcomp, but now I should do an ICA, and I guess I cannot compare the PCA plot from prcomp with the ICA plot if the two PCA plots looks different? Does anyone knows anything about this? Maybe there's a different approach that's better? if(require(MASS)) { x - mvrnorm(n = 1000, mu = c(0, 0), Sigma = matrix(c(10, 3, 3, 1), 2, 2)) x1 - mvrnorm(n = 1000, mu = c(-1, 2), Sigma = matrix(c(10, 3, 3, 1), 2, 2)) X - rbind(x, x1) a - fastICA(X, 2, alg.typ = deflation, fun = logcosh, alpha = 1, method = R, row.norm = FALSE, maxit = 200, tol = 0.0001, verbose = TRUE) par(mfrow = c(1, 3)) plot(a$X, main = Pre-processed data) plot(a$X%*%a$K, main = PCA components) plot(a$S, main = ICA components) } PC=prcomp (X, center=T, scale=T) hcl=hclust(dist(df)) plot(PC$x[,1],PC$x[,2], main=PCA components (prcomp)) Best regards, Joel _ Nya Windows 7 gör allt lite enklare. Hitta en dator som passar dig! http://windows.microsoft.com/shop [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Loadings and scores from fastICA?
Hi all, Does anyone know how to get the independent components and loadings from an Independent Component Analysis (ICA), as well as principal components and loadings from a Pricipal Component analysis (PCA) using the fastICA package? Or perhaps if there's another way to do ICAs in R? Below is an example from the fastICA manual (http://cran.r-project.org/web/packages/fastICA/fastICA.pdf) if(require(MASS)) { x - mvrnorm(n = 1000, mu = c(0, 0), Sigma = matrix(c(10, 3, 3, 1), 2, 2)) x1 - mvrnorm(n = 1000, mu = c(-1, 2), Sigma = matrix(c(10, 3, 3, 1), 2, 2)) X - rbind(x, x1) a - fastICA(X, 2, alg.typ = deflation, fun = logcosh, alpha = 1, method = R, row.norm = FALSE, maxit = 200, tol = 0.0001, verbose = TRUE) par(mfrow = c(1, 3)) plot(a$X, main = Pre-processed data) plot(a$X%*%a$K, main = PCA components) plot(a$S, main = ICA components) } Best regards, Joel _ Hitta kärleken i vinter! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PCA with tow response variables
Hi all, I'm new to PCA in R, so this might be a basical thing, but I cannot find anything on the net about it. I need to make a PCA plot with two response variables (df$resp1 and df$resp2) against eight metabolites (df$met1, df$met2, ...) and I don't have a clue how to do... and I've only used the simplest PCAs before, like this: pcaObj=prcomp(t(df[idx, c(40:47)])) biplot(pcaObj) Anyone who knows how to do? Best rageds, Joel _ Hitta hetaste singlarna på MSN Dejting! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Change negative values in column
Hi all, I'm trying to write a script that changes all negative values in a data frame column to a small positive value, based on the the minimum value of the column. However, I get the following error: Error in if (x[i] 0) { : argument is of length zero As well, I would minimum to be the smallest of the non-negative values... Aa_non_neg=(fieldTrial0809$Aa) # Copy column from data frame to manipulate nonNegative = function(x) { minimum=min(x) # Should only use positive minimum! for (i in x) { if(x[i]0) # Found a negative value { x[i]=minimum/10 # Change to a new non-negative value } } } nonNegative(Aa_non_neg) # Apply function on column _ Lagra alla dina foton på Skydrive. Det är enkelt och säkert! http://www.skydrive.live.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Exclude rows in xyplot
Hi all, I'm searching for a way to exclude outliers from my dataset while making xyplots. While plotting using pairs(), I exclude specific row in my data frame and save the settings as a variable which I later include as an argument: # Discard outliers and save settings as idx idx=with(fieldTrial0809, which(Pro0 Pro0.95 Fum0 Fum0.4 Mal0.1 Mal2.5 Glc2 Glc20 Fru1 Fru30 Raf0 Raf3 Suc1 Suc14)) #Plot the numerical and ranked columns pair wise using pairs() pdf(ranked.pdf, height=20, width=20) pairs(fieldTrial0809[idx, c(21,26,30,32,34,36,38,40,42,44,46,48,50)], main=Ranked, col=blue, pch=°, gap=0.2) dev.off() Now I'm trying to make xyplots to compare the result from three different categories: # Plot Pro against Glc for each of the three categories xyplot(Pro ~ Glc | Categories_BBCH_ID, data=fieldTrial0809, pch=°, layout=c(1, 3), aspect=1, index.cond=list(3:1)) I would like to exlude outliers like above. I've found that limits can be used in a similar manner as with xlim and ylim, but though I've read about them I don't understand how to solve it... Anyone who can help me? All the best, Joel _ Hitta hetaste singlarna på MSN Dejting! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Print several xyplots to the same page in a pdf file
Hello everybody, I'm using the lattice package and the xyplot to make several graphs like below. However, I can just print the three grouped plots onto one page as I'm putting them into a pdf-file, which gives me a huge amount of pages... Is it possible to put them all, or at least more than one on the same page, for instance put three groups beside each other like columns? ... xyplot(Pro ~ Glc | Categories_BBCH_ID, data=fieldTrial0809, pch=°, layout=c(1, 3), aspect=1, index.cond=list(3:1)) xyplot(Pro ~ Raf | Categories_BBCH_ID, data=fieldTrial0809, pch=°, layout=c(1, 3), aspect=1, index.cond=list(3:1)) xyplot(Pro ~ Suc | Categories_BBCH_ID, data=fieldTrial0809, pch=°, layout=c(1, 3), aspect=1, index.cond=list(3:1)) xyplot(Fum ~ Aa | Categories_BBCH_ID, data=fieldTrial0809, pch=°, layout=c(1, 3), aspect=1, index.cond=list(3:1)) xyplot(Fum ~ Pro | Categories_BBCH_ID, data=fieldTrial0809, pch=°, layout=c(1, 3), aspect=1, index.cond=list(3:1)) etc... All the best, Joel _ Nya Windows 7 - Hitta en dator som passar dig! Mer information. http://windows.microsoft.com/shop [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Change positions of columns in data frame
Hi all, Probably a simple question, but I just can't find a simple answear in the older threads or anywhere else. I've added some new vectors as columns in a data frame using cbind(). As they're all put as the last columns inte the data frame, I would like to move them to specific positions. How do you do to change the position of a column in a data frame? I know I can use fieldTrial0809=data.frame(Sample_ID=as.factor(fieldTrial0809$Sample_ID), Plant_ID=as.factor(fieldTrial0809$Plant_ID), ...) to create a new data frame with the given columns in the specified order, but there must be an easier way..? All the best, Joel _ Nya Windows 7 - Hitta en dator som passar dig! Mer information. http://windows.microsoft.com/shop [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Automatization of non-linear regression
Hi everybody, I'm using the method described here to make a linear regression: http://www.apsnet.org/education/advancedplantpath/topics/Rmodules/Doc1/05_Nonlinear_regression.html ## Input the data that include the variables time, plant ID, and severity time - c(seq(0,10),seq(0,10),seq(0,10)) plant - c(rep(1,11),rep(2,11),rep(3,11)) ## Severity represents the number of ## lesions on the leaf surface, standardized ## as a proportion of the maximum severity - c( + 42,51,59,64,76,93,106,125,149,171,199, + 40,49,58,72,84,103,122,138,162,187,209, + 41,49,57,71,89,112,146,174,218,250,288)/288 data1 - data.frame( + cbind( + time, + plant, + severity + ) + ) ## Plot severity versus time ## to see the relationship between## the two variables for each plant plot( + data1$time, + data1$severity, + xlab=Time, + ylab=Severity, + type=n + ) text( + data1$time, + data1$severity, + data1$plant + ) title(main=Graph of severity vs time) getInitial( + severity ~ SSlogis(time, alpha, xmid, scale), + data = data1 + ) alpha xmid scale 2.212468 12.506960 4.572391 ## Using the initial parameters above, ## fit the data with a logistic curve. para0.st - c( + alpha=2.212, + beta=12.507/4.572, # beta in our model is xmid/scale + gamma=1/4.572 # gamma (or r) is 1/scale + ) fit0 - nls( + severity~alpha/(1+exp(beta-gamma*time)), + data1, + start=para0.st, + trace=T + ) 0.1621433 : 2.212 2.7355643 0.2187227 0.1621427 : 2.2124095 2.7352979 0.2187056 ## Plot to see how the model fits the data; plot the ## logistic curve on a scatter plot plot( + data1$time, + data1$severity, + type=n + ) text( + data1$time, + data1$severity, + data1$plant + ) title(main=Graph of severity vs time) curve( + 2.21/(1+exp(2.74-0.22*x)), + from=time[1], + to=time[11], + add=TRUE + ) As you can see I have to do some work manually, such as setting the numbers to be used for calculation of alpha, beta and gamma. I wonder if you might have an idea how to automatize this? I suppose it should be possible to save the output from getInitial() and reach the elements via index or something, but how? I guess a similar approach could be used for the values of fit0? Or even better, if the variables alpha, beta and gamma could be used right away for instance in curve(), instead of adding the values manually. But just exchanging the values with the varables (alpha instead of 2.21 etc) doesn't seem to work. What is the reason for that? Any solution? A last, general but somewhat related question. If I set variables in a function such as para0.st - c(alpha=2.212, ...), is it just stored locally, or can it be used globally, I mean, can I use the variable anywhere (for instance in curve()) or just in the function where it was created? I'm asking because I'm used to Java, where the life time of local variabels only extends to the closing braces, while global variables can be reached everywhere. The reason for automatization is that I'll have to repeat the procedure more than a hundred times, while making overview pair waise plots of my data, with both this logaritmic regression and several others (exponential, monomoelcular, logistic, Gompertz and Weibull). Wish you all the best, Joel _ Nya Windows 7 gör allt lite enklare. Hitta en dator som passar dig! http://windows.microsoft.com/shop [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Division of data frame and deletion of values from column
Hi all, I guess this might be an easy question, but I've searched multiple help pages without finding any answear... so now I put my trust in you! I have a data frame (36 variables and 556 observations). One column contains three factors, and I would like to divide the data frame into three new ones, based on the value of the factors, thereby having only one value for all elements of the particular column in each of the data frames. The reason is that I later will create plots and do statistical analyzes on these data frames, and I don't want those factors affecting the result. ID Weight Age_days ... 1 18 76.1 106 2 19 77.0 175 3 20 78.1 121 4 21 78.2 121 5 22 78.8 106 6 23 76.3 106 . . . I also have another column containing several factors, of which I would like to exclude one (get NA instead). ID Weight Age_days Value_ID ... 1 18 76.1 106 high 2 19 77.0 175 low 3 20 78.1 121middle 4 21 78.2 121 high 5 22 78.8 106 high 6 23 76.3 106number -- exclude 7 24 76.9 175 low . . . I really hope someone could help me, though you might think it's too easy... Best regards, Joel _ Hitta kärleken nu i vår! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot overview xy plots from data frame?
Hi, I've got a data frame (556 rows and 36 columns) from which I need to create several xy plots and print to pdf, in order to detect outliers and trends in the data. 16 of the columns contains numerical values, and I would like to create graphs for all combinations. It can be done manually, but creating 256 plots by hand takes time... I guess I have to iterate through the data frame, but I'm not used to do that with R. Below I've written my thoughts, trying to combine my knowledge in Java and R, just to give you the idea: pdf(FieldTrial0809Overview.pdf) int colWidth = fieldTrial[0].length; for(i=0, icolWidth, i++) { for(j=0, jcolWidth, j++) { String colI=get.fieldTrial$i; String colJ=get.fieldTrial$j; plot(fieldTrial$i~fieldTrial$j, main=colI + vs + colJ, xlab=colI, ylab=colJ) } } dev.off() Anyone who know how to solve this? Do I have to copy the 16 numerical columns to a new dataset, because they are not grouped and there are 20 additional non-numerical columns in the data frame. By the way, can the iterations be made using R, or do you have to combine with for instance Perl? All the best, Joel _ Hitta hetaste singlarna på MSN Dejting! http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.