Re: [R] How to use bash command in R script?
It's a great way. You told a lot of things that I need to ask. Thank you very much! Best wishes, Wei-Wei 2007/2/27, Peter Dalgaard [EMAIL PROTECTED]: Guo Wei-Wei wrote: Thank you all! I solved my problem with your help. Come to think of it, it might be more to the point to use scan() on a pipe(): con - pipe(mxresult.sh ABC.mx, r) mynum - scan(con) close(con) -- O__ Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to use bash command in R script?
Dear All: Maybe it is a too basic question, but I don't how to find the answer. Sorry for that. What I want to do is call a shell command, which will provide two numbers, and assign those numbers to a vector. For example: The following command: $mxresult.sh ABC.mx mxresult.sh is a script written by myself and ABC.mx is a Mx script. I can get two numbers, 126.128 and 29, with this command. Is there any way to do it like this: c - somefunction(mxresult.sh ABC.mx) Or is their any other way to fulfill the function? Thanks in advance! Best washes, Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use bash command in R script?
Thank you all! I solved my problem with your help. Best wishes, Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a better way for inputing data manually?
Thank you, Duncan and Michael. Your information are all very helpful for me. Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question on Chi-square of null model in sem package
Dear Pref. Fox Sorry, I didn't receive your reply. I try the new sem package. It's great. The following is the results that I got. The fit indices are fine. Model Chisquare = 208 Df = 98 Pr(Chisq) = 6.6e-10 Chisquare (null model) = 1741 Df = 120 Goodness-of-fit index = 0.9 Adjusted goodness-of-fit index = 0.87 RMSEA index = 0.066 90 % CI: (0.054, 0.079) Bentler-Bonnett NFI = 0.88 Tucker-Lewis NNFI = 0.92 Bentler CFI = 0.93 BIC = -336 Thank you very much. You help me out so many problems. Best wishes, Wei-Wei 2006/9/4, John Fox [EMAIL PROTECTED]: Dear Wei-Wei, As I explained to you in private email yesterday (perhaps you didn't receive my reply?), the problem that you point out is due to a bug in the sem function that I fixed some time ago and then inadvertently reintroduced. Yesterday, I sent a corrected version of the sem package (0.9-5) to CRAN; the source package is there now and I'm sure that the compiled Windows package will appear in due course. Thank you once more for bringing the problem to my attention. John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question on Chi-square of null model in sem package
Dear all, I met a problem while doing SEM by sem package. I got a negative chi-square of null model. Because the theoretical value of chi-square cannot be negative, I checked the source code of sem.R in sem package and I found the Chi-square of null model was computed by the following expression: result$chisqNull - (N - 1) * (sum(diag(S %*% diag(1/diag(S + log(prod(diag(S I think the reason for negative Chi-square is the too small value of prod(diag(S)) of my data. I'm working on a data.frame named emc.data from a sample of a 16-item questioinnaire. The variance of items are diag(cov(emc.data)) EMC1 EMC2 EMC3 EMC4 EMC5 EMC6 EMC7 EMC8 0.364 0.2350041 0.2488009 0.2901653 0.3195399 0.3107343 0.3436622 0.2345912 EMC9 EMC10 EMC11 EMC12 EMC13 EMC14 EMC15 EMC16 0.2621680 0.3230400 0.4039245 0.3803105 0.2773370 0.4348342 0.2757216 0.3405252 The fit indices of RMSEA and GFI are good, so I think the problem might be solve by another way for computing the Chi-square of null model. I'm not well trained in maths, so I come for help. Any advise is appreciated. Best wishes, Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question on partial effect
Dear all, I don't know what's my question is called. I have a performance variable A, such as sales. And I have another variable B, let's say establish time of firm. I want to create the third variable that is sales without the effect of establish time. Maybe it can be called partial effect problem. I'm not sure. Does anyone have any suggestion? Thank you in advance! All the best, Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Question on partial effect
Thank you, Gavin. I think that might be what I need. But I'm a little bit wandering what's the scale of resid(mod). Is it scale(dist)/scale(speed), for example kilometer / (kilometer per hour)? or something else? Thank you very much! Wei-Wei 2006/7/12, Gavin Simpson [EMAIL PROTECTED]: On Tue, 2006-07-11 at 23:51 +0800, Guo Wei-Wei wrote: Dear all, I don't know what's my question is called. I have a performance variable A, such as sales. And I have another variable B, let's say establish time of firm. I want to create the third variable that is sales without the effect of establish time. Maybe it can be called partial effect problem. I'm not sure. Does anyone have any suggestion? Thank you in advance! All the best, Wei-Wei Do you mean? ## dummy data A - rnorm(100) B - rnorm(100) C - resid(lm(A ~ B)) C now contains the residual variation in A after fitting B. e.g. with some real data ?cars data(cars) # not sure this is needed now, I forget mod - lm(dist ~ speed, data = cars) summary(mod) partial - resid(mod) ## check mod2 - lm(dist ~ partial, data = cars) summary(mod2) ## from the two R^2 form mod1 and mod2 - partial contains dist minus ## the effects of speed 0.6511 + 0.3489 [1] 1 HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC ENSIS, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/cv/ London, UK. WC1E 6BT. [w] http://www.ucl.ac.uk/~ucfagls/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Question on partial effect
Than you, Gavin. You helped me out a lot of problems. Thank you very much! Wei-Wei 2006/7/12, Gavin Simpson [EMAIL PROTECTED]: On Wed, 2006-07-12 at 00:51 +0800, Guo Wei-Wei wrote: Thank you, Gavin. I think that might be what I need. But I'm a little bit wandering what's the scale of resid(mod). Is it scale(dist)/scale(speed), for example kilometer / (kilometer per hour)? or something else? Thank you very much! Wei-Wei The scale of dist - they are just the differences between observed dist and fitted dist (based on speed). mod - lm(dist ~ speed, data = cars) resid(mod) 1 2 3 4 5 3.849460 11.849460 -5.947766 12.052234 2.119825 6 7 8 9 10 -7.812584 -3.744993 4.255007 12.255007 -8.677401 # visualise the residuals plot(resid(mod) ~ dist, data = cars) abline(h = 0, col = grey) ## length of blue line represents the residual lines(cars$dist, resid(mod), type = h, col = blue) So you see that for the 1st residual it is 3.849 ft (the distances are measured in feet, see ?cars) Does this help? G 2006/7/12, Gavin Simpson [EMAIL PROTECTED]: On Tue, 2006-07-11 at 23:51 +0800, Guo Wei-Wei wrote: Dear all, I don't know what's my question is called. I have a performance variable A, such as sales. And I have another variable B, let's say establish time of firm. I want to create the third variable that is sales without the effect of establish time. Maybe it can be called partial effect problem. I'm not sure. Does anyone have any suggestion? Thank you in advance! All the best, Wei-Wei Do you mean? ## dummy data A - rnorm(100) B - rnorm(100) C - resid(lm(A ~ B)) C now contains the residual variation in A after fitting B. e.g. with some real data ?cars data(cars) # not sure this is needed now, I forget mod - lm(dist ~ speed, data = cars) summary(mod) partial - resid(mod) ## check mod2 - lm(dist ~ partial, data = cars) summary(mod2) ## from the two R^2 form mod1 and mod2 - partial contains dist minus ## the effects of speed 0.6511 + 0.3489 [1] 1 HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC ENSIS, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/cv/ London, UK. WC1E 6BT. [w] http://www.ucl.ac.uk/~ucfagls/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC ENSIS, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/cv/ London, UK. WC1E 6BT. [w] http://www.ucl.ac.uk/~ucfagls/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] A possible too old question on significant test of correlation matrix
Hi, Gavin, your program is excellent. Thank you very much! And I have two further questions. 1. Since it is very possible that the data contains missing value and the program will failed against missing values, I have to delete all the cases contained NA. Can it be done pairwisely? 2. Can the program show t values instead of p values? Best regards, Wei-Wei 2006/7/10, Gavin Simpson [EMAIL PROTECTED]: On Mon, 2006-07-10 at 13:27 +0800, Guo Wei-Wei wrote: Dear all, I'm working on a data.frame named en.data, which has n cases and m columns. I generate the correlation matrix of en.data by cor(en.data) I find that there is no p-value on each correlation in the correlation matrix. I searched in the R-help mail list and found some related posts, but I didn't find direct way to solve the problem. Someone said to use cor.test() or t.test(). The problem is that cor.test() and t.test() can only apply on two vectors, not on a data.frame or a matrix. My solution is for (i in 1:(ncol(en.data) -1)) { cor.test(en.data[,i], en.data[, i+1]) } I think it is a stupid way. Is there a direct way to do so? After all, it is a basic function to generate significant level of a correlation in a correlation matrix. Thank you in advance! Wei-Wei Hi, Bill Venables posted a solution to this on the R-Help list in Jan 2000. I made a minor modification to add a class to the result and wrote a print method (which could probably do with some tidying but it works). E.g.: # paste in the functions below, then data(iris) corProb(iris[,1:4]) ## prints Correlations are shown below the diagonal P-values are shown above the diagonal Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length 1. 0.1519 0. 0. Sepal.Width -0.1176 1. 0. 0. Petal.Length 0.8718 -0.4284 1. 0. Petal.Width 0.8179 -0.3661 0.9629 1. Is this what you want? HTH G # correlation function # based on post by Bill Venables on R-Help # Date: Tue, 04 Jan 2000 15:05:39 +1000 # https://stat.ethz.ch/pipermail/r-help/2000-January/009758.html # modified by G L Simpson, September 2003 # version 0.2: added print.cor.prob # added class statement to cor.prob # version 0.1: original function of Bill Venables corProb - function(X, dfr = nrow(X) - 2) { R - cor(X) above - row(R) col(R) r2 - R[above]^2 Fstat - r2 * dfr / (1 - r2) R[above] - 1 - pf(Fstat, 1, dfr) class(R) - corProb R } print.corProb - function(x, digits = getOption(digits), quote = FALSE, na.print = , justify = none, ...) { xx - format(unclass(round(x, digits = 4)), digits = digits, justify = justify) if (any(ina - is.na(x))) xx[ina] - na.print cat(\nCorrelations are shown below the diagonal\n) cat(P-values are shown above the diagonal\n\n) print(xx, quote = quote, ...) invisible(x) } -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% *Note new Address and Fax and Telephone numbers from 10th April 2006* %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC [f] +44 (0)20 7679 0565 UCL Department of Geography Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street London, UK[w] http://www.ucl.ac.uk/~ucfagls/cv/ WC1E 6BT [w] http://www.ucl.ac.uk/~ucfagls/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] A possible too old question on significant test of correlationmatrix
Thank you, Dimitris. The function rcor.test() is very nice. It can pass arguments to cor() and solve my first problem of pairwised case deletion. Best regards, Wei-Wei 2006/7/10, Dimitris Rizopoulos [EMAIL PROTECTED]: you can use function rcor.test() from package 'ltm', e.g., help(rcor.test, package = ltm) ### library(ltm) dat - data.frame(matrix(rnorm(1000), 100, 10)) rcor.test(dat) rcor.test(dat, method = kendall) rcor.test(dat, method = spearman) I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Guo Wei-Wei [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Monday, July 10, 2006 7:27 AM Subject: [R] A possible too old question on significant test of correlationmatrix Dear all, I'm working on a data.frame named en.data, which has n cases and m columns. I generate the correlation matrix of en.data by cor(en.data) I find that there is no p-value on each correlation in the correlation matrix. I searched in the R-help mail list and found some related posts, but I didn't find direct way to solve the problem. Someone said to use cor.test() or t.test(). The problem is that cor.test() and t.test() can only apply on two vectors, not on a data.frame or a matrix. My solution is for (i in 1:(ncol(en.data) -1)) { cor.test(en.data[,i], en.data[, i+1]) } I think it is a stupid way. Is there a direct way to do so? After all, it is a basic function to generate significant level of a correlation in a correlation matrix. Thank you in advance! Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] A possible too old question on significant test of correlation matrix
Dear all, I'm working on a data.frame named en.data, which has n cases and m columns. I generate the correlation matrix of en.data by cor(en.data) I find that there is no p-value on each correlation in the correlation matrix. I searched in the R-help mail list and found some related posts, but I didn't find direct way to solve the problem. Someone said to use cor.test() or t.test(). The problem is that cor.test() and t.test() can only apply on two vectors, not on a data.frame or a matrix. My solution is for (i in 1:(ncol(en.data) -1)) { cor.test(en.data[,i], en.data[, i+1]) } I think it is a stupid way. Is there a direct way to do so? After all, it is a basic function to generate significant level of a correlation in a correlation matrix. Thank you in advance! Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problems on testing moderating effect (or interactive effect).
Thank you Jonathan, Can I use the variance-covariance matrix as the input data? Just like what SEM does. My mentor told me to avoid sperate the operation into two step, that is to get the factors' means first and then to test the relationships. I'm used to use sem package. I'm not familiar with lm(). I trid summary(lm(B ~ A*C)) and failed to get any result. Can sem deal with mediation? And could you tell me the command of generating a interaction item of A (nxp) and C (mxp)? And you give a nice reference. Thank you very much! 2006/7/4, Jonathan Baron [EMAIL PROTECTED]: On 07/04/06 11:38, Guo Wei-Wei wrote: Hi everyone, I want to do test on moderating effect. I have three factors, A, B, and C. A has influence on B, and C moderating the influence. The relationship looks like this: A - B ^ | C A, B, and C are all scale variables. I think I can test the moderating effect by adding a interactive variable between A and C. But I'm not sure how to do. Is there a default way to do it in package sem? I'm also thinking about create a interaction variable of A and C, but I don't know how to it. A has n (n = 27) items and p (p = 288) cases and C has m (m = 16) iterms and p (p = 288) cases. Moderation is usually tested with an interaction. You would use lm() not sem. For example, summary(lm(B ~ A*C)) which will report the main effects of A and C as well as their interaction. (Of course, main effects may be meaningless if there is an interaction.) See the help page for formula. So far I'm assuming that you are interested in individual differences (cases). So A, B, and C would be the means of each case. If, for example, A is actually a matrix in which each row is a case, you would use something like rowMeans(A), etc., for each variable, so you could say summary(lm(rowMeans(B) ~ rowMeans(A)*rowMeans(C))) (or else compute each of these first). However, you may be interested in moderation WITHIN cases, across items. If you look up moderation on Google, you find http://davidakenny.net/cm/moderation.htm which cites Judd, C. M., Kenny, D. A., McClelland, G. H. (2001). Estimating and testing mediation and moderation in within-participant designs. Psychological Methods, 6, 115-134. I have not read this article, but other articles by the same authors are both clear and well reasoned. -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron Editor: Judgment and Decision Making (http://journal.sjdm.org) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problems on testing moderating effect (or interactive effect).
Thank you Jonathan. I do need to read R documentation. The extra problem is time Thank you for your reply. 2006/7/4, Jonathan Baron [EMAIL PROTECTED]: On 07/04/06 19:51, Guo Wei-Wei wrote: Can I use the variance-covariance matrix as the input data? Just like what SEM does. My mentor told me to avoid sperate the operation into two step, that is to get the factors' means first and then to test the relationships. I'm used to use sem package. I'm not familiar with lm(). I trid summary(lm(B ~ A*C)) and failed to get any result. Can sem deal with mediation? And could you tell me the command of generating a interaction item of A (nxp) and C (mxp)? I don't use sem, but I don't see why you need it for this. This is a simple regression problem, so far as I can tell. I think you need to do some reading of the R documentation. What are A, B, and C? They should be vectors. That was the point of my comment about rowMeans. You seem to be guessing and relying on authority instead of trying to understand. -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problems on testing moderating effect (or interactive effect).
Hi everyone, I want to do test on moderating effect. I have three factors, A, B, and C. A has influence on B, and C moderating the influence. The relationship looks like this: A - B ^ | C A, B, and C are all scale variables. I think I can test the moderating effect by adding a interactive variable between A and C. But I'm not sure how to do. Is there a default way to do it in package sem? I'm also thinking about create a interaction variable of A and C, but I don't know how to it. A has n (n = 27) items and p (p = 288) cases and C has m (m = 16) iterms and p (p = 288) cases. Does anyone have any suggestion? Thanks in advance. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] aggregate data.frame by one column
Hi, everyone, I have a data.frame named eva like this: IND PARTNO VC1 EO1 EO2 EO3 EO4 EO5 114 114001 2 5 4 4 5 4 114 114001 2 4 4 4 4 4 114 114001 2 4 NA NA NA NA 112 112002 2 3 3 6 2 6 112 112002 2 1 1 3 4 4 112 112003 2 6 6 6 5 6 112 112003 2 5 7 6 6 6 112 112003 2 6 6 6 4 5 114 114004 2 2 3 3 2 4 114 114004 2 5 3 4 4 2 114 114004 2 NA NA NA NA NA 113 113005 2 5 5 6 6 5 113 113005 2 7 7 4 7 6 111 111006 2 5 7 7 7 7 112 112007 2 7 7 7 2 2 112 112007 2 6 6 6 1 2 112 112007 2 7 6 6 2 2 111 111008 2 4 1 3 1 4 111 111008 2 3 1 5 3 2 This is only a small part of the whole data. PARTNO is a digit variable and I want to use it as a group variable to aggreate other variables. What I want to get looks like this: IND PARTNO NUM VC1 EO1 EO2 EO3 EO4 EO5 114 114001 3 2 4.3 4 4 4.5 4 112 112002 2 2 2 2 4.5 3 5 112 112003 3 2 5.7 6.3 6 5 5.7 114 114004 3 2 3.5 3 3.5 3 3 113 113005 2 2 6 6 5 6.5 5.5 111 111006 1 2 5 7 7 7 7 112 112007 3 2 6.7 6.3 6.3 1.7 2 111 111008 2 2 3.5 1 4 2 3 NUM is a newly added variable which indicates the case number of each group grouped by PARTNO. I have two questions on this manipulation. The first is how to get the newly added variable NUM. I have no idea on this question. The second is how to average other variables by group. If there are NA, I want the average operation is done on other cases. For example, the variable EO1 has values of 2, 5, and NA on case 114004. What I have done is aggregate(eva[,-2], by=eva[,-2], mean) But it seems because there are NAs, the aggregate cannot process. Because the NA values are not a small part, I cannot use imputation methods. I'm not sure whether my operation is right. Does anyone have any suggestion on the two problems? Thanks in advance! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] aggregate data.frame by one column
Hi Andrew, Thank you very much! It works so well than I can expect. All the best, Wei-Wei 2006/6/30, Andrew Robinson [EMAIL PROTECTED]: Hi Wei-Wei, try this: eva.agg - aggregate(x = list( VC1=eva$VC1, EO1=eva$EO1, EO2=eva$EO2, EO3=eva$EO3, EO4=eva$EO4, EO5=eva$EO5 ), by = list(PARTNO=eva$PARTNO), FUN = mean, na.rm = TRUE) eva.agg$NUM - aggregate(eva$PARTNO, list(eva$PARTNO), length) Cheers Andrew __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to choose columns in data.frame by parts of columns' names?
Dear all, I have a data.frame which has names as following. [1] XG1 YG1 XEST YEST [2] XNOEMP1 XNOEMP2 YNOEMP1 YNOEMP2 [3] XBUS10 XBUS10A XBUS10B XBUS10C [4] YBUS10 YBUS10A YBUS10B YBUS10C [5] XOWNBUS XSELFEST YOWNBUS YSELFEST Those columns have names beginning with X or Y. Each X is paired by a Y, e.g. XG1 and YG1, but they are not in the order of X Y X Y I want to combine X* and Y* like this: data.new[,G1] - (data.old[,XG1] + endata.use[,YG1])/2 How to choose columns by parts of names? For example, I can pick out XG1 and YG1 because they have the common part G1. Thank you. Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to choose columns in data.frame by parts of columns' names?
Thank you. I made a mistake in my previous email. What I mean is: data.new[,G1] - (data.old[,XG1] + data.old[,YG1])/2 data.old[, regexpr(G1, colnames(data.old)) 0] is a nice way, but there are about 100 X*s and Y*s. Can I do some comparision on all those column names and get columns with similar parts? 2006/5/31, Gabor Grothendieck [EMAIL PROTECTED]: On 5/30/06, Guo Wei-Wei [EMAIL PROTECTED] wrote: Dear all, I have a data.frame which has names as following. [1] XG1 YG1 XEST YEST [2] XNOEMP1 XNOEMP2 YNOEMP1 YNOEMP2 [3] XBUS10 XBUS10A XBUS10B XBUS10C [4] YBUS10 YBUS10A YBUS10B YBUS10C [5] XOWNBUS XSELFEST YOWNBUS YSELFEST Those columns have names beginning with X or Y. Each X is paired by a Y, e.g. XG1 and YG1, but they are not in the order of X Y X Y I want to combine X* and Y* like this: data.new[,G1] - (data.old[,XG1] + endata.use[,YG1])/2 How to choose columns by parts of names? For example, I can pick out XG1 and YG1 because they have the common part G1. This gives all columns whose column name contains G1: data.old[, regexpr(G1, colnames(data.old)) 0] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to choose columns in data.frame by parts of columns' names?
Peter, Thank you, I made a mistake in my previous email. What I mean is: data.new[,G1] - (data.old[,XG1] + data.old[,YG1])/2 Does your way have effects on data? or only have effects on those column names? I tried on my data and get a list of numbers. Can I rearrange the order of columns of data.frame by your way? 2006/5/31, Peter Alspach [EMAIL PROTECTED]: Wei-wei yourNames [1] XG1 YG1 XEST YEST XNOEMP1 XNOEMP2 [7] YNOEMP1 YNOEMP2 XBUS10 XBUS10A XBUS10B XBUS10C [13] YBUS10 YBUS10A YBUS10B YBUS10C XOWNBUS XSELFEST [19] YOWNBUS YSELFEST yourNames[order(substring(yourNames,2), substring(yourNames, 1,1))] [1] XBUS10 YBUS10 XBUS10A YBUS10A XBUS10B YBUS10B [7] XBUS10C YBUS10C XEST YEST XG1 YG1 [13] XNOEMP1 YNOEMP1 XNOEMP2 YNOEMP2 XOWNBUS YOWNBUS [19] XSELFEST YSELFEST gives an idea of what I mean ... Peter Alspach __ The contents of this e-mail are privileged and/or confidential to the named recipient and are not to be used by any other person and/or organisation. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. __ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to choose columns in data.frame by parts of columns' names?
Gabor and Peter, Thank you. Both of you give me excellent ways. I have a further problem. How can I get the common parts of column names as column names in a new data.frame? For example, I combines data of XG1 and YG1 in data.old and get a new column in data.new named G1. Can It be done automaticlly? data.new[,G1] - (data.old[,XG1] + data.old[,YG1])/2 2006/5/31, Gabor Grothendieck [EMAIL PROTECTED]: This is not restricted to single matches: colnames(iris) [1] Sepal.Length Sepal.Width Petal.Length Petal.Width Species regexpr(Sepal, colnames(iris)) 0 [1] TRUE TRUE FALSE FALSE FALSE __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html