[R] LDA: normalization of eigenvectors (see SPSS)
Hi dear R-users I try to reproduce the steps included in a LDA. Concerning the eigenvectors there is a difference to SPSS. In my textbook (Bortz) it says, that the matrix with the eigenvectors V usually are not normalized to the length of 1, but in the way that the following holds (SPSS does the same thing): t(Vstar)%*%Derror%*%Vstar = I where Vstar are the normalized eigenvectors. Derror is an error or within squaresum- and crossproduct matrix (squaresum of the p variables on the diagonale, and the non-diagonal elements are the sum of the crossproducts). For Derror the following holds: Dtotal = Dtreat + Derror. Since I assume that many of you are familiar with this transformation: can anybody of you tell me, how to conduct this transformation in R? Would be very nice. Thanks a lot Cheers Christoph -- Christoph Lehmann [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Ordering long vectors
On Sat, 7 Jun 2003, Göran Broström wrote: I need to order a long vector of integers with rather few unique values. This is very slow: x - sample(rep(c(1:10), 5)) system.time(ord - order(x)) [1] 189.18 0.09 190.48 0.00 0.00 But with no ties y - sample(50) system.time(ord1 - order(y)) [1] 1.18 0.00 1.18 0.00 0.00 it is very fast! This gave me the following idea: Since I don't care about keeping the order within tied values, why not add some small disturbance to x, and indeed, unix.time(ord2 - order(x + runif(length(x), -0.1, 0.1))) [1] 1.66 0.00 1.66 0.00 0.00 An even better way is system.time(ord3 - order(x + seq(0, 0.9, length = length(x [1] 1.32 0.05 1.37 0.00 0.00 Faster, but more important; it keeps the original ordering for tied values. Thanks to James Holtman. Göran [...] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] daylight saving time problems
Hello, sorry for my mail yesterday about the POSIXct problems. I was a bit tired and now I found out the real problem. When importing time data over a daylight saving time shift, R shifts two times. I don't now whether it is a bug or a (wrongly used) feature If you execute the following code: -- test-c(31/03/2002 0:00, 31/03/2002 0:15, 31/03/2002 0:30, 31/03/2002 0:45, 31/03/2002 1:00, 31/03/2002 1:15, 31/03/2002 1:30, 31/03/2002 1:45, 31/03/2002 2:00, 31/03/2002 2:15, 31/03/2002 2:30, 31/03/2002 2:45, 31/03/2002 3:00, 31/03/2002 3:15, 31/03/2002 3:30, 31/03/2002 3:45, 31/03/2002 4:00, 31/03/2002 4:15, 31/03/2002 4:30, 31/03/2002 4:45, 31/03/2002 5:00, 31/03/2002 5:15, 31/03/2002 5:30, 31/03/2002 5:45, 31/03/2002 6:00); timetest-strptime(as.character(test), format = %d/%m/%Y %H:%M); timetest2-as.POSIXct(timetest); -- then R 1.7.0 gives on my Mandrake 9.1: test [1] 31/03/2002 0:00 31/03/2002 0:15 31/03/2002 0:30 31/03/2002 0:45 [5] 31/03/2002 1:00 31/03/2002 1:15 31/03/2002 1:30 31/03/2002 1:45 [9] 31/03/2002 2:00 31/03/2002 2:15 31/03/2002 2:30 31/03/2002 2:45 [13] 31/03/2002 3:00 31/03/2002 3:15 31/03/2002 3:30 31/03/2002 3:45 [17] 31/03/2002 4:00 31/03/2002 4:15 31/03/2002 4:30 31/03/2002 4:45 [21] 31/03/2002 5:00 31/03/2002 5:15 31/03/2002 5:30 31/03/2002 5:45 [25] 31/03/2002 6:00 timetest [1] 2002-03-31 00:00:00 2002-03-31 00:15:00 2002-03-31 00:30:00 [4] 2002-03-31 00:45:00 2002-03-31 01:00:00 2002-03-31 01:15:00 [7] 2002-03-31 01:30:00 2002-03-31 01:45:00 2002-03-31 03:00:00 [10] 2002-03-31 03:15:00 2002-03-31 03:30:00 2002-03-31 03:45:00 [13] 2002-03-31 03:00:00 2002-03-31 03:15:00 2002-03-31 03:30:00 [16] 2002-03-31 03:45:00 2002-03-31 04:00:00 2002-03-31 04:15:00 [19] 2002-03-31 04:30:00 2002-03-31 04:45:00 2002-03-31 05:00:00 [22] 2002-03-31 05:15:00 2002-03-31 05:30:00 2002-03-31 05:45:00 [25] 2002-03-31 06:00:00 timetest2 [1] 2002-03-31 00:00:00 CET 2002-03-31 00:15:00 CET [3] 2002-03-31 00:30:00 CET 2002-03-31 00:45:00 CET [5] 2002-03-31 01:00:00 CET 2002-03-31 01:15:00 CET [7] 2002-03-31 01:30:00 CET 2002-03-31 01:45:00 CET [9] 2002-03-31 03:00:00 CEST 2002-03-31 03:15:00 CEST [11] 2002-03-31 03:30:00 CEST 2002-03-31 03:45:00 CEST [13] 2002-03-31 03:00:00 CEST 2002-03-31 03:15:00 CEST [15] 2002-03-31 03:30:00 CEST 2002-03-31 03:45:00 CEST [17] 2002-03-31 04:00:00 CEST 2002-03-31 04:15:00 CEST [19] 2002-03-31 04:30:00 CEST 2002-03-31 04:45:00 CEST [21] 2002-03-31 05:00:00 CEST 2002-03-31 05:15:00 CEST [23] 2002-03-31 05:30:00 CEST 2002-03-31 05:45:00 CEST [25] 2002-03-31 06:00:00 CEST There is a clear time shift timetest[8] and timetest[9] and another one between timetest[12] and timetest[13]. I.e. timetest[9:12] are wrongly converted. In october (reverse timeshift in daylight time) there is no shift at all. It seems that it was a feature before that has been badly patched. I'm using R 1.7.0 on Mandrake Linux in Belgium (CEST?) It does not occur on my MacOSX box (both Darwin and Carbon version); I don't now about the windows version. Thanks, Wouter Buytaert __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] LDA: normalization of eigenvectors (see SPSS)
The following satisfies some of your constraints but I don't know if it satisfies all of them. Let V = eigenvectors normalized so t(V) %*% V = I. Also, let D.5 = some square root matrix, so t(D.5) %*% D.5 = Derror, and Dm.5 = solve(D.5) = invers of D.5. The Choleski decomposition (chol) provides one such solution, but you can construct a symmetric square root using eigen. Then Vstar = Dm.5%*%V will have the property you mentioned below. Consider the following: (Derror - array(c(1,1,1,4), dim=c(2,2))) [,1] [,2] [1,]11 [2,]14 D.5 - chol(Derror) t(D.5) %*% D.5 [,1] [,2] [1,]11 [2,]14 (Dm.5 - solve(D.5)) [,1] [,2] [1,]1 -0.5773503 [2,]0 0.5773503 (t(Dm.5) %*% Derror %*% Dm.5) [,1] [,2] [1,]10 [2,]01 Thus,t(Vstar)%*%Derror%*%Vstar = t(V)%*%t(Dm.5)%*%Derror%*%Dm.5%*%V = t(V)%*%V = I. hope this helps. spencer graves Christoph Lehmann wrote: Hi dear R-users I try to reproduce the steps included in a LDA. Concerning the eigenvectors there is a difference to SPSS. In my textbook (Bortz) it says, that the matrix with the eigenvectors V usually are not normalized to the length of 1, but in the way that the following holds (SPSS does the same thing): t(Vstar)%*%Derror%*%Vstar = I where Vstar are the normalized eigenvectors. Derror is an error or within squaresum- and crossproduct matrix (squaresum of the p variables on the diagonale, and the non-diagonal elements are the sum of the crossproducts). For Derror the following holds: Dtotal = Dtreat + Derror. Since I assume that many of you are familiar with this transformation: can anybody of you tell me, how to conduct this transformation in R? Would be very nice. Thanks a lot Cheers Christoph __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] RMySQL errors
I have RMySQL installed on my OSX implementation of R, but get the following errors when trying to use it: Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library /usr/local/lib/R/library/RMySQL/libs/RMySQL.so: dlcompat: dyld: /usr/local/lib/R/bin/R.bin Undefined symbols: _getopt_long _load_defaults _mysql_affected_rows _mysql_close _mysql_errno _mysql_error _mysql_fetch_fields _mysql_fetch_lengths _mysql_fetch_row _mysql_field_count _mysql_free_result _mysql_g Error in library(RMySQL) : .First.lib failed I'm hoping there is a RMySQL guru out there somewhere that can help me out. TIA, cjf __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] LDA: normalization of eigenvectors (see SPSS)
Hi, Christoph: 1. I didn't see in your original email that you wanted V to be orthogonal, only that it's columns have length 1. You have a solution satisfying the latter constraint, but not the former. 2. I don't have time now to sort out the details, and I don't have them on the top of my head. I just entered lda into R 1.6.2 [after library(MASS)] and got the following: lda function (x, ...) { if (is.null(class(x))) class(x) - data.class(x) UseMethod(lda, x, ...) } To decode 'UseMethod(lda, ...)', I requested 'methods(lda)' with the following result: methods(lda) [1] lda.data.frame lda.defaultlda.formulalda.matrix Have you tried listing each of these 4 functions and working through them step by step? I think this should answer your question. Also see Venables and Ripley (2002) Modern Applied Statistics with S, index entry for lda. hth. spencer graves Christoph Lehmann wrote: thanks a lot, Spencer The problem is the following: my textbook has an example with the data: X x x1 x2 x3 1 3 3 4 2 4 4 3 3 4 4 6 4 2 5 5 5 2 4 5 6 3 4 6 7 3 4 4 8 2 5 5 9 4 3 6 10 5 5 6 11 4 5 7 12 4 6 4 13 3 6 6 14 4 7 6 15 6 5 6 -- y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 1 1 1 1 1 2 2 2 2 3 3 3 3 3 -- Dtot - (t(x)%*%x-t(xbar)%*%xbar) Dtot x1x2x3 x1 17.73 2.67 4.87 x2 2.67 17.33 4.33 x3 4.87 4.33 16.93 -- A - cbind(tapply(x[,1],y,sum), tapply(x[,2],y,sum), tapply(x[,3],y,sum)) A [,1] [,2] [,3] 1 18 24 29 2 14 17 21 3 21 29 29 G - apply(x,2,sum) G x1 x2 x3 53 70 79 p - ncol(x) k - length(freq) N - sum(freq) Dtreat - array(0,c(p,p)) k - length(freq) for (i in 1:p) + { + for (j in 1:k) + { + for (h in 1:k) + { + Dtreat[i,j] - Dtreat[i,j] + A[h,i]*A[h,j]/freq[h] + } + Dtreat[i,j] - Dtreat[i,j] - G[i]*G[j]/N + } + } Dtreat [,1] [,2] [,3] [1,] 3.93 5.97 3.17 [2,] 5.97 9.78 4.78 [3,] 3.17 4.78 2.55 -- Derror - Dtot-Dtreat Derror x1x2 x3 x1 13.8 -3.30 1.7 x2 -3.3 7.55 -0.45000 x3 1.7 -0.45 14.38333 -- eigen(Dtreat%*%solve(Derror)) $values [1] 2.300398e+00 2.039672e-02 -1.907034e-15 $vectors [,1] [,2] [,3] [1,] -0.4870772 0.6813155 -0.6076020 [2,] -0.7809602 -0.4342229 0.1539928 [3,] -0.3909693 0.5892874 0.7791701 V - eigen(Dtreat%*%solve(Derror))$vectors V [,1] [,2] [,3] [1,] -0.4870772 0.6813155 -0.6076020 [2,] -0.7809602 -0.4342229 0.1539928 [3,] -0.3909693 0.5892874 0.7791701 the textbook (SPSS) has similar eigenvalues, but only two!: lambda1 = 2.30048, lambda2 = 0.02091 , but as I wrote in the last mail: different eigenvectors Let's start here with your recommendation: first, it seems, since the last eigenvalue is almost 0, that the eigenvectors V are not orthogonal: t(V)%*%V [,1][,2][,3] [1,] 1.000 -0.22313575 -0.12894473 [2,] -0.2231357 1. -0.02168078 [3,] -0.1289447 -0.02168078 1. let's continue anyway? D.5 - chol(Derror) t(D.5) %*% D.5 x1x2 x3 x1 13.8 -3.30 1.7 x2 -3.3 7.55 -0.45000 x3 1.7 -0.45 14.38333 Dm.5 - solve(D.5) t(Dm.5) %*% Derror %*% Dm.5 x1x2x3 x1 1.00e+00 -2.523481e-17 -1.097755e-18 x2 -6.625163e-18 1.00e+00 -2.120970e-18 x3 4.501901e-18 4.460942e-19 1.00e+00 perfectly orthogonal t(V)%*%t(Dm.5)%*%Dfehler%*%Dm.5%*%V [,1][,2][,3] [1,] 1.000 -0.22313575 -0.12894473 [2,] -0.2231357 1. -0.02168078 [3,] -0.1289447 -0.02168078 1. again, equals t(V)%*%V not orthogonal. -- I think it has to do with the fact, that the textbook considers the third eigenvalue as = 0 and then gets the Vstar eigenvectors (which I try to reproduce: Vstar = [,1][,2][,3] [1,] 0.1689 0.1419 -0.1825 [2,] 0.3498-0.1597 0.0060 [3,] 0.0625 0.1422 0.2154 - Spencer if you find some minutes time to help me reproduce this example, it would be very nice (the data are from Jones 1961. He investigated whether essays written by children from lower, middle, upper class differ in sentence length, choosen words, complexity of sentence) Cheers Christoph ## The following satisfies some of your constraints but I don't know if it satisfies all of them. Let V = eigenvectors normalized so t(V) %*% V = I. Also, let D.5 = some square root matrix, so t(D.5) %*% D.5 = Derror, and Dm.5 = solve(D.5) = invers of D.5. The Choleski decomposition (chol) provides one such solution, but you can construct a symmetric
Re: [R] Ordering long vectors
On Sun, 8 Jun 2003, [ISO-8859-1] Göran Broström wrote: On Sat, 7 Jun 2003, Göran Broström wrote: I need to order a long vector of integers with rather few unique values. This is very slow: x - sample(rep(c(1:10), 5)) system.time(ord - order(x)) [1] 189.18 0.09 190.48 0.00 0.00 But with no ties y - sample(50) system.time(ord1 - order(y)) [1] 1.18 0.00 1.18 0.00 0.00 it is very fast! This gave me the following idea: Since I don't care about keeping the order within tied values, why not add some small disturbance to x, Another option: system.time(a-sapply(sort(unique(x)),function(i) which(x==i))) This turns out to be slightly slower than your method, but doesn't require that you know what the smallest difference between values is (and works for characters as well as numbers) -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Ordering long vectors
On Sat, 7 Jun 2003, [ISO-8859-1] Göran Broström wrote: I need to order a long vector of integers with rather few unique values. This is very slow: I think the culprit is src/main/sort.c: orderVector1 /* Shell sort isn't stable, but it proves to be somewhat faster to run a final insertion sort to re-order runs of ties when comparison is cheap. */ This also explains: aa-sample(rep(1:10,5)) system.time( order(aa, 1:length(aa))) [1] 3.67 0.01 3.68 0.00 0.00 system.time( order(aa)) ^C Timing stopped at: 49.33 0.01 49.34 0 0 which is perhaps the simplest work-around :). -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] converting by to a data.frame?
Thanks to Thomas Lumley, Sundar Dorai-Raj, and Don McQueen for their suggestions. I need the INDICES as part of the output data.frame, which McQueen's solution provided. I generalized his method as follows: by.to.data.frame - function(x, INDICES, FUN){ # Split data.frame x on x[,INDICES] # and lapply FUN to each data.frame subset, # returning a data.frame # # Internal functions get.Index - function(x, INDICES){ Ind - as.character(x[,INDICES[1]]) k - length(INDICES) if(k 1) Ind - paste(Ind, get.Index(x, INDICES[-1]), sep=:) Ind } FUN2 - function(data., INDICES, FUN){ vec - FUN(data.) Vec - matrix(vec, nrow=1) dimnames(Vec) - list(NULL, names(vec)) cbind(data.[1,INDICES], Vec) } # Combine INDICES Ind - get.Index(x, INDICES) # Apply ...: Do the work. Split - split(x, Ind) byFits - lapply(Split, FUN2, INDICES, FUN) # Convert to a data.frame do.call('rbind',byFits) } Applying this to my toy problem produces the following: by.df - data.frame(A=rep(c(A1, A2), each=3), + B=rep(c(B1, B2), each=3), x=1:6, y=rep(0:1, length=6)) by.to.data.frame(by.df, c(A, B), function(data.)coef(lm(y~x, data.))) A B (Intercept) x A1:B1 A1 B1 0.333 -1.517960e-16 A2:B2 A2 B2 0.667 3.282015e-16 Thanks for the assistance. I can now tackle the real problem that generated this question. Best Wishes, Spencer Graves Don MacQueen wrote: Since I don't have your by.df to test with I may not have it exactly right, but something along these lines should work: byFits - lapply(split(by.df,paste(by.df$A,by.df$B)), FUN=function(data.) { tmp - coef(lm(y~x,data.)) data.frame(A=unique(data.$A), B=unique(data.$B), intercept=tmp[1], slope=tmp[2]) }) byFitsDF - do.call('rbind',byFits) That's assuming I've got all the closing parantheses in the right places, since my email software (Eudora) doesn't do R syntax checking! This approach can get rather slow if by.df is big, or when the computations in FUN are extensive (or both). If by.df$A has mode character (as opposed to being a factor), then replacing A=unique(data.$A) with A=I(unique(data.$A)) might improve performance. You want to avoid character to factor conversions when using an approach like this. -Don At 2:54 PM -0700 6/5/03, Spencer Graves wrote: Dear R-Help: I want to (a) subset a data.frame by several columns, (b) fit a model to each subset, and (c) store a vector of results from the fit in the columns of a data.frame. In the past, I've used for loops do do this. Is there a way to use by? Consider the following example: byFits - by(by.df, list(A=by.df$A, B=by.df$B), + function(data.)coef(lm(y~x, data.))) byFits A: A1 B: B1 (Intercept) x 3.33e-01 -1.517960e-16 A: A2 B: B1 NULL A: A1 B: B2 NULL A: A2 B: B2 (Intercept)x 6.67e-01 3.282015e-16 # Desired output: data.frame(A=c(A1,A2), B=c(B1, B2), .Intercept.=c(1/3, 2/3), x=c(-1.5e-16, 3.3e-16)) What's the simplest way to do this? Thanks, Spencer Graves __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Need help on data frame
Dear Sir/Madam, I am new in R.I have data corresponding to every day. Problem is that there are some gap i.e. observation couldn't be done on some particular day. I want to place this data frame like exact data frame (every year it will change, Feb 28 or feb29) Maybe I need to make one coulmn of date (for each year, say this dataframe 'frame1'), then I need to place data set on frame1 with missing entry as NA. Then I want to change this NA as mean of precceeding and following entries (for EACH NA) Hope it is possible by using R. I will greatly appreciate any help. Thanks, Pratibha Missed your favourite TV serial last night? Try the new, Yahoo! TV. visit http://in.tv.yahoo.com __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Ordering long vectors
On Sun, 8 Jun 2003, Thomas Lumley wrote: On Sat, 7 Jun 2003, [ISO-8859-1] Göran Broström wrote: I need to order a long vector of integers with rather few unique values. This is very slow: I think the culprit is src/main/sort.c: orderVector1 /* Shell sort isn't stable, but it proves to be somewhat faster to run a final insertion sort to re-order runs of ties when comparison is cheap. */ This also explains: aa-sample(rep(1:10,5)) system.time( order(aa, 1:length(aa))) [1] 3.67 0.01 3.68 0.00 0.00 system.time( order(aa)) ^C Timing stopped at: 49.33 0.01 49.34 0 0 which is perhaps the simplest work-around :). Thanks. This is really surprising: it is *much* faster to break ties by a second condition than not breaking them. I think it should be mentioned in the help. And could 'order/sort' be modified to check for 'tieness'? But I guess the the overhead would be too heavy. (if (length(unique(x)) alpha * length(x)) then else ) Göran __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Need help on data frame
I'm sorry, but I don't understand enough of your problem to be able to comment. If you can give us a toy example, small, easy to understand in a few seconds, that illustrated the difficulty, it should be easier for others to help. spencer graves Pratibha Murthy wrote: Dear Sir/Madam, I am new in R.I have data corresponding to every day. Problem is that there are some gap i.e. observation couldn't be done on some particular day. I want to place this data frame like exact data frame (every year it will change, Feb 28 or feb29) Maybe I need to make one coulmn of date (for each year, say this dataframe 'frame1'), then I need to place data set on frame1 with missing entry as NA. Then I want to change this NA as mean of precceeding and following entries (for EACH NA) Hope it is possible by using R. I will greatly appreciate any help. Thanks, Pratibha Missed your favourite TV serial last night? Try the new, Yahoo! TV. visit http://in.tv.yahoo.com __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Basic question on applying a function to each row of adataframe
Hi, I have a function foo(x,y) and a dataframe, DF, comprised of two vectors, x w, as follows : x w 1 1 1 2 2 1 3 3 1 4 4 1 etc I would like to apply the function foo to each 'pair' within DF e.g foo(1,1), foo(2,1), foo(3,1) etc I have tried apply(DF,foo) apply(DF[,],foo) apply(DF[DF$x,DF$w],foo) However, none of the above worked. Can anyone help ? Thanks in advance, Peter __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Basic question on applying a function to each row ofa dataframe
How about the following: DF - data.frame(x=1:4, y=rep(1,4)) foo - function(x, y)x+y foo(DF$x, DF$y) [1] 2 3 4 5 hth. spencer graves peter leonard wrote: Hi, I have a function foo(x,y) and a dataframe, DF, comprised of two vectors, x w, as follows : x w 1 1 1 2 2 1 3 3 1 4 4 1 etc I would like to apply the function foo to each 'pair' within DF e.g foo(1,1), foo(2,1), foo(3,1) etc I have tried apply(DF,foo) apply(DF[,],foo) apply(DF[DF$x,DF$w],foo) However, none of the above worked. Can anyone help ? Thanks in advance, Peter __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Basic question on applying a function to each row of adataframe
Hi, You need to tell the apply() whether you want to apply the function to rows (1) or columns (2). So in your case you may want to try something like: apply(DF, 1, foo) On Sun, 8 Jun 2003, peter leonard wrote: I have a function foo(x,y) and a dataframe, DF, comprised of two vectors, x w, as follows : x w 1 1 1 2 2 1 3 3 1 4 4 1 etc I would like to apply the function foo to each 'pair' within DF e.g foo(1,1), foo(2,1), foo(3,1) etc I have tried apply(DF,foo) apply(DF[,],foo) apply(DF[DF$x,DF$w],foo) -- Cheers, Kevin -- On two occasions, I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able to rightly apprehend the kind of confusion of ideas that could provoke such a question. -- Charles Babbage (1791-1871) From Computer Stupidities: http://rinkworks.com/stupid/ -- Ko-Kang Kevin Wang Master of Science (MSc) Student SLC Tutor and Lab Demonstrator Department of Statistics University of Auckland New Zealand Homepage: http://www.stat.auckland.ac.nz/~kwan022 Ph: 373-7599 x88475 (City) x88480 (Tamaki) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Basic question on applying a function to each row of adataframe
Hi Keven, This returns : Error in FUN(newX[, i], ...) : Argument y is missing, with no default E.g x-c(1,2,3,4) w-c(1,1,1,1) DF-data.frame(x,w) foo - function(x, y)x+y apply(DF, 1, foo) Error in FUN(newX[, i], ...) : Argument y is missing, with no default Regards Peter From: Ko-Kang Kevin Wang [EMAIL PROTECTED] To: peter leonard [EMAIL PROTECTED] CC: [EMAIL PROTECTED] Subject: Re: [R] Basic question on applying a function to each row of a dataframe Date: Mon, 9 Jun 2003 08:54:02 +1200 (NZST) Hi, You need to tell the apply() whether you want to apply the function to rows (1) or columns (2). So in your case you may want to try something like: apply(DF, 1, foo) On Sun, 8 Jun 2003, peter leonard wrote: I have a function foo(x,y) and a dataframe, DF, comprised of two vectors, x w, as follows : x w 1 1 1 2 2 1 3 3 1 4 4 1 etc I would like to apply the function foo to each 'pair' within DF e.g foo(1,1), foo(2,1), foo(3,1) etc I have tried apply(DF,foo) apply(DF[,],foo) apply(DF[DF$x,DF$w],foo) -- Cheers, Kevin -- On two occasions, I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able to rightly apprehend the kind of confusion of ideas that could provoke such a question. -- Charles Babbage (1791-1871) From Computer Stupidities: http://rinkworks.com/stupid/ -- Ko-Kang Kevin Wang Master of Science (MSc) Student SLC Tutor and Lab Demonstrator Department of Statistics University of Auckland New Zealand Homepage: http://www.stat.auckland.ac.nz/~kwan022 Ph: 373-7599 x88475 (City) x88480 (Tamaki) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] executable R scripts
Hi, I'm a newbie trying to make an R program executable on UNIX, just like one would write an executable perl script by putting #!/usr/bin/perl in the first line, and so on. It seems, though, that this would only work if I use the BATCH command to tell R to execute the program in its first argument. This would have the unfortunately side-effect of dumping all output to a file rather than stdout. Additionally, I'd want to see only the results of print statements on stdout, not all off R's output, just as when you source a script with echo=FALSE. This seems like it would be a pretty common problem, but I haven't found any explanations in the docs. Does somebody have a sample script that I could look at for advice? Or should I just bite the bullet and write a wrapper shell script? Thanks! --JRZ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] executable R scripts
On 06/08/03 17:35, John Zedlewski wrote: Hi, I'm a newbie trying to make an R program executable on UNIX, just like one would write an executable perl script by putting #!/usr/bin/perl in the first line, and so on. It seems, though, that this would only work if I use the BATCH command to tell R to execute the program in its first argument. This would have the unfortunately side-effect of dumping all output to a file rather than stdout. Additionally, I'd want to see only the results of print statements on stdout, not all off R's output, just as when you source a script with echo=FALSE. See man R for how to do it, although I'm not sure where it says the following: To get just the print output and nothing else, it helps to have print()'s in the script itself. Then you can use R --slave myfile.R printoutput.txt I also use R --vanilla myfile.R for a R file that has write.table()'s in it. For this you do not need to pipe the output anywhere. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] executable R scripts
On Sun, Jun 08, 2003 at 05:35:14PM -0700, John Zedlewski wrote: Hi, I'm a newbie trying to make an R program executable on UNIX, just like one would write an executable perl script by putting #!/usr/bin/perl in the first line, and so on. This is not currently supported, but with some luck may be supported in a later version of R. It seems, though, that this would only work if I use the BATCH command to tell R to execute the program in its first argument. This would have the unfortunately side-effect of dumping all output to a file rather than stdout. My personal favourite currently is to arrange everything (loading of package, code, ...) in a file which I can read with source() from within R. Then $ echo source(\foo.R\) | R --slave works quite well, you can redirect etc. Works on windows/cygwin too using Rterm.exe. Additionally, I'd want to see only the results of print statements on stdout, not all off R's output, just as when you source a script with echo=FALSE. I think the above fits that bill. This seems like it would be a pretty common problem, but I haven't found any explanations in the docs. Does somebody have a sample script that I could look at for advice? Or should I just bite the bullet and write a wrapper shell script? That's where the above leads to as well. Dirk -- Don't drink and derive. Alcohol and analysis don't mix. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] executable R scripts
Dirk and Jonathan-- Thanks a lot for the fast and helpful comments, guys. I ended up writing a wrapper script that uses the trick of echoing source(\filename\) into R --slave, and it works well. Thanks again! --JRZ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] early R messages to stdout
Hi, I have an R script that takes its input in the form of command-line parameters. It works fine, but R complains about every unknown arg with the ARGUMENT %s ignored message, and this goes to stdout instead of stderr because R_ConsoleFile isn't set yet. Is it really necessary to process all command line args before setting R_ConsoleFile? It seems that only Aqua systems care about their arguments when choosing the console file. I've attached a diff (against 1.7.0) that fixes this issue, so that non-Aqua unix folks can redirect stderr to /dev/null and not have to worry about those annoying argument ignored errors anymore. --JRZ__ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Basic question on applying a function to each row of adataframe
This works fine. Thanks Peter From: Spencer Graves [EMAIL PROTECTED] To: peter leonard [EMAIL PROTECTED] CC: [EMAIL PROTECTED] Subject: Re: [R] Basic question on applying a function to each row of a dataframe Date: Sun, 08 Jun 2003 13:48:04 -0700 How about the following: DF - data.frame(x=1:4, y=rep(1,4)) foo - function(x, y)x+y foo(DF$x, DF$y) [1] 2 3 4 5 hth. spencer graves peter leonard wrote: Hi, I have a function foo(x,y) and a dataframe, DF, comprised of two vectors, x w, as follows : x w 1 1 1 2 2 1 3 3 1 4 4 1 etc I would like to apply the function foo to each 'pair' within DF e.g foo(1,1), foo(2,1), foo(3,1) etc I have tried apply(DF,foo) apply(DF[,],foo) apply(DF[DF$x,DF$w],foo) However, none of the above worked. Can anyone help ? Thanks in advance, Peter __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Questions for package ts prediction
Dear helpers, I am trying to write a function to return prediction values using package ts. I have written three different versions since I am not sure what's wrong with my func2. func and func1 return the same results.But func1 and func2 don't. In particular, the only difference between func1 and func2 is the function variable name being y and data, respectively. But running the last line of the following script will give the message: Error in ts(x): object is not a matrix. I am confused. Also, could somebody kindly let me what's the answer if any for the following sunspot example from the package help: data(sunspot) (sunspot.ar - ar(sunspot.year)) # why not just sunspot.ar - ar(sunspot.year) ? predict(sunspot.ar, n.ahead=25) Thanks in advance. Zhu Wang Statistical Science Department Southern Methodist University (214)768-2453 -- zhu wang [EMAIL PROTECTED] # time series prediction func-function(data) {(esti- ar(data)) return(predict(object=esti,newdata=data,n.head=5)) } func1-function(y) {(esti- ar(y)) return(predict(esti,n.head=5)) } func2-function(data) {(esti- ar(data)) return(predict(esti,n.head=5)) } y-arima.sim(model=list(ar=c(1.7,-0.8)),n=100) func(y) func1(y) func2(y) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] looking for Prof Bates' file
Hello I'm reading up on fitting truncated Weibull distribution to data. There are posts in 2002 that point to this presentation by Prof Bates: http://www.stat.wisc.edu/~bates/JSM2001.pdf but now the file is not there. I can't find it anywhere else, Google doesn't have a cached copy for it. Could someone please give me a copy of this file, if they have it? Thanks and regards, viet. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help