Re: [R] R Beginner : Loop and adding row to dataframe
Hi Phil, I think you want: merge(listA, listB, by = NACE) which will give you: NACE Name aaa bbb ccc 11a a a c 21a a a c 31a a a c 42b a a c 52b a a c 63c a a c If you want to get rid of the Name column, the following should help: tmp - merge(listA, listB, by = NACE) tmp[,-2] NACE aaa bbb ccc 11 a a c 21 a a c 31 a a c 42 a a c 52 a a c 63 a a c Cheers, Henrik Am 22.07.2012 18:35, schrieb ph!l: Hi everybody, I am currently quite inexperienced with R. I try to create a function that simply take a value in a dataframe, look for this value in another dataframe and copy all the row that have this value This example is a simplified version of what I am doing but it's enough to help me listA Name NACE a 1 b 2 c 3 ListB NACE aaa bbb ccc 1 a a c 1 a a c 1 a a c 2 a a c 2 a a c 3 a a c 4 a a c 4 a a c 4 a a c The output i would like to have NACE aaa bbb ccc 1 a a c 1 a a c 1 a a c 2 a a c 2 a a c 3 a a c Code: listpeer - function (x) { for (i in 1:length(listA$NACE)) TriNACE[i] - subset.data.frame(ListB, NACE == NACEsample$NACE[i],) TriNACE } But the result is Warning message: In `[-.data.frame`(`*tmp*`, i, value = list(NACE = c(3L, 3L, 3L : provided xx variables to replace x variables I guess there is something wrong TriNACE[i], instead i should use something to add rows, but I really don't find anything ? Somebody has any clue ? Thank you for your time and help! -- View this message in context: http://r.789695.n4.nabble.com/R-Beginner-Loop-and-adding-row-to-dataframe-tp4637360.html Sent from the R help mailing list archive at Nabble.com. -- Dipl. Psych. Henrik Singmann PhD Student Albert-Ludwigs-Universität Freiburg, Germany http://www.psychologie.uni-freiburg.de/Members/singmann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [RsR] How does rlm in R decide its w weights for each IRLSiteration?
Hi Valentin, If the contamination is mainly in the response direction, M-estimator provides good estimates for parameters and rlm can be used. Rohan -Original Message- From: r-sig-robust-boun...@r-project.org [mailto:r-sig-robust-boun...@r-project.org] On Behalf Of Valentin Todorov Sent: Saturday, 21 July 2012 6:57 a.m. To: S Ellison Cc: r-sig-rob...@r-project.org; r-help Subject: Re: [RsR] How does rlm in R decide its w weights for each IRLSiteration? Hi Michael, S Ellison, I do not actually understand what you want to achieve with the M estimates of rlm in MASS, but why you do not give a try of lmrob in 'robustbase'. Please have a llok in the references (?lmrob) about the advantages of MM estimators over the M estimators. Best regards, Valentin On Fri, Jul 20, 2012 at 5:11 PM, S Ellison s.elli...@lgcgroup.com wrote: -Original Message- Subject: [RsR] How does rlm in R decide its w weights for each IRLS iteration? I am also confused about the manual: a. The input arguments: wt.method are the weights case weights (giving the relative importance of case, so a weight of 2 means there are two of these) or the inverse of the variances, so a weight of two means this error is half as variable? When you give rlm weights (called 'weights', not 'w' on input, though you can abbreviate to 'w'), you need to tell it which of these two possibilities you used. If you gave it case numbers, say wt.method=case; if you gave it inverse variance weights, say wt.method=inv.var. The default is inv.var. The input argument w is used for the initial values of the rlm IRLS weighting and the output value w is the converged w. There is no input argument 'w' for rlm (see above). The output w are a calculated using the psi function, so between 0 and 1. The effective weights for the final estimate would then be something like w*weights, using the full name of the input argument (and if I haven't forgotten a square root somewhere). At least, that would work for a simple location estimate (eg rlm(x~1)). If my understanding above is correct, how does rlm decide its w for each IRLS iteration then? It uses the given psi functions to calculate the iterative weights based on the scaled residuals. Any pointers/tutorials/notes to the calculation of these w's in each IRLS iteration? Read the cited references for a detailed guide. Or, of course, MASS - the package is, after all, intended to support the book, not replace it. S Ellison *** This email and any attachments are confidential. Any u...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reorder in decreasing order
On 12-07-22 5:33 PM, arun wrote: Hi Duncan, That was my original suggestioin. His reply suggests that it is not that he wanted. I didn't see your reply. Maybe you sent it privately? In any case, I think it is up to Sverre to give an example of what he wants, since your suggestion, Weidong's and mine all appear to do what he asked for. Duncan Murdoch Not quite. It still orders the values in an increasing order, you've just changed the values here. I'm using reorder() to prepare for plotting the values, so I can't change the values. bymean2-with(InsectSprays,reorder(spray,count,function(x) -mean(x))) bymean2 [1] A A A A A A A A A A A A B B B B B B B B B B B B C C C C C C C C C C C C D D [39] D D D D D D D D D D E E E E E E E E E E E E F F F F F F F F F F F F attr(,scores) A B C D E F -14.50 -15.33 -2.08 -4.916667 -3.50 -16.67 Levels: F B A D E C ### A.K. - Original Message - From: Duncan Murdoch murdoch.dun...@gmail.com To: Sverre Stausland john...@fas.harvard.edu Cc: r-help@r-project.org Sent: Sunday, July 22, 2012 4:56 PM Subject: Re: [R] Reorder in decreasing order On 12-07-22 12:27 PM, Sverre Stausland wrote: reorder() is probably the best way to order the levels in a vector without manually specifying the order. But reorder() orders by default in an increasing order: The levels are ordered such that the values returned by ‘FUN’ are in increasing order. Is there a way to do what reorder() does, but order the levels according to a _decreasing_ order of the values? Yes, as Weidong suggested: x - rnorm(20) y - factor(sample(letters[1:3], 20, replace=TRUE)) reorder(y, x, mean) [1] a a c c c b b a b a c c b b a a a a c a attr(,scores) a b c -0.2012975 0.6117830 0.2180352 Levels: a c b reorder(y, x, function(x) -mean(x)) [1] a a c c c b b a b a c c b b a a a a c a attr(,scores) a b c 0.2012975 -0.6117830 -0.2180352 Levels: b c a Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] taylor expansions with real vectors
Hi, Thanks a lot for answer. It is what I mean. But the code does not seem to work ( Le Jul 19, 2012 à 8:52 AM, Petr Savicky [via R] a écrit : On Wed, Jul 18, 2012 at 06:02:27PM -0700, bilelsan wrote: Leave the Taylor expansion aside, how is it possible to compute with [R]: f(e) = e1 + e2 #for r = 1 + 1/2!*e1^2 + 1/2!*e2^2 + 1/2!*e1*e2 #for r = 2, excluding e2*e1 + 1/3!*e1^3 + 1/3!*e1^2*e2 + 1/3!*e2^2*e1 + 1/3!*e2^3 #for r = 3, excluding e2*e1^2 and e1*e2^2 + ... #for r = k In other words, I am trying to figure out how to compute all the possible combinations as exposed above. Hi. For a general r, do you mean the following sum of products? 1/r! (e1^r + e1^(r-1) e2 + ... e1 e2^(r-1) + e2^r) If this is correct, then try f - 0 for (r in 1:k) { f - f + 1/factorial(r) * sum(e1^(r:0)*e2^(0:r)) } Hope this helps. Petr Savicky. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Re-taylor-expansions-with-real-vectors-tp4636948p4636989.html To unsubscribe from Re: taylor expansions with real vectors, click here. NAML -- View this message in context: http://r.789695.n4.nabble.com/Re-taylor-expansions-with-real-vectors-tp4636948p4637388.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] setar function error message
Hi all, I have problem to estimate a SETAR model. I always get an error message. Here is the code: ## there are 4175 observation in the series (a). a[1:10,1] [1] 1.496498 1.496602 1.496636 1.496515 1.496515 1.496463 1.496429 1.496549 1.496480 [10] 1.496498 library(tsDyn) selectSETAR(a, m=2)Using maximum autoregressive order for low regime: mL = 2 Using maximum autoregressive order for high regime: mH = 2 Searching on 1552 possible threshold values within regimes with sufficient ( 15% ) number of observations Searching on 6208 combinations of thresholds ( 1552 ), thDelay ( 1 ), mL ( 2 ) and MM ( 2 ) Results of the grid search for 1 threshold thDelay mL mH th pooled-AIC 10 2 2 1.674402 -43590.12 20 2 2 1.674218 -43588.81 30 2 2 1.673758 -43588.24 40 2 2 1.674011 -43588.19 50 2 2 1.674480 -43587.85 60 2 2 1.673988 -43587.62 70 2 2 1.673666 -43587.56 80 2 2 1.674471 -43587.16 90 2 2 1.673620 -43586.51 10 0 2 2 1.673574 -43585.83 mod.star - setar(a, m=2, steps=5, thDelay=0, th=1.674402)Error in rep(NA, length(x) - length(reg)) : invalid 'times' argument Would someone tell me the meaning of this error message? and how to solve the problem? Kind Regards, Ario [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Workspace Question.
I recently installed R version 2.14.1. After a session (not my first) in 2.14.1, I saved the 'workspace image'. I then opened earlier R versions, 2.14.0 2.12.0, and the only objects listed were from the 2.14.1 session. All the work under those previous versions were gone, to my dismay. This did not happen when I was working in 2.14.0, as 2.12.0 objects were not affected when saving the workspace image in 2.14.0. Are the 2.14.0 2.12.0 objects retrievable or permanently deleted? Thanks for any input, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] npindex: fitted values of the function itself?
Dear list, I realized my mistake! For those who are interested: what I predicted was in fact G*(X'b*): A single-index model assumes a linear index function for which I obtain the estimated coefficients b*. The predicted probabilities are then G*(X'b*). Indeed, this is equivalent to the probit case, where I only need G_hat=pnorm(x_fit) if x_fit is the linear prediction. Best, Kristin -- View this message in context: http://r.789695.n4.nabble.com/npindex-fitted-values-of-the-function-itself-tp4637070p4637408.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The best solver for non-smooth functions?
There is a new blog post that is pertinent to this question: http://www.portfolioprobe.com/2012/07/23/a-comparison-of-some-heuristic-optimization-methods/ Pat On 18/07/2012 21:00, Cren wrote: # Hi all, # consider the following code (please, run it: # it's fully working and requires just few minutes # to finish): require(CreditMetrics) require(clusterGeneration) install.packages(Rdonlp2, repos= c(http://R-Forge.R-project.org;, getOption(repos))) install.packages(Rsolnp2, repos= c(http://R-Forge.R-project.org;, getOption(repos))) require(Rdonlp2) require(Rsolnp) require(Rsolnp2) N - 3 n - 10 r - 0.0025 ead - rep(1/3,3) rc - c(AAA, AA, A, BBB, BB, B, CCC, D) lgd - 0.99 rating - c(BB, BB, BBB) firmnames - c(firm 1, firm 2, firm 3) alpha - 0.99 # One year empirical migration matrix from Standard Poor's website rc - c(AAA, AA, A, BBB, BB, B, CCC, D) M - matrix(c(90.81, 8.33, 0.68, 0.06, 0.08, 0.02, 0.01, 0.01, 0.70, 90.65, 7.79, 0.64, 0.06, 0.13, 0.02, 0.01, 0.09, 2.27, 91.05, 5.52, 0.74, 0.26, 0.01, 0.06, 0.02, 0.33, 5.95, 85.93, 5.30, 1.17, 1.12, 0.18, 0.03, 0.14, 0.67, 7.73, 80.53, 8.84, 1.00, 1.06, 0.01, 0.11, 0.24, 0.43, 6.48, 83.46, 4.07, 5.20, 0.21, 0, 0.22, 1.30, 2.38, 11.24, 64.86, 19.79, 0, 0, 0, 0, 0, 0, 0, 100 )/100, 8, 8, dimnames = list(rc, rc), byrow = TRUE) # Correlation matrix rho - rcorrmatrix(N) ; dimnames(rho) = list(firmnames, firmnames) # Credit Value at Risk cm.CVaR(M, lgd, ead, N, n, r, rho, alpha, rating) # Risk neutral yield rates Y - cm.cs(M, lgd) y - c(Y[match(rating[1],rc)], Y[match(rating[2],rc)], Y[match(rating[3],rc)]) ; y # The function to be minimized sharpe - function(w) { - (t(w) %*% y) / cm.CVaR(M, lgd, ead, N, n, r, rho, alpha, rating) } # The linear constraints constr - function(w) { sum(w) } # Results' matrix (it's empty by now) Results - matrix(NA, nrow = 3, ncol = 4) rownames(Results) - list('donlp2', 'solnp', 'solnp2') colnames(Results) - list('w_1', 'w_2', 'w_3', 'Sharpe') # See the differences between different solvers rho Results[1,1:3] - round(donlp2(fn = sharpe, par = rep(1/N,N), par.lower = rep(0,N), par.upper = rep(1,N), A = t(rep(1,N)), lin.lower = 1, lin.upper = 1)$par, 2) Results[2,1:3] - round(solnp(pars = rep(1/N,N), fun = sharpe, eqfun = constr, eqB = 1, LB = rep(0,N), UB = rep(1,N))$pars, 2) Results[3,1:3] - round(solnp2(par = rep(1/N,N), fun = sharpe, eqfun = constr, eqB = 1, LB = rep(0,N), UB = rep(1,N))$pars, 2) for(i in 1:3) { Results[i,4] - abs(sharpe(Results[i,1:3])) } Results # In fact the sharpe function I previously defined # is not smooth because of the cm.CVaR function. # If you change correlation matrix, ratings or yields # you see how different solvers produce different # parameters estimation. # Then the main issue is: how may I know which is the # best solver at all to deal with non-smooth functions # such as this one? -- View this message in context: http://r.789695.n4.nabble.com/The-best-solver-for-non-smooth-functions-tp4636934.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] easy way to fit saturated model in sem package?
I will check out the lavaan package. Dear Joshua, The lavaan package may help you. The FIML estimator typically starts with the EM algorithm to estimate the moments of the unrestricted model. There is no 'one-shot' function for it, at the moment, but if you only need those moments, you can do something like this: Suppose your data is a data.frame called 'HS.missing', then the following commands can be used to get the estimated moments: library(lavaan) Data - lavaan:::lavData(HS.missing, ov.names=names(HS.missing), missing=fiml) # we assume only 1 group Missing - lavaan:::getMissingPatternStats(X = Data@X[[1L]], Mp = Data@Mp[[1L]]) # compute moments using EM algorithm Moments - lavaan:::estimate.moments.EM(X=Data@X[[1L]], M=Missing, verbose=TRUE) # estimated covariance matrix Moments$sigma # estimated mean vector Moments$mu Hope this helps, Yves Rosseel http://lavaan.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Workspace Question.
On 23/07/12 19:36, Michael Trianni wrote: I recently installed R version 2.14.1. After a session (not my first) in 2.14.1, I saved the 'workspace image'. I then opened earlier R versions, 2.14.0 2.12.0, and the only objects listed were from the 2.14.1 session. All the work under those previous versions were gone, to my dismay. This did not happen when I was working in 2.14.0, as 2.12.0 objects were not affected when saving the workspace image in 2.14.0. This has nothing to do with versions of R. Whenever you start R, any previously saved image of the workspace is loaded (from a file called .RData). Your workspace/global environment will then include any objects that were in the workspace when you saved it. If you save the workspace when you exit R, then the previously saved image of the workspace (the .RData file) is overwritten. If objects that were in a previously saved image of the workspace are no longer present when you start R, then you must have removed them at some stage before saving the workspace. Either that, or you removed the file .RData from the directory in which you are starting R. That's all there is to it. The objects did not disappear by magic. Are the 2.14.0 2.12.0 objects retrievable or permanently deleted? If you have backups of your file system then a copy of .RData (one that was made when you were using 2.12.0 or 2.14.0) can be restored from the backup and that will have the objects that you are looking for. If you don't have backups, then I am afraid that you are SOL. Those objects are gone for good. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setar function error message
Hello, Your problem is nto reproducible without the complete a dataset. Regards, Pascal Le 23/07/2012 09:49, Ario Ario a écrit : Hi all, I have problem to estimate a SETAR model. I always get an error message. Here is the code: ## there are 4175 observation in the series (a). a[1:10,1] [1] 1.496498 1.496602 1.496636 1.496515 1.496515 1.496463 1.496429 1.496549 1.496480 [10] 1.496498 library(tsDyn) selectSETAR(a, m=2)Using maximum autoregressive order for low regime: mL = 2 Using maximum autoregressive order for high regime: mH = 2 Searching on 1552 possible threshold values within regimes with sufficient ( 15% ) number of observations Searching on 6208 combinations of thresholds ( 1552 ), thDelay ( 1 ), mL ( 2 ) and MM ( 2 ) Results of the grid search for 1 threshold thDelay mL mH th pooled-AIC 10 2 2 1.674402 -43590.12 20 2 2 1.674218 -43588.81 30 2 2 1.673758 -43588.24 40 2 2 1.674011 -43588.19 50 2 2 1.674480 -43587.85 60 2 2 1.673988 -43587.62 70 2 2 1.673666 -43587.56 80 2 2 1.674471 -43587.16 90 2 2 1.673620 -43586.51 10 0 2 2 1.673574 -43585.83 mod.star - setar(a, m=2, steps=5, thDelay=0, th=1.674402)Error in rep(NA, length(x) - length(reg)) : invalid 'times' argument Would someone tell me the meaning of this error message? and how to solve the problem? Kind Regards, Ario [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 3D scatterplot, using size of symbols for the fourth variable
Dear R fans, I would like to create a scatterplot showing the relationship between 4 continuous variables. I thought of using the package scatterplot 3d to have a 3-dimensional plot and then using the size of the symbols to represent the 4th variable. Does anybody know how to do this? I already tried to create this graph using the colour of the symbols, but I was unable to generate a legend. Can anyone perhaps tell me why? #dataframe structure(list(Shannon = c(2.707, 2.525, 2.52, 2.447, 2.655, 2.404, 2.63, 3.427, 3.381, 2.968, 3.247, 3.063, 3.284, 3.24, 2.948, 3.133, 3.482, 3.467, 3.474, 3.659, 3.294, 3.394, 3.293, 3.312, 3.386, 3.755, 3.419, 3.586, 3.539, 3.464, 3.461), H0 = c(19L, 14L, 17L, 13L, 17L, 14L, 17L, 46L, 49L, 29L, 45L, 51L, 65L, 65L, 43L, 56L, 65L, 58L, 59L, 68L, 43L, 55L, 52L, 51L, 55L, 81L, 74L, 86L, 57L, 53L, 58L), H1 = c(14.98425525, 12.49089526, 12.42859666, 11.55363377, 14.22498606, 11.06735739, 13.8737699, 30.78415163, 29.40015657, 19.45297471, 25.71308484, 21.3916359, 26.68228868, 25.53372175, 19.06778001, 22.94270452, 32.52470648, 32.04047669, 32.26554685, 38.82250095, 26.95045014, 29.78485372, 26.92351316, 27.43995053, 29.54752547, 42.73421981, 30.53886089, 36.08942911, 34.4324695, 31.9444993, 31.8488094), H2 = c(11.755, 11.255, 9.507, 10.125, 12.042, 8.397, 11.791, 23.171, 20.917, 13.902, 15.813, 13.739, 15.654, 14.692, 12.078, 13.546, 21.211, 20.437, 22.152, 26.635, 18.077, 19.381, 18.042, 18.856, 19.848, 28.671, 18.047, 21.797, 24.508, 22.898, 21.368), Hinf = c(5.128205128, 7.692307692, 4.504504505, 5.988023952, 6.802721088, 3.831417625, 6.493506494, 9.708737864, 9.615384615, 7.142857143, 5.376344086, 5.988023952, 6.25, 6.060606061, 4.975124378, 5.291005291, 10, 6.756756757, 9.090909091, 10.1010101, 6.41025641, 7.299270073, 8, 9.174311927, 9.345794393, 12.65822785, 7.407407407, 8.130081301, 10.1010101, 9.803921569, 9.523809524), J = c(0.92, 0.957, 0.889, 0.954, 0.937, 0.911, 0.928, 0.895, 0.869, 0.881, 0.853, 0.779, 0.787, 0.776, 0.784, 0.778, 0.834, 0.854, 0.852, 0.867, 0.876, 0.847, 0.833, 0.842, 0.845, 0.855, 0.794, 0.805, 0.875, 0.872, 0.852), EG_18 = c(11.89625579, 12.12535291, 10.52322645, 13, 11.94721408, 11.74138905, 11.41864537, 13.51174611, 13.08380677, 11.98338672, 12.50012399, 11.33153181, 12.05860699, 11.79889475, 10.978733, 11.56314923, 13.04645068, 13.19345708, 13.24637756, 13.8929676, 12.76104192, 12.92668353, 12.46911655, 12.62784458, 12.87743784, 14.10318753, 12.5553675, 13.19308414, 13.6559399, 13.38873489, 13.18810233), TD = c(3.584221748, 3.327044025, 3.260869565, 3.52173913, 3.635220126, 2.890710383, 3.136082474, 3.433096849, 3.49944009, 3.482109228, 3.561288523, 2.994932583, 3.36222162, 3.252259735, 2.986886848, 3.037585943, 3.181734279, 3.012922623, 3.194701667, 3.280983686, 3.114578401, 3.092463169, 3.352683024, 3.283793025, 3.358701828, 3.445251948, 2.873097033, 3.03782087, 3.706619585, 3.325312529, 3.34325186 ), MI = c(2.780487805, 2.52173913, 2.6, 2.8, 2.911764706, 2.434782609, 2.435897436, 2.75, 2.748, 2.720930233, 2.818584071, 2.377319588, 2.502692998, 2.565891473, 2.391644909, 2.418439716, 2.663179916, 2.630177515, 2.550724638, 2.696103896, 2.75464684, 2.703583062, 2.825072886, 2.721088435, 2.782334385, 2.8, 2.618834081, 2.67, 2.743589744, 2.766917293, 2.678321678 ), D = c(71.81184669, 58.1027668, 70.89466089, 71.0550887, 65.69900688, 70.1863354, 61.76980914, 69.90865312, 67.91876076, 69.8495212, 70.04579295, 64.99869296, 67.95894908, 68.35197804, 64.01693209, 64.74605873, 69.7579387, 68.87333164, 67.97582936, 71.24014378, 71.19738387, 71.54931462, 71.16698452, 70.27831123, 69.36640407, 70.84656085, 66.61085878, 68.05601309, 70.77177025, 71.58583791, 70.39500159), Dstar = c(76.57440089, 60.99585062, 77.46767581, 74.46183953, 69.54177898, 76.21091355, 65.7635468, 72.70359475, 71.04353504, 74.38811189, 74.44360179, 70.00021147, 72.46614626, 73.26103888, 69.67652555, 69.76706304, 73.05627881, 72.20245655, 70.9831013, 73.8268811, 75.1172186, 75.19617965, 75.12340981, 73.96171487, 72.8167, 73.22565624, 70.36018789, 71.22055825, 73.51203798, 74.57343, 73.61652018), Dplus = c(79.03091061, 63.89324961, 75.94537815, 74.35897436, 70.27310924, 76.13814757, 72.89915966, 72.10489993, 71.81729835, 72.06192822, 73.14574315, 73.91253644, 74.60851648, 73.50481859, 75.84204413, 73.58345358, 74.11401099, 73.79656037, 73.38231611, 73.45415778, 75.39406006, 75.74795575, 72.89377289, 74.42016807, 74.15103415, 73.55820106, 71.94689797, 73.71748699, 72.52953813, 74.98444951, 74.6151092), Lplus = c(435.1033529, 252.705357, 668.3739672, 828.6707188, 504.3671881, 493.6306125, 577.0690629, 458.7755483, 507.6234298, 470.0924753, 481.281377, 455.4709007, 533.7458772, 520.7742317, 504.3544853, 482.9535419, 514.2758555, 503.053443, 513.1260705, 536.5610267, 523.9890434, 508.1071602, 489.4344326, 529.9859238, 505.6095669, 510.3928711, 482.2434835, 517.598572, 464.1093393, 525.4025899,
Re: [R] setar function error message
Hi, I apologize for not attaching a complete dataset. The dataset contains only 1000 observations. Please find it in the attachment. Thanks (Ario) On Mon, Jul 23, 2012 at 3:46 PM, Pascal Oettli kri...@ymail.com wrote: Hello, Your problem is nto reproducible without the complete a dataset. Regards, Pascal Le 23/07/2012 09:49, Ario Ario a écrit : Hi all, I have problem to estimate a SETAR model. I always get an error message. Here is the code: ## there are 4175 observation in the series (a). a[1:10,1] [1] 1.496498 1.496602 1.496636 1.496515 1.496515 1.496463 1.496429 1.496549 1.496480 [10] 1.496498 library(tsDyn) selectSETAR(a, m=2)Using maximum autoregressive order for low regime: mL = 2 Using maximum autoregressive order for high regime: mH = 2 Searching on 1552 possible threshold values within regimes with sufficient ( 15% ) number of observations Searching on 6208 combinations of thresholds ( 1552 ), thDelay ( 1 ), mL ( 2 ) and MM ( 2 ) Results of the grid search for 1 threshold thDelay mL mH th pooled-AIC 10 2 2 1.674402 -43590.12 20 2 2 1.674218 -43588.81 30 2 2 1.673758 -43588.24 40 2 2 1.674011 -43588.19 50 2 2 1.674480 -43587.85 60 2 2 1.673988 -43587.62 70 2 2 1.673666 -43587.56 80 2 2 1.674471 -43587.16 90 2 2 1.673620 -43586.51 10 0 2 2 1.673574 -43585.83 mod.star - setar(a, m=2, steps=5, thDelay=0, th=1.674402)Error in rep(NA, length(x) - length(reg)) : invalid 'times' argument Would someone tell me the meaning of this error message? and how to solve the problem? Kind Regards, Ario [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. RUPEE.US. 1 1.496498 2 1.496602 3 1.496636 4 1.496515 5 1.496515 6 1.496463 7 1.496429 8 1.496549 9 1.496480 10 1.496498 11 1.496498 12 1.496533 13 1.496533 14 1.496498 15 1.496515 16 1.496498 17 1.496533 18 1.496498 19 1.496498 20 1.496463 21 1.496533 22 1.496463 23 1.496445 24 1.496515 25 1.496480 26 1.496498 27 1.496463 28 1.496515 29 1.496515 30 1.496498 31 1.496480 32 1.496480 33 1.496445 34 1.496498 35 1.496480 36 1.496463 37 1.496463 38 1.496515 39 1.496480 40 1.496533 41 1.496515 42 1.496445 43 1.496515 44 1.496895 45 1.497344 46 1.496895 47 1.496533 48 1.496480 49 1.496498 50 1.496515 51 1.496515 52 1.496515 53 1.496480 54 1.496533 55 1.496515 56 1.496515 57 1.496533 58 1.496498 59 1.496515 60 1.496498 61 1.496463 62 1.496498 63 1.496498 64 1.496515 65 1.496515 66 1.496445 67 1.496498 68 1.496480 69 1.496515 70 1.496498 71 1.496445 72 1.496376 73 1.496480 74 1.496515 75 1.496480 76 1.496463 77 1.496515 78 1.496515 79 1.496515 80 1.496498 81 1.496515 82 1.496480 83 1.496498 84 1.496445 85 1.496480 86 1.496480 87 1.496533 88 1.496480 89 1.496480 90 1.496498 91 1.496515 92 1.496480 93 1.496498 94 1.496515 95 1.496515 96 1.496498 97 1.496515 98 1.496498 99 1.496533 100 1.496480 101 1.496515 102 1.496515 103 1.496515 104 1.496515 105 1.496515 106 1.496533 107 1.496533 108 1.496515 109 1.496602 110 1.496515 111 1.496498 112 1.496480 113 1.496515 114 1.496498 115 1.496515 116 1.496480 117 1.496515 118 1.496533 119 1.496498 120 1.496498 121 1.496498 122 1.496480 123 1.496498 124 1.496480 125 1.496498 126 1.496498 127 1.496480 128 1.496498 129 1.496515 130 1.496515 131 1.496498 132 1.496480 133 1.496498 134 1.496636 135 1.496549 136 1.496515 137 1.496533 138 1.496498 139 1.496480 140 1.496498 141 1.496463 142 1.496480 143 1.496533 144 1.496498 145 1.496567 146 1.496238 147 1.496498 148 1.496498 149 1.496480 150 1.496498 151 1.496515 152 1.496498 153 1.496515 154 1.496533 155 1.496533 156 1.496515 157 1.496480 158 1.496515 159 1.496533 160 1.496480 161 1.496533 162 1.496515 163 1.496498 164 1.496480 165 1.496515 166 1.496515 167 1.496411 168 1.496429 169
[R] mgcv: Extract random effects from gam model
Hi everyone, I can't figure out how to extract by-factor random effect adjustments from a gam model (mgcv package). Example (from ?gam.vcomp): library(mgcv) set.seed(3) dat - gamSim(1,n=400,dist=normal,scale=2) a - factor(sample(1:10,400,replace=TRUE)) b - factor(sample(1:7,400,replace=TRUE)) Xa - model.matrix(~a-1)## random main effects Xb - model.matrix(~b-1) Xab - model.matrix(~a:b-1) ## random interaction dat$y - dat$y + Xa%*%rnorm(10)*.5 + Xb%*%rnorm(7)*.3 + Xab%*%rnorm(70)*.7 dat$a - a;dat$b - b mod - gam(y ~ s(a, bs=re) + s(x2, k = 15), data = dat) When I run plot(mod) I can see the adjustments for the a factor, but I don't know which adjustment is associated with which factor. Is it possible to extract the information underlying this plot numerically? Thanks! Jan -- View this message in context: http://r.789695.n4.nabble.com/mgcv-Extract-random-effects-from-gam-model-tp4637415.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setar function error message
Hello, It works for me (with a warning message), by adding this line before the setar procedure: a - as.ts(a) As I don't know the time step, thus it is only a pseudo time-series. Regards, Pascal Le 23/07/2012 18:18, Ario Ario a écrit : Hi, I apologize for not attaching a complete dataset. The dataset contains only 1000 observations. Please find it in the attachment. Thanks (Ario) On Mon, Jul 23, 2012 at 3:46 PM, Pascal Oettli kri...@ymail.com wrote: Hello, Your problem is nto reproducible without the complete a dataset. Regards, Pascal Le 23/07/2012 09:49, Ario Ario a écrit : Hi all, I have problem to estimate a SETAR model. I always get an error message. Here is the code: ## there are 4175 observation in the series (a). a[1:10,1] [1] 1.496498 1.496602 1.496636 1.496515 1.496515 1.496463 1.496429 1.496549 1.496480 [10] 1.496498 library(tsDyn) selectSETAR(a, m=2)Using maximum autoregressive order for low regime: mL = 2 Using maximum autoregressive order for high regime: mH = 2 Searching on 1552 possible threshold values within regimes with sufficient ( 15% ) number of observations Searching on 6208 combinations of thresholds ( 1552 ), thDelay ( 1 ), mL ( 2 ) and MM ( 2 ) Results of the grid search for 1 threshold thDelay mL mH th pooled-AIC 10 2 2 1.674402 -43590.12 20 2 2 1.674218 -43588.81 30 2 2 1.673758 -43588.24 40 2 2 1.674011 -43588.19 50 2 2 1.674480 -43587.85 60 2 2 1.673988 -43587.62 70 2 2 1.673666 -43587.56 80 2 2 1.674471 -43587.16 90 2 2 1.673620 -43586.51 10 0 2 2 1.673574 -43585.83 mod.star - setar(a, m=2, steps=5, thDelay=0, th=1.674402)Error in rep(NA, length(x) - length(reg)) : invalid 'times' argument Would someone tell me the meaning of this error message? and how to solve the problem? Kind Regards, Ario [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mgcv: Extract random effects from gam model
Hi Jan , coef(mod) will extract the coefficients, including for a. They are labelled and for 'a' are in the same order as levels(a). best, Simon On 23/07/12 10:08, janvanhove wrote: Hi everyone, I can't figure out how to extract by-factor random effect adjustments from a gam model (mgcv package). Example (from ?gam.vcomp): library(mgcv) set.seed(3) dat - gamSim(1,n=400,dist=normal,scale=2) a - factor(sample(1:10,400,replace=TRUE)) b - factor(sample(1:7,400,replace=TRUE)) Xa - model.matrix(~a-1)## random main effects Xb - model.matrix(~b-1) Xab - model.matrix(~a:b-1) ## random interaction dat$y - dat$y + Xa%*%rnorm(10)*.5 + Xb%*%rnorm(7)*.3 + Xab%*%rnorm(70)*.7 dat$a - a;dat$b - b mod - gam(y ~ s(a, bs=re) + s(x2, k = 15), data = dat) When I run plot(mod) I can see the adjustments for the a factor, but I don't know which adjustment is associated with which factor. Is it possible to extract the information underlying this plot numerically? Thanks! Jan -- View this message in context: http://r.789695.n4.nabble.com/mgcv-Extract-random-effects-from-gam-model-tp4637415.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Wood, Mathematical Science, University of Bath BA2 7AY UK +44 (0)1225 386603 http://people.bath.ac.uk/sw283 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3D scatterplot, using size of symbols for the fourth variable
On 07/23/2012 06:20 PM, elpape wrote: Dear R fans, I would like to create a scatterplot showing the relationship between 4 continuous variables. I thought of using the package scatterplot 3d to have a 3-dimensional plot and then using the size of the symbols to represent the 4th variable. Does anybody know how to do this? Hi Ellen, The plot is pretty crowded, but you could try this (newdata is your data frame): library(plotrix) size_n_color(newdata$MI,newdata$TD,size=(newdata$EG_18-10.5)/1000, col=color.scale(newdata$Dstar,1,c(0,1,0),c(0,0,1),xrange=c(60,80)), yat=c(2.8,3.2,3.6),ylim=c(2.75,3.8),xlab=MI,ylab=TD, main=Four dimensional plot) color.legend(2.3,2.52,2.5,2.57,seq(60,80,by=4), color.scale(seq(60,80,by=4),1,c(0,1,0),c(0,0,1))) mtext(Size = EG_18,side=1,at=2.83,line=3) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] marginal effect lmer
Hi everybody, I try to calculate and display the marginal effect(s) in a hierarchical model using lmer. Here is my model: m1- lmer(vote2011~ Catholic + Attendance+ logthreshold + West + Catholicproportion+ (Catholic * Catholicproportion) + (Attendance*Catholicproportion) + Catholicproportion²+ (Catholic *Catholicproportion²)+ (Attendance* Catholicproportion²) + (1 + Attendance+ Catholic | canton), data=dat2011, verbose = T) I want to display the me of the individual level variable Catholic depending on the contextual variable Catholicproportion (showing also the 95% ci). So far I tried a bit with the effects package, but without success. Does anybody know how to display that? Thanks in advance for your help! Andigo -- View this message in context: http://r.789695.n4.nabble.com/marginal-effect-lmer-tp4637421.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with Loop
hello there, I'm an R beginner and got plunged into this. I guess my attempts are hopeless so far, so I won't even show them. I want to write a loop, which prints all erroneous values. My definition of erroneous: If the current counts (partridge counts in a hunting district) differ from last years counts by more than 50 percent and absolut values differ by more than 5 animals I want r to print these values. I have a grouping variable District D, the year Y and the counts C. example table: D Y C a 200510 a 20060 a 20079 b 20051 b 20060 b 20071 c 20055 c 2006NA c 20074 Although the difference in a and b is 100 percent I would doubt a's population breakdown, whereas District b is credible. To confuse things I want the loop to skip missing values and instead look at the year after. Any help is very much appreciated! Thanks, Katrin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3D scatterplot, using size of symbols for the fourth variable
Hello, Maybe the plot argument 'cex' will do it. ?par # in particular the entry for 'cex' s3d$points(MI, TD, EG_18, pch = 16, col = Dstarscale, cex=2*(Dstar - min(Dstar) + 0.3)) Multiply 2 into it because the points are very small. Hope this helps, Rui Barradas Em 23-07-2012 09:20, elpape escreveu: Dear R fans, I would like to create a scatterplot showing the relationship between 4 continuous variables. I thought of using the package scatterplot 3d to have a 3-dimensional plot and then using the size of the symbols to represent the 4th variable. Does anybody know how to do this? I already tried to create this graph using the colour of the symbols, but I was unable to generate a legend. Can anyone perhaps tell me why? #dataframe structure(list(Shannon = c(2.707, 2.525, 2.52, 2.447, 2.655, 2.404, 2.63, 3.427, 3.381, 2.968, 3.247, 3.063, 3.284, 3.24, 2.948, 3.133, 3.482, 3.467, 3.474, 3.659, 3.294, 3.394, 3.293, 3.312, 3.386, 3.755, 3.419, 3.586, 3.539, 3.464, 3.461), H0 = c(19L, 14L, 17L, 13L, 17L, 14L, 17L, 46L, 49L, 29L, 45L, 51L, 65L, 65L, 43L, 56L, 65L, 58L, 59L, 68L, 43L, 55L, 52L, 51L, 55L, 81L, 74L, 86L, 57L, 53L, 58L), H1 = c(14.98425525, 12.49089526, 12.42859666, 11.55363377, 14.22498606, 11.06735739, 13.8737699, 30.78415163, 29.40015657, 19.45297471, 25.71308484, 21.3916359, 26.68228868, 25.53372175, 19.06778001, 22.94270452, 32.52470648, 32.04047669, 32.26554685, 38.82250095, 26.95045014, 29.78485372, 26.92351316, 27.43995053, 29.54752547, 42.73421981, 30.53886089, 36.08942911, 34.4324695, 31.9444993, 31.8488094), H2 = c(11.755, 11.255, 9.507, 10.125, 12.042, 8.397, 11.791, 23.171, 20.917, 13.902, 15.813, 13.739, 15.654, 14.692, 12.078, 13.546, 21.211, 20.437, 22.152, 26.635, 18.077, 19.381, 18.042, 18.856, 19.848, 28.671, 18.047, 21.797, 24.508, 22.898, 21.368), Hinf = c(5.128205128, 7.692307692, 4.504504505, 5.988023952, 6.802721088, 3.831417625, 6.493506494, 9.708737864, 9.615384615, 7.142857143, 5.376344086, 5.988023952, 6.25, 6.060606061, 4.975124378, 5.291005291, 10, 6.756756757, 9.090909091, 10.1010101, 6.41025641, 7.299270073, 8, 9.174311927, 9.345794393, 12.65822785, 7.407407407, 8.130081301, 10.1010101, 9.803921569, 9.523809524), J = c(0.92, 0.957, 0.889, 0.954, 0.937, 0.911, 0.928, 0.895, 0.869, 0.881, 0.853, 0.779, 0.787, 0.776, 0.784, 0.778, 0.834, 0.854, 0.852, 0.867, 0.876, 0.847, 0.833, 0.842, 0.845, 0.855, 0.794, 0.805, 0.875, 0.872, 0.852), EG_18 = c(11.89625579, 12.12535291, 10.52322645, 13, 11.94721408, 11.74138905, 11.41864537, 13.51174611, 13.08380677, 11.98338672, 12.50012399, 11.33153181, 12.05860699, 11.79889475, 10.978733, 11.56314923, 13.04645068, 13.19345708, 13.24637756, 13.8929676, 12.76104192, 12.92668353, 12.46911655, 12.62784458, 12.87743784, 14.10318753, 12.5553675, 13.19308414, 13.6559399, 13.38873489, 13.18810233), TD = c(3.584221748, 3.327044025, 3.260869565, 3.52173913, 3.635220126, 2.890710383, 3.136082474, 3.433096849, 3.49944009, 3.482109228, 3.561288523, 2.994932583, 3.36222162, 3.252259735, 2.986886848, 3.037585943, 3.181734279, 3.012922623, 3.194701667, 3.280983686, 3.114578401, 3.092463169, 3.352683024, 3.283793025, 3.358701828, 3.445251948, 2.873097033, 3.03782087, 3.706619585, 3.325312529, 3.34325186 ), MI = c(2.780487805, 2.52173913, 2.6, 2.8, 2.911764706, 2.434782609, 2.435897436, 2.75, 2.748, 2.720930233, 2.818584071, 2.377319588, 2.502692998, 2.565891473, 2.391644909, 2.418439716, 2.663179916, 2.630177515, 2.550724638, 2.696103896, 2.75464684, 2.703583062, 2.825072886, 2.721088435, 2.782334385, 2.8, 2.618834081, 2.67, 2.743589744, 2.766917293, 2.678321678 ), D = c(71.81184669, 58.1027668, 70.89466089, 71.0550887, 65.69900688, 70.1863354, 61.76980914, 69.90865312, 67.91876076, 69.8495212, 70.04579295, 64.99869296, 67.95894908, 68.35197804, 64.01693209, 64.74605873, 69.7579387, 68.87333164, 67.97582936, 71.24014378, 71.19738387, 71.54931462, 71.16698452, 70.27831123, 69.36640407, 70.84656085, 66.61085878, 68.05601309, 70.77177025, 71.58583791, 70.39500159), Dstar = c(76.57440089, 60.99585062, 77.46767581, 74.46183953, 69.54177898, 76.21091355, 65.7635468, 72.70359475, 71.04353504, 74.38811189, 74.44360179, 70.00021147, 72.46614626, 73.26103888, 69.67652555, 69.76706304, 73.05627881, 72.20245655, 70.9831013, 73.8268811, 75.1172186, 75.19617965, 75.12340981, 73.96171487, 72.8167, 73.22565624, 70.36018789, 71.22055825, 73.51203798, 74.57343, 73.61652018), Dplus = c(79.03091061, 63.89324961, 75.94537815, 74.35897436, 70.27310924, 76.13814757, 72.89915966, 72.10489993, 71.81729835, 72.06192822, 73.14574315, 73.91253644, 74.60851648, 73.50481859, 75.84204413, 73.58345358, 74.11401099, 73.79656037, 73.38231611, 73.45415778, 75.39406006, 75.74795575, 72.89377289, 74.42016807, 74.15103415, 73.55820106, 71.94689797, 73.71748699, 72.52953813, 74.98444951, 74.6151092), Lplus = c(435.1033529, 252.705357, 668.3739672, 828.6707188, 504.3671881, 493.6306125, 577.0690629, 458.7755483, 507.6234298, 470.0924753, 481.281377,
Re: [R] help with Loop
Hello, Try the following. d - read.table(text= D Y C a 2005 10 a 2006 0 a 2007 9 b 2005 1 b 2006 0 b 2007 1 c 2005 5 c 2006 NA c 2007 4 , header=TRUE) d prn - lapply(split(d, d$D), function(x){ x - x[!is.na(x$C), ] x[c(FALSE, diff(x$C)/x$C[-length(x$C)] -0.5 diff(x$C) -5), ] }) do.call(rbind, prn) Hope this helps, Rui Barradas Em 23-07-2012 11:08, Katrin Ronnenberg escreveu: hello there, I'm an R beginner and got plunged into this. I guess my attempts are hopeless so far, so I won't even show them. I want to write a loop, which prints all erroneous values. My definition of erroneous: If the current counts (partridge counts in a hunting district) differ from last years counts by more than 50 percent and absolut values differ by more than 5 animals I want r to print these values. I have a grouping variable District D, the year Y and the counts C. example table: D Y C a 200510 a 20060 a 20079 b 20051 b 20060 b 20071 c 20055 c 2006NA c 20074 Although the difference in a and b is 100 percent I would doubt a's population breakdown, whereas District b is credible. To confuse things I want the loop to skip missing values and instead look at the year after. Any help is very much appreciated! Thanks, Katrin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] car::Anova - Can it be used for ANCOVA with repeated-measures factors.
On Jul 23, 2012, at 02:48 , John Fox wrote: [snip long discussion which I admit not to have studied in every detail...] Unfortunately, my involvement with this issue has led me to another question. Winer and Kirk both discuss a split-plot ANCOVA in which one has measured a covariate for each observation. That is a second matrix alike the original data matrix, e.g. the body temperature of each person at each measurement for the OBrienKaiser dataset: OBK.cov - OBrienKaiser OBK.cov[,-(1:2)] - runif(16*15, 36, 41) Would it be possible to fit the data using this temperature matrix as a covariate using car::Anova (I thought about this but couldn't find any idea of how to specify the imatrix)? I'm afraid that Anova() won't handle repeated measures on covariates. I agree that it would be desirable to do so, and this capability is on my list of features to add to Anova(), but I can't promise when, or if, I'll get to it. Here There Be Tygers... These models very easily get into territory that does not fall within the realm of standard (multivariate) linear modeling, and I'm not sure you really want it to be handled by a tool like Anova(). There is some risk that I will find myself writing half a treatise in email, but lets look at a simple example: a simple randomized block design with treatments (say, Variety) and a covariate (say, Eelworm). In much of the ANCOVA ideology there is an assumption that the covariate is independent of treatment, typically a pre-randomization measurement. Now, using standard univariate theory, you can easily fit a model like Yield ~ Variety + Eelworm + Block in which there is a single regression coefficient on Eelworm, and the Variety effects are said to be adjusted for differences in eelworm count. You can do this with lm(), or with aov() as you please. However, in the latter case, you might formulate the model with a random Block effect, i.e. Yield ~ Variety + Eelworm + Error(Block) In that case, you will find that you get two estimates of the Eelworm effect, one from each stratum. This comes about via interblock information: If there's a high average Yield in blocks where the average Eelworm is low, then this says something about the effect of Eelworm. The estimate from the within-Block stratum will be the same as in the model with non-random Block effects. If you believe in a mechanistic explanation for the Eelworm effect, you would likely believe that the two regression coefficients estimate the same quantity and you could try combining the estimates into one (recovery of interblock information). However, this messes up all standard theory and since the interblock estimate is usually quite inaccurate, one often decides to discard it. (Mixed-effects software happily fits such models, at the expense of precise degrees of freedom-theory.) There's an alternative interpretation in the form of a two-dimensional model, cbind(Yield, Eelworm) ~ Variety + Error(Block) In that model, you get two-dimensional contrasts, and covariance matrices for each stratum. Then you can utilize the fact that if it is known that the contrasts for the covariate are zero, then the mean of the response (i.e. Yield) is the same as the conditional mean given the covariate equals zero, which is the intercept in the conditional regression model, which is the adjusted ANCOVA contrast yet again. One difference is that in the two-dimensional response model, it is not obvious that the between and within covariance matrices need to share a common regression coefficient. If you think about this with a view to potential measurement errors in the covariate, it becomes clear that the two regression coefficients could well be different. In the repeated measurements setting, we make a shift from an additive Block effect to a multidimensional Yield-response (corresponding to a reshape from long to wide). Let us say, for convenience that there are 3 Varieties; then we are looking at a 3-dimensional response. If we want to study Variety effects, we can decompose the response into a set of contrasts and an average, discard the latter, and use multivariate tests for zero mean of the contrasts. To introduce a covariate at this point gets tricky because the standard linear model assumes the same design matrix for all responses, so you cannot have Yield1 depend on Eelworm1 only, Yield2 on Eelworm2, etc. although you could potentially have all responses depend on all covariates, leaving you with 9 regression coefficients. However, it is not at all clear that you can compare the intercepts between varieties in such a model. One viewpoint is that this is really a (2x3=6)-dimensional response problem if we consider Yield and Eelworm simultaneously. However, it is possible is to transform both the Yield and the Eelworm variables to contrasts, for a 4-dimensional response, consisting of two Yield contrasts and two Eelworm contrasts. If the
Re: [R] Bug in my code (finding nonzero min)
Strictly, it's calculating the minimum of the positive values in each row. And you could do that with a bit less code using q[,1] - apply(remain, 1, function(x) min(x[x0])) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of wwreith Sent: 23 July 2012 05:35 To: r-help@r-project.org Subject: [R] Bug in my code (finding nonzero min) Can someone verify for me if the for loop below is really calculating the nonzero min for each row of a matrix? I have a bug somewhere in the is section of code. My first guess is how I am find the the nonzero min of each row of my matrix. The overall idea is to make sure I am investing all of my money, i.e. new.set is a set of indicator variables for each stock for a particular portfolio, i.e. 0=did not buy, 1=bought. y are the stocks I could still buy, assuming I have the money, data3[,5] are their cost, so for each portfolio, i.e. the rows of new.set I have the option to purchase another stock at a cost listed in the rows of variable remain. Obvisouly the cheapest stock needs to have a cost0 in order for me to be allowed to buy it. My code is intended to weed out portfolios where I could have bought another stock, by taking budget-portfolio cost - (cheapest available stock) and subsetting new.set when this is negative, i.e. buying the cheapest available stock puts me over budget. My problem is that my code is still allowing examples like the following budget of 10, portfolio cost 6, cheapest availabe stock 3 despite the diff variable being negative. Any ideas? y-1-new.set[,6:26] remain-y*data3[,5] minn-matrix(0,nrow(remain),1) for(q in 1:nrow(remain)) { remainc-remain[q,] minn[q,]-min(remainc[which(remainc0)]) } maxcost-matrix(150,nrow(new.set),1) diff-maxcost[,1]-new.set[,5]-minn[,1] new.set-subset(new.set,diff0) -- View this message in context: http://r.789695.n4.nabble.com/Bug-in-my-code-finding-nonzero-m in-tp4637399.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3D scatterplot, using size of symbols for the fourth variable
Hi, Both options worked fine! Thanks for the help! Regards Ellen On 23 July 2012 13:19, Rui Barradas [via R] ml-node+s789695n4637424...@n4.nabble.com wrote: Hello, Maybe the plot argument 'cex' will do it. ?par # in particular the entry for 'cex' s3d$points(MI, TD, EG_18, pch = 16, col = Dstarscale, cex=2*(Dstar - min(Dstar) + 0.3)) Multiply 2 into it because the points are very small. Hope this helps, Rui Barradas Em 23-07-2012 09:20, elpape escreveu: Dear R fans, I would like to create a scatterplot showing the relationship between 4 continuous variables. I thought of using the package scatterplot 3d to have a 3-dimensional plot and then using the size of the symbols to represent the 4th variable. Does anybody know how to do this? I already tried to create this graph using the colour of the symbols, but I was unable to generate a legend. Can anyone perhaps tell me why? #dataframe structure(list(Shannon = c(2.707, 2.525, 2.52, 2.447, 2.655, 2.404, 2.63, 3.427, 3.381, 2.968, 3.247, 3.063, 3.284, 3.24, 2.948, 3.133, 3.482, 3.467, 3.474, 3.659, 3.294, 3.394, 3.293, 3.312, 3.386, 3.755, 3.419, 3.586, 3.539, 3.464, 3.461), H0 = c(19L, 14L, 17L, 13L, 17L, 14L, 17L, 46L, 49L, 29L, 45L, 51L, 65L, 65L, 43L, 56L, 65L, 58L, 59L, 68L, 43L, 55L, 52L, 51L, 55L, 81L, 74L, 86L, 57L, 53L, 58L), H1 = c(14.98425525, 12.49089526, 12.42859666, 11.55363377, 14.22498606, 11.06735739, 13.8737699, 30.78415163, 29.40015657, 19.45297471, 25.71308484, 21.3916359, 26.68228868, 25.53372175, 19.06778001, 22.94270452, 32.52470648, 32.04047669, 32.26554685, 38.82250095, 26.95045014, 29.78485372, 26.92351316, 27.43995053, 29.54752547, 42.73421981, 30.53886089, 36.08942911, 34.4324695, 31.9444993, 31.8488094), H2 = c(11.755, 11.255, 9.507, 10.125, 12.042, 8.397, 11.791, 23.171, 20.917, 13.902, 15.813, 13.739, 15.654, 14.692, 12.078, 13.546, 21.211, 20.437, 22.152, 26.635, 18.077, 19.381, 18.042, 18.856, 19.848, 28.671, 18.047, 21.797, 24.508, 22.898, 21.368), Hinf = c(5.128205128, 7.692307692, 4.504504505, 5.988023952, 6.802721088, 3.831417625, 6.493506494, 9.708737864, 9.615384615, 7.142857143, 5.376344086, 5.988023952, 6.25, 6.060606061, 4.975124378, 5.291005291, 10, 6.756756757, 9.090909091, 10.1010101, 6.41025641, 7.299270073, 8, 9.174311927, 9.345794393, 12.65822785, 7.407407407, 8.130081301, 10.1010101, 9.803921569, 9.523809524), J = c(0.92, 0.957, 0.889, 0.954, 0.937, 0.911, 0.928, 0.895, 0.869, 0.881, 0.853, 0.779, 0.787, 0.776, 0.784, 0.778, 0.834, 0.854, 0.852, 0.867, 0.876, 0.847, 0.833, 0.842, 0.845, 0.855, 0.794, 0.805, 0.875, 0.872, 0.852), EG_18 = c(11.89625579, 12.12535291, 10.52322645, 13, 11.94721408, 11.74138905, 11.41864537, 13.51174611, 13.08380677, 11.98338672, 12.50012399, 11.33153181, 12.05860699, 11.79889475, 10.978733, 11.56314923, 13.04645068, 13.19345708, 13.24637756, 13.8929676, 12.76104192, 12.92668353, 12.46911655, 12.62784458, 12.87743784, 14.10318753, 12.5553675, 13.19308414, 13.6559399, 13.38873489, 13.18810233), TD = c(3.584221748, 3.327044025, 3.260869565, 3.52173913, 3.635220126, 2.890710383, 3.136082474, 3.433096849, 3.49944009, 3.482109228, 3.561288523, 2.994932583, 3.36222162, 3.252259735, 2.986886848, 3.037585943, 3.181734279, 3.012922623, 3.194701667, 3.280983686, 3.114578401, 3.092463169, 3.352683024, 3.283793025, 3.358701828, 3.445251948, 2.873097033, 3.03782087, 3.706619585, 3.325312529, 3.34325186 ), MI = c(2.780487805, 2.52173913, 2.6, 2.8, 2.911764706, 2.434782609, 2.435897436, 2.75, 2.748, 2.720930233, 2.818584071, 2.377319588, 2.502692998, 2.565891473, 2.391644909, 2.418439716, 2.663179916, 2.630177515, 2.550724638, 2.696103896, 2.75464684, 2.703583062, 2.825072886, 2.721088435, 2.782334385, 2.8, 2.618834081, 2.67, 2.743589744, 2.766917293, 2.678321678 ), D = c(71.81184669, 58.1027668, 70.89466089, 71.0550887, 65.69900688, 70.1863354, 61.76980914, 69.90865312, 67.91876076, 69.8495212, 70.04579295, 64.99869296, 67.95894908, 68.35197804, 64.01693209, 64.74605873, 69.7579387, 68.87333164, 67.97582936, 71.24014378, 71.19738387, 71.54931462, 71.16698452, 70.27831123, 69.36640407, 70.84656085, 66.61085878, 68.05601309, 70.77177025, 71.58583791, 70.39500159), Dstar = c(76.57440089, 60.99585062, 77.46767581, 74.46183953, 69.54177898, 76.21091355, 65.7635468, 72.70359475, 71.04353504, 74.38811189, 74.44360179, 70.00021147, 72.46614626, 73.26103888, 69.67652555, 69.76706304, 73.05627881, 72.20245655, 70.9831013, 73.8268811, 75.1172186, 75.19617965, 75.12340981, 73.96171487, 72.8167, 73.22565624, 70.36018789, 71.22055825, 73.51203798, 74.57343, 73.61652018), Dplus = c(79.03091061, 63.89324961, 75.94537815, 74.35897436, 70.27310924, 76.13814757, 72.89915966, 72.10489993, 71.81729835, 72.06192822, 73.14574315, 73.91253644, 74.60851648, 73.50481859, 75.84204413, 73.58345358, 74.11401099, 73.79656037,
[R] Fwd: 3D scatterplot, using size of symbols for the fourth variable
Hi, Both options work fine! Thanks! I'll play around with the scripts some more to see which way is the best to visualise the data. Regards Ellen -- Forwarded message -- From: ellen pape ellen.p...@gmail.com Date: 23 July 2012 14:40 Subject: Re: 3D scatterplot, using size of symbols for the fourth variable To: Rui Barradas [via R] ml-node+s789695n4637424...@n4.nabble.com Hi, Both options work fine! Thanks! I'll play around with the scripts some more to see which way is the best to visualise the data. Regards Ellen On 23 July 2012 13:19, Rui Barradas [via R] ml-node+s789695n4637424...@n4.nabble.com wrote: Hello, Maybe the plot argument 'cex' will do it. ?par # in particular the entry for 'cex' s3d$points(MI, TD, EG_18, pch = 16, col = Dstarscale, cex=2*(Dstar - min(Dstar) + 0.3)) Multiply 2 into it because the points are very small. Hope this helps, Rui Barradas Em 23-07-2012 09:20, elpape escreveu: Dear R fans, I would like to create a scatterplot showing the relationship between 4 continuous variables. I thought of using the package scatterplot 3d to have a 3-dimensional plot and then using the size of the symbols to represent the 4th variable. Does anybody know how to do this? I already tried to create this graph using the colour of the symbols, but I was unable to generate a legend. Can anyone perhaps tell me why? #dataframe structure(list(Shannon = c(2.707, 2.525, 2.52, 2.447, 2.655, 2.404, 2.63, 3.427, 3.381, 2.968, 3.247, 3.063, 3.284, 3.24, 2.948, 3.133, 3.482, 3.467, 3.474, 3.659, 3.294, 3.394, 3.293, 3.312, 3.386, 3.755, 3.419, 3.586, 3.539, 3.464, 3.461), H0 = c(19L, 14L, 17L, 13L, 17L, 14L, 17L, 46L, 49L, 29L, 45L, 51L, 65L, 65L, 43L, 56L, 65L, 58L, 59L, 68L, 43L, 55L, 52L, 51L, 55L, 81L, 74L, 86L, 57L, 53L, 58L), H1 = c(14.98425525, 12.49089526, 12.42859666, 11.55363377, 14.22498606, 11.06735739, 13.8737699, 30.78415163, 29.40015657, 19.45297471, 25.71308484, 21.3916359, 26.68228868, 25.53372175, 19.06778001, 22.94270452, 32.52470648, 32.04047669, 32.26554685, 38.82250095, 26.95045014, 29.78485372, 26.92351316, 27.43995053, 29.54752547, 42.73421981, 30.53886089, 36.08942911, 34.4324695, 31.9444993, 31.8488094), H2 = c(11.755, 11.255, 9.507, 10.125, 12.042, 8.397, 11.791, 23.171, 20.917, 13.902, 15.813, 13.739, 15.654, 14.692, 12.078, 13.546, 21.211, 20.437, 22.152, 26.635, 18.077, 19.381, 18.042, 18.856, 19.848, 28.671, 18.047, 21.797, 24.508, 22.898, 21.368), Hinf = c(5.128205128, 7.692307692, 4.504504505, 5.988023952, 6.802721088, 3.831417625, 6.493506494, 9.708737864, 9.615384615, 7.142857143, 5.376344086, 5.988023952, 6.25, 6.060606061, 4.975124378, 5.291005291, 10, 6.756756757, 9.090909091, 10.1010101, 6.41025641, 7.299270073, 8, 9.174311927, 9.345794393, 12.65822785, 7.407407407, 8.130081301, 10.1010101, 9.803921569, 9.523809524), J = c(0.92, 0.957, 0.889, 0.954, 0.937, 0.911, 0.928, 0.895, 0.869, 0.881, 0.853, 0.779, 0.787, 0.776, 0.784, 0.778, 0.834, 0.854, 0.852, 0.867, 0.876, 0.847, 0.833, 0.842, 0.845, 0.855, 0.794, 0.805, 0.875, 0.872, 0.852), EG_18 = c(11.89625579, 12.12535291, 10.52322645, 13, 11.94721408, 11.74138905, 11.41864537, 13.51174611, 13.08380677, 11.98338672, 12.50012399, 11.33153181, 12.05860699, 11.79889475, 10.978733, 11.56314923, 13.04645068, 13.19345708, 13.24637756, 13.8929676, 12.76104192, 12.92668353, 12.46911655, 12.62784458, 12.87743784, 14.10318753, 12.5553675, 13.19308414, 13.6559399, 13.38873489, 13.18810233), TD = c(3.584221748, 3.327044025, 3.260869565, 3.52173913, 3.635220126, 2.890710383, 3.136082474, 3.433096849, 3.49944009, 3.482109228, 3.561288523, 2.994932583, 3.36222162, 3.252259735, 2.986886848, 3.037585943, 3.181734279, 3.012922623, 3.194701667, 3.280983686, 3.114578401, 3.092463169, 3.352683024, 3.283793025, 3.358701828, 3.445251948, 2.873097033, 3.03782087, 3.706619585, 3.325312529, 3.34325186 ), MI = c(2.780487805, 2.52173913, 2.6, 2.8, 2.911764706, 2.434782609, 2.435897436, 2.75, 2.748, 2.720930233, 2.818584071, 2.377319588, 2.502692998, 2.565891473, 2.391644909, 2.418439716, 2.663179916, 2.630177515, 2.550724638, 2.696103896, 2.75464684, 2.703583062, 2.825072886, 2.721088435, 2.782334385, 2.8, 2.618834081, 2.67, 2.743589744, 2.766917293, 2.678321678 ), D = c(71.81184669, 58.1027668, 70.89466089, 71.0550887, 65.69900688, 70.1863354, 61.76980914, 69.90865312, 67.91876076, 69.8495212, 70.04579295, 64.99869296, 67.95894908, 68.35197804, 64.01693209, 64.74605873, 69.7579387, 68.87333164, 67.97582936, 71.24014378, 71.19738387, 71.54931462, 71.16698452, 70.27831123, 69.36640407, 70.84656085, 66.61085878, 68.05601309, 70.77177025, 71.58583791, 70.39500159), Dstar = c(76.57440089, 60.99585062, 77.46767581, 74.46183953, 69.54177898, 76.21091355, 65.7635468, 72.70359475, 71.04353504, 74.38811189, 74.44360179, 70.00021147, 72.46614626, 73.26103888,
Re: [R] [RsR] How does rlm in R decide its w weights for each IRLSiteration?
rlm includes an MM estimator. S Ellison -Original Message- From: Maheswaran Rohan [mailto:mro...@doc.govt.nz] Sent: 22 July 2012 23:08 To: Valentin Todorov; S Ellison Cc: r-sig-rob...@r-project.org; r-help Subject: RE: [RsR] How does rlm in R decide its w weights for each IRLSiteration? Hi Valentin, If the contamination is mainly in the response direction, M-estimator provides good estimates for parameters and rlm can be used. Rohan -Original Message- From: r-sig-robust-boun...@r-project.org [mailto:r-sig-robust-boun...@r-project.org] On Behalf Of Valentin Todorov Sent: Saturday, 21 July 2012 6:57 a.m. To: S Ellison Cc: r-sig-rob...@r-project.org; r-help Subject: Re: [RsR] How does rlm in R decide its w weights for each IRLSiteration? Hi Michael, S Ellison, I do not actually understand what you want to achieve with the M estimates of rlm in MASS, but why you do not give a try of lmrob in 'robustbase'. Please have a llok in the references (?lmrob) about the advantages of MM estimators over the M estimators. Best regards, Valentin On Fri, Jul 20, 2012 at 5:11 PM, S Ellison s.elli...@lgcgroup.com wrote: -Original Message- Subject: [RsR] How does rlm in R decide its w weights for each IRLS iteration? I am also confused about the manual: a. The input arguments: wt.method are the weights case weights (giving the relative importance of case, so a weight of 2 means there are two of these) or the inverse of the variances, so a weight of two means this error is half as variable? When you give rlm weights (called 'weights', not 'w' on input, though you can abbreviate to 'w'), you need to tell it which of these two possibilities you used. If you gave it case numbers, say wt.method=case; if you gave it inverse variance weights, say wt.method=inv.var. The default is inv.var. The input argument w is used for the initial values of the rlm IRLS weighting and the output value w is the converged w. There is no input argument 'w' for rlm (see above). The output w are a calculated using the psi function, so between 0 and 1. The effective weights for the final estimate would then be something like w*weights, using the full name of the input argument (and if I haven't forgotten a square root somewhere). At least, that would work for a simple location estimate (eg rlm(x~1)). If my understanding above is correct, how does rlm decide its w for each IRLS iteration then? It uses the given psi functions to calculate the iterative weights based on the scaled residuals. Any pointers/tutorials/notes to the calculation of these w's in each IRLS iteration? Read the cited references for a detailed guide. Or, of course, MASS - the package is, after all, intended to support the book, not replace it. S Ellison *** This email and any attachments are confidential. Any u...{{dropped:6}} ___ r-sig-rob...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-robust ## This e-mail (and attachments) is confidential and may be legally privileged. ## *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Speeding up a loop
1.15 60 0.553555415 0.574892872 1.15 60 0.563183983 0.564029359 Shouldn't the function row out the second one, since it it higher in position 3 and lower in position 4 i.e. it should not all be yes? -- View this message in context: http://r.789695.n4.nabble.com/Speeding-up-a-loop-tp4637201p4637438.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Workspace Question.
The objects from an R session are saved in the RWorking directory of the session. The answer to your question will depend on whether you started the different versions of R using shortcuts located in different folders or the same folder. The objects should not be automatically deleted so I would think that unless you deleted them yourself, you are dealing with a situation where you are not starting the previous version of R with the same working directory as before. Sorry this is only general advice, but there is not enough detail to say more. Rob On 7/23/2012 2:36 AM, Michael Trianni wrote: I recently installed R version 2.14.1. After a session (not my first) in 2.14.1, I saved the 'workspace image'. I then opened earlier R versions, 2.14.0 2.12.0, and the only objects listed were from the 2.14.1 session. All the work under those previous versions were gone, to my dismay. This did not happen when I was working in 2.14.0, as 2.12.0 objects were not affected when saving the workspace image in 2.14.0. Are the 2.14.0 2.12.0 objects retrievable or permanently deleted? Thanks for any input, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] marginal effect lmer
Dear Andigo, You don't say what problems you encountered, and I don't know how to interpret the superscript 2s in your lmer() command, but if the intention is to fit a quadratic in Catholicproportion, then you can use poly(Catholicproportion, 2) in formulating the model. That way, effect() will be able to figure out the structure of the model. BTW, the effects computed by effect() are not what are commonly called marginal effects, but rather fitted values under the model for particular combinations of predictors. I hope this helps, John John Fox Sen. William McMaster Prof. of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ On Mon, 23 Jul 2012 02:46:32 -0700 (PDT) Andigo andreas.goldb...@unige.ch wrote: Hi everybody, I try to calculate and display the marginal effect(s) in a hierarchical model using lmer. Here is my model: m1- lmer(vote2011~ Catholic + Attendance+ logthreshold + West + Catholicproportion+ (Catholic * Catholicproportion) + (Attendance*Catholicproportion) + Catholicproportion²+ (Catholic *Catholicproportion²)+ (Attendance* Catholicproportion²) + (1 + Attendance+ Catholic | canton), data=dat2011, verbose = T) I want to display the me of the individual level variable Catholic depending on the contextual variable Catholicproportion (showing also the 95% ci). So far I tried a bit with the effects package, but without success. Does anybody know how to display that? Thanks in advance for your help! Andigo -- View this message in context: http://r.789695.n4.nabble.com/marginal-effect-lmer-tp4637421.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sparse Logistic PCA package?
Hi UseRs, Has anyone come across an R package (or any other software) that can implement Sparse Logistic PCA (an extension to Sparse PCA that works in the presence of binary-type data)? Specifically, I have read the 2010 paper by Lee et al. called Sparse Logistic Principal Component Analysis for Binary Data. Thanks, Ash [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] CSV format issues
Dear all, I have some encoding problem which I'm not familiar with. Here is the case : I'm read data files which can have been generated from a computer either with global settings in french or in english. Here is an exemple ouf data file : * English output Time,Value 17,-0.0753953 17.05,-6.352454E-02 * French output. Time,Value 32,-7,183246E-02 32,05,3,469364E-02 In the first case, I can totally retrieve both columns, splitting each line using the comma as a separator. In the second case, it's impossible, since the comma (in french) is also used to separate decimal. Usually, the CSV french file format add some quote, to distinguish the comma used as column separator from comma used as decimal, like the following : Time,Value 32,-7,183246E-02 32,05,3,469364E-02 Since I'm expecting 2 numbers, I can set that if there is 3 comma, the first two number are to be gathered as well as the two lefting ones. But in case of only two comma, which number is the floating one (I know that it is the second one, but who this is a great source of bugs ...). the unix tools file returns : === $ file P23_RD06_High\ Sensitivity\ DNA\ Assay_DE04103664_2012-06-27_11-57-29_Sample1.csv $ P23_RD06_High Sensitivity DNA Assay_DE04103664_2012-06-27_11-57-29_Sample1.csv: ASCII text, with CRLF line terminators === Unfortunately, the raw file doesn't contains the precious quote. So sorry to bother with this question which is not totally related to R (which I'm using). Do you know if there any facilities using R to get the data in the good format ? Bests, -- Guillaume Meurice - PhD __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with Portfolio Optmization
Hi, I need some help with Portfolio Optimization problem. I am trying to find the minimum variance portfolio subjected to constraints on weights like /x1 w1 x2 x3 w2 x4/i I need help with solving for the minimum variance portfolio as solve.QP doesn't allow me to specify the lower boundaries. Thanks Mahesh -- View this message in context: http://r.789695.n4.nabble.com/Help-with-Portfolio-Optmization-tp4637425.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Large data set
Hi all, Have a problem. Trying to read in a data set that has about 112,000,000 rows and 8 columns and obviously enough it was too big for R to handle. The columns are mode up of 2 integer columns and 6 logical columns. The text file is about 4.2 Gb in size. Also I have 4 Gb of RAM and 218 Gb of available space on the hard drive. I tried the dumpDF function but it was too big. Also tried bring in the data is 10 sets of about 12,000,000. Are there are other ways of getting around the size of the data. Regards, Lorcan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] date conversation
Hello, when I convert a factor like this a=01.10.2009 into a date using b=strptime(a, format=%d. %m. %Y). It works very well. But when I try to convert c = 2009/10 into a date using d=strptime(c, format=%Y/%m) it doesnt work at all. I get d=NA. What did I do wrong? Thank you very much for your help! best regards Claudia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] accumulation curve
Hello, I am actually trying to perform accumulation curves with the function accumcomp (package BiodiversityR). I am working on a data set species - sampling date - sampling sites and I would like to see the impact of sampling pressure (increasing sampling date) on species richness. Nevertheless, all my sites belong to different climatic and edaphic contexts which naturally leads to differences in the size of the habitat species pools i.e. the number of species potentially able to colonize my site and consequently on species richness occuring in my different sites. So, when plotting my accumulation curves instead of having thespecies richness in y and number of sampling dates in x, I would like to have y= % of total species richness encountered within the site and x=number of sampling dates. I don't know how to change the code accumcomp or accumresult (BiodiversityR package) or specaccum (vegan package, because the two first functions of BiodiversityR are based on the third one). Please, does anyone have any idea? Thanks Michaël -- Michaël Aubert Maître de Conférences Groupe de Recherche Ecodiv, UPRES-EA 1293 Fédération de Recherche SCALE (FED 4116) UFR Sciences Techniques, Université de Rouen 76821 Mont Saint Aignan Tel: 33+(0)232769447 - Fax: 33+(0)235146655 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FIML using lavaan returns zeroes for coefficients
Thanks for the helpful explanation. As to your question, I sometimes use lavaan to fit univariate regressions simply because it can handle missing data using FIML rather than listwise deletion. Are there reasons to avoid this? BTW, thanks for the update in the development version. Andrew Miles On Jul 21, 2012, at 12:59 PM, yrosseel wrote: On 07/20/2012 10:35 PM, Andrew Miles wrote: Hello! I am trying to reproduce (for a publication) analyses that I ran several months ago using lavaan, I'm not sure which version, probably 0.4-12. A sample model is given below: pathmod='mh30days.log.w2 ~ mh30days.log + joingroup + leavegroup + alwaysgroup + grp.partic.w2 + black + age + bivoc + moved.conf + local.noretired + retired + ds + ministrytime + hrswork + nomoralescore.c + negint.c + cong.conflict.c + nomoraleXjoin + nomoraleXleave + nomoraleXalways + negintXjoin + negintXleave + negintXalways + conflictXjoin + conflictXleave + conflictXalways ' mod1 = sem(pathmod, data=sampledat, missing=fiml, se=robust) At the time, the model ran fine. Now, using version 0.4-14, the model returns all 0's for coefficients. What happened is that since 0.4-14, lavaan tries to 'detect' models that are just univariate regression, and internally calls lm.fit, instead of the lavaan estimation engine, at least when the missing=ml argument is NOT used. (BTW, I fail to understand why you would use lavaan if you just want to fit a univariate regression). When missing=ml is used, lavaan normally checks if you have fixed x covariates (which you do), and if fixed.x=TRUE (which is the default). In 0.4, lavaan internally switches to fixed.x=FALSE (which implicitly assumes that all your predictors are continuous, but I assume you would not using missing=ml otherwise). Unfortunately, for the 'special' case of univariate regression, it fails to do this. This behavior will likely change in 0.5, where, by default, only endogenous/dependent variables will be handled by missing=ml, not exogenous 'x' covariates. To fix it: simply add the fixed.x=FALSE argument, or revert to 0.4-12 to get the old behavior. Hope this helps, Yves. http://lavaan.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding gamma and 3-parameter log normal distributions to L-moments ratio diagram lmrd()
On 2012-07-22 03:31, otsvetkova wrote: How to adapt this piece of code but for: - gamma distribution - 3 parameter log normal More specifically, where can I find the specification of the parameter (lmom) for pelgam() and pelln3()? Lmom package info just gives: pelgam(lmom), lelln3(lmom), where lmom is a numeric vector containing the L-moments of the distribution or of a data sample. # Draw an L-moment ratio diagram and add a line for the # Weibull distribution lmrd() weimom- sapply( seq(0, 0.9, by=0.01), function(tau3) lmrwei(pelwei(c(0,1,tau3)), nmom=4) ) lmrdlines(t(weimom), col='darkgreen', lwd=2) source: http://rgm2.lab.nig.ac.jp/RGM2/func.php?rd_id=lmom:lmrdpoints Thank you. lmrd() by default draws lines for the Pearson type III distribution (which is a 3-parameter form of the gamma distribution) and the generalized normal distribution (a differently parametrized version of the 3-parameter lognormal distribution). Is this what you want? If not, I don't know what else you want to know about the specification of the parameter (lmom). J. R. M. Hosking __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Circular predictor in MARS
Hello, I am currently trying to build a statistical model using MARS (multiple adaptative regression splines) with package library earth v3.2-3 However I could not find a way to add a circular predictor, namely the wind direction [0-360°] . In GAMS (library mgcv) there is the possibility to use special spline commands such as s(X,bs=cp) or s(X,bs=cc). Would you know of anything similar for MARS ? Thanks and regards, Pierre master student Geneva [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CSV format issues
try this; looks for strings of numbers with commas and quotes them: x - readLines(textConnection(Time,Value + 32,-7,183246E-02 + 32,05,3,469364E-02)) # process the data putting in quotes on scientific x.new1 - gsub((-?[0-9]+,[0-9]+E-?[0-9]+), '\\1', x) x.new1 [1] Time,Value 32,\-7,183246E-02\ 32,05,\3,469364E-02\ # put quotes on just numbers x.new2 - gsub((-?[0-9]+,[0-9]+)(,|$), '\\1\\2', x.new1) x.new2 [1] Time,Value 32,\-7,183246E-02\ \32,05\,\3,469364E-02\ temp - tempfile() writeLines(x.new2, temp) x.input - read.csv(temp) x.input Time Value 132 -7,183246E-02 2 32,05 3,469364E-02 On Mon, Jul 23, 2012 at 9:06 AM, Guillaume Meurice guillaume.meur...@igr.fr wrote: Dear all, I have some encoding problem which I'm not familiar with. Here is the case : I'm read data files which can have been generated from a computer either with global settings in french or in english. Here is an exemple ouf data file : * English output Time,Value 17,-0.0753953 17.05,-6.352454E-02 * French output. Time,Value 32,-7,183246E-02 32,05,3,469364E-02 In the first case, I can totally retrieve both columns, splitting each line using the comma as a separator. In the second case, it's impossible, since the comma (in french) is also used to separate decimal. Usually, the CSV french file format add some quote, to distinguish the comma used as column separator from comma used as decimal, like the following : Time,Value 32,-7,183246E-02 32,05,3,469364E-02 Since I'm expecting 2 numbers, I can set that if there is 3 comma, the first two number are to be gathered as well as the two lefting ones. But in case of only two comma, which number is the floating one (I know that it is the second one, but who this is a great source of bugs ...). the unix tools file returns : === $ file P23_RD06_High\ Sensitivity\ DNA\ Assay_DE04103664_2012-06-27_11-57-29_Sample1.csv $ P23_RD06_High Sensitivity DNA Assay_DE04103664_2012-06-27_11-57-29_Sample1.csv: ASCII text, with CRLF line terminators === Unfortunately, the raw file doesn't contains the precious quote. So sorry to bother with this question which is not totally related to R (which I'm using). Do you know if there any facilities using R to get the data in the good format ? Bests, -- Guillaume Meurice - PhD __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] date conversation
You have to supply a 'day' or the date is not known: c = '2009/10' d=strptime(paste0(c, '/1'), format=%Y/%m/%d) d [1] 2009-10-01 On Mon, Jul 23, 2012 at 9:05 AM, paladini palad...@beuth-hochschule.de wrote: Hello, when I convert a factor like this a=01.10.2009 into a date using b=strptime(a, format=%d. %m. %Y). It works very well. But when I try to convert c = 2009/10 into a date using d=strptime(c, format=%Y/%m) it doesnt work at all. I get d=NA. What did I do wrong? Thank you very much for your help! best regards Claudia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Large data set
First of all, try to determine the smallest file you can read with an empty workspace. Once you have done that, then break up your file into that size sets and read them in. The next question is what do you want to do with 112M rows of data. Can you process them a set a time and then aggregate the results. I have no problem in reading in files with 10M rows on a 32-bit version of R on Windows with 3GB of memory. So a little more information on what is the problem you are trying to solve would be useful. On Mon, Jul 23, 2012 at 8:02 AM, Lorcan Treanor lorcan.trea...@idiro.com wrote: Hi all, Have a problem. Trying to read in a data set that has about 112,000,000 rows and 8 columns and obviously enough it was too big for R to handle. The columns are mode up of 2 integer columns and 6 logical columns. The text file is about 4.2 Gb in size. Also I have 4 Gb of RAM and 218 Gb of available space on the hard drive. I tried the dumpDF function but it was too big. Also tried bring in the data is 10 sets of about 12,000,000. Are there are other ways of getting around the size of the data. Regards, Lorcan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CSV format issues
On 23/07/2012 9:06 AM, Guillaume Meurice wrote: Dear all, I have some encoding problem which I'm not familiar with. Here is the case : I'm read data files which can have been generated from a computer either with global settings in french or in english. Here is an exemple ouf data file : * English output Time,Value 17,-0.0753953 17.05,-6.352454E-02 * French output. Time,Value 32,-7,183246E-02 32,05,3,469364E-02 In the first case, I can totally retrieve both columns, splitting each line using the comma as a separator. In the second case, it's impossible, since the comma (in french) is also used to separate decimal. Usually, the CSV french file format add some quote, to distinguish the comma used as column separator from comma used as decimal, like the following : You could convert columns to character to get this. R also has the read.csv2 and write.csv2 files which take a different approach, i.e. using a semicolon as a separator. If you want to automate this, you can use the Sys.localeconv() function to find the details of the current locale. Duncan Murdoch Time,Value 32,-7,183246E-02 32,05,3,469364E-02 Since I'm expecting 2 numbers, I can set that if there is 3 comma, the first two number are to be gathered as well as the two lefting ones. But in case of only two comma, which number is the floating one (I know that it is the second one, but who this is a great source of bugs ...). the unix tools file returns : === $ file P23_RD06_High\ Sensitivity\ DNA\ Assay_DE04103664_2012-06-27_11-57-29_Sample1.csv $ P23_RD06_High Sensitivity DNA Assay_DE04103664_2012-06-27_11-57-29_Sample1.csv: ASCII text, with CRLF line terminators === Unfortunately, the raw file doesn't contains the precious quote. So sorry to bother with this question which is not totally related to R (which I'm using). Do you know if there any facilities using R to get the data in the good format ? Bests, -- Guillaume Meurice - PhD __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CSV format issues
On Jul 23, 2012, at 15:06 , Guillaume Meurice wrote: Dear all, I have some encoding problem which I'm not familiar with. Here is the case : I'm read data files which can have been generated from a computer either with global settings in french or in english. Here is an exemple ouf data file : * English output Time,Value 17,-0.0753953 17.05,-6.352454E-02 * French output. Time,Value 32,-7,183246E-02 32,05,3,469364E-02 In the first case, I can totally retrieve both columns, splitting each line using the comma as a separator. In the second case, it's impossible, since the comma (in french) is also used to separate decimal. Usually, the CSV french file format add some quote, to distinguish the comma used as column separator from comma used as decimal, like the following : Time,Value 32,-7,183246E-02 32,05,3,469364E-02 Since I'm expecting 2 numbers, I can set that if there is 3 comma, the first two number are to be gathered as well as the two lefting ones. But in case of only two comma, which number is the floating one (I know that it is the second one, but who this is a great source of bugs ...). the unix tools file returns : === $ file P23_RD06_High\ Sensitivity\ DNA\ Assay_DE04103664_2012-06-27_11-57-29_Sample1.csv $ P23_RD06_High Sensitivity DNA Assay_DE04103664_2012-06-27_11-57-29_Sample1.csv: ASCII text, with CRLF line terminators === Unfortunately, the raw file doesn't contains the precious quote. So sorry to bother with this question which is not totally related to R (which I'm using). Do you know if there any facilities using R to get the data in the good format ? As you already observe, there can't be. There's just no way of seeing whether 32,7,8 is 32.7, 8.0 or 32.0, 7.8. That's why the usual CSV format in jurisdictions with comma as decimal separator has semicolon as the field separator. You may be able to scrape through with various heuristics, such as - a leading 0 likely belongs to a decimal part - if there's an exponent, then there is likely a decimal point - there can't be a sign in the decimal part - times should be (roughly) equidistant and in increasing sequence R is fairly well equipped with tools to let you create code to do this, look at strsplit(), count.fields(), grep() etc., but it will be a fair bit of work, and you may still end up with a handful of truly ambiguous cases. Ultimately, however, the issue is that someone messed up the collection of data and now try to make it your problem. In a consulting situation, that should cost serious extra money. Other options include - redo the data, this time with all computers set to English (if there's an internal storage format, this could be more realistic than you think) - return the faulty data collecting software (some time back in the 1990's, the Paradox data base had the same stupid bug of double usage of commas, but it is really unheard of in 2012) -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating panel data
I'm new to R. I am trying to create panel data with a slight alteration from a typical dataset. At present, I have data on a few hundred people with the dates of occurrences for several events (like marriage and employment). The dates are in year/quarter format, so 68.0 equals the 1st quarter of 1968 and 68.25 equals the 2nd quarter of 1968. If the event never occurred, 0 is recorded for the Year Of Occurrence. Somewhat redundantly, I also have separate dichotomous variables indicating whether the event ever occurred (0/1 format). For example: x - data.frame( id = c(1,2), Event1Occur = c(1,0), YearOfOccurEvent1 = c(68.25,0), Event2Occur = c(0,1), YearOfOccurEvent2 = c(0,68.5)) I need to transform that dataframe so that I have a separate row for each time period (year/quarter) for each person, with variables for whether the event had already occurred during that time period. If the event occurred during an earlier time, it is presumed to still be occurring at later times. E.g., if the person got married in the first quarter of 1968, they are presumed to still be married at all later time periods. I need those time periods marked (0/1). For example: y - data.frame( id = c( rep (1,5), rep (2,5)), Year=c (68.0,68.25,68.50,68.75,69.0)) y $ Event1 - c (0,1,1,1,1,0,0,0,0,0) y $ Event2 - c (0,0,0,0,0,0,0,1,1,1) can someone get me started. Thanks Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] contingency tables in R
Your first example creates a four-way table (sex, familyhist, numrisk, hypertension) and your second example adds four more risk factors. From the second example, you can use margin.table() to sum across any set of dimensions and ftable() to present the results of different slices. You should be able to accomplish what you want without making separate data.frames. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Jin Choi Sent: Sunday, July 22, 2012 11:32 AM To: r-help@r-project.org Subject: [R] contingency tables in R Hello! I am interested in creating contingency tables, namely one that would let me find the frequency and proportion of patients with specific risk factors (dyslipidemia, diabetes, obesity, smoking, hypertension). There are 3 dimensions I would like to divide the population into: sex, family history, and number of risk factors. In R, I used the following code: mytable-xtabs(~sex+familyhist+numrisk+hypertension,data=mydata) ftable(mytable) a-ftable(mytable) prop.table(a,1) However, when I conduct the following code: mytable- xtabs(~sex+familyhist+numrisk+hypertension+diabetes+obesity+smoking+dys lipidemia,data=mydata) Here the table simply considers the additional risk factors as new dimensions, which I do not want. I would like to find a way where the dimensions are sex, family history, and number of risk factors and I am finding the frequency and prevalence for each risk factor (dyslipidemia, diabetes, obesity, smoking, hypertension) in each of these subgroups. The only way to get around this problem I could think of is to create new data frames for each number of risk factor subgroup: numrisk1, numrisk2, numrisk3.where numrisk1 indicates population with 1 risk factor. Then I could calculate the prevalence of each risk factor separately. This approach will take a very long time so I was hoping to ask if anyone knew of a solution to this issue I am having with contingency tables...perhaps a useful R package? Thank you for your help! Jin Choi Masters Student (Epidemiology) McGill University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating panel data
Sounds like you would find it worthwhile to read a good Intro R tutorial -- like the one that comes shipped with R. Have you done so? If not, why not? If so, how about the data import/export manual? I certainly wouldn't guarantee that these will answer all your questions. They're just places to start BEFORE posting here. Setting up proper data structures can be tricky (have you considered what form the functions/packages with which you are going to analyze the data want?). You might also find it useful to use Hadley Wickham's plyr and/or reshape2 packages, whose aim is to standardize and simplify data manipulation tasks. Vignettes/tutorials are available for both. Cheers, Bert On Mon, Jul 23, 2012 at 8:21 AM, Jeff r...@jp.pair.com wrote: I'm new to R. I am trying to create panel data with a slight alteration from a typical dataset. At present, I have data on a few hundred people with the dates of occurrences for several events (like marriage and employment). The dates are in year/quarter format, so 68.0 equals the 1st quarter of 1968 and 68.25 equals the 2nd quarter of 1968. If the event never occurred, 0 is recorded for the Year Of Occurrence. Somewhat redundantly, I also have separate dichotomous variables indicating whether the event ever occurred (0/1 format). For example: x - data.frame( id = c(1,2), Event1Occur = c(1,0), YearOfOccurEvent1 = c(68.25,0), Event2Occur = c(0,1), YearOfOccurEvent2 = c(0,68.5)) I need to transform that dataframe so that I have a separate row for each time period (year/quarter) for each person, with variables for whether the event had already occurred during that time period. If the event occurred during an earlier time, it is presumed to still be occurring at later times. E.g., if the person got married in the first quarter of 1968, they are presumed to still be married at all later time periods. I need those time periods marked (0/1). For example: y - data.frame( id = c( rep (1,5), rep (2,5)), Year=c (68.0,68.25,68.50,68.75,69.0)) y $ Event1 - c (0,1,1,1,1,0,0,0,0,0) y $ Event2 - c (0,0,0,0,0,0,0,1,1,1) can someone get me started. Thanks Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] EM for missing data
This is chopped off. What happens next is important. EM can be used to compute covariance matrices and conducts the statistical analysis without imputing any missing values or it can be used to impute missing values to create one or multiple data sets. You might find Missing Data by Paul D. Allison to be useful. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of ya Sent: Sunday, July 22, 2012 5:11 AM To: r-help Subject: Re: [R] EM for missing data hi Greg, David, and Tal, Thank you very much for the information. I found this in SPSS 17.0 missing value manual: EM Method This method assumes a distribution for the partially missing data and bases inferences on the likelihood under that distribution. Each iteration consists of an E step and an M step. The E step finds the conditional expectation of the b __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Excel file Import - to Matrix format
Use read.table() with the row.names= argument set to the column number that represents the row.names (presumably 1) and then convert the data.frame to a matrix: A - read.table(file=fname.csv, row.names=1) B - as.matrix(A) If you are using Windows, you can open Excel, copy the data, Copy and then change fname.csv to clipboard-128 and R will read the data from the Windows clipboard. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of arun Sent: Saturday, July 21, 2012 8:16 PM To: biostat1 Cc: R help Subject: Re: [R] Excel file Import - to Matrix format Hello, Try this: You can use the read.table with sep=, if it is comma separated to read the file. test1-read.table(text= 0.141 0.242 0.342 0.224 0.342 0.334 0.652 0.682 0.182 ,sep=,header=FALSE) #Read data test1 # V1 V2 V3 #1 0.141 0.242 0.342 #2 0.224 0.342 0.334 #3 0.652 0.682 0.182 #Convert to matrix as it is that you wanted test2-as.matrix(test1) colnames(test2)-NULL genelist-c(Fkh2,Swi5,Sic1) rownames(test2)-genelist test2 # [,1] [,2] [,3] #Fkh2 0.141 0.242 0.342 #Swi5 0.224 0.342 0.334 #Sic1 0.652 0.682 0.182 #2nd case: As in your example, test1-read.table(text= IMAGE:152 0.141 0.242 0.342 IMAGE:262 0.224 0.342 0.334 IMAGE:342 0.652 0.682 0.182 ,sep=,header=FALSE) test2-as.matrix(test1[-1]) colnames(test2)-NULL genelist-c(Fkh2,Swi5,Sic1) rownames(test2)-genelist test2 # [,1] [,2] [,3] #Fkh2 0.141 0.242 0.342 #Swi5 0.224 0.342 0.334 #Sic1 0.652 0.682 0.182 A.K. - Original Message - From: biostat1 sridi...@gmail.com To: r-help@r-project.org Cc: Sent: Saturday, July 21, 2012 7:29 PM Subject: [R] Excel file Import - to Matrix format Hi, New to R. Need a bit of help. Thanks I am trying to import data from Excel file. The package I am trying to use requires data in Matrix format. Excel - R Matrix with the constraint that the very first column in Excel file should became the names for rows of the matrix. Example. Data has 1000 rows and 11 columns. 1st column is the names of Genes 10 coulmns of numerical data. I need to read this into R. it should be a 1000 (row) X 10 Column (Matrix) 1st column of Excel file becomes name of the rows. I am experimenting with reading as data frame (as I am unable to find any info on reading into matrix) split the data frame, etc. Thanks truly appreciate your help. What I need: http://r.789695.n4.nabble.com/file/n4637332/thisWorks2.png http://r.789695.n4.nabble.com/file/n4637332/doesNotWork1.png -- View this message in context: http://r.789695.n4.nabble.com/Excel-file- Import-to-Matrix-format-tp4637332.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extracting and combining statistics like BIC, Rsquare
Dear all, I'd like to ask you if there is a way to combine Rsquare, BIC, AIC values after making imputations in R. There are five data sets I imputed with mice and and than when I create new variable and apply ologit model, I could extract Beta coefficients and its standard errors, but don't know how to extract these statistics above because they cannot with variance or covariance function. Do you know alternative ways? Bests Niklas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Speeding up a loop
Hello, I think this is a boundary issue. In your op you've said less not less than or equal to. Try using = and = to see what happens, I bet it solves it. Rui Barradas Em 23-07-2012 14:43, wwreith escreveu: 1.15 60 0.553555415 0.574892872 1.15 60 0.563183983 0.564029359 Shouldn't the function row out the second one, since it it higher in position 3 and lower in position 4 i.e. it should not all be yes? -- View this message in context: http://r.789695.n4.nabble.com/Speeding-up-a-loop-tp4637201p4637438.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating panel data
At 10:38 AM 7/23/2012, you wrote: You might also find it useful to use Hadley Wickham's plyr and/or reshape2 packages, whose aim is to standardize and simplify data manipulation tasks. Cheers, Bert I have already used R enough to have correctly imported the actual data. After import, it is in the approximate format at the x dataframe I previously posted. I already found the plyr and reshape2 packages and had assumed that the cast (or dcast) options might be the correct ones. Melt seemed to get me only what I already have. The examples I have seen thus far start with data in a various formats and end up in the format that I am starting with. In other words, they seem to do the exact opposite of what I'm trying to do. So I'm still stuck with how to get started and whether the functions in reshape2 are actually the correct ones to consider. ...still looking for some help on this. Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] drop1, 2-way Unbalanced ANOVA
Hi all, I've spent quite a lot of time searching through the help lists and reading about how best to run perform a 2-way ANOVA with unbalanced data. I realize this has been covered a great deal so I was trying to avoid adding yet another entry to the long list considering the use of different SS, etc. Unfortunately, I have come to the point where I feel I have to wade in and see if someone can help me out. Hopefully I'll phrase this properly given and hopefully it will end up only requiring a simple response. I have an experiment where I have measured a response variable (such as water content) following exposure to two treatments (oxygen content and medium). Oxygen content has three levels (5, 20, 35) and medium has two levels (Air, Water). I am interested if water content is different under the two treatments and whether the effect of oxygen content depends upon the medium in which the experiment was conducted (Air or Water). Unfortunately, the design is unbalanced as some experimental subjects had to be removed from the experiment. I realize that if I just use aov() to perform a two-way ANOVA the order in which the terms (oxygen content and medium) are entered will give different results because of the sequential SS. What I have done in the past is utilize drop1() in conjunction with aov() drop1(aov(WaterContent~Oxygen*Medium, data), test=F) to see if the interaction term was significant (F, p-value) and if its inclusion improved model fit (AIC). If from this I determine that the interaction term can be removed and the model can be rerun without it, I am able to test for main-effects and get F and p-values that I can report in a manuscript. However, if the interaction term is significant and its inclusion is warranted, drop1() only provide me with SS, F, and p-value for the interaction term. Now this is fine, because I do not wish to interpret the main-effects with a significant interaction, but in a manuscript reviewers will request an ANOVA table where l will be asked to report SS, F and p-values for the other terms. I don't have those because I used drop1() which only provides these for the highest order term in the model. How best should I calculate the values that I know I will be asked to provide in a manuscript? I don't wish to come across as a scientist who is simply a slave to the F and p-values with little regard for the data, the hypotheses, and the actual statistical interpretation. I am interested in doing this right, but I also know that practically in the current status of our field, while I focus on doing statistics that address my hypotheses of interest and can choose to not discuss the main effects in isolation when an interaction exists, I will be asked to provide the ANOVA table with all the degrees of freedom, SS, F-values, p-values etc...for the entire model, not just the highest order term. Can anyone provide advice here? Should I just use the car package and Type III SS with an appropriate contrast and not use the drop1() function, even though I'm really not interested in using the Type III SS and I kinda like the drop1()? I am not opposed to Type II SS, but clearly if the interaction is important then using Type II SS, which do not consider interactions, are not appropriate. Hopefully this is somewhat clear and doesn't simply sound like a rehashing of the same old ANOVA and SS story. Maybe I should be doing something completely different I greatly appreciate constructive comments. Thanks, Nate [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating panel data
This looks really ugly but it 'may' do what you want. I was too lazy to generate enough raw data to check. Note i changed the names in x as they were a bit clumsy. x - data.frame( id = c(1,2), Event1= c(1,0), YEvent1 = c(68.25,0), Event2 = c(0,1), YEvent2 = c(0,68.5)) y - data.frame( id = c( rep (1,5), rep (2,5)), Year=c (68.0,68.25,68.50,68.75,69.0)) y $ Event1 - c (0,1,1,1,1,0,0,0,0,0) y $ Event2 - c (0,0,0,0,0,0,0,1,1,1) x - data.frame( id = c(1,2), Event1= c(1,0), YEvent1 = c(68.25,0), Event2 = c(0,1), YEvent2 = c(0,68.5)) dd - melt(x, id= c(id, Event1, Event2), value.name=year.quarter ) dd1 - subset(dd, dd[, 5] != 0 ) dd1 - dd1[ , c(1,2,3,5)] John Kane Kingston ON Canada -Original Message- From: r...@jp.pair.com Sent: Mon, 23 Jul 2012 11:33:37 -0500 To: gunter.ber...@gene.com Subject: Re: [R] Creating panel data At 10:38 AM 7/23/2012, you wrote: You might also find it useful to use Hadley Wickham's plyr and/or reshape2 packages, whose aim is to standardize and simplify data manipulation tasks. Cheers, Bert I have already used R enough to have correctly imported the actual data. After import, it is in the approximate format at the x dataframe I previously posted. I already found the plyr and reshape2 packages and had assumed that the cast (or dcast) options might be the correct ones. Melt seemed to get me only what I already have. The examples I have seen thus far start with data in a various formats and end up in the format that I am starting with. In other words, they seem to do the exact opposite of what I'm trying to do. So I'm still stuck with how to get started and whether the functions in reshape2 are actually the correct ones to consider. ...still looking for some help on this. Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hire person to convert R code to SAS
It would make much more sense to convert in the opposite direction. Frank willsurg wrote I am looking to hire someone to convert a small bit of R code into SAS code. The R code is about 2 word pages long and uses Snell's law to convert likert scales. If you are willing to look at this, or could point me to someone who would, it would be very much appreciated. Thanks in advance! -will - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Hire-person-to-convert-R-code-to-SAS-tp4637436p4637473.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] image function
Did you try this? image(x, xlim=c(0, 100), ylim=c(0, 100)) Jean li li hannah@gmail.com wrote on 07/22/2012 09:28:33 PM: Dear all, I have a question regarding changing the xlim and ylim in the function image(). For example, in the following code, how can I have a heatmap with xlim=ylim=c(0, 100) instead of (0,1). Thank you very much. x - matrix(rnorm(1, 0,1), 100, 100) image(x) Hannah [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hire person to convert R code to SAS
On Mon, Jul 23, 2012 at 6:19 PM, Frank Harrell f.harr...@vanderbilt.edu wrote: It would make much more sense to convert in the opposite direction. Frank I got the fear when I saw the unit of R code length was '[presumably MS] Word pages'. We had one student who wrote all her R code in MS Word (and coloured bits of it nicely sometimes) but was then surprised that it didn't cut n paste into R properly. And then I tried to run one from the command line... willsurg wrote I am looking to hire someone to convert a small bit of R code into SAS code. The R code is about 2 word pages long and uses Snell's law to convert likert scales. If you are willing to look at this, or could point me to someone who would, it would be very much appreciated. Thanks in advance! -will - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Hire-person-to-convert-R-code-to-SAS-tp4637436p4637473.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- blog: http://geospaced.blogspot.com/ web: http://www.maths.lancs.ac.uk/~rowlings web: http://www.rowlingson.com/ twitter: http://twitter.com/geospacedman pics: http://www.flickr.com/photos/spacedman __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unique Values per Column
Bert, Is it important that you end up with a data frame? If not, it would be very easy to generate a list with the unique values for each column. For example: df - data.frame(v1 = sample(5, 20, T), v2 = sample(7, 20, T), v3 = sample(9, 20, T), v4 = sample(11, 20, T)) lapply(df, unique) Jean Bert Jacobs bert.jac...@figurestofacts.be wrote on 07/20/2012 02:37:37 PM: Hi, I was wondering what the best way is to create a new dataframe based on an existing dataframe with only the unique available levels for each column (22 columns in total) in it. If some columns have less unique values than others, then those columns can be filled with blanks for the remaining part. Is it possible to make it a generic function, which is independent from the column names? Thx for helping me out. Kind regards, Bert [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hire person to convert R code to SAS
Why not just call the R code from SAS since SAS is supposed to have an interface for using R within SAS? On Mon, Jul 23, 2012 at 1:26 PM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On Mon, Jul 23, 2012 at 6:19 PM, Frank Harrell f.harr...@vanderbilt.edu wrote: It would make much more sense to convert in the opposite direction. Frank I got the fear when I saw the unit of R code length was '[presumably MS] Word pages'. We had one student who wrote all her R code in MS Word (and coloured bits of it nicely sometimes) but was then surprised that it didn't cut n paste into R properly. And then I tried to run one from the command line... willsurg wrote I am looking to hire someone to convert a small bit of R code into SAS code. The R code is about 2 word pages long and uses Snell's law to convert likert scales. If you are willing to look at this, or could point me to someone who would, it would be very much appreciated. Thanks in advance! -will - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Hire-person-to-convert-R-code-to-SAS-tp4637436p4637473.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- blog: http://geospaced.blogspot.com/ web: http://www.maths.lancs.ac.uk/~rowlings web: http://www.rowlingson.com/ twitter: http://twitter.com/geospacedman pics: http://www.flickr.com/photos/spacedman __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drop1, 2-way Unbalanced ANOVA
Dear Nate, I don't want to repeat everything that I previously said on this subject, both on this email list and more generally, but briefly: The reason to do so-called type-II tests is that they're maximally powerful for the main effects if the interactions are nil, which is precisely the circumstance in which main effects are generally of interest. The argument for doing so-called type-III tests, properly formulated of course, is that they are valid (if not maximally powerful) when the interactions are nil and also test hypotheses that can be constructed to be about main effects (i.e., the effect of each factor averaged over the levels of the other) when interactions are not nil. I personally prefer the first approach. If you use drop1(), removing the interaction after testing it, you'll get something similar to a type-II test, but pooling the interaction into the estimate of error variance. I hope this helps, John On Mon, 23 Jul 2012 09:58:50 -0700 Nathan Miller natemille...@gmail.com wrote: Hi all, I've spent quite a lot of time searching through the help lists and reading about how best to run perform a 2-way ANOVA with unbalanced data. I realize this has been covered a great deal so I was trying to avoid adding yet another entry to the long list considering the use of different SS, etc. Unfortunately, I have come to the point where I feel I have to wade in and see if someone can help me out. Hopefully I'll phrase this properly given and hopefully it will end up only requiring a simple response. I have an experiment where I have measured a response variable (such as water content) following exposure to two treatments (oxygen content and medium). Oxygen content has three levels (5, 20, 35) and medium has two levels (Air, Water). I am interested if water content is different under the two treatments and whether the effect of oxygen content depends upon the medium in which the experiment was conducted (Air or Water). Unfortunately, the design is unbalanced as some experimental subjects had to be removed from the experiment. I realize that if I just use aov() to perform a two-way ANOVA the order in which the terms (oxygen content and medium) are entered will give different results because of the sequential SS. What I have done in the past is utilize drop1() in conjunction with aov() drop1(aov(WaterContent~Oxygen*Medium, data), test=F) to see if the interaction term was significant (F, p-value) and if its inclusion improved model fit (AIC). If from this I determine that the interaction term can be removed and the model can be rerun without it, I am able to test for main-effects and get F and p-values that I can report in a manuscript. However, if the interaction term is significant and its inclusion is warranted, drop1() only provide me with SS, F, and p-value for the interaction term. Now this is fine, because I do not wish to interpret the main-effects with a significant interaction, but in a manuscript reviewers will request an ANOVA table where l will be asked to report SS, F and p-values for the other terms. I don't have those because I used drop1() which only provides these for the highest order term in the model. How best should I calculate the values that I know I will be asked to provide in a manuscript? I don't wish to come across as a scientist who is simply a slave to the F and p-values with little regard for the data, the hypotheses, and the actual statistical interpretation. I am interested in doing this right, but I also know that practically in the current status of our field, while I focus on doing statistics that address my hypotheses of interest and can choose to not discuss the main effects in isolation when an interaction exists, I will be asked to provide the ANOVA table with all the degrees of freedom, SS, F-values, p-values etc...for the entire model, not just the highest order term. Can anyone provide advice here? Should I just use the car package and Type III SS with an appropriate contrast and not use the drop1() function, even though I'm really not interested in using the Type III SS and I kinda like the drop1()? I am not opposed to Type II SS, but clearly if the interaction is important then using Type II SS, which do not consider interactions, are not appropriate. Hopefully this is somewhat clear and doesn't simply sound like a rehashing of the same old ANOVA and SS story. Maybe I should be doing something completely different I greatly appreciate constructive comments. Thanks, Nate [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing
Re: [R] Help with Portfolio Optmization
On Mon, Jul 23, 2012 at 6:24 AM, MaheshT mahesh.m...@gmail.com wrote: Hi, I need some help with Portfolio Optimization problem. I am trying to find the minimum variance portfolio subjected to constraints on weights like /x1 w1 x2 x3 w2 x4/i I need help with solving for the minimum variance portfolio as solve.QP doesn't allow me to specify the lower boundaries. Well, no: you have to be a little smarter about it though: try to walk through the example in ?solve.QP again. The trick is to split the conditions and then note that a x b -- a x x b -- -x -a x b -- both of which are now of the form constant*x bound so you can wrap them up in a single matrix. c(-1, 1) * x c(-a, b) Incidentally, this has already been solved for you here (inter alia): https://systematicinvestor.wordpress.com/2011/12/13/backtesting-minimum-variance-portfolios/ For this sort of problem, you might also get better help on the R-SIG-Finance lists. Cheers, Michael Thanks Mahesh -- View this message in context: http://r.789695.n4.nabble.com/Help-with-Portfolio-Optmization-tp4637425.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drop1, 2-way Unbalanced ANOVA
On Jul 23, 2012, at 18:58 , Nathan Miller wrote: Hi all, I've spent quite a lot of time searching through the help lists and reading about how best to run perform a 2-way ANOVA with unbalanced data. I realize this has been covered a great deal so I was trying to avoid adding yet another entry to the long list considering the use of different SS, etc. Unfortunately, I have come to the point where I feel I have to wade in and see if someone can help me out. Hopefully I'll phrase this properly given and hopefully it will end up only requiring a simple response. I have an experiment where I have measured a response variable (such as water content) following exposure to two treatments (oxygen content and medium). Oxygen content has three levels (5, 20, 35) and medium has two levels (Air, Water). I am interested if water content is different under the two treatments and whether the effect of oxygen content depends upon the medium in which the experiment was conducted (Air or Water). Unfortunately, the design is unbalanced as some experimental subjects had to be removed from the experiment. I realize that if I just use aov() to perform a two-way ANOVA the order in which the terms (oxygen content and medium) are entered will give different results because of the sequential SS. What I have done in the past is utilize drop1() in conjunction with aov() drop1(aov(WaterContent~Oxygen*Medium, data), test=F) to see if the interaction term was significant (F, p-value) and if its inclusion improved model fit (AIC). If from this I determine that the interaction term can be removed and the model can be rerun without it, I am able to test for main-effects and get F and p-values that I can report in a manuscript. However, if the interaction term is significant and its inclusion is warranted, drop1() only provide me with SS, F, and p-value for the interaction term. Now this is fine, because I do not wish to interpret the main-effects with a significant interaction, but in a manuscript reviewers will request an ANOVA table where l will be asked to report SS, F and p-values for the other terms. I don't have those because I used drop1() which only provides these for the highest order term in the model. How best should I calculate the values that I know I will be asked to provide in a manuscript? I don't wish to come across as a scientist who is simply a slave to the F and p-values with little regard for the data, the hypotheses, and the actual statistical interpretation. I am interested in doing this right, but I also know that practically in the current status of our field, while I focus on doing statistics that address my hypotheses of interest and can choose to not discuss the main effects in isolation when an interaction exists, I will be asked to provide the ANOVA table with all the degrees of freedom, SS, F-values, p-values etc...for the entire model, not just the highest order term. Can anyone provide advice here? Should I just use the car package and Type III SS with an appropriate contrast and not use the drop1() function, even though I'm really not interested in using the Type III SS and I kinda like the drop1()? I am not opposed to Type II SS, but clearly if the interaction is important then using Type II SS, which do not consider interactions, are not appropriate. Hopefully this is somewhat clear and doesn't simply sound like a rehashing of the same old ANOVA and SS story. Maybe I should be doing something completely different I greatly appreciate constructive comments. (1) Do yourself a favor and restrict aov() to balanced analyses. It probably won't do anything badly wrong if there is only one error term, but it offers little above plain lm() in those cases. (2) Give the reviewers what they want, unless clearly ridiculous. In this case it probably isn't. (For one thing, some may want to disregard a weakly significant interaction if one or both tests of main effects are not significant. Also, stating the df etc. gives some indication that the author knows what he is doing and guards against silliness like using category codes as numerical.) I'd avoid Type-III SS since they have fooled me so often. If you can get away with it, see if they'll accept the _two_ Type-I tables corresponding to anova(lm(WaterContent ~ Oxygen * Medium, data), test=F) anova(lm(WaterContent ~ Medium * Oxygen, data), test=F) If that doesn't work, you can consider selecting the Type-I table that is most relevant, e.g., if everybody knows that there is a Medium effect, then ensure that it enters the model first. Alternatively, you can give Type II SS for the main effects, Medium | Oxygen and Oxygen | Medium. In both cases, it would be appropriate to insert a note that the table does not tell the whole story because of unbalancedness. However, first and foremost: If there is an interaction, you need to spend some time explaining what its nature
[R] looking to hire person to convert R to SAS
Hi, I am looking to hire someone to convert a small bit of R code into SAS code. The R code is about 2 word pages long and uses Snell's law to convert likert scales. If you are willing to look at this, or could point me to someone who would, it would be very much appreciated. Thanks in advance! -will [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum number of patterns and speed in grep
Hi, I have a minor follow-up question: In the example below, ann and nn in the third element of text are matched. I would like to ignore all matches in which the character following the match is one of [:alpha:]. How do I do this without removing the ignore.case = TRUE argument of the strapply function? So the output should be: [[1]] [1] Santa Fe Gold Corp [[2]] [1] Starpharma Holdings [[3]] NULL Rather than: [[1]] [1] Santa Fe Gold Corp [[2]] [1] Starpharma Holdings [[3]] [1] ann nn Thanks! require(gsubfn) # read in data data - read.csv(https://dl.dropbox.com/u/13631687/data.csv;, header = T, sep = ,) # define the object to be searched text - c(the first is Santa Fe Gold Corp, the second is Starpharma Holdings, the annual earnings exceed those of last year) k - 3000 # chunk size f - function(from, text) { to - min(from + k - 1, nrow(data)) r - paste(data[seq(from, to), 1], collapse = |) r - gsub([().*?+{}], , r) strapply(text, r, ignore.case = TRUE) } ix - seq(1, nrow(data), k) out - lapply(text, function(text) unlist(lapply(ix, f, text))) -- View this message in context: http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4637458.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FIML using lavaan returns zeroes for coefficients
Hi Miles and Yves, I agree with Miles to use lavaan to do regression so that FIML can be used. This is one of the amazing things that lavaan can do. If lavaan can handle missing values for the univariate regression using FIML, I don't see why you want to avoid it. ya From: Andrew Miles Date: 2012-07-23 16:07 To: yrosseel CC: r-help Subject: Re: [R] FIML using lavaan returns zeroes for coefficients Thanks for the helpful explanation. As to your question, I sometimes use lavaan to fit univariate regressions simply because it can handle missing data using FIML rather than listwise deletion. Are there reasons to avoid this? BTW, thanks for the update in the development version. Andrew Miles On Jul 21, 2012, at 12:59 PM, yrosseel wrote: On 07/20/2012 10:35 PM, Andrew Miles wrote: Hello! I am trying to reproduce (for a publication) analyses that I ran several months ago using lavaan, I'm not sure which version, probably 0.4-12. A sample model is given below: pathmod='mh30days.log.w2 ~ mh30days.log + joingroup + leavegroup + alwaysgroup + grp.partic.w2 + black + age + bivoc + moved.conf + local.noretired + retired + ds + ministrytime + hrswork + nomoralescore.c + negint.c + cong.conflict.c + nomoraleXjoin + nomoraleXleave + nomoraleXalways + negintXjoin + negintXleave + negintXalways + conflictXjoin + conflictXleave + conflictXalways ' mod1 = sem(pathmod, data=sampledat, missing=fiml, se=robust) At the time, the model ran fine. Now, using version 0.4-14, the model returns all 0's for coefficients. What happened is that since 0.4-14, lavaan tries to 'detect' models that are just univariate regression, and internally calls lm.fit, instead of the lavaan estimation engine, at least when the missing=ml argument is NOT used. (BTW, I fail to understand why you would use lavaan if you just want to fit a univariate regression). When missing=ml is used, lavaan normally checks if you have fixed x covariates (which you do), and if fixed.x=TRUE (which is the default). In 0.4, lavaan internally switches to fixed.x=FALSE (which implicitly assumes that all your predictors are continuous, but I assume you would not using missing=ml otherwise). Unfortunately, for the 'special' case of univariate regression, it fails to do this. This behavior will likely change in 0.5, where, by default, only endogenous/dependent variables will be handled by missing=ml, not exogenous 'x' covariates. To fix it: simply add the fixed.x=FALSE argument, or revert to 0.4-12 to get the old behavior. Hope this helps, Yves. http://lavaan.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] marginal effect lmer
Dear John, yes, the superscript shall say that the quadratic term of Catholicproportion is included as well (plus the interaction terms with it). In my dataset the quadratic Catholicproportion is already included as a variable of its own, so it should be fine or do I have to use the syntax with poly... you mentioned? If so how do I have to rewrite the command for the whole model? Out of your description of the package I just tried the example you mentioned for a lmer model, so I tried: plot(effect(Catholic:Catholicproportion, m1), grid=TRUE) The result are ten (dk why 10 and not just 1) little plots, which look rather linear and not the curvilinear pattern it should be. So far I always calculated the marginal effects in Stata, which works fine for rather simple hierarchical models. But when the models become more complicated (more random effects, more interaction terms,..) Stata often fails to calculate the models, so that I cannot use it for displaying the marginal effects. Thats why I try to find a solution in R to calculate the marginal effects exactly the same way as I do in Stata. In Stata I followed the syntax by Brambor, Clark and Golder (https://files.nyu.edu/mrg217/public/interaction.html). Now I just wonder if there is a way to calculate the marginal effects in R exactly the same way as they do in Stata? Best, Andigo -- View this message in context: http://r.789695.n4.nabble.com/marginal-effect-lmer-tp4637421p4637452.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove row.names column in dataframe
try row.names(your.data.frame)-NULL Salut! -- View this message in context: http://r.789695.n4.nabble.com/Remove-row-names-column-in-dataframe-tp856647p4637486.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extract values from summary of function indval of the package labdsv
Hi everybody, I am doing Indicator species analysis using the function indval from the package labdsv. For further analysis I need the values Number of Significant Indicators and Sum of Indicator Values that is calculated from the summary on my indval object. indication3-indval(Veg,caver3,numitr=4999) summary(indication3) cluster indicator_value probability X141 0.9423 0.00020004 X101 0.8993 0.00040008 ... Sum of probabilities = 22.7643528705741 Sum of Indicator Values = 57.55 Sum of Significant Indicator Values = 17.03 Number of Significant Indicators = 22 Significant Indicator Distribution 1 2 3 10 10 2 I tried to extract them using summary(indication3)$Sum of Indicator Values for example, but it doesn't work. Is there a way to get the values? Someone has an idea? Thanks katse -- View this message in context: http://r.789695.n4.nabble.com/extract-values-from-summary-of-function-indval-of-the-package-labdsv-tp4637466.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] VARMA
Hi all, Â I need to estimate the coefficient of Varma model with linear constraints. The packages that I found that deal with Varma are dlm and dse. Does any of them can be used in that case? I will be very grateful for any help. Â Anna Dudek AGH University of Science and Technology in Krakow Poland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to use color shade in Stacked bar plot?
Hi, I am wokring on stacked R plot but i want to use shade color of red for each stack and corresponding legend. How can i use it? Regards -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-color-shade-in-Stacked-bar-plot-tp4637468.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Solving equations in R
Hi there, I would like to solve the following equation in R to estimate 'a'. I have the amp, d, x and y. amp*y^2 = 2*a*(1-a)*(-a*d+(1-a)*x)^2 test data: amp = 0.2370 y= 0.0233 d= 0.002 x= 0.091 Can anyone suggest how I can set this up? Thanks, Diviya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help building dataset
I'm having trouble building a dataset. I'm working with Census data from Brazil, and the particular set I'm trying to get into right now is a microdata sample which has 4 data files that are saved at .txt files full of numbers. The folder also has lot of excel sheets and other text files describing the data, and (I'm assuming) to help organize everything. Unfortunately there isn't much help in the description about how to construct the dataset and avoid messing things up (since its Census data, I need to make sure I avoid associating data with the wrong city/state, etc.). I basically just need to be able to put the data in readable format, because there's literally 1 variable in the set that I can't find anywhere else and need to get at for some analysis. However, when I've tried to get the data straight into R (copy from NotePad), it overloads R, and R stops responding. Any suggestions? Or, if there isn't enough information about the set to be helpful, what else do you need to know? Or if you'd like to take a look at the data let me know and I can attach it. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/help-building-dataset-tp4637491.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tm_map help
I encountered this error when I used readLines on a text file, and some of the lines were empty. Once I extracted empty rows it worked fine, EG Text = readLines(~/text.txt) extracts = which(Text == ) Text = Text[-extracts] -- View this message in context: http://r.789695.n4.nabble.com/tm-map-help-tp4423241p4637492.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FIML using lavaan returns zeroes for coefficients
Hi Andrew, I do not think there is a reason to avoid it for univariate regression other than: 1) as was stated the predictors must be continuous 2) it will be slower (non issue for a handful of regressions on a few thousand cases but for people doing thousands of regression on millions of observations, a big concern) In the next month or so, I may have a beta version of a package primarily providing helper functions for fitting SEM models, but could also include an lm()ish wrapper to lavaan. It would use the traditional formula interface, but issue a warning if factor variables or variables with insufficient unique values were used (as either predictors or outcomes). If anyone would be interested in beta testing, feel free to email me. Once I have a basic package working, it will go up on github. Cheers, Josh On Mon, Jul 23, 2012 at 6:07 AM, Andrew Miles rstuff.mi...@gmail.com wrote: Thanks for the helpful explanation. As to your question, I sometimes use lavaan to fit univariate regressions simply because it can handle missing data using FIML rather than listwise deletion. Are there reasons to avoid this? BTW, thanks for the update in the development version. Andrew Miles On Jul 21, 2012, at 12:59 PM, yrosseel wrote: On 07/20/2012 10:35 PM, Andrew Miles wrote: Hello! I am trying to reproduce (for a publication) analyses that I ran several months ago using lavaan, I'm not sure which version, probably 0.4-12. A sample model is given below: pathmod='mh30days.log.w2 ~ mh30days.log + joingroup + leavegroup + alwaysgroup + grp.partic.w2 + black + age + bivoc + moved.conf + local.noretired + retired + ds + ministrytime + hrswork + nomoralescore.c + negint.c + cong.conflict.c + nomoraleXjoin + nomoraleXleave + nomoraleXalways + negintXjoin + negintXleave + negintXalways + conflictXjoin + conflictXleave + conflictXalways ' mod1 = sem(pathmod, data=sampledat, missing=fiml, se=robust) At the time, the model ran fine. Now, using version 0.4-14, the model returns all 0's for coefficients. What happened is that since 0.4-14, lavaan tries to 'detect' models that are just univariate regression, and internally calls lm.fit, instead of the lavaan estimation engine, at least when the missing=ml argument is NOT used. (BTW, I fail to understand why you would use lavaan if you just want to fit a univariate regression). When missing=ml is used, lavaan normally checks if you have fixed x covariates (which you do), and if fixed.x=TRUE (which is the default). In 0.4, lavaan internally switches to fixed.x=FALSE (which implicitly assumes that all your predictors are continuous, but I assume you would not using missing=ml otherwise). Unfortunately, for the 'special' case of univariate regression, it fails to do this. This behavior will likely change in 0.5, where, by default, only endogenous/dependent variables will be handled by missing=ml, not exogenous 'x' covariates. To fix it: simply add the fixed.x=FALSE argument, or revert to 0.4-12 to get the old behavior. Hope this helps, Yves. http://lavaan.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving equations in R
On 23/07/2012 2:46 PM, Diviya Smith wrote: Hi there, I would like to solve the following equation in R to estimate 'a'. I have the amp, d, x and y. amp*y^2 = 2*a*(1-a)*(-a*d+(1-a)*x)^2 test data: amp = 0.2370 y= 0.0233 d= 0.002 x= 0.091 Can anyone suggest how I can set this up? See ?polyroot. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use color shade in Stacked bar plot?
Far too vague a question but perhaps something like barplot(1:12, col = c(brown1, brown3, brown2, firebrick, firebrick1, firebrick4, firebrick3, firebrick2, darkred, red4, red3)) legend(topleft, col = c(brown1, brown3, brown2, firebrick, firebrick1, firebrick4, firebrick3, firebrick2, darkred, red4, red3), legend = 1:12, pch = 15) Michael On Mon, Jul 23, 2012 at 11:34 AM, Manish Gupta mandecent.gu...@gmail.com wrote: Hi, I am wokring on stacked R plot but i want to use shade color of red for each stack and corresponding legend. How can i use it? Regards -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-color-shade-in-Stacked-bar-plot-tp4637468.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help building dataset
Weren't you told to take a look at read.table() (both the function help and the manual describing it)? If the rows correspond in each data file, something like do.call(cbind, lapply(dir(), read.table)) will possibly align the results of read.table()-ing each file in your directory. To parse that further: dir() gives a list of all files in the directory. lapply( x, FUN) takes a set of values (x) and does FUN to them. Here it would read.table() on each file name. do.call(cbind, x) will call the cbind() function on the results of lapply(). It's sort of like doing cbind(x[1], x[2], x[3], ...) but doesn't require as much typing or you to know how many columns are going in. Michael On Mon, Jul 23, 2012 at 1:32 PM, walcotteric walco...@msu.edu wrote: I'm having trouble building a dataset. I'm working with Census data from Brazil, and the particular set I'm trying to get into right now is a microdata sample which has 4 data files that are saved at .txt files full of numbers. The folder also has lot of excel sheets and other text files describing the data, and (I'm assuming) to help organize everything. Unfortunately there isn't much help in the description about how to construct the dataset and avoid messing things up (since its Census data, I need to make sure I avoid associating data with the wrong city/state, etc.). I basically just need to be able to put the data in readable format, because there's literally 1 variable in the set that I can't find anywhere else and need to get at for some analysis. However, when I've tried to get the data straight into R (copy from NotePad), it overloads R, and R stops responding. Any suggestions? Or, if there isn't enough information about the set to be helpful, what else do you need to know? Or if you'd like to take a look at the data let me know and I can attach it. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/help-building-dataset-tp4637491.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving equations in R
Diviya Smith wrote Hi there, I would like to solve the following equation in R to estimate 'a'. I have the amp, d, x and y. amp*y^2 = 2*a*(1-a)*(-a*d+(1-a)*x)^2 test data: amp = 0.2370 y= 0.0233 d= 0.002 x= 0.091 Can anyone suggest how I can set this up? This should help you to get started. library(nleqslv) f - function(a, amp=.2370,y=0.0233, d=0.002, x=0.091) { amp*y^2 - ( 2*a*(1-a)*(-a*d+(1-a)*x)^2 ) } # draw the function curve(f, from=-0.2, to=1.5, col=blue) abline(h=0, col=red) # use nleqslv # use two different starting values to see if we can solve for the two roots # you can leave out the control=... argument astart - 1 nleqslv(astart,f, control=list(trace=1)) astart - 0 nleqslv(astart,f, control=list(trace=1)) # use uniroot # limit the lower/upper interval to the two roots # ranges chosen from the plot so that function values at the endpoints have different signs uniroot(f,c(-0.5,.1)) uniroot(f,c(0.5,1.5)) Berend -- View this message in context: http://r.789695.n4.nabble.com/Solving-equations-in-R-tp4637498p4637503.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] igraph node placement
Hello R users, I've just defended my PhD dissertation, and I'm engaging in discussions with the ruler lady. She has some problems with my figures. The problem is that my nodes overlap the text that I've put inside of them. Here is a url for the figure: https://www.msu.edu/~wolfste4/igraph/sorter34_graphvfr.pdf (As you can see, the cluster at the center left of this graph is quite problematic.) Also, I've got a file with the weighted adjacency matrix in it here: https://www.msu.edu/~wolfste4/igraph/sorter34 This is the code that I'm using to make the graph: _ require(igraph) # Get weighted adjacency matrix: fname=sorter34 location load(fname) # Now make graphs and plot them: gg=graph.adjacency(wadj,mode=max) V(gg)$name = 1:50 V(gg)$label = V(gg)$name E(gg)$weight = count.multiple(gg) gg = simplify(gg) wt = 2*E(gg)$weight*E(gg)$weight par(mar=c(0, 0, 0, 0)) #c(bottom, left, top, right) l=layout.fruchterman.reingold(gg,coolexp=10) plot.igraph(gg, layout=l, vertex.color=gray(0.8), edge.width=wt, vertex.label.color='black', edge.color=gray(0.2), vertex.label.cex=1.5, vertex.size=12) Thanks in advance for your help! -Steve ___ Steven Wolf Michigan State University Lyman Briggs College 35 E. Holmes Hall East Lansing, MI 48823 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] turning R expressions into functions?
One of the things I would love to add to my package would be the ability to compare more than two expressions in one call. But unfortunately, I haven't found out so far whether (and if so, how) it is possible to extract the elements of a ... object without evaluating them. Have a look at match.call. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] EM for missing data
I am not a professional and sorry if I bring any incorrect concept. When you say impute, I guess you want to replace the missing part by its conditional expection. This is not safe if you do non-linear opearations later. An simple example is the expection of quadratic form, E(X'AX) != E(X)'AE(X). In other words, for example, if you want to estimate the covariance, better avoiding using the imputed values directly since they will not give good results, but if you want to estimate the mean, then it is maybe OK (maybe, I do not remember my results). Please correct if I am wrong. I am also interested in those packages. But if you are stuck with finding a proper package, you may write your own. The most clear explaniation I have read is wiki. In order to use EM, you need to specify the distribution of the data, not sure if it is proper, depending what is your next step. And also, Bayesian methods and packages I think exist in R to impute the data. Best wishes, Jie On Mon, Jul 23, 2012 at 11:39 AM, David L Carlson dcarl...@tamu.edu wrote: This is chopped off. What happens next is important. EM can be used to compute covariance matrices and conducts the statistical analysis without imputing any missing values or it can be used to impute missing values to create one or multiple data sets. You might find Missing Data by Paul D. Allison to be useful. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of ya Sent: Sunday, July 22, 2012 5:11 AM To: r-help Subject: Re: [R] EM for missing data hi Greg, David, and Tal, Thank you very much for the information. I found this in SPSS 17.0 missing value manual: EM Method This method assumes a distribution for the partially missing data and bases inferences on the likelihood under that distribution. Each iteration consists of an E step and an M step. The E step finds the conditional expectation of the b __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] marginal effect lmer
Dear Andigo, On Mon, 23 Jul 2012 07:42:49 -0700 (PDT) Andigo andreas.goldb...@unige.ch wrote: Dear John, yes, the superscript shall say that the quadratic term of Catholicproportion is included as well (plus the interaction terms with it). In my dataset the quadratic Catholicproportion is already included as a variable of its own, so it should be fine or do I have to use the syntax with poly... you mentioned? If so how do I have to rewrite the command for the whole model? It's hard to know where to begin, particularly since you deleted the earlier discussion. Retrieving that from my reply to you, the model you fit was m1- lmer(vote2011~ Catholic + Attendance+ logthreshold + West + Catholicproportion+ (Catholic * Catholicproportion) + (Attendance*Catholicproportion) + Catholicproportion²+ (Catholic *Catholicproportion²)+ (Attendance* Catholicproportion²) + (1 + Attendance+ Catholic | canton), data=dat2011, verbose = T) I don't really know what these variables are but I'll assume that all are numeric variables and not, e.g., 0/1 dummy regressors that you put in the model instead of factors in the same way that you generated the quadratic regressor for Catholicproportion -- both of these are bad ideas in R because they disguise the structure of the model. Anyway, I suppose that what you want in the fixed-effects part of the model is logthreshold + West + (Catholic + Attendance)*poly(Catholicproportion, 2) This will give you an orthogonal quadradic in Catholicproportion that interacts both with Catholic and Attendance and will allow effect() to figure out the structure of the model; if you want a raw quadratic (which would provide the same fit to the data but be somewhat less numerically stable), you could substitute poly(Catholicproportion, 2, raw=TRUE). Out of your description of the package I just tried the example you mentioned for a lmer model, so I tried: plot(effect(Catholic:Catholicproportion, m1), grid=TRUE) I'm not sure what you read, but what you got is what you asked for; with the respecified model, you could ask for the Catholic:poly(Catholicproportion, 2) effect, or more simply plot(allEffects(m1)) to plot all high-order fixed-effects terms in the model. The result are ten (dk why 10 and not just 1) little plots, which look rather linear and not the curvilinear pattern it should be. Catholic and Catholicproportion are presumably numeric variables, and as documented in ?effect, each will by default be set to 10 levels spanning the range of the variable. There are then 10*10 == 100 combinations of values of these variables at which the effect will be estimated. That you expected one plot suggests to me that you don't quite understand what effect() computes (though all the lines could be put on one plot by setting the argument multiline=TRUE -- ?plot.eff). The lines are straight because effect() couldn't read your mind to know that Catholicproportion2 and Catholicproportion are related, which is why using poly() instead is a good idea. So far I always calculated the marginal effects in Stata, which works fine for rather simple hierarchical models. But when the models become more complicated (more random effects, more interaction terms,..) Stata often fails to calculate the models, so that I cannot use it for displaying the marginal effects. Thats why I try to find a solution in R to calculate the marginal effects exactly the same way as I do in Stata. In Stata I followed the syntax by Brambor, Clark and Golder (https://files.nyu.edu/mrg217/public/interaction.html). Now I just wonder if there is a way to calculate the marginal effects in R exactly the same way as they do in Stata? I already explained that effect() doesn't compute marginal effects. There may well be an R package that does, but I'm not aware of one. It should, however, be a simple matter to compute marginal effects from the lmer() results if you find marginal effects interesting. Best, John Best, Andigo -- View this message in context: http://r.789695.n4.nabble.com/marginal-effect-lmer-tp4637421p4637452.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hire person to convert R code to SAS
I will see if my SAS programer can do that, He initially had no idea what R was, which is what prompted me looking for someone who knew R -- View this message in context: http://r.789695.n4.nabble.com/Hire-person-to-convert-R-code-to-SAS-tp4637436p4637497.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hire person to convert R code to SAS
interesting idea, the problem is that I have a dataset of 27,000 and already of coded in SAS. I have the R code and just need someone to duplicate it in SAS. :) -- View this message in context: http://r.789695.n4.nabble.com/Hire-person-to-convert-R-code-to-SAS-tp4637436p4637496.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] turning R expressions into functions?
One of the things I would love to add to my package would be the ability to compare more than two expressions in one call. But unfortunately, I haven't found out so far whether (and if so, how) it is possible to extract the elements of a ... object without evaluating them. Have a look at match.call. ... or use dotlist - list(...) to get a list of everything included in ... S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] turning R expressions into functions?
On Mon, Jul 23, 2012 at 2:12 PM, S Ellison s.elli...@lgcgroup.com wrote: One of the things I would love to add to my package would be the ability to compare more than two expressions in one call. But unfortunately, I haven't found out so far whether (and if so, how) it is possible to extract the elements of a ... object without evaluating them. Have a look at match.call. ... or use dotlist - list(...) to get a list of everything included in ... But that evaluates them, which I don't think the original poster wanted. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R2wd package wdGet() error
I am having trouble using the R2wd package. The last time I used it successfully, I was running an earlier version of R and an earlier version of Word with an earlier Windows OS. I'm not sure which if any of these changes might be contributing to the problem. Right now I'm using R version 2.15.0 (2012-03-30) for Windows MS Word 2010 version 14.0.6112.5000 (32-bit) Windows 7 Enterprise 2009 (Service Pack 1) When I submit the following code library(R2wd) library(rcom) wdGet() I get this error message Error in if (wdapp[[Documents]][[Count]] == 0) wdapp[[Documents]]$Add() : argument is of length zero I saw in a previous posting to r-help ( http://r.789695.n4.nabble.com/R2wd-error-in-wdGet-td4632737.html), someone asked if RDCOMClient was installed and working properly. So, I tried submitting this code install.packages(RDCOMClient, repos = http://www.omegahat.org/R;) And I got this warning message: package ?RDCOMClient? is not available (for R version 2.15.0) I also ran installstatconnDCOM() And it seemed to install fine. But when I try to run statconn DCOM's Server 01 ? Basic Test, and then select Start R, I get this message Loading StatConnector Server... Done Initializing R...Function call failed Code: -2147221485 Text: installation problem: unable to load connector Method '~' of object '~' failed Releasing StatConnector Server...Done Any suggestions for how I might get wdGet() to work? Thanks in advance. Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hire person to convert R code to SAS
You actually have a SAS programmer who has never heard of R? On Mon, 23 Jul 2012 11:52:14 -0700 willsurg surg...@mac.com wrote: I will see if my SAS programer can do that, He initially had no idea what R was, which is what prompted me looking for someone who knew R -- View this message in context: http://r.789695.n4.nabble.com/Hire-person-to-convert-R-code-to-SAS-tp4637436p4637497.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Important Notice: This mailbox is ignored: e-mails are set to be deleted on receipt. For those needing to send personal or professional e-mail, please use appropriate addresses. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] turning R expressions into functions?
list(...) evaluates the things in ... E.g., f0 - function(x, ...) list(...) f0(1, warning(Hmm), stop(Oops), cat(some output\n))[[2]] Error in f0(1, warning(Hmm), stop(Oops), cat(some output\n)) : Oops In addition: Warning message: In f0(1, warning(Hmm), stop(Oops), cat(some output\n)) : Hmm You can use the odd idiom substitute(...()) to get the unevaluated ... arguments: f1 - function(x, ...) substitute(...()) f1(1, warning(Hmm), stop(Oops), cat(some output\n)) [[1]] warning(Hmm) [[2]] stop(Oops) [[3]] cat(some output\n) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of S Ellison Sent: Monday, July 23, 2012 2:12 PM To: Jochen Voß Cc: r-help@r-project.org Subject: Re: [R] turning R expressions into functions? One of the things I would love to add to my package would be the ability to compare more than two expressions in one call. But unfortunately, I haven't found out so far whether (and if so, how) it is possible to extract the elements of a ... object without evaluating them. Have a look at match.call. ... or use dotlist - list(...) to get a list of everything included in ... S Ellison * ** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] turning R expressions into functions?
Bill: Is there some reason to prefer your odd idiom to match.call, perhaps as as.list(match.call()), as proposed by Hadley? -- Bert On Mon, Jul 23, 2012 at 2:25 PM, William Dunlap wdun...@tibco.com wrote: list(...) evaluates the things in ... E.g., f0 - function(x, ...) list(...) f0(1, warning(Hmm), stop(Oops), cat(some output\n))[[2]] Error in f0(1, warning(Hmm), stop(Oops), cat(some output\n)) : Oops In addition: Warning message: In f0(1, warning(Hmm), stop(Oops), cat(some output\n)) : Hmm You can use the odd idiom substitute(...()) to get the unevaluated ... arguments: f1 - function(x, ...) substitute(...()) f1(1, warning(Hmm), stop(Oops), cat(some output\n)) [[1]] warning(Hmm) [[2]] stop(Oops) [[3]] cat(some output\n) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of S Ellison Sent: Monday, July 23, 2012 2:12 PM To: Jochen Voß Cc: r-help@r-project.org Subject: Re: [R] turning R expressions into functions? One of the things I would love to add to my package would be the ability to compare more than two expressions in one call. But unfortunately, I haven't found out so far whether (and if so, how) it is possible to extract the elements of a ... object without evaluating them. Have a look at match.call. ... or use dotlist - list(...) to get a list of everything included in ... S Ellison * ** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] turning R expressions into functions?
... or better still, the idiom used in update.default: match.call(expand.dots=FALSE)$... ? -- Bert On Mon, Jul 23, 2012 at 2:45 PM, Bert Gunter bgun...@gene.com wrote: Bill: Is there some reason to prefer your odd idiom to match.call, perhaps as as.list(match.call()), as proposed by Hadley? -- Bert On Mon, Jul 23, 2012 at 2:25 PM, William Dunlap wdun...@tibco.com wrote: list(...) evaluates the things in ... E.g., f0 - function(x, ...) list(...) f0(1, warning(Hmm), stop(Oops), cat(some output\n))[[2]] Error in f0(1, warning(Hmm), stop(Oops), cat(some output\n)) : Oops In addition: Warning message: In f0(1, warning(Hmm), stop(Oops), cat(some output\n)) : Hmm You can use the odd idiom substitute(...()) to get the unevaluated ... arguments: f1 - function(x, ...) substitute(...()) f1(1, warning(Hmm), stop(Oops), cat(some output\n)) [[1]] warning(Hmm) [[2]] stop(Oops) [[3]] cat(some output\n) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of S Ellison Sent: Monday, July 23, 2012 2:12 PM To: Jochen Voß Cc: r-help@r-project.org Subject: Re: [R] turning R expressions into functions? One of the things I would love to add to my package would be the ability to compare more than two expressions in one call. But unfortunately, I haven't found out so far whether (and if so, how) it is possible to extract the elements of a ... object without evaluating them. Have a look at match.call. ... or use dotlist - list(...) to get a list of everything included in ... S Ellison * ** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] turning R expressions into functions?
I tend not to use match.call for this because it feels like I'm repeating work (argument matching) that has already been done. Also, match.call's output needs to be processed a bit to get the expressions. The following 2 functions give the same results, a pairlist of the unevaluated arguments matching the ... in the function definition. f1 - function(x, ..., atEnd) substitute(...()) f2 - function(x, ..., atEnd) match.call(expand.dots=FALSE)$... E.g., str(f1(1, stop(don't evaluate me!), Log=log(-10:1), atEnd=Inf)) Dotted pair list of 2 $: language stop(don't evaluate me!) $ Log: language log(-10:1) str(f2(1, stop(don't evaluate me!), Log=log(-10:1), atEnd=Inf)) Dotted pair list of 2 $: language stop(don't evaluate me!) $ Log: language log(-10:1) The former appears to be about 3 times as fast as the latter, but you need to run it a lot of times (10^4) to see the difference. There may be a bigger speedup if there are lots of non-... arguments, but I haven't tested that. I also use the following, which works in both S+ and R: f3 - function(x, ..., atEnd) as.list(substitute(junk(...)))[-1] str(f3(1, stop(don't evaluate me!), Log=log(-10:1), atEnd=Inf)) List of 2 $: language stop(don't evaluate me!) $ Log: language log(-10:1) It is probably best to bury this in a utility function with an intuitive name instead of trying to remember the idioms. Perhaps there already is one. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Monday, July 23, 2012 2:45 PM To: William Dunlap Cc: S Ellison; Jochen Voß; r-help@r-project.org Subject: Re: [R] turning R expressions into functions? Bill: Is there some reason to prefer your odd idiom to match.call, perhaps as as.list(match.call()), as proposed by Hadley? -- Bert On Mon, Jul 23, 2012 at 2:25 PM, William Dunlap wdun...@tibco.com wrote: list(...) evaluates the things in ... E.g., f0 - function(x, ...) list(...) f0(1, warning(Hmm), stop(Oops), cat(some output\n))[[2]] Error in f0(1, warning(Hmm), stop(Oops), cat(some output\n)) : Oops In addition: Warning message: In f0(1, warning(Hmm), stop(Oops), cat(some output\n)) : Hmm You can use the odd idiom substitute(...()) to get the unevaluated ... arguments: f1 - function(x, ...) substitute(...()) f1(1, warning(Hmm), stop(Oops), cat(some output\n)) [[1]] warning(Hmm) [[2]] stop(Oops) [[3]] cat(some output\n) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of S Ellison Sent: Monday, July 23, 2012 2:12 PM To: Jochen Voß Cc: r-help@r-project.org Subject: Re: [R] turning R expressions into functions? One of the things I would love to add to my package would be the ability to compare more than two expressions in one call. But unfortunately, I haven't found out so far whether (and if so, how) it is possible to extract the elements of a ... object without evaluating them. Have a look at match.call. ... or use dotlist - list(...) to get a list of everything included in ... S Ellison * ** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb- biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating panel data
On Jul 23, 2012, at 10:33 AM, Jeff wrote: At 10:38 AM 7/23/2012, you wrote: You might also find it useful to use Hadley Wickham's plyr and/or reshape2 packages, whose aim is to standardize and simplify data manipulation tasks. Cheers, Bert I have already used R enough to have correctly imported the actual data. After import, it is in the approximate format at the x dataframe I previously posted. I already found the plyr and reshape2 packages and had assumed that the cast (or dcast) options might be the correct ones. Melt seemed to get me only what I already have. The examples I have seen thus far start with data in a various formats and end up in the format that I am starting with. In other words, they seem to do the exact opposite of what I'm trying to do. So I'm still stuck with how to get started and whether the functions in reshape2 are actually the correct ones to consider. ...still looking for some help on this. I didn't see a clear way to use either reshape() or the plyr/reshape2 packages to do this, (but would enjoy seeing an example that improved my understanding on this path) so I just looked at your x and then created a scaffold with the number of rows needed to match your y and filled in the the other columns by first merging to that scaffold and then creating new columns: y2 - data.frame(id=rep(1:2, each=5), Year=seq(68,69,by=0.25) ) merge(y2, x) id Year Event1Occur YearOfOccurEvent1 Event2Occur YearOfOccurEvent2 1 1 68.00 1 68.25 0 0.0 2 1 68.25 1 68.25 0 0.0 3 1 68.50 1 68.25 0 0.0 4 1 68.75 1 68.25 0 0.0 5 1 69.00 1 68.25 0 0.0 6 2 68.00 0 0.00 1 68.5 7 2 68.25 0 0.00 1 68.5 8 2 68.50 0 0.00 1 68.5 9 2 68.75 0 0.00 1 68.5 10 2 69.00 0 0.00 1 68.5 y2a - merge(y2, x) y2a$Event1 - with( y2a, as.numeric( Event1Occur Year= YearOfOccurEvent1) ) y2a$Event2 - with( y2a, as.numeric( Event2Occur Year= YearOfOccurEvent2) ) # Using negative numeric column indexing to suppress then now superfluous columns y2a[, -(3:6) ] id Year Event1 Event2 1 1 68.00 0 0 2 1 68.25 1 0 3 1 68.50 1 0 4 1 68.75 1 0 5 1 69.00 1 0 6 2 68.00 0 0 7 2 68.25 0 0 8 2 68.50 0 1 9 2 68.75 0 1 10 2 69.00 0 1 -- David Winsemius, MD Alameda, CA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R2wd package wdGet() error
On 7/23/2012 4:25 PM, Jean V Adams wrote: I am having trouble using the R2wd package. The last time I used it successfully, I was running an earlier version of R and an earlier version of Word with an earlier Windows OS. I'm not sure which if any of these changes might be contributing to the problem. Right now I'm using R version 2.15.0 (2012-03-30) for Windows MS Word 2010 version 14.0.6112.5000 (32-bit) Windows 7 Enterprise 2009 (Service Pack 1) When I submit the following code library(R2wd) library(rcom) wdGet() I get this error message Error in if (wdapp[[Documents]][[Count]] == 0) wdapp[[Documents]]$Add() : argument is of length zero I have a new R on a new Win7 OS since I last wdget() so i decided to try to reproduce your error I tried your code and got the same error (R2.15.1), but the library(rcom) However, as you note below, that line also told me to install RDCOMClient with the command; installstatconnDCOM() I did this and the install happened seamlessly with only a few user interactions. wdget() now works just fine. The only missing instruction here is that it is necessary to stop and restart R after installing the RDCOMClient. If you did not try that give it a try. It worked well for me. Rob I saw in a previous posting to r-help ( http://r.789695.n4.nabble.com/R2wd-error-in-wdGet-td4632737.html), someone asked if RDCOMClient was installed and working properly. So, I tried submitting this code install.packages(RDCOMClient, repos = http://www.omegahat.org/R;) And I got this warning message: package ?RDCOMClient? is not available (for R version 2.15.0) I also ran installstatconnDCOM() And it seemed to install fine. But when I try to run statconn DCOM's Server 01 ? Basic Test, and then select Start R, I get this message Loading StatConnector Server... Done Initializing R...Function call failed Code: -2147221485 Text: installation problem: unable to load connector Method '~' of object '~' failed Releasing StatConnector Server...Done Any suggestions for how I might get wdGet() to work? Thanks in advance. Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] translating IDL to R
I would appreciate * guidance regarding translation of IDL routines to R, generally * assistance translating two IDL routines to R, specifically Why I ask: I'm slowly learning how to do atmospheric modeling. One language that has been been popular in this space is IDL http://en.wikipedia.org/wiki/IDL_(programming_language) which unfortunately is proprietary (not to mention butt-ugly, IMHO :-) There is an emerging FOSS implementation, GDL http://en.wikipedia.org/wiki/GNU_Data_Language but I can't presently install GDL on either my own workstation or the cluster on which I do my real work. And, unfortunately, I need to generate some datasets for which instructions are provided only in IDL (see listings following my .sig). However I have R in both work environments, and love it. So if there are any experts in IDL-to-R porting out there, I'd appreciate your assistance. TIA, Tom Roche tom_ro...@pobox.com---2 IDL routines follow to EOF--- from ftp://gfed3:dailyandhou...@zea.ess.uci.edu/GFEDv3.1/Readme.pdf ; idl code to generate daily emissions + nlon=720 ; number of grid points by longitude nlat=360 ; number of grid points by latitude gfedmly=fltarr(nlon,nlat) ; array containing monthly emissions gfeddly=fltarr(nlon,nlat) ; array containing daily emissions ; You must read monthly emissions to generate daily fluxes. ; For example, if you want daily emissions for January 21st, 2004, ; you need read monthly data in January 2004 first: file0_in='GFED3.1_200401_C.txt' file0_in=strcompress(file0_in, /REMOVE_ALL) gfedmly = read_ascii( file0_in ) gfedmly = gfedmly.field001 ; reverse the direction of latitude with monthly emissions ; to combine with daily fire fractions. for j=1, nlat/2 do begin tmp = gfedmly[*,j-1] gfedmly[*,j-1] = gfedmly[*,nlat-j] gfedmly[*,nlat-j] = tmp endfor undefine, tmp ; Then, you can read daily fire fractions from the netcdf file. file1_in = string('fraction_emissions_20040121.nc') file1_in=strcompress(file1_in, /REMOVE_ALL) fid=NCDF_OPEN(file1_in) varid=NCDF_VARID(fid,'Fraction_of_Emissions') NCDF_VARGET, fid, varid, DATA NCDF_CLOSE, fid gfeddly=gfedmly*DATA ; the end for daily emissions ++ ; idl code to generate 3-hourly emissions ++ nlon=720 ; number of grid points by longitude nlat=360 ; number of grid points by latitude nhor=8 ; numbers of 3-hourly intervals each day gfeddly=fltarr(nlon,nlat) ; array containing daily emissions gfed3hly=fltarr(nlon,nlat,nhor) ; array containing 3-hourly emissions file_in = string('fraction_emissions_200401.nc') file_in=strcompress(file_in, /REMOVE_ALL) fid=NCDF_OPEN(file_in) varid=NCDF_VARID(fid,'Fraction_of_Emissions') NCDF_VARGET, fid, varid, DATA NCDF_CLOSE, fid for nh=0,nhor-1 do begin gfed3hly[*,*,nh]=gfeddly*DATA[*,*,nh] endfor ; the end for 3-hourly emissions +++ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] translating IDL to R
R can do all this, but you'll need to get into specifics a little more. Most of your code is covered by R's ?read.table, ?data.frame, ?array, ?file and ?Extract. See the ncdf (or ncdf4 or RNetCDF) package for examples that will look quite similar to the IDL code that opens a NetCDF file. You could do a more or less direct translation of this IDL code to R, but for us to help easily we probably need example data files. These are fairly basic I/O and array manipulation operations, and The Introduction to R and R Import/Export manuals cover that well. A small example translation: nlon = 720 ## number of grid points by longitude nlat = 360 ## number of grid points by latitude gfedmly = array(0.0, c(nlon,nlat)) ## array containing monthly emissions gfeddly = array(0.0, c(nlon,nlat)) ## array containing daily emissions To start populating those arrays from your files we would need much more detail about them, and as ever with arrays you need to be aware of the orientation conventions, and depending on your data you should explore the spatial support functions in sp and raster and rgdal. See the R Spatial Task View, you'll find much more integrated support in R than this IDL code shows and it's probably easier for you to jump straight into that level. Cheers, Mike. On Tue, Jul 24, 2012 at 9:55 AM, Tom Roche tom_ro...@pobox.com wrote: I would appreciate * guidance regarding translation of IDL routines to R, generally * assistance translating two IDL routines to R, specifically Why I ask: I'm slowly learning how to do atmospheric modeling. One language that has been been popular in this space is IDL http://en.wikipedia.org/wiki/IDL_(programming_language) which unfortunately is proprietary (not to mention butt-ugly, IMHO :-) There is an emerging FOSS implementation, GDL http://en.wikipedia.org/wiki/GNU_Data_Language but I can't presently install GDL on either my own workstation or the cluster on which I do my real work. And, unfortunately, I need to generate some datasets for which instructions are provided only in IDL (see listings following my .sig). However I have R in both work environments, and love it. So if there are any experts in IDL-to-R porting out there, I'd appreciate your assistance. TIA, Tom Roche tom_ro...@pobox.com---2 IDL routines follow to EOF--- from ftp://gfed3:dailyandhou...@zea.ess.uci.edu/GFEDv3.1/Readme.pdf ; idl code to generate daily emissions + nlon=720 ; number of grid points by longitude nlat=360 ; number of grid points by latitude gfedmly=fltarr(nlon,nlat) ; array containing monthly emissions gfeddly=fltarr(nlon,nlat) ; array containing daily emissions ; You must read monthly emissions to generate daily fluxes. ; For example, if you want daily emissions for January 21st, 2004, ; you need read monthly data in January 2004 first: file0_in='GFED3.1_200401_C.txt' file0_in=strcompress(file0_in, /REMOVE_ALL) gfedmly = read_ascii( file0_in ) gfedmly = gfedmly.field001 ; reverse the direction of latitude with monthly emissions ; to combine with daily fire fractions. for j=1, nlat/2 do begin tmp = gfedmly[*,j-1] gfedmly[*,j-1] = gfedmly[*,nlat-j] gfedmly[*,nlat-j] = tmp endfor undefine, tmp ; Then, you can read daily fire fractions from the netcdf file. file1_in = string('fraction_emissions_20040121.nc') file1_in=strcompress(file1_in, /REMOVE_ALL) fid=NCDF_OPEN(file1_in) varid=NCDF_VARID(fid,'Fraction_of_Emissions') NCDF_VARGET, fid, varid, DATA NCDF_CLOSE, fid gfeddly=gfedmly*DATA ; the end for daily emissions ++ ; idl code to generate 3-hourly emissions ++ nlon=720 ; number of grid points by longitude nlat=360 ; number of grid points by latitude nhor=8 ; numbers of 3-hourly intervals each day gfeddly=fltarr(nlon,nlat) ; array containing daily emissions gfed3hly=fltarr(nlon,nlat,nhor) ; array containing 3-hourly emissions file_in = string('fraction_emissions_200401.nc') file_in=strcompress(file_in, /REMOVE_ALL) fid=NCDF_OPEN(file_in) varid=NCDF_VARID(fid,'Fraction_of_Emissions') NCDF_VARGET, fid, varid, DATA NCDF_CLOSE, fid for nh=0,nhor-1 do begin gfed3hly[*,*,nh]=gfeddly*DATA[*,*,nh] endfor ; the end for 3-hourly emissions +++ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael Sumner Hobart, Australia e-mail: mdsum...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Convert Package Interest?
I am thinking about submitting a package to CRAN that contains some units conversion functions that I use on a regular basis. Would this be helpful to the community, or would it be better to keep this as a personal package? I don't want to clutter CRAN. many thanks, -- Stephen Sefick ** Auburn University Biological Sciences 331 Funchess Hall Auburn, Alabama 36849 ** sas0...@auburn.edu http://www.auburn.edu/~sas0025 ** Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis A big computer, a complex algorithm and a long time does not equal science. -Robert Gentleman __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert Package Interest?
On Mon, Jul 23, 2012 at 8:12 PM, Stephen Sefick sas0...@auburn.edu wrote: I am thinking about submitting a package to CRAN that contains some units conversion functions that I use on a regular basis. Would this be helpful to the community, or would it be better to keep this as a personal package? I don't want to clutter CRAN. many thanks, You might want to check out the udunits2 package on CRAN. library(udunits2) ud.convert(1, miles, km) [1] 1.609344 -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating panel data
At 06:33 PM 7/23/2012, David Winsemius wrote: I didn't see a clear way to use either reshape() or the plyr/reshape2 packages to do this, (but would enjoy seeing an example that improved my understanding on this path) so I just looked at your x and then created a scaffold with the number of rows needed to match your y and filled in the the other columns by first merging to that scaffold and then creating new columns: That did it! Thanks Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert Package Interest?
If you haven't already, you might try findFn in the sos package to look for other functions that do things similar to what your functions do. If your package offers some functionality not present in other packages, I'd encourage you to submit it to CRAN. Again, if you haven't already done this, you might wish to add links to functions in other packages that do similar but different things. Spencer On 7/23/2012 5:25 PM, Gabor Grothendieck wrote: On Mon, Jul 23, 2012 at 8:12 PM, Stephen Sefick sas0...@auburn.edu wrote: I am thinking about submitting a package to CRAN that contains some units conversion functions that I use on a regular basis. Would this be helpful to the community, or would it be better to keep this as a personal package? I don't want to clutter CRAN. many thanks, You might want to check out the udunits2 package on CRAN. library(udunits2) ud.convert(1, miles, km) [1] 1.609344 -- Spencer Graves, PE, PhD President and Chief Technology Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 web: www.structuremonitoring.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert Package Interest?
On 24/07/12 12:12, Stephen Sefick wrote: I am thinking about submitting a package to CRAN that contains some units conversion functions that I use on a regular basis. Would this be helpful to the community, or would it be better to keep this as a personal package? I don't want to clutter CRAN. Why not? Everybody else does! :-) cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.