[R] Extract data from Array to Table
Dear All, I am facing the task to extract data from array to table. here an example of array. ##Get A list of Matrices with unequal rows Disease - NULL Diseases - NULL ListMatByGene - NULL for(i in 1:3){ Disease[[i]] -matrix(sample(-30:30,25+(5* i)),5+i) rownames(Disease[[i]]) - paste0(Sample,1:(5+i)) colnames(Disease[[i]]) - paste0(Gene,1:5) D - paste0(Disease,i) Diseases[[D]] - Disease[[i]] } ## get the same Column from all matrices getColumn - function(x, colNum, len = nrow(x)){ y - x[,colNum] length(y) - len y } ## get Matrices by the same columns of the list of matrices getMatrices - function(colNums, dataList = x){ # the number of rows required n - max(sapply(dataList, nrow)) lapply(colNums, function(x, dat, n) { # iterate along requested columns do.call(cbind, lapply(dat, getColumn,x, len=n)) # iterate along input data list }, dataList, n) } ## Rotate the list of matrices by 90° G - paste0(Gene,1:5) ListMatByGene[G] - getMatrices(c(1:ncol(Diseases[[1]])),dataList=Diseases) ## get Disease correlation by gene DiseaseCorrelation - lapply(ListMatByGene,function(x) cor(x,use=na, method=spearman)) ##convert the list of Matrices to array ArrayDiseaseCor - array(unlist(DiseaseCorrelation), dim = c(nrow(DiseaseCorrelation[[1]]), ncol(DiseaseCorrelation[[1]]), length(DiseaseCorrelation))) dimnames(ArrayDiseaseCor) - list(names(Diseases), names(Diseases), colnames(Diseases[[1]])) ## Select only correlation bigger than 0.5 from the array FilterDiseaseCor - apply(ArrayDiseaseCor,MARGIN=c(1,2) ,function(x) x[abs(x)0.5]) ## Final result FilterDiseaseCor Disease1 Disease2 Disease3 Disease1 Numeric,5 Numeric,2 -0.9428571 Disease2 Numeric,2 Numeric,5 Numeric,2 Disease3 -0.9428571 Numeric,2 Numeric,5 Question is: How can get a table as: D1 D2 Cor Gene Disease1Disease2 -0.94Gene2 Disease1Disease2 0.78Gene4 Disease3Disease2 0.5 Gene5 ... and Disease1 Disease2 Disease3 Disease1510 Disease21 53 Disease30 3 5 Or in general, How can I extract data from Array to Table? Thanks Karim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to solve this complex equation
A~ha~!! Thank you, Prof. Peter Dalgaard, so much for your wonderful lesson!!! Learning new things everyday from this R-help mailing list! Chel Hee Lee On 2/11/2015 10:37 AM, peter dalgaard wrote: On 11 Feb 2015, at 17:11 , Chel Hee Lee chl...@mail.usask.ca wrote: The functional form given in the post written by Ssuhanchen captures my eyes. It is the cumulative distribution function of Poisson when the number of counts is less than or equal to 2 with unknown parameter mu=x/2. Since it is a nonlinear function, there may be multiple solutions but the solution should be greater than 0 (if I am in the right track). I am assuming this functional form is originated from the Poisson. Under this assumption, one solution is found as below: rt - uniroot(function(x) ppois(2, lambda=x)-0.05, interval=c(0.5,1), extendInt=yes) Warning messages: 1: In ppois(2, lambda = x) : NaNs produced 2: In ppois(2, lambda = x) : NaNs produced 3: In ppois(2, lambda = x) : NaNs produced ppois(2, lambda=rt$root) [1] 0.051 rt$root [1] 6.295791 Thus, the solution x would be rt$root*2 (Note that I did not try to find other solutions). I hope this helps. Given the Poisson connection, I would pretty strongly expect the solution to be unique. Notice also that your rt$root comes out as the upper end of the confidence interval in poisson.test(2, alt=l) Exact Poisson test data: 2 time base: 1 number of events = 2, time base = 1, p-value = 0.9197 alternative hypothesis: true event rate is less than 1 95 percent confidence interval: 0.00 6.295794 sample estimates: event rate 2 Chel Hee Lee On 2/10/2015 2:29 AM, Rolf Turner wrote: On 10/02/15 14:04, Ssuhanchen wrote: Hi! I want to use R to calculate the variable x which is in a complex equation in below: 2 Σ[exp(-x/2)*(x^k)/(2^k*k!)]=0.05 k=0 how to solve this equation to get the exact x in R? Is this homework? Sure looks like it. Talk to your prof. Or do a bit of work on learning how to use R --- which is presumably the point of the exercise. cheers, Rolf Turner __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A portfolio return function?
For finance applications, I'm surprised that I am unable to find a function to compute the portfolio return (sqrt(t(w) %*% V %*% w)) where w are portfolio weights and V is the cov(returns). The Performance Analytics portfolio return function seems to compute something else. Ernie __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting this data
Hi, Here is an implementation. The image is uploaded as Rplot02.png. gen - read.table(geno.txt,header=TRUE) gen Genotype E1 E2E3 E4 E5 E6 E7 E8 E9 E10 E11 E12 E13 E14 E15 E16 E17 E18 1 G1 0.79 2.11 6.21 0.56 4.06 2.13 5.61 0.20 3.32 3.01 5.12 0.77 0.78 0.62 2.00 5.87 2.50 0.49 2 G2 0.59 0.92 10.11 0.74 4.01 1.12 4.58 0.70 2.61 1.49 4.59 0.20 1.33 0.62 2.67 5.58 2.22 0.93 3 G3 1.18 3.44 4.08 0.50 6.91 4.27 6.87 0.30 4.57 2.88 6.61 0.59 0.97 1.02 1.45 5.32 2.65 2.73 4 G4 2.17 4.46 6.51 0.35 4.80 5.99 5.44 1.20 4.21 3.59 4.89 0.39 2.26 2.05 3.33 8.55 2.10 2.39 5 G5 0.99 4.83 6.71 1.43 5.83 2.95 3.66 0.50 1.54 2.55 5.70 1.83 0.39 1.44 1.66 5.36 1.32 1.19 6 G6 1.57 3.94 7.49 0.61 5.88 2.74 7.22 1.19 4.81 3.72 5.59 0.38 1.36 1.43 3.94 9.15 2.50 2.56 7 G7 2.38 4.74 9.94 1.31 5.74 5.59 9.01 0.70 4.47 3.59 4.85 3.48 2.35 1.23 3.22 11.21 2.89 3.01 8 G8 0.98 2.56 12.02 3.31 5.41 2.03 5.92 0.30 4.39 4.19 6.10 0.39 1.97 0.62 4.93 8.89 3.45 2.95 E19 E20 E21 E22 E23 E24 1 0.42 0.73 5.66 4.61 3.98 0.32 2 0.40 0.68 4.28 3.89 2.98 1.78 3 1.31 2.63 6.51 5.42 5.43 0.60 4 0.03 1.81 6.05 6.31 6.53 1.94 5 0.08 1.03 5.38 2.45 5.18 0.54 6 0.08 0.68 6.46 6.02 5.28 2.52 7 0.33 0.91 9.15 7.03 6.39 2.13 8 0.01 0.62 8.35 5.71 5.86 0.38 x - expand.grid(gen[,1],names(gen)[2:dim(gen)[2]]) x Var1 Var2 1 G1 E1 2 G2 E1 3 G3 E1 4 G4 E1 5 G5 E1 6 G6 E1 7 G7 E1 8 G8 E1 9 G1 E2 10G2 E2 11G3 E2 12G4 E2 13G5 E2 14G6 E2 15G7 E2 16G8 E2 17G1 E3 18G2 E3 19G3 E3 20G4 E3 21G5 E3 22G6 E3 23G7 E3 24G8 E3 25G1 E4 26G2 E4 27G3 E4 28G4 E4 29G5 E4 30G6 E4 31G7 E4 32G8 E4 33G1 E5 34G2 E5 35G3 E5 36G4 E5 37G5 E5 38G6 E5 39G7 E5 40G8 E5 41G1 E6 42G2 E6 43G3 E6 44G4 E6 45G5 E6 46G6 E6 47G7 E6 48G8 E6 49G1 E7 50G2 E7 51G3 E7 52G4 E7 53G5 E7 54G6 E7 55G7 E7 56G8 E7 57G1 E8 58G2 E8 59G3 E8 60G4 E8 61G5 E8 62G6 E8 63G7 E8 64G8 E8 65G1 E9 66G2 E9 67G3 E9 68G4 E9 69G5 E9 70G6 E9 71G7 E9 72G8 E9 73G1 E10 74G2 E10 75G3 E10 76G4 E10 77G5 E10 78G6 E10 79G7 E10 80G8 E10 81G1 E11 82G2 E11 83G3 E11 84G4 E11 85G5 E11 86G6 E11 87G7 E11 88G8 E11 89G1 E12 90G2 E12 91G3 E12 92G4 E12 93G5 E12 94G6 E12 95G7 E12 96G8 E12 97G1 E13 98G2 E13 99G3 E13 100 G4 E13 101 G5 E13 102 G6 E13 103 G7 E13 104 G8 E13 105 G1 E14 106 G2 E14 107 G3 E14 108 G4 E14 109 G5 E14 110 G6 E14 111 G7 E14 112 G8 E14 113 G1 E15 114 G2 E15 115 G3 E15 116 G4 E15 117 G5 E15 118 G6 E15 119 G7 E15 120 G8 E15 121 G1 E16 122 G2 E16 123 G3 E16 124 G4 E16 125 G5 E16 126 G6 E16 127 G7 E16 128 G8 E16 129 G1 E17 130 G2 E17 131 G3 E17 132 G4 E17 133 G5 E17 134 G6 E17 135 G7 E17 136 G8 E17 137 G1 E18 138 G2 E18 139 G3 E18 140 G4 E18 141 G5 E18 142 G6 E18 143 G7 E18 144 G8 E18 145 G1 E19 146 G2 E19 147 G3 E19 148 G4 E19 149 G5 E19 150 G6 E19 151 G7 E19 152 G8 E19 153 G1 E20 154 G2 E20 155 G3 E20 156 G4 E20 157 G5 E20 158 G6 E20 159 G7 E20 160 G8 E20 161 G1 E21 162 G2 E21 163 G3 E21 164 G4 E21 165 G5 E21 166 G6 E21 167 G7 E21 168 G8 E21 169 G1 E22 170 G2 E22 171 G3 E22 172 G4 E22 173 G5 E22 174 G6 E22 175 G7 E22 176 G8 E22 177 G1 E23 178 G2 E23 179 G3 E23 180 G4 E23 181 G5 E23 182 G6 E23 183 G7 E23 184 G8 E23 185 G1 E24 186 G2 E24 187 G3 E24 188 G4 E24 189 G5 E24 190 G6 E24 191 G7 E24 192 G8 E24 names(gen) - NULL genfinal - data.frame(x, unlist(gen[,2:dim(gen)[2]])) names(genfinal) - c(Genotype,category,value) genfinal$category - as.numeric(genfinal$category) head(genfinal) Genotype category value 1 G11 0.79 2 G21 0.59 3 G31 1.18 4 G41 2.17 5 G51 0.99 6 G61 1.57 library(ggplot2) ggplot(genfinal,aes(x=category,y=value,colour=Genotype)) + geom_line() Rplot02.png http://r.789695.n4.nabble.com/file/n4703089/Rplot02.png -- View this message in context: http://r.789695.n4.nabble.com/plotting-this-data-tp4702926p4703089.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
[R] Download internet videos
Hello R-helpers, It is possible donwload youtube videos with R? I made a google search and find no options to do that. Thanks in advanced, Raoni [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to solve this complex equation
On 11 Feb 2015, at 17:11 , Chel Hee Lee chl...@mail.usask.ca wrote: The functional form given in the post written by Ssuhanchen captures my eyes. It is the cumulative distribution function of Poisson when the number of counts is less than or equal to 2 with unknown parameter mu=x/2. Since it is a nonlinear function, there may be multiple solutions but the solution should be greater than 0 (if I am in the right track). I am assuming this functional form is originated from the Poisson. Under this assumption, one solution is found as below: rt - uniroot(function(x) ppois(2, lambda=x)-0.05, interval=c(0.5,1), extendInt=yes) Warning messages: 1: In ppois(2, lambda = x) : NaNs produced 2: In ppois(2, lambda = x) : NaNs produced 3: In ppois(2, lambda = x) : NaNs produced ppois(2, lambda=rt$root) [1] 0.051 rt$root [1] 6.295791 Thus, the solution x would be rt$root*2 (Note that I did not try to find other solutions). I hope this helps. Given the Poisson connection, I would pretty strongly expect the solution to be unique. Notice also that your rt$root comes out as the upper end of the confidence interval in poisson.test(2, alt=l) Exact Poisson test data: 2 time base: 1 number of events = 2, time base = 1, p-value = 0.9197 alternative hypothesis: true event rate is less than 1 95 percent confidence interval: 0.00 6.295794 sample estimates: event rate 2 Chel Hee Lee On 2/10/2015 2:29 AM, Rolf Turner wrote: On 10/02/15 14:04, Ssuhanchen wrote: Hi! I want to use R to calculate the variable x which is in a complex equation in below: 2 Σ[exp(-x/2)*(x^k)/(2^k*k!)]=0.05 k=0 how to solve this equation to get the exact x in R? Is this homework? Sure looks like it. Talk to your prof. Or do a bit of work on learning how to use R --- which is presumably the point of the exercise. cheers, Rolf Turner __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to solve this complex equation
The functional form given in the post written by Ssuhanchen captures my eyes. It is the cumulative distribution function of Poisson when the number of counts is less than or equal to 2 with unknown parameter mu=x/2. Since it is a nonlinear function, there may be multiple solutions but the solution should be greater than 0 (if I am in the right track). I am assuming this functional form is originated from the Poisson. Under this assumption, one solution is found as below: rt - uniroot(function(x) ppois(2, lambda=x)-0.05, interval=c(0.5,1), extendInt=yes) Warning messages: 1: In ppois(2, lambda = x) : NaNs produced 2: In ppois(2, lambda = x) : NaNs produced 3: In ppois(2, lambda = x) : NaNs produced ppois(2, lambda=rt$root) [1] 0.051 rt$root [1] 6.295791 Thus, the solution x would be rt$root*2 (Note that I did not try to find other solutions). I hope this helps. Chel Hee Lee On 2/10/2015 2:29 AM, Rolf Turner wrote: On 10/02/15 14:04, Ssuhanchen wrote: Hi! I want to use R to calculate the variable x which is in a complex equation in below: 2 Σ[exp(-x/2)*(x^k)/(2^k*k!)]=0.05 k=0 how to solve this equation to get the exact x in R? Is this homework? Sure looks like it. Talk to your prof. Or do a bit of work on learning how to use R --- which is presumably the point of the exercise. cheers, Rolf Turner __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download internet videos
The set of things that R can do is not a subset of the set of things packages may have been written for. Have at it. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On February 11, 2015 11:17:47 AM EST, Raoni Rodrigues caciquesamu...@gmail.com wrote: Hello R-helpers, It is possible donwload youtube videos with R? I made a google search and find no options to do that. Thanks in advanced, Raoni [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] prediction intervals for robust regression
I have created robust regression models using least trimmed squares and MM-regression (using the R package robustbase). I am now looking to create prediction intervals for the predicted results. While I have seen some discussion in the literature about confidence intervals on the estimates for robust regression, I haven't had much success in finding out how to create prediction intervals for the results. I was wondering if anyone would be able to provide some direction on how to create these prediction intervals in the robust regression setting. Thanks, Jonathan Burns Sr. Statistician General Dynamics Information Technology Medicare Medicaid Solutions One West Pennsylvania Avenue Baltimore, MD 21204 (410)-842-1594 jonathan.bur...@gdit.commailto:jonathan.bur...@gdit.com www.gdit.comhttp://www.gdit.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prediction intervals for robust regression
Presumably you've checked out: http://cran.r-project.org/web/views/Robust.html If you can estimate the variance of parameter estimates, betahat, then you can estimate the variance of a predicted value, X betahat; add the estimated variance of individuals to this if that's what you're looking for (and it's not already available). Further questions should go to a statistics site like stats.stackexchange.com, as statistical questions are off topic here. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. Clifford Stoll On Wed, Feb 11, 2015 at 11:03 AM, Burns, Jonathan (NONUS) jonathan.bur...@gdit.com wrote: I have created robust regression models using least trimmed squares and MM-regression (using the R package robustbase). I am now looking to create prediction intervals for the predicted results. While I have seen some discussion in the literature about confidence intervals on the estimates for robust regression, I haven't had much success in finding out how to create prediction intervals for the results. I was wondering if anyone would be able to provide some direction on how to create these prediction intervals in the robust regression setting. Thanks, Jonathan Burns Sr. Statistician General Dynamics Information Technology Medicare Medicaid Solutions One West Pennsylvania Avenue Baltimore, MD 21204 (410)-842-1594 jonathan.bur...@gdit.commailto:jonathan.bur...@gdit.com www.gdit.comhttp://www.gdit.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package build help
On Sun, Feb 8, 2015 at 5:15 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 08/02/2015 4:06 PM, Glenn Schultz wrote: Hello All, I am in the final stages of building my first package BondLab and the check throughs the following warning. I think this is namespace thing. I have not done anything with the namespace at this point. I am turning my attention to the namespace now. Am I correct this can be a handled by the namespace? I would guess you have imported the lubridate and plyr packages, and also defined your own duration() and here() functions, hiding theirs. You can also see this problem if you have import(plyr) import(plyr, here) etc. Or with import(plyr) import(lubridate) since I think both provide a here() function. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sqldf() difference between R 3.1.2 and 3.0.1
That seems to have worked, both in the new and old version of R. I'll do more unit testing on other files. Thank you, Gabor. -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Wednesday, February 11, 2015 10:22 AM To: Doran, Harold Cc: r-help@r-project.org Subject: Re: [R] sqldf() difference between R 3.1.2 and 3.0.1 On Wed, Feb 11, 2015 at 9:45 AM, Doran, Harold hdo...@air.org wrote: I have a function written and tested using R 3.0.1 and sqldf_0.4-7.1 that works perfectly. However, using this same code with R 3.1.2 and sqldf_0.4-10 yields the error below that I am having a difficult time deciphering. Hence, same code behaves differently on different versions of R and sqldf(). Error in sqliteSendQuery(con, statement, bind.data) : error in statement: no such column: V1 Reproducible example below as well as complete sessionInfo all provided below. My function and code using the function are below. dorReader - function(dorFile, layout, sepChar = '\n'){ sepChar - as.character(sepChar) dorFile - as.character(dorFile) layout$type2 - ifelse(layout$type == 'C', 'character', ifelse(layout$type == 'N', 'numeric', 'Date')) dor - file(dorFile) attr(dor, file.format) - list(sep = sepChar) getVars - paste(select, paste(substr(V1, , layout$Start, , , layout$Length, ) ', layout$Variable.Name, ', collapse = , ), from dor) dat - sqldf(getVars) classConverter - function(obj, types){ out - lapply(1:length(obj),FUN = function(i){FUN1 - switch(types[i],character = as.character,numeric = as.numeric,factor = as.factor, Date = as.character); FUN1(obj[,i])}) names(out) - colnames(obj) as.data.frame(out) } dat - classConverter(dat, layout$type2) names(dat) - layout$Variable.Name dat } ### contents of fwf file 'sample.txt' 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 layout - data.frame(Variable.Name =c('test1', 'test2'), Length = c(3,4), Start =c(1,4), End = c(3,7), type = c('N', 'N')) tmp - dorReader('sample.txt', layout) sqldf is documented to use the sqliteImportFile defaults for file.format components. It may be that RSQLite 1.0 has changed the default for header in sqliteImportFile. Try replacing your statement that sets file.format with this: attr(dor, file.format) - list(sep = sepChar, header = FALSE) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sqldf() difference between R 3.1.2 and 3.0.1
On Wed, Feb 11, 2015 at 9:45 AM, Doran, Harold hdo...@air.org wrote: I have a function written and tested using R 3.0.1 and sqldf_0.4-7.1 that works perfectly. However, using this same code with R 3.1.2 and sqldf_0.4-10 yields the error below that I am having a difficult time deciphering. Hence, same code behaves differently on different versions of R and sqldf(). Error in sqliteSendQuery(con, statement, bind.data) : error in statement: no such column: V1 Reproducible example below as well as complete sessionInfo all provided below. My function and code using the function are below. dorReader - function(dorFile, layout, sepChar = '\n'){ sepChar - as.character(sepChar) dorFile - as.character(dorFile) layout$type2 - ifelse(layout$type == 'C', 'character', ifelse(layout$type == 'N', 'numeric', 'Date')) dor - file(dorFile) attr(dor, file.format) - list(sep = sepChar) getVars - paste(select, paste(substr(V1, , layout$Start, , , layout$Length, ) ', layout$Variable.Name, ', collapse = , ), from dor) dat - sqldf(getVars) classConverter - function(obj, types){ out - lapply(1:length(obj),FUN = function(i){FUN1 - switch(types[i],character = as.character,numeric = as.numeric,factor = as.factor, Date = as.character); FUN1(obj[,i])}) names(out) - colnames(obj) as.data.frame(out) } dat - classConverter(dat, layout$type2) names(dat) - layout$Variable.Name dat } ### contents of fwf file 'sample.txt' 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 layout - data.frame(Variable.Name =c('test1', 'test2'), Length = c(3,4), Start =c(1,4), End = c(3,7), type = c('N', 'N')) tmp - dorReader('sample.txt', layout) sqldf is documented to use the sqliteImportFile defaults for file.format components. It may be that RSQLite 1.0 has changed the default for header in sqliteImportFile. Try replacing your statement that sets file.format with this: attr(dor, file.format) - list(sep = sepChar, header = FALSE) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rotate array by 90°
Thanks Lee and Huang. That is useful. Best On Fri, Feb 6, 2015 at 1:53 AM, Chel Hee Lee chl...@mail.usask.ca wrote: lapply(1:2, function(x) t(A[rev(1:3),,x])) [[1]] [,1] [,2] [,3] [1,] g d a [2,] h e b [3,] i f c [[2]] [,1] [,2] [,3] [1,] p m j [2,] q n k [3,] r o l Is this what you are looking for? I hope this helps. Chel Hee Lee On 02/05/2015 03:17 PM, Karim Mezhoud wrote: Dear aal, Is there a way to rotate array or a cube of matrices by Y axis? MatLab example: A = cat(3,{'a' 'b' 'c';'d' 'e' 'f';'g' 'h' 'i'},{'j' 'k' 'l';'m' 'n' 'o';'p' 'q' 'r'}) A(:,:,1) = 'a''b''c' 'd''e''f' 'g''h''i' A(:,:,2) = 'j''k''l' 'm''n''o' 'p''q''r' Rotate the cell array by 270 degrees. B = rot90(A,3) B(:,:,1) = 'g''d''a' 'h''e''b' 'i''f''c' B(:,:,2) = 'p''m''j' 'q''n''k' 'r''o''l' karim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/ posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] factor levels numeric values
Hi, Suppose your data frame is called data and the name of the factor column is named tobeConverted. I have tried this and it worked. Hope this helps. as.numeric(as.character(data$tobeConverted)) -- View this message in context: http://r.789695.n4.nabble.com/factor-levels-numeric-values-tp4699515p4703090.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] library(Rcmdr) sh: otool: command not found
Dear John, Dear Dan, Many thanks for your response. It works perfectly. Best, - Mail original - De : John Fox j...@mcmaster.ca À : 'varin sacha' varinsa...@yahoo.fr Cc : 'r-help@R-project.org' r-help@r-project.org Envoyé le : Mercredi 11 février 2015 3h12 Objet : RE: [R] library(Rcmdr) sh: otool: command not found Dear varin sacha, This problem presumably will also arise if you try to load the tcltk package directly via library(tcltk) and has been discussed before on the R-SIG-Mac email list, for example at https://stat.ethz.ch/pipermail/r-sig-mac/2014-December/011260.html. The problem is fixed in the patched version of R 3.1.2 for Mac OS X, available at http://r.research.att.com/. I hope this helps, John --- John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of varin sacha Sent: February-10-15 6:17 PM To: r-help@R-project.org Subject: [R] library(Rcmdr) sh: otool: command not found Hi R experts, I have just updated R and RStudio. I am running OS X 10.6.8 R version 3.1.2 (2014-10-31) -- Pumpkin Helmet Copyright (C) 2014 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin10.8.0 (64-bit) RStudio Version 0.98.1102 – © 2009-2014 RStudio, Inc. Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/534.59.10 (KHTML, like Gecko) Everything is ok for installing the Rcmdr package. install.packages(Rcmdr) Installing package into ‘/Users/Caro/Library/R/3.1/library’ (as ‘lib’ is unspecified) essai de l'URL 'http://cran.rstudio.com/bin/macosx/contrib/3.1/Rcmdr_2.1- 6.tgz' Content type 'application/x-gzip' length 5340999 bytes (5.1 Mb) URL ouverte == downloaded 5.1 Mb The downloaded binary packages are in /var/folders/V2/V26Pbtc-GsSUTVPWEj6Bdk+++TI/-Tmp- //RtmpgNlyXf/downloaded_packages But once I write library(Rcmdr), it doesn't work. library(Rcmdr) Le chargement a nécessité le package : splines Le chargement a nécessité le package : RcmdrMisc Le chargement a nécessité le package : car Le chargement a nécessité le package : sandwich Error : .onLoad a échoué dans loadNamespace() pour 'tcltk', détails : appel : system2(otool, c(-L, shQuote(DLL)), stdout = TRUE) erreur : erreur lors de l'exécution d'une commande Erreur : le chargement du package ou de l'espace de noms a échoué pour ‘Rcmdr’ sh: otool: command not found Has somebody any idea of how I can solve this problem ? Thanks, __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sqldf() difference between R 3.1.2 and 3.0.1
I have a function written and tested using R 3.0.1 and sqldf_0.4-7.1 that works perfectly. However, using this same code with R 3.1.2 and sqldf_0.4-10 yields the error below that I am having a difficult time deciphering. Hence, same code behaves differently on different versions of R and sqldf(). Error in sqliteSendQuery(con, statement, bind.data) : error in statement: no such column: V1 Reproducible example below as well as complete sessionInfo all provided below. My function and code using the function are below. dorReader - function(dorFile, layout, sepChar = '\n'){ sepChar - as.character(sepChar) dorFile - as.character(dorFile) layout$type2 - ifelse(layout$type == 'C', 'character', ifelse(layout$type == 'N', 'numeric', 'Date')) dor - file(dorFile) attr(dor, file.format) - list(sep = sepChar) getVars - paste(select, paste(substr(V1, , layout$Start, , , layout$Length, ) ', layout$Variable.Name, ', collapse = , ), from dor) dat - sqldf(getVars) classConverter - function(obj, types){ out - lapply(1:length(obj),FUN = function(i){FUN1 - switch(types[i],character = as.character,numeric = as.numeric,factor = as.factor, Date = as.character); FUN1(obj[,i])}) names(out) - colnames(obj) as.data.frame(out) } dat - classConverter(dat, layout$type2) names(dat) - layout$Variable.Name dat } ### contents of fwf file 'sample.txt' 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 layout - data.frame(Variable.Name =c('test1', 'test2'), Length = c(3,4), Start =c(1,4), End = c(3,7), type = c('N', 'N')) tmp - dorReader('sample.txt', layout) ### SessionInfo where functions behaves as expected sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] tcltk stats graphics grDevices utils datasets methods base other attached packages: [1] sqldf_0.4-7.1 RSQLite.extfuns_0.0.1 RSQLite_0.11.4DBI_0.2-7 gsubfn_0.6-5 [6] proto_0.3-10 MiscPsycho_1.6statmod_1.4.18 loaded via a namespace (and not attached): [1] chron_2.3-45 tools_3.0.1 ### SessionInfo for version not working sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] tcltk stats graphics grDevices utils datasets methods base other attached packages: [1] sqldf_0.4-10 RSQLite_1.0.0 DBI_0.3.1 gsubfn_0.6-6 proto_0.3-10 loaded via a namespace (and not attached): [1] chron_2.3-45 tools_3.1.2 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MApply and SubStr
That's exactly what I wanted, thank you very much! My intent was to perform the SubStr operation first, and then apply the types to the columns. I wasn't expecting the two types in the same column. I appreciate your response! On Tue, Feb 10, 2015 at 5:03 PM, David Winsemius dwinsem...@comcast.net wrote: On Feb 10, 2015, at 3:58 PM, Brian Trautman wrote: Hi! I'm trying to write a custom function that applies SubStr to a string, and then depending on the arguments, converts the output to a number. The substring part of my code works fine, but it's not converting the way I want to -- options('stringsAsFactors'=FALSE) require(data.table) substr_typeswitch - function(x, start, stop, typeto='chr') { tmpvar - substr(x=x, start=start, stop=stop) tmpvar - switch(typeto, num=as.numeric(tmpvar), tmpvar) return(tmpvar) } startpos - c(01, 03) endpos - c(02, 04) typelist - c('chr', 'num') startdata - as.data.table(c('aa01', 'bb02')) enddata_want - as.data.table(mapply(substr_typeswitch, startdata, startpos, endpos, typelist)) If I examine enddata_want -- str(enddata_want) Classes ‘data.table’ and 'data.frame': 2 obs. of 2 variables: $ V1: chr aa bb $ NA: chr 1 2 - attr(*, .internal.selfref)=externalptr 1 and 2 are being stored as character, and not as number. It appears from you code that you might be expecting a vector in a dataframe object to have a character mode in the first postition and a numeric mode in the second position. That wouldn't seem to be a reasonable expectation. But maybe you were hoping the chr and num types were to be applied to columns. I was surprised to get something different from as.data.table: str(enddata_want) Classes ‘data.table’ and 'data.frame': 2 obs. of 2 variables: $ V1: Factor w/ 2 levels aa,bb: 1 2 $ NA: Factor w/ 2 levels 1,2: 1 2 - attr(*, .internal.selfref)=externalptr The mapply operation made a matrix which forces all values to be the same mode: str( mapply(substr_typeswitch, startdata, + startpos, endpos, typelist) ) chr [1:2, 1:2] aa bb 1 2 - attr(*, dimnames)=List of 2 ..$ : NULL ..$ : chr [1:2] V1 NA You might have gotten something less homogeneous if you added the SIMPLIFY argument: str( mapply(substr_typeswitch, startdata, + startpos, endpos, typelist, SIMPLIFY=FALSE) ) List of 2 $ V1: chr [1:2] aa bb $ NA: num [1:2] 1 2 Can anyone help me understand what I'm doing wrong? Thank you! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read this Rpart decision tree?
Hi Kim fancyRpartPlot is a front-end to prp, and you can pass it all of the prp options - it says this in the help for fancyRpartPlot, and that's about all it says. So you need to spend some time reading about prp options, and how to customize your plot to get what you want. There are lots of detailed resources; here's one to get you started. http://www.milbo.org/rpart-plot/prp.pdf Sarah On Wed, Feb 11, 2015 at 3:02 AM, Kim C. minorthre...@hotmail.com wrote: Hi all, In the attachment or this link (http://oi58.tinypic.com/35ic9qc.jpg) you'll find the decision tree I made. I used the Rpart package to make the tree and the rattle package using the fancyRpartPlot to plot it. The data in the tree looks different than about every example I have seen before. I don't understand how I should read it. I want to predict Product (which are productkeys). The variables to predict it contain age, incomegroup, gender, totalchildren, education, occupation, houseownerflag, numberCars.It looks like the upper number is a ProductKey. The n is number of observations? And the percentage of the yes/no question below. This is the code I used. ss.rpart1 - rpart(Product ~ ., data=sstrain, control=rpart.control(minbucket=2,minsplit=1, cp=-1)) spt - which.min(ss.rpart1$cptable[, xerror]) scp - ss.rpart1$cptable[opt, CP] ss.rpart2 - prune(ss.rpart1, cp=cp) fancyRpartPlot(ss.rpart2) So why does the tree looks so different from the most (for example: http://media.tumblr.com/a9f482ff88b0b9cfaffca7ffd46c6a8e/tumblr_inline_mz7pyuaYJQ1s5wtly.png). This is from Trevor Stephen's TItanic tutorial. The first node show that 62% of 100% doesn't survive. If they were male, only 19% of them were survivors. I find that a lot examples look like that. Why does mine predict per ProductKey and every node it has something else. it doesn't make sense to me. And it doesn't have the two numbers like .62 and .38 but it has n=197e+3. So should I read the first node like For 100% of the observations of ProductKey 1074, the incomegroup was moderate)? Thank you! Kim -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help in anova
I suspect that your treatment levels are perfectly nested inside the soilgroup levels. If this were the case you can either use one of the factor in your analysis or the other, depending on what you want to analyze. Il giorno Wed, 11 Feb 2015 14:47:18 + Mir Salam mir.sa...@uef.fi ha scritto: Dear all, __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help in anova
Dear all, I have follwing factors in my data set (G) names(G) relative ht, biomass, treatment, variety, soil group relative height biomass treatment (4 levels)- control+lime, control+ash, contaminated soil+lime, contaminated soil+ash variety (4 levels)- Salix swherinii, Salix mursinifolia, Klara, Karin soilgroup (2 level)- control and contaminated if I fit aov function this is ok anova1-aov(ht~treatment*variety,data=G) summary(anova1) R- output Df Sum SqMean Sq F value Pr(F) treatment 314015 4672 65.021 2e-16 *** variety3 5883 196127.295 2.95e-12 *** treatment:variety9 5474 608 8.465 8.93e-09 *** Residuals 80 5748 72 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 above I can see my treatment and variety interaction effect anova3-aov(ht~treatment*variety*soilgroup,data=G) summary(anova3) R-ouput Df Sum SqMean SqF value Pr(F) treatment314015 4672 65.021 2e-16 *** variety 3 5883 1961 27.295 2.95e-12 *** treatment:variety 9 5474 608 8.465 8.93e-09 *** soilgroup -missing treatment:variety:soilgroup - missing Here, I am missing value of soil group and interation of treatment: variety: soilgroup- Can any one please tell me why? Thanks for help Best regards Salam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maxent does not work
Is there any package that can replace maxent for a large number of independent variables? Thanks, Ken Sent from my iPhone On Feb 2, 2015, at 6:07 PM, Kenneth Z kenneth...@gmail.com wrote: Yesterday I installed the most recent R and maxent package, but it stopped working. Even a simple command like Model - maxent(matrix(c(1,2,3,4,5,6,7,8), nrow=2, ncol=4),c(1,-1)) will cause a fatal error in R. I am attaching a screenshot to this email. Any help will be appreciated. Best regards, Ken On Dec 12, 2012, at 4:13 PM, Kenneth Z kenneth...@gmail.com wrote: I found that the optim() function does not always reach the real minimum. Is it because the solution is trapped at a local minimum? Thanks! Ken On Dec 12, 2012, at 2:17 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: ... origin pro? Then why are you here? It is not clear from your message that this has anything to do with R. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. vishal katoch vkatoch...@yahoo.co.in wrote: Hello, i am working in origin pro, i want to plot a graph as like a pdf attached but with black and white lines. here radial axis varies from 0 to 1. and angular axis from 0 degree to 60 degree.and third axis which is depend on both radial and axial gives non intersecting lines. how can i read the data from plot for replot. vikas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixed-effects model for pre-post randomization design
Marco Barbàra jabbba at gmail.com writes: DeaR userRs, I recently read this Liang-Zeger article: http://sankhya.isical.ac.in/search/62b1/fpaper7.html in which (among other things) they adopt a random intercept model for pre-post designed trials, using a conditional likelihood approach (I didn't think it possible with only two measurements per subject) I'm trying to figure out (if and) how it is possible to reproduce straightforwardly their model using R standard mixed model tools, but I cannot even try to reproduce their work, since they used a non-available dataset (I found an extract on prof. Diggle's web site where it is explicitly reported to be confidential), so I have to review a bit of likelihood theory along with some implementation details. In the meantime, I wonder if anyone here could point out any related documentation to me. This might get more attention on r-sig-mixed-mod...@r-project.org. I took a quick look at the paper, but it's not a case where the answer is immediately obvious. The paper of reference for lme4 (see http://cran.r-project.org/web/packages/lme4/citation.html ) gives technical details of lme4's implementation, in case that's useful. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Nonlinear integer programming question
I am seeking an optimization routine that can deal with the following problem: Maximize g(x), where x is a vector and g is nonlinear, subject to linear constraints of the form h(x)0 and m(x)=0 and subject to the constraint that all values of x are 0 or 1. I can't find a nonlinear optimization program in R that states that it can accommodate 0-1 constraints. Oddly, Excel's Solver will produce a solution to such problems but (1) I don't trust it and (2) it cannot handle a large number of constraints. Thanks, all! Rebecca Zwick (Santa Barbara, California) Statistical Analysis, Data Analysis, and Psychometric Research Educational Testing Service This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited. Thank you for your compliance. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Impute time-series data, perhaps with a Kalman filter - do you know of any R code?
Hi, searching for kalman filter R gives this paper that was published in JSS. It may help. @article{Tusell:2010:JSSOBK:v39i02, author = Fernando Tusell, title = Kalman Filtering in R, journal = Journal of Statistical Software, volume = 39, number = 2, pages = 1--27, day = 1, month = 3, year =2011, CODEN = JSSOBK, ISSN =1548-7660, bibdate = 2010-08-17, URL = http://www.jstatsoft.org/v39/i02;, accepted =2010-08-17, acknowledgement = , keywords =, submitted = 2010-01-12, } 15:23:00 -0500 John Sorkin jsor...@grecc.umaryland.edu wrote: Does anyone have code that uses a Kalman filter to impute time-series data? If not, do you know of any software that can be used to impute time-series data? Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for ...{{dropped:28}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AR1 covariance structure for lme object and R/SAS differences in model output
On 11 Feb 2015, at 16:57 , anord andreas.n...@biol.lu.se wrote: Dear R users, We are working on a data set in which we have measured repeatedly a physiological response variable (y) every 20 min for 12 h (time variable; 'x') in subjects ('id') beloning to one of five groups ('group'; 'A' to 'E'). Data are located at: https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0 We are interested to model if the response in y differences with time (i.e. 'x') for the two groups. Thus: require(nlme) m1-lme(y~group*x+group*I(x^2),random=~x|id,data=data.df,na.action=na.omit) But because data are collected repeatedly over short time intervals for each subject, it seemed prudent to consider an autoregressive covariance structure. Thus: m2-update(m1,~.,corr=corCAR1(form=~x|id)) AIC values indicate the latter (i.e. m2) as more appropriate: anova(m1,m2) # Model df AIC BIC logLikTest L.Ratio p-value #m1 1 19 2155.996 2260.767 -1058.9981 #m2 2 20 2021.944 2132.229 -990.9718 1 vs 2 136.0525 .0001 Fixed effects and test statistics differ between models. A look at marginal ANOVA tables suggest inference might differ somewhat between models: anova.lme(m1,type=m) # numDF denDF F-value p-value #(Intercept) 1 1789 63384.80 .0001 #group 445 1.29 0.2893 #x 1 1789 0.05 0.8226 #I(x^2)1 1789 4.02 0.0451 #group:x 4 1789 2.61 0.0341 #group:I(x^2) 4 1789 4.37 0.0016 anova.lme(m2,type=m) # numDF denDF F-value p-value #(Intercept) 1 1789 59395.79 .0001 #group 445 1.33 0.2725 #x 1 1789 0.04 0.8379 #I(x^2)1 1789 2.28 0.1312 #group:x 4 1789 2.09 0.0802 #group:I(x^2) 4 1789 2.81 0.0244 Now, this is all well. But: my colleagues have been running the same data set using PROC MIXED in SAS and come up with substantially different results when comparing SAS default covariance structure (variance components) and AR1. Specifically, there is virtually no change in either test statistics or fitted values when using AR1 instead of Variance Components in SAS, which fits the observation that AIC values (in SAS) indicate both covariance structures fit data equally well. This is not very satisfactory to me, and I would be interesting to know what is happening here. Realizing this might not be the correct forum for this question, I would like to ask you all if anyone would have any input as to what is going on here, e.g. am I setting up my model erroneously, etc.? N.b. I have no desire to replicate SAS results, but I would most certainly be interested to know what could possibly explain such a large discrepancy between the two platforms. Any suggestions greatly welcomed. (Data are located at: https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0) Hmm, does SAS include a nugget effect perchance? At any rate, showing the SAS output (or some of it) might make it easier for someone to help. Also, R-sig-ME is a better choice for discussions of mixed effects models. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem Solved: optim fails when using arima
I am using arima(x, order=c(p,0,q)) function for my project, which deals with a set of large differenced time series data, data size varies from 8000 to 7. I checked their stationarity before applying arima. Occasionally, arima(x, order=c(p,0,q)) gives me error like following (which stops script running): Error in optim(init[mask], armafn, method = optim.method, Hessian = TRUE, : non-finite finite-difference value [16] The last [16] would change anyting from 1 to 16. Using argument method=CSS, or ML, or default did not help. I am using the newest R version 3.1.2 for windows 7. I have done a lot of research on internet for this Error Message, and tried a lot of suggested solutions too. But the results are negative. Then, finally, I used following line which solved my problem. arima(x, order=c(p,0,q), optim.method=Nelder-Mead) Hope this helps others with similar situations. Shuxiang Albert Li [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Impute time-series data, perhaps with a Kalman filter - do you know of any R code?
Does anyone have code that uses a Kalman filter to impute time-series data? If not, do you know of any software that can be used to impute time-series data? Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table with missing data and consecutive delimiters
Hello, You're missing a dollar sign: 2$$$5, not 2$$5. Hope this helps, Rui Barradas Em 11-02-2015 14:53, Tim Victor escreveu: All, Assume we have data in an ASCII file that looks like Var1$Var2$Var3$Var4 1$2$3$4 2$$5 $$$6 When I execute read.table( 'test.dat', header=TRUE, sep='$' ) I, of course, receive the following error: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 2 did not have 4 elements When I set fill=TRUE, e.g., read.table( 'test.dat', header=TRUE, sep='$', fill=TRUE ) I get: Var1 Var2 Var3 Var4 11234 22 NA5 NA 3 NA NA NA6 What I need is Var1 Var2 Var3 Var4 11234 22 NA NA5 3 NA NA NA6 What am I missing? Thanks, Tim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prediction intervals for robust regression
On 11/02/2015 19:38, Bert Gunter wrote: Presumably you've checked out: http://cran.r-project.org/web/views/Robust.html If you can estimate the variance of parameter estimates, betahat, then you can estimate the variance of a predicted value, X betahat; add the estimated variance of individuals to this if that's what you're looking for (and it's not already available). But that's not too much use without some idea of the error distribution, and using robust statistics assumes it is non-normal, long-tailed. And it is unusual to have enough data to estimate the tail behaviour of such a distribution. It might be better to do this with a parametric model with a long-tailed error distribution, especially if there is evidence (e.g. from other samples) about the latter. Further questions should go to a statistics site like stats.stackexchange.com, as statistical questions are off topic here. Agreed. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. Clifford Stoll On Wed, Feb 11, 2015 at 11:03 AM, Burns, Jonathan (NONUS) jonathan.bur...@gdit.com wrote: I have created robust regression models using least trimmed squares and MM-regression (using the R package robustbase). I am now looking to create prediction intervals for the predicted results. While I have seen some discussion in the literature about confidence intervals on the estimates for robust regression, I haven't had much success in finding out how to create prediction intervals for the results. I was wondering if anyone would be able to provide some direction on how to create these prediction intervals in the robust regression setting. Thanks, Jonathan Burns Sr. Statistician General Dynamics Information Technology Medicare Medicaid Solutions One West Pennsylvania Avenue Baltimore, MD 21204 (410)-842-1594 jonathan.bur...@gdit.commailto:jonathan.bur...@gdit.com www.gdit.comhttp://www.gdit.com/ -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford 1 South Parks Road, Oxford OX1 3TG, UK __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.table with missing data and consecutive delimiters
All, Assume we have data in an ASCII file that looks like Var1$Var2$Var3$Var4 1$2$3$4 2$$5 $$$6 When I execute read.table( 'test.dat', header=TRUE, sep='$' ) I, of course, receive the following error: Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 2 did not have 4 elements When I set fill=TRUE, e.g., read.table( 'test.dat', header=TRUE, sep='$', fill=TRUE ) I get: Var1 Var2 Var3 Var4 11234 22 NA5 NA 3 NA NA NA6 What I need is Var1 Var2 Var3 Var4 11234 22 NA NA5 3 NA NA NA6 What am I missing? Thanks, Tim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] $ operator is invalid for atomic vectors
Hi, If x is a data frame, then x$getmean will try to get the vector named getmean in x. You put () after x$getmean. I think r is confused about it. It appears that you want to call a function named getmean(). -- View this message in context: http://r.789695.n4.nabble.com/operator-is-invalid-for-atomic-vectors-tp4703112p4703114.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] AR1 covariance structure for lme object and R/SAS differences in model output
Dear R users, We are working on a data set in which we have measured repeatedly a physiological response variable (y) every 20 min for 12 h (time variable; 'x') in subjects ('id') beloning to one of five groups ('group'; 'A' to 'E'). Data are located at: https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0 We are interested to model if the response in y differences with time (i.e. 'x') for the two groups. Thus: require(nlme) m1-lme(y~group*x+group*I(x^2),random=~x|id,data=data.df,na.action=na.omit) But because data are collected repeatedly over short time intervals for each subject, it seemed prudent to consider an autoregressive covariance structure. Thus: m2-update(m1,~.,corr=corCAR1(form=~x|id)) AIC values indicate the latter (i.e. m2) as more appropriate: anova(m1,m2) # Model df AIC BIC logLikTest L.Ratio p-value #m1 1 19 2155.996 2260.767 -1058.9981 #m2 2 20 2021.944 2132.229 -990.9718 1 vs 2 136.0525 .0001 Fixed effects and test statistics differ between models. A look at marginal ANOVA tables suggest inference might differ somewhat between models: anova.lme(m1,type=m) # numDF denDF F-value p-value #(Intercept) 1 1789 63384.80 .0001 #group 445 1.29 0.2893 #x 1 1789 0.05 0.8226 #I(x^2)1 1789 4.02 0.0451 #group:x 4 1789 2.61 0.0341 #group:I(x^2) 4 1789 4.37 0.0016 anova.lme(m2,type=m) # numDF denDF F-value p-value #(Intercept) 1 1789 59395.79 .0001 #group 445 1.33 0.2725 #x 1 1789 0.04 0.8379 #I(x^2)1 1789 2.28 0.1312 #group:x 4 1789 2.09 0.0802 #group:I(x^2) 4 1789 2.81 0.0244 Now, this is all well. But: my colleagues have been running the same data set using PROC MIXED in SAS and come up with substantially different results when comparing SAS default covariance structure (variance components) and AR1. Specifically, there is virtually no change in either test statistics or fitted values when using AR1 instead of Variance Components in SAS, which fits the observation that AIC values (in SAS) indicate both covariance structures fit data equally well. This is not very satisfactory to me, and I would be interesting to know what is happening here. Realizing this might not be the correct forum for this question, I would like to ask you all if anyone would have any input as to what is going on here, e.g. am I setting up my model erroneously, etc.? N.b. I have no desire to replicate SAS results, but I would most certainly be interested to know what could possibly explain such a large discrepancy between the two platforms. Any suggestions greatly welcomed. (Data are located at: https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0) With all best wishes, Andreas -- View this message in context: http://r.789695.n4.nabble.com/AR1-covariance-structure-for-lme-object-and-R-SAS-differences-in-model-output-tp4703103.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download internet videos
Hmmm. Well, a youtube video is a file. Therefore search for R download file and you will find the download.file() function. -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 2/11/15, 8:17 AM, Raoni Rodrigues caciquesamu...@gmail.com wrote: Hello R-helpers, It is possible donwload youtube videos with R? I made a google search and find no options to do that. Thanks in advanced, Raoni [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] suggestion for optimal plotting to show significant differences
Petr, My first attempt is to use the simple=TRUE argument to interaction2wt. Then the bwplots in the item|item panel show the behavior of value over day for each item. You get a plot similar to this panel with the growth curve plots from nlme, for example, bwplot(value ~ day | item, data=test, horizontal=FALSE) I am treating set as a replication and each box is cumulated over the three sets. My analysis question is about day. You have it as numeric. My inclination would be to make day factor. Then you could model the interaction of day and item. Rich On Mon, Feb 9, 2015 at 6:01 AM, PIKAL Petr petr.pi...@precheza.cz wrote: Hallo Richard. I tried your suggestion but it seems to be no better than simple ggplot. Let me extend the example a bit to 8 items which is more realistic. item-rep(letters[1:8], each=18) day-rep((0:5)*100, 24) set-rep(rep(1:3, each=6), 8) test-data.frame(item, day, set) set.seed(111) test$value-(test$day/100+1)+rnorm(144) test$value-test$value+(as.numeric(test$item)*1.3) Value is increasing during time (day) for each tested subject (item), each item is measured 3 times (set) each day. Here is some graph p-ggplot(test, aes(x=day, y=value, colour=item)) p+geom_point()+stat_smooth(method=lm, formula= y~poly(x,2)) I can do lm or aov, however I am not sure about proper formula. fit-lm(value~day, data=test) summary(fit) # this shows that value is increasing with day fit-lm(value~day/item, data=test) summary(fit) # this suggests that value is decreasing with day (which is wrong) fit-lm(value~day*item, data=test) summary(fit) # and this tells me that value is increasing with day and items have different intercepts but the same rate of growth (I hope I got it right). I do not have your book available but I went through help pages. Your interaction graph is not much better than ggplot. I can do interaction2wt(value ~ item * day, data=test) which probably is closer to actual problem. The basic problem is that increase of value with days is in fact not linear and actually it can increase in the beginning and then stagnate or it can stagnate in beginning and then increase. I am not aware of any way how to compare time behaviour of different items in such situations if I cannot state some common formula in which case I would use probably nlme. Thank for your insight, I try to go through it more deeply. Best regards Petr -Original Message- From: Richard M. Heiberger [mailto:r...@temple.edu] Sent: Friday, February 06, 2015 6:14 PM To: PIKAL Petr Cc: r-help@r-project.org Subject: Re: [R] suggestion for optimal plotting to show significant differences I would try one of these illustrations for starts. interaction2wt (two-way tables) is designed to be used with aov() for testing. interaction2wt shows all main effects and all two-way interactions for many factors. test - structure(list(item = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(A, B), class = factor), day = c(0L, 100L, 200L, 300L, 400L, 500L, 0L, 100L, 200L, 300L, 400L, 500L, 0L, 100L, 200L, 300L, 400L, 500L, 0L, 100L, 200L, 300L, 400L, 500L, 0L, 100L, 200L, 300L, 400L, 500L, 0L, 100L, 200L, 300L, 400L, 500L), set = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), value = c(1.08163365169503, 2.61998412608805, 3.07820466606394, 4.44993419381934, 5.29163171545805, 6.29155990999293, -0.123163011367676, 2.07767236834003, 2.32537052874901, 3.09372794501084, 6.65273721166635, 5.92304962329131, 1.50504697705548, 2.66253728086866, 2.63420157418685, 2.78195098580416, 6.47578642973288, 5.89587443775143, 0.848864231485078, 1.27549677119713, 2.19573089053609, 2.45659926134292, 5.15424403414103, 5.4813151140983, 1.25731482647214, 2.09662105167973, 1.75954023316977, 4.81624002288939, 4.65029189325307, 6.39946904227214, 0.944996929887344, 1.74667265331284, 2.42956264345558, 5.17852980415141, 3.5453435965834, 6.9011238437191)), .Names = c(item, day, set, value), row.names = c(NA, -36L), class = data.frame) library(HH) test$set - factor(test$set) test$day - factor(test$day) test$item - factor(test$item) interaction2wt(value ~ item * day * set, data=test) test$item.day - interaction(test$item, test$day) position(test$item.day) - outer(c(-10,10), as.numeric(levels(test$day)), `+`) xyplot(value ~ as.position(item.day) | set, groups=item, data=test, horizontal=FALSE, pch=c(17,16), xlab=day, scales=list( x=list( alternating=1, at=levels(test$day), ## placement of tick labels and marks tck=1)), key=list( text=list(c(A,B), col=c(blue,red)), points=list(pch=c(17, 16), col=c(blue,red)), space=top,
Re: [R] Nonlinear integer programming question
Hi Rebecca, It will be very helpful if you can provide a set of specific functions g(x), h(x) and m(x). -- View this message in context: http://r.789695.n4.nabble.com/Nonlinear-integer-programming-question-tp4703122p4703128.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] piecewise regression and lm() question
I was playing with some examples of piecewise regression using lm() and have come across a behavior I'm uncertain about. Below is simple 3-segment dataset. I compare predicted output of a model created by one call to lm() to that of 3 models created by 3 calls to lm(). In case A and B, the results are the same. However, in case C the results differ for the middle segment. Is the output of lm() for case C to be expected or not and if so, why? Thank you, Jill set.seed(133) y - c(21:30, rep(15,10), 10:1) + runif(30, -2, 2) x - 1:30 # Case A: 3 segments, each fit with a constant value plot(x, y) # 3 different lm fits p.c3 - c(predict(lm(y[1:10]~1)), predict(lm(y[11:20]~1)), predict(lm(y[21:30]~1))) lines(x, p.c3, col=red) # piecewise via lm p.lm1 - predict(lm(y ~ I(x=10) + I(x10 x=20) + I(x20))) lines(x, p.lm1, col=blue) max(abs(p.c3-p.lm1)) # Case B: 3 segments, each fit with a line plot(x, y) # 3 different lm fits p.c3 - c(predict(lm(y[1:10]~x[1:10])), predict(lm(y[11:20]~x[11:20])), predict(lm(y[21:30]~x[21:30]))) lines(x, p.c3, col=red) # piecewise via lm p.lm1 - predict(lm(y ~ (x=10)*x + (x10 x=20)*x + (x20)*x)) lines(x, p.lm1, col=blue) max(abs(p.c3-p.lm1)) # Cast C: 3 segments - the middle fit with a constant value, the outer by a line plot(x, y) # 3 different lm fits p.c3 - c(predict(lm(y[1:10]~x[1:10])), predict(lm(y[11:20]~1)), predict(lm(y[21:30]~x[21:30]))) lines(x, p.c3, col=red) # piecewise via lm p.lm1 - predict(lm(y ~ (x=10)*x + I(x10 x=20) + (x20)*x)) lines(x, p.lm1, col=blue) max(abs(p.c3-p.lm1)) ## the single call to lm does not have a constant value fit to the middle section plot(x, p.c3-p.lm1 ) The information contained in this transmission may contain confidential information. If the reader of this message is not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AR1 covariance structure for lme object and R/SAS differences in model output
anord andreas.nord at biol.lu.se writes: [snip snip] We are working on a data set in which we have measured repeatedly a physiological response variable (y) every 20 min for 12 h (time variable; 'x') in subjects ('id') beloning to one of five groups ('group'; 'A' to 'E'). Data are located at: https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0 We are interested to model if the response in y differences with time (i.e. 'x') for the two groups. Thus: require(nlme) m1-lme(y~group*x+group*I(x^2),random=~x|id, data=data.df,na.action=na.omit) But because data are collected repeatedly over short time intervals for each subject, it seemed prudent to consider an autoregressive covariance structure. Thus: m2-update(m1,~.,corr=corCAR1(form=~x|id)) AIC values indicate the latter (i.e. m2) as more appropriate: anova(m1,m2) # Model df AIC BIC logLikTest L.Ratio p-value #m1 1 19 2155.996 2260.767 -1058.9981 #m2 2 20 2021.944 2132.229 -990.9718 1 vs 2 136.0525 .0001 Fixed effects and test statistics differ between models. A look at marginal ANOVA tables suggest inference might differ somewhat between models: anova.lme(m1,type=m) # numDF denDF F-value p-value #(Intercept) 1 1789 63384.80 .0001 #group 445 1.29 0.2893 #x 1 1789 0.05 0.8226 #I(x^2)1 1789 4.02 0.0451 #group:x 4 1789 2.61 0.0341 #group:I(x^2) 4 1789 4.37 0.0016 anova.lme(m2,type=m) # numDF denDF F-value p-value #(Intercept) 1 1789 59395.79 .0001 #group 445 1.33 0.2725 #x 1 1789 0.04 0.8379 #I(x^2)1 1789 2.28 0.1312 #group:x 4 1789 2.09 0.0802 #group:I(x^2) 4 1789 2.81 0.0244 Now, this is all well. But: my colleagues have been running the same data set using PROC MIXED in SAS and come up with substantially different results when comparing SAS default covariance structure (variance components) and AR1. Specifically, there is virtually no change in either test statistics or fitted values when using AR1 instead of Variance Components in SAS, which fits the observation that AIC values (in SAS) indicate both covariance structures fit data equally well. This is not very satisfactory to me, and I would be interesting to know what is happening here. Realizing this might not be the correct forum for this question, I would like to ask you all if anyone would have any input as to what is going on here, e.g. am I setting up my model erroneously, etc.? N.b. I have no desire to replicate SAS results, but I would most certainly be interested to know what could possibly explain such a large discrepancy between the two platforms. Any suggestions greatly welcomed. (Data are located at: https://www.dropbox.com/s/hf455aev3teb5e0/data.csv?dl=0) Could you repost this on r-sig-mixed-mod...@r-project.org ? It would be useful to see some of the comparisons (fixed effects, RE variance-covariance, correlation parameter) between SAS and R; are they actually fitting the same model? (e.g., does SAS allow for covariance between the slope and intercept random effects?) Maybe they're getting the same estimates but computing df/p-values in different ways? I thought I would try this with orthogonal polynomials in case some of the fits were unstable ... data.df - read.csv2(ar1_data.csv) library(nlme) m1 - lme(y~group*x+group*I(x^2),random=~x|id, data=data.df,na.action=na.omit,method=ML) ## use method=ML so we can compare orthog. and non-orthog. polynomials m1B - update(m1,.~group*poly(x,2)) m2 - update(m1,corr=corCAR1(form=~x|id)) m2B - update(m1B,corr=corCAR1(form=~x|id)) AIC(m1,m1B,m2,m2B) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] piecewise regression and lm() question
On 11/02/2015 4:30 PM, Goldschneider, Jill wrote: I was playing with some examples of piecewise regression using lm() and have come across a behavior I'm uncertain about. Below is simple 3-segment dataset. I compare predicted output of a model created by one call to lm() to that of 3 models created by 3 calls to lm(). In case A and B, the results are the same. However, in case C the results differ for the middle segment. Is the output of lm() for case C to be expected or not and if so, why? Take a look at the fit value, and you'll see what happened: lm(y ~ (x=10)*x + I(x10 x=20) + (x20)*x) Call: lm(formula = y ~ (x = 10) * x + I(x 10 x = 20) + (x 20) * x) Coefficients: (Intercept) x = 10TRUE x I(x 10 x = 20)TRUE 35.0741 -15.0733 -0.1644 -17.7344 x 20TRUEx = 10TRUE:x x:x 20TRUE NA 1.2397 -0.9658 The * in a formula means main effect plus interaction, not multiplication. W hat you want is lm(y ~ I((x=10)*x) + I(x10 x=20) + I((x20)*x)) This doesn't give exactly the same results as the segmented regression, because it uses the same intercept in all three segments; you might want to add I(x = 10) as well if you don't want that. Duncan Murdoch Thank you, Jill set.seed(133) y - c(21:30, rep(15,10), 10:1) + runif(30, -2, 2) x - 1:30 # Case A: 3 segments, each fit with a constant value plot(x, y) # 3 different lm fits p.c3 - c(predict(lm(y[1:10]~1)), predict(lm(y[11:20]~1)), predict(lm(y[21:30]~1))) lines(x, p.c3, col=red) # piecewise via lm p.lm1 - predict(lm(y ~ I(x=10) + I(x10 x=20) + I(x20))) lines(x, p.lm1, col=blue) max(abs(p.c3-p.lm1)) # Case B: 3 segments, each fit with a line plot(x, y) # 3 different lm fits p.c3 - c(predict(lm(y[1:10]~x[1:10])), predict(lm(y[11:20]~x[11:20])), predict(lm(y[21:30]~x[21:30]))) lines(x, p.c3, col=red) # piecewise via lm p.lm1 - predict(lm(y ~ (x=10)*x + (x10 x=20)*x + (x20)*x)) lines(x, p.lm1, col=blue) max(abs(p.c3-p.lm1)) # Cast C: 3 segments - the middle fit with a constant value, the outer by a line plot(x, y) # 3 different lm fits p.c3 - c(predict(lm(y[1:10]~x[1:10])), predict(lm(y[11:20]~1)), predict(lm(y[21:30]~x[21:30]))) lines(x, p.c3, col=red) # piecewise via lm p.lm1 - predict(lm(y ~ (x=10)*x + I(x10 x=20) + (x20)*x)) lines(x, p.lm1, col=blue) max(abs(p.c3-p.lm1)) ## the single call to lm does not have a constant value fit to the middle section plot(x, p.c3-p.lm1 ) The information contained in this transmission may contain confidential information. If the reader of this message is not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simple question - mean of a row of a data.frame
Hi all, Simple question I should know: I'm unclear on the logic of why the sum of a row of a data.frame returns a valid sum but the mean of a row of a data.frame returns NA: sum(rock[2,]) [1] 10901.05 mean(rock[2,],trim=0) [1] NA Warning message: In mean.default(rock[2, ], trim = 0) : argument is not numeric or logical: returning NA I get that rock[2,] is itself a data.frame of mode list, but why the inconsistency between functions? How can you figure this out from, e.g., ?mean ?sum Thanks in advance, Matt -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple question - mean of a row of a data.frame
Hi, rock[2,] is a data frame and you should not use sum() on a data frame, first google hit for the error message gives http://stackoverflow.com/questions/19697498/r-beginner-argument-is-not-numeric-or-logical-returning-na Otherwise I think you should use ?rowSums and ?rowMeans if you have numeric data frames. HTH, daniel Feladó: R-help [r-help-boun...@r-project.org] ; meghatalmaz#243;: Matthew Keller [mckellerc...@gmail.com] Küldve: 2015. február 11. 23:49 To: r help Tárgy: [R] simple question - mean of a row of a data.frame Hi all, Simple question I should know: I'm unclear on the logic of why the sum of a row of a data.frame returns a valid sum but the mean of a row of a data.frame returns NA: sum(rock[2,]) [1] 10901.05 mean(rock[2,],trim=0) [1] NA Warning message: In mean.default(rock[2, ], trim = 0) : argument is not numeric or logical: returning NA I get that rock[2,] is itself a data.frame of mode list, but why the inconsistency between functions? How can you figure this out from, e.g., ?mean ?sum Thanks in advance, Matt -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subsetting data with svyglm
I am aware that it is possible to specify a subset with a single logical operator when constructing a model, such as: svyglm(formula, design=data, subset=variable==value). What I can't figure out is how to specify a subset with two or more logical operators: svyglm(formula, design=data, subset=variable==value a|value b). Is it possible to specify a subset in this way using *glm without having to, in my case, subset the original data, create a survey design, and then fit a model? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] grofit issues with replicates - probit or logit or glmm
Hello I tried use grofit package in our data set. We provide a subset of our data with X iso, and 4 doses, and insect died was count each day for long 5 days. We started with Y insects per dishes. When one is dead, it was counted and removed. Died insect is cumulative in the next days. i.e. day 1 died 1. day 2 no died, so, day 2 is assigned 1 died (from day 1). Here is the script: library(lattice) library(grofit) library(repmis) url.csv - https://dl.dropboxusercontent.com/u/34009642/cepajabo07_wide_acumulado.csv data02 - read.table(url.csv, header=TRUE, sep=\t, dec=,) head(data02) timepoints - 1:5 # 5 days time - t(matrix(rep(timepoints, 120), c(5, 120))) # 5 days and 120 experiments # (6 iso * 4 doses # * 5 rep) time TestRun1$drFit TestRun2$drFit colData - c(black, cyan, magenta, blue) plot(TestRun1$gcFit, opt = s, colData = colData, colSpline = 1, pch = 1:4, cex = 1) plot(TestRun2$gcFit, opt = s, colData = colData, colSpline = 1, pch = 1:4, cex = 1) plot(TestRun1$drFit$drFittedSplines[[1]], colData = colData, pch = 1:4, cex = 1) plot(TestRun2$drFit$drFittedSplines[[1]], colData = colData, pch = 1:4, cex = 1) The problem: grofit didn't deal with replicates and do a curve for each ones. Is it a way to get response curve with the replicates? We are interested in LD50, and dose response curve, and graphs. Any suggestion is very welcome! Thank you! -- Marcelo __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to read this Rpart decision tree?
Hi all, In the attachment or this link (http://oi58.tinypic.com/35ic9qc.jpg) you'll find the decision tree I made. I used the Rpart package to make the tree and the rattle package using the fancyRpartPlot to plot it. The data in the tree looks different than about every example I have seen before. I don't understand how I should read it. I want to predict Product (which are productkeys). The variables to predict it contain age, incomegroup, gender, totalchildren, education, occupation, houseownerflag, numberCars.It looks like the upper number is a ProductKey. The n is number of observations? And the percentage of the yes/no question below. This is the code I used. ss.rpart1 - rpart(Product ~ ., data=sstrain, control=rpart.control(minbucket=2,minsplit=1, cp=-1)) spt - which.min(ss.rpart1$cptable[, xerror]) scp - ss.rpart1$cptable[opt, CP] ss.rpart2 - prune(ss.rpart1, cp=cp) fancyRpartPlot(ss.rpart2) So why does the tree looks so different from the most (for example: http://media.tumblr.com/a9f482ff88b0b9cfaffca7ffd46c6a8e/tumblr_inline_mz7pyuaYJQ1s5wtly.png). This is from Trevor Stephen's TItanic tutorial. The first node show that 62% of 100% doesn't survive. If they were male, only 19% of them were survivors. I find that a lot examples look like that. Why does mine predict per ProductKey and every node it has something else. it doesn't make sense to me. And it doesn't have the two numbers like .62 and .38 but it has n=197e+3. So should I read the first node like For 100% of the observations of ProductKey 1074, the incomegroup was moderate)? Thank you! Kim __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mixed-effects model for pre-post randomization design
DeaR userRs, I recently read this Liang-Zeger article: http://sankhya.isical.ac.in/search/62b1/fpaper7.html in which (among other things) they adopt a random intercept model for pre-post designed trials, using a conditional likelihood approach (I didn't think it possible with only two measurements per subject) I'm trying to figure out (if and) how it is possible to reproduce straightforwardly their model using R standard mixed model tools, but I cannot even try to reproduce their work, since they used a non-available dataset (I found an extract on prof. Diggle's web site where it is explicitly reported to be confidential), so I have to review a bit of likelihood theory along with some implementation details. In the meantime, I wonder if anyone here could point out any related documentation to me. Thank you. Marco. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read this Rpart decision tree
First: summary(ss.rpart1) or summary(ss.rpart, file=whatever) The printout will be quite long since your tree is so large, so the second form may be best followed by a perusal of the file with your favorite text editor. The file name of whatever above should be something you choose, of course. This will give you a full description of the tree. Read the first node or two very carefully so that you understand what the fit did. Plotting routines for trees have to make display choices, since there simply is not enough space available to list all the details. You have a complicated endpoint with at least 14 different products. The predicted value for the each node of the tree is a vector of percentages (one per product, adds to one); plots often show only the name of the most frequent. The alive/dead endpoint for the Titanic data is a lot easier to fit into a little plotting oval so of course the plotted tree is easier to grasp. Terry T. On 02/11/2015 05:00 AM, r-help-requ...@r-project.org wrote: Hi all, In the attachment or this link (http://oi58.tinypic.com/35ic9qc.jpg) you'll find the decision tree I made. I used the Rpart package to make the tree and the rattle package using the fancyRpartPlot to plot it. The data in the tree looks different than about every example I have seen before. I don't understand how I should read it. I want to predict Product (which are productkeys). The variables to predict it contain age, incomegroup, gender, totalchildren, education, occupation, houseownerflag, numberCars.It looks like the upper number is a ProductKey. The n is number of observations? And the percentage of the yes/no question below. This is the code I used. ss.rpart1 - rpart(Product ~ ., data=sstrain, control=rpart.control(minbucket=2,minsplit=1, cp=-1)) spt - which.min(ss.rpart1$cptable[, xerror]) scp - ss.rpart1$cptable[opt, CP] ss.rpart2 - prune(ss.rpart1, cp=cp) fancyRpartPlot(ss.rpart2) So why does the tree looks so different from the most (for example:http://media.tumblr.com/a9f482ff88b0b9cfaffca7ffd46c6a8e/tumblr_inline_mz7pyuaYJQ1s5wtly.png). This is from Trevor Stephen's TItanic tutorial. The first node show that 62% of 100% doesn't survive. If they were male, only 19% of them were survivors. I find that a lot examples look like that. Why does mine predict per ProductKey and every node it has something else. it doesn't make sense to me. And it doesn't have the two numbers like .62 and .38 but it has n=197e+3. So should I read the first node like For 100% of the observations of ProductKey 1074, the incomegroup was moderate)? Thank you! Kim __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] upgrading issues with Rcpp
gabx@hortensia [R] sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 [6] LC_MESSAGES=en_US.UTF-8LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] utils base loaded via a namespace (and not attached): [1] tools_3.1.2 gabx@hortensia [R] search() [1] .GlobalEnvtools:rstudio package:utils package:base -- Now when trying to update Rcpp: gabx@hortensia [R] install.packages(Rcpp) . Error in unloadNamespace(pkg_name) : namespace ‘Rcpp’ is imported by ‘reshape2’, ‘plyr’, ‘dplyr’ so cannot be unloaded .. I can't upgrade my packages because of Rcpp =0.11.3 needed (running actually 0.11.2) What is this namespace imported by 'reshape2’, ‘plyr’, ‘dplyr’ ? How can I get rid of it and upgrade my packages ? Thank you for hints -- google.com/+arnaudgabourygabx __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] upgrading issues with Rcpp
arnaud gaboury arnaud.gaboury at gmail.com writes: Now when trying to update Rcpp: gabx at hortensia [R] install.packages(Rcpp) . Error in unloadNamespace(pkg_name) : namespace ‘Rcpp’ is imported by ‘reshape2’, ‘plyr’, ‘dplyr’ so cannot be unloaded .. I can't upgrade my packages because of Rcpp =0.11.3 needed (running actually 0.11.2) What is this namespace imported by 'reshape2’, ‘plyr’, ‘dplyr’ ? How can I get rid of it and upgrade my packages ? I'd do $ R --vanilla # to ensure nothing is loaded install.packages(Rcpp) # possibly set a repo but what I really do is to use scripts update.r, install.r, install2.r, ... from package littler which makes as this so much easier -- and do it on the command-line with empty R sessions. Dirk __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting data with svyglm
hi brennan, survey design objects can be subsetted with the same subset() syntax as data.frame objects, so following jeff's advice maybe you want svyglm( formula , design = subset( surveydesign , variable %in% c( 'value a' , 'value b' ) ) ) for some examples of how to construct a survey design with public use data, see http://github.com/ajdamico/usgsd On Wed, Feb 11, 2015 at 11:49 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: This seems like a fundamental misunderstanding on your part of how operators, and in particular logical expressions, work in computer languages. Consider some examples: 1+2 has a numeric answer because 1 and 2 are both numeric. 1+a has at the very least not a numeric answer because the values on either side of the + sign are not both numeric. TRUE | FALSE has a logical type of answer because both sides of the logical or operator are logical. However, you are expressing something like TRUE | a string which might mean something but that something generally is not a logical type of answer. Try variable==value a | variable==value b or variable %in% c( value a, value b ) You would probably find that the Introduction to R document that comes with R has some enlightening examples in it. You might also find Pat Burns' The R Inferno entertaining as well (search for it in your favorite search engine). --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On February 11, 2015 8:42:58 PM EST, Brennan O'Banion brennan.oban...@gmail.com wrote: I am aware that it is possible to specify a subset with a single logical operator when constructing a model, such as: svyglm(formula, design=data, subset=variable==value). What I can't figure out is how to specify a subset with two or more logical operators: svyglm(formula, design=data, subset=variable==value a|value b). Is it possible to specify a subset in this way using *glm without having to, in my case, subset the original data, create a survey design, and then fit a model? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting data with svyglm
This seems like a fundamental misunderstanding on your part of how operators, and in particular logical expressions, work in computer languages. Consider some examples: 1+2 has a numeric answer because 1 and 2 are both numeric. 1+a has at the very least not a numeric answer because the values on either side of the + sign are not both numeric. TRUE | FALSE has a logical type of answer because both sides of the logical or operator are logical. However, you are expressing something like TRUE | a string which might mean something but that something generally is not a logical type of answer. Try variable==value a | variable==value b or variable %in% c( value a, value b ) You would probably find that the Introduction to R document that comes with R has some enlightening examples in it. You might also find Pat Burns' The R Inferno entertaining as well (search for it in your favorite search engine). --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On February 11, 2015 8:42:58 PM EST, Brennan O'Banion brennan.oban...@gmail.com wrote: I am aware that it is possible to specify a subset with a single logical operator when constructing a model, such as: svyglm(formula, design=data, subset=variable==value). What I can't figure out is how to specify a subset with two or more logical operators: svyglm(formula, design=data, subset=variable==value a|value b). Is it possible to specify a subset in this way using *glm without having to, in my case, subset the original data, create a survey design, and then fit a model? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.