[R] Finding matches in 2 files
I have 2 files containing data analysed by 2 different methods. I would like to find out which genes appear in both analyses. Can someone show me how to do this? _ [[trailing spam removed]] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] qda(MASS) function error
Mauro Rossi wrote: Dear R user, I'm using qda (quadratic discriminant analysis) function (package MASS) to classify 58 explanatory variables (numeric type with different ranges) using a grouping variable (factor 2 levels 0 1). I'm using the qda method for class 'data.frame' (in this way I don't need to specify a formula). Using the function: result.qda-qda(explanatory.variables, grouping.variable, method=moment) I obtain the following error message: Error in qda.default(x, grouping, ...) : rank deficiency in group 0 I run the script excluding some variables and I've individuated 2 explanatory variables that give problems, but I don't understand why they give them. The two excluded variables are numeric with two possible values: 0 and 1, but in the rest of group of variables, some similar variables are considered. I don't have this problem using lda function for linear discriminant analysis. What does this error message mean? What types of variables does qda function consider? Well, qda assumes real values (and not factors) in the explanatory variables. If you think it makes sense to ignore this assumption (and I doubt it makes sense), then the error message tells you there is a rank deficiency, i.e. some variables might be collinear. Hence at least one of the covariance matrices cannot be inverted. Uwe Ligges Thank in advance, Mauro Rossi __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding matches in 2 files
Maybe 'merge', but your message is wa First On 7/26/07, jenny tan [EMAIL PROTECTED] wrote: I have 2 files containing data analysed by 2 different methods. I would like to find out which genes appear in both analyses. Can someone show me how to do this? _ [[trailing spam removed]] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Christophe Pallier (http://www.pallier.org) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding matches in 2 files
Maybe with 'merge', but your message is too vague (see http://www.catb.org/~esr/faqs/smart-questions.html). On 7/26/07, jenny tan [EMAIL PROTECTED] wrote: I have 2 files containing data analysed by 2 different methods. I would like to find out which genes appear in both analyses. Can someone show me how to do this? _ [[trailing spam removed]] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Christophe Pallier (http://www.pallier.org) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] For loops
Any time you are calling a function one value at a time, it is worth asking if you can eliminate a loop (or more). If 'G.fun' is vectorized in its first argument, then you can easily get rid of the three inner loops. Just generate a vector of all of the values and do: gj - sum(G.fun(long.vector, ff)) If 'G.fun' is not vectorized and can't be vectorized, then you might save some time by still creating a vector of the first argument first. Whether that will be a significant reduction depends on how time consuming 'G.fun' is. There is a caveat. If 'n' is large, then you could create a vector that strains the amount of memory (RAM) that you have. If that is the case, then there will be some compromise between loops and vectorization that will be optimal. Patrick Burns [EMAIL PROTECTED] +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Joaquim J. S. Ramalho wrote: Hi, is there a way of simplifying the following code: G - rep(NA,n) for(i in 1:n) { gj - 0 for(j in 1:n) { for(l in 1:n) { for(m in 1:n) { gj - gj+G.fun(XB[i]+p[3]*X[j,3]+p[4]*X[l,4]+p[5]*X[m,5],ff) } } } G[i] - gj/n^3 } Thanks. Joaquim Santos __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROC curve in R
You might also want to try the ROCR package (http://rocr.bioinf.mpi-sb.mpg.de/). Tutorial slides: http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt Overview paper: http://bioinformatics.oxfordjournals.org/cgi/content/full/21/20/3940 Good luck, Tobias On 7/26/07, Rithesh M. Mohan [EMAIL PROTECTED] wrote: Hi, I need to build ROC curve in R, can you please provide data steps / code or guide me through it. Thanks and Regards Rithesh M Mohan [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tobias Sing Computational Biology and Applied Algorithmics Max Planck Institute for Informatics Saarbrucken, Germany Phone: +49 681 9325 315 Fax: +49 681 9325 399 http://www.tobiassing.net __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Submatrices Extraction
Hello, Given a submatrix containing 0 or 1 I need to extract the indexes of all the diagonal submatrices so one of the two diagonals must contains only 1 for each submatrix ... Any help? Thanks in advance Bruno -- Scegli infostrada: ADSL gratis per tutta lestate e telefoni senza canone Telecom __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regression with Missing values. na.action?
Hi all, Can you please tell me what is the problem here. My regression eq is y = B0 + B1X1 +B2X2 +e And i am interested in coefficient B1 I am doing regression with two cases: 1) reg-lm(y ~ X1 + X2, sam) where sam is the data 2) reg-lm(y ~ X1 + X2, sam, na.action= na.exclude) . I have missing values in X1 but the values of coefficient is not consistent in two cases. Actually B1 in case one sould be smaller than B1 in case 2. But sometimes it comes greater. I can't figure it out. Is there some problem with *na.action ? *My sample size is 100 *Regards,* *Vaibhav* [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression with Missing values. na.action?
na.exclude should give the same results as na.omit, which is the default na.action. Is the number of complete cases the same in these two regressions? On 26/07/07, Vaibhav Gathibandhe [EMAIL PROTECTED] wrote: Hi all, Can you please tell me what is the problem here. My regression eq is y = B0 + B1X1 +B2X2 +e And i am interested in coefficient B1 I am doing regression with two cases: 1) reg-lm(y ~ X1 + X2, sam) where sam is the data 2) reg-lm(y ~ X1 + X2, sam, na.action= na.exclude) . I have missing values in X1 but the values of coefficient is not consistent in two cases. Actually B1 in case one sould be smaller than B1 in case 2. But sometimes it comes greater. I can't figure it out. Is there some problem with *na.action ? *My sample size is 100 *Regards,* *Vaibhav* [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding matches in 2 files
Something like: # Sample data g1-c(gene1, gene2, gene3, gene4, gene5, gene9, gene10, geneA) g2-c(gene6, gene9, gene1, gene2, gene7, gene8, gene9, gene1, gene10) df1-cbind(gene=g1, expr=runif(length(g1))) df2-cbind(gene=g2, expr=runif(length(g2))) # Merge mdf-merge(df1, df2, by=gene, sort=T) # Unique list ug-unique(mdf[,gene]) You may find the match command useful and/or the %in% opertaor. JS --- -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of jenny tan Sent: 26 July 2007 04:35 To: r-help@stat.math.ethz.ch Subject: [R] Finding matches in 2 files I have 2 files containing data analysed by 2 different methods. I would like to find out which genes appear in both analyses. Can someone show me how to do this? _ [[trailing spam removed]] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Download multiple stock quotes in a loop
Hi all, this should be a simple question, but I haven't been able to do it right. I am trying to download multiple stock quotes in a loop, so that every timeseries is safed with the symbol of the stock. Can anybody help me out? Here's the code: require(tseries) startd - 2000-06-01 stocks - c(bmw.de, vow.de, dte.de) for(stock in stocks) stock - as.timeSeries(get.hist.quote(instrument=stock, start=startd, quote=Close, compress=d)) } Thanks in advance, Owe -- Owe Jessen Diplom-Volkswirt Hanssenstraße 17 24106 Kiel [EMAIL PROTECTED] http://www.econinfo.de __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download multiple stock quotes in a loop
Owe Jessen wrote: Hi all, this should be a simple question, but I haven't been able to do it right. I am trying to download multiple stock quotes in a loop, so that every timeseries is safed with the symbol of the stock. Can anybody help me out? Here's the code: require(tseries) startd - 2000-06-01 stocks - c(bmw.de, vow.de, dte.de) for(stock in stocks) stock - as.timeSeries(get.hist.quote(instrument=stock, start=startd, quote=Close, compress=d)) } Thanks in advance, Owe The variable stock is assigned values twice in the cycle. First, it gets the value of bmw.de, and immediately after that it is assigned with the result returned by as.timeSeries( ... ) If you replace the interior of the loop with the assign(paste(stock.,stock,sep=), as.timeSeries(get.hist.quote [etc])) you will get three variables, namely, stock.bmw.de, stock.vow.de and stock.dte.de. -- View this message in context: http://www.nabble.com/Download-multiple-stock-quotes-in-a-loop-tf4150838.html#a11808177 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding matches in 2 files
Is this what you want? g1-c(gene1, gene2, gene3, gene4, gene5, gene9, gene10, + geneA) g2-c(gene6, gene9, gene1, gene2, gene7, gene8, gene9, + gene1, gene10) intersect(g1,g2) [1] gene1 gene2 gene9 gene10 On 7/25/07, jenny tan [EMAIL PROTECTED] wrote: I have 2 files containing data analysed by 2 different methods. I would like to find out which genes appear in both analyses. Can someone show me how to do this? _ [[trailing spam removed]] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple graphs
Does anyone have a simple explanation and example on how to add histograms or barcharts to an other graph like in the example at the R-graph gallery: http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=109 looking at the code I'not undertand very well how to add graphs in arbitrary/clever position with an adequate scale. If somebody have a simplier example with explanations it will be highly appreciate. Best Daniele -- Scegli infostrada: ADSL gratis per tutta lestate e telefoni senza canone Telecom __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] logistic regression
Greetings, I am working on a logistic regression model in R and I am struggling with the code, as it is a relatively new program for me. In searching Google for 'logistic regression diagnostics' I came Elizabeth Brown's Lecture 14 from her Winter 2004 Biostatistics 515 course (http://courses.washington.edu/b515/l14.pdf) . I found most of the code to be very helpful, but I am struggling with the lines on to calculate the observed and expected values in the 10 groups created by the cut function. I get error messages in trying to create the E and O matrices: R won't accept assignment of fi1c==j and it won't calculate the sum. I am wondering whether someone might be able to offer me some assistance...my search of the archives was not fruitful. Here is the code that I adapted from the lecture notes: fit - fitted(glm.lyme) fitc - cut(fit, br = c(0, quantile(fit, p = seq(.1, .9, .1)),1)) t-table(fitc) fitc - cut(fit, br = c(0, quantile(fit, p = seq(.1, .9, .1)), 1), labels = F) t-table(fitc) #Calculate observed and expected values in ea group E - matrix(0, nrow=10, ncol = 2) O - matrix(0, nrow=10, ncol=2) for (j in 1:10) { E[j, 2] = sum(fit[fitc==j]) E[j, 1] = sum((1- fit)[fitc==j]) O[j, 2] = sum(pcdata$lymdis[fitc==j]) O[j, 1] = sum((1-pcdata$lymdis)[fitc==j]) } Here is the error message: Error in Summary.factor(..., na.rm = na.rm) : sum not meaningful for factors I understand what it means; I just can't figure out how to get around it or how to get the output printed in table form. Thank you in advance for any assistance. Mary Sullivan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substituting dots in the names of the columns (sub, gsub, regexpr)
Use \\. or [.] with quotes to denote a literal dot (#1) or can use fixed = TRUE to remove the meaning of dot (#2) or use a zero-width lookahead assertion (?=[.]) which will be matched but is not added to the string to be replaced (#3). Try ?regexpr . Also the links on the gsubfn home page (http://code.google.com/p/gsubfn/) point to a number of good resources on regular expressions. Str - c(y..m., BD..g.cm3., PR..Mpa., Ks..m.s., SP.g..g., P..m3.m3., theta1..g.g., theta2..g.g., AWC..g.g.) # 1 tmp - gsub([.]+, ., Str) sub([.]+$, , tmp) # 2 tmp - gsub(.., ., Str, fixed = TRUE) sub([.]+$, , tmp) # 3 - both done at once using zero-width lookahead gsub([.]*$|[.]*(?=[.]), , Str, perl = TRUE) On 7/26/07, 8rino-Luca Pantani [EMAIL PROTECTED] wrote: Dear R users, I have the following two problems, related to the function sub, grep, regexpr and similia. The header of the file(s) I have to import is like this. c(y (m), BD (g/cm3), PR (Mpa), Ks (m/s), SP g./g., P (m3/m3), theta1 (g/g), theta2 (g/g), AWC (g/g)) To get rid of spaces and symbols in the names of the columns, I use read.table(... check.names=TRUE) and I get: str - c(y..m., BD..g.cm3., PR..Mpa., Ks..m.s., SP.g..g., P..m3.m3., theta1..g.g., theta2..g.g., AWC..g.g.) Now, my problem is to remove the trailing dots, as well as the double dots, in order to get the names like the following c(y.m, BD.g.cm3, PR.Mpa, Ks.m.s, SP.g.g, P.m3.m3., theta1.g.g, theta2.g.g, AWC.g.g) I've searched the help pages for sub, regexpr and similia, and also searched the help archives. I understand that the dot is a peculiar sign since sub(.., ., str) [1] ..m....g.cm3. ...Mpa. ...m.s. ..g..g. [6] ..m3.m3..eta1..g.g. .eta2..g.g. .C..g.g. Therefore I tried sub(\\.., ., str) [1] y.m.BD.g.cm3. PR.Mpa. Ks.m.s. SP...g. [6] P.m3.m3.theta1.g.g. theta2.g.g. AWC.g.g. and I've been surprised by the (to me) strange behaviour in SP.g..g. modified in SP...g. An this is the first problem I cannot solve. Then there's the problem of trailing dot removal. In http://tolstoy.newcastle.edu.au/R/e2/help/07/01/8665.html I've found a somewhat similar problem, but it do not works in this case since: gsub([.].*, , str) [1] y BD PR Ks SP P theta1 theta2 [9] AWC And this the second problem Apart this particular problems I would like to know more on regexp, sub and so on, since each time I have strings to manipulate, I must face my ignorance in the topic of regular expression and its syntax. Is there any page with examples, where I can improve my knowledge and stop being frustrated each time I have to manipulate strings? 8rino -- Ottorino-Luca Pantani, Università di Firenze Dip. Scienza del Suolo e Nutrizione della Pianta P.zle Cascine 28 50144 Firenze Italia Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substituting dots in the names of the columns (sub, gsub, regexpr)
Use \\. or [.] with quotes to denote a literal dot (#1) or can use fixed = TRUE to remove the meaning of dot (#2) or use a zero-width lookahead assertion (?=[.]) which will be matched but is not added to the string to be replaced (#3). Try ?regexpr . Also the links on the gsubfn home page (http://code.google.com/p/gsubfn/) point to a number of good resources on regular expressions. Str - c(y..m., BD..g.cm3., PR..Mpa., Ks..m.s., SP.g..g., P..m3.m3., theta1..g.g., theta2..g.g., AWC..g.g.) # 1 tmp - gsub([.]+, ., Str) sub([.]+$, , tmp) # 2 tmp - gsub(.., ., Str, fixed = TRUE) sub([.]+$, , tmp) # 3 - both done at once using zero-width lookahead gsub([.]*$|[.]*(?=[.]), , Str, perl = TRUE) On 7/26/07, 8rino-Luca Pantani [EMAIL PROTECTED] wrote: Dear R users, I have the following two problems, related to the function sub, grep, regexpr and similia. The header of the file(s) I have to import is like this. c(y (m), BD (g/cm3), PR (Mpa), Ks (m/s), SP g./g., P (m3/m3), theta1 (g/g), theta2 (g/g), AWC (g/g)) To get rid of spaces and symbols in the names of the columns, I use read.table(... check.names=TRUE) and I get: str - c(y..m., BD..g.cm3., PR..Mpa., Ks..m.s., SP.g..g., P..m3.m3., theta1..g.g., theta2..g.g., AWC..g.g.) Now, my problem is to remove the trailing dots, as well as the double dots, in order to get the names like the following c(y.m, BD.g.cm3, PR.Mpa, Ks.m.s, SP.g.g, P.m3.m3., theta1.g.g, theta2.g.g, AWC.g.g) I've searched the help pages for sub, regexpr and similia, and also searched the help archives. I understand that the dot is a peculiar sign since sub(.., ., str) [1] ..m....g.cm3. ...Mpa. ...m.s. ..g..g. [6] ..m3.m3..eta1..g.g. .eta2..g.g. .C..g.g. Therefore I tried sub(\\.., ., str) [1] y.m.BD.g.cm3. PR.Mpa. Ks.m.s. SP...g. [6] P.m3.m3.theta1.g.g. theta2.g.g. AWC.g.g. and I've been surprised by the (to me) strange behaviour in SP.g..g. modified in SP...g. An this is the first problem I cannot solve. Then there's the problem of trailing dot removal. In http://tolstoy.newcastle.edu.au/R/e2/help/07/01/8665.html I've found a somewhat similar problem, but it do not works in this case since: gsub([.].*, , str) [1] y BD PR Ks SP P theta1 theta2 [9] AWC And this the second problem Apart this particular problems I would like to know more on regexp, sub and so on, since each time I have strings to manipulate, I must face my ignorance in the topic of regular expression and its syntax. Is there any page with examples, where I can improve my knowledge and stop being frustrated each time I have to manipulate strings? 8rino -- Ottorino-Luca Pantani, Università di Firenze Dip. Scienza del Suolo e Nutrizione della Pianta P.zle Cascine 28 50144 Firenze Italia Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fit t Copula
Hi, I am trying to fit t copula to some data, and I am using the following function in the library(QRMlib). Udatac - apply(datac, 2, edf,adjust=1) tcopulac - fit.tcopula.rank(Udatac) But the error message come out Error in fit.tcopula.rank(Udatac) : Non p.s.d. covariance matrix Could anyone give me some advice? In fact, I am not sure what the adjust=1 is used for. Many thanks. -- View this message in context: http://www.nabble.com/Fit-t-Copula-tf4152818.html#a11814432 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create Strings of Column Id's
Is this what you want: paste(-, paste(colnames(MyMatrix)[COL], collapse='-'), sep='') [1] -E-T On 7/26/07, Tom.O [EMAIL PROTECTED] wrote: Does anyone know how this is don? I have a large matrix where I extract specific columns into txt files for further use. To be able to keep track of which txt files contain which columns I want to name the filenames with the column Id's. The most basic example would be to use an for() loop together with paste(), but the result is blank. Not even NULL. this is the concept of thecode i use: for example MyMatrix - matrix(NA,ncol=4,nrow=1,dimnames=list(NULL,c(E,R,T,Y))) COL - c(1,3) # a vector of columns I want to extract, Filename - NULL # the starting variable, so I can use paste Filename - for(i in colnames(MyMatrix)[COL]) {paste(Filename,-,i,sep=)} The result is -T, but I want it to be -E-T Anyone have a clue? Thanks Tom -- View this message in context: http://www.nabble.com/Create-Strings-of-Column-Id%27s-tf4153354.html#a11816439 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Dates
Are you using the latest version of fame? 1.05 and earlier had a bug in tisFromCsv that was fixed in 1.08. Below I show what I get with fame version 1.08. There is still a problem in that the frequency-figuring logic appears to think the frequency is bwsunday (biweekly with weeks ending on Sunday) rather than semimonthly, which would appear to be a better fit. That's why the 19860330 observation is getting filled in with NA's. Jeff Lines - Date Price Open.Int. Comm.Long Comm.Short net.comm 15-Jan-86 673.25175645 65910 2842537485 31-Jan-86 677.00167350 54060 2712026940 14-Feb-86 680.25157985 37955 2542512530 28-Feb-86 691.75162775 49760 1603033730 14-Mar-86 706.50163495 54120 2799526125 31-Mar-86 709.75164120 54715 3039024325 + + + + + + boink - tisFromCsv(textConnection(Lines), dateFormat = %d-%b-%y, dateCol = Date, sep = ) boink $Price [,1] 19860119 673.25 19860202 677.00 19860216 680.25 19860302 691.75 19860316 706.50 19860330 NA 19860413 709.75 class: tis $Open.Int. [,1] 19860119 175645 19860202 167350 19860216 157985 19860302 162775 19860316 163495 19860330 NA 19860413 164120 class: tis $Comm.Long [,1] 19860119 65910 19860202 54060 19860216 37955 19860302 49760 19860316 54120 19860330NA 19860413 54715 class: tis $Comm.Short [,1] 19860119 28425 19860202 27120 19860216 25425 19860302 16030 19860316 27995 19860330NA 19860413 30390 class: tis $net.comm [,1] 19860119 37485 19860202 26940 19860216 12530 19860302 33730 19860316 26125 19860330NA 19860413 24325 class: tis Gabor Grothendieck [EMAIL PROTECTED] writes: On 26 Jul 2007 09:59:31 -0400, Jeffrey J. Hallman [EMAIL PROTECTED] wrote: zoo is nice. 'tisFromCsv()' in the fame package is nicer. Jeff 1. What am I doing wrong here? I only get one data column. 2. I assume the regularized dates which do not exactly match the input ones are intended so as to make this a regularly spaced series. Is that right? 3. What is the cause of the warning message? 4. Why is a list returned with a single component containing the output? Thanks. library(fame) Lines - Date Price Open.Int. Comm.Long Comm.Short net.comm + 15-Jan-86 673.25175645 65910 2842537485 + 31-Jan-86 677.00167350 54060 2712026940 + 14-Feb-86 680.25157985 37955 2542512530 + 28-Feb-86 691.75162775 49760 1603033730 + 14-Mar-86 706.50163495 54120 2799526125 + 31-Mar-86 709.75164120 54715 3039024325 + tisFromCsv(textConnection(Lines), dateFormat = %d-%b-%y, dateCol = Date, sep = ) [[1]] [,1] 19860119 673.25 19860202 677.00 19860216 680.25 19860302 691.75 19860316 706.50 19860330 709.75 class: tis Warning message: number of items to replace is not a multiple of replacement length in: x[i] - value __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jeff __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to auto-scale cex of y-axis labels in lattice dotplot?
On 7/25/07, Kevin Wright [EMAIL PROTECTED] wrote: When I create a dotplot in lattice, I frequently observe overplotting of the labels along the vertical axis. On my screen, this illustrates overplotting of the letters: windows() reps=6 dat=data.frame(let=rep(letters,each=reps), grp=rep(1:reps, 26), y=runif(26*reps)) dotplot(let~y|grp, dat) Is there a way to automatically scale the labels so that they are not over-plotted? Not that I can think of. I currently do something like this: Calculate or guess the number of panel rows: NumPanelRows cexLab - min(1, .9*par()$pin[2]/ (nlevels(dat$let)*NumPanelRows*strheight(A,units=in))) dotplot(..., scales=list(y=list(cex=cexLab)) Is there an easier way? Is there a function that I can call which calculates the layout of the panels that will be used in the dotplot? Not really. The eventual layout is calculated inside print.trellis as follows (where 'x' is the trellis object being plotted): panel.layout - compute.layout(x$layout, dim(x), skip = x$skip) [...] if (panel.layout[1] == 0) { ddim - par(din) device.aspect - ddim[2] / ddim[1] panel.aspect - panel.height[[1]] / panel.width[[1]] plots.per.page - panel.layout[2] m - max (1, round(sqrt(panel.layout[2] * device.aspect / panel.aspect))) n - ceiling(plots.per.page/m) m - ceiling(plots.per.page/n) panel.layout[1] - n panel.layout[2] - m } -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constructing bar charts with standard error bars
John Zabroski wrote: On 7/25/07, Ben Bolker [EMAIL PROTECTED] wrote: Thanks a lot! I tried all three and they all seem very dependable. Also, I appreciate you rewriting my solution and adding elegance. Is there a way to extend the tick marks to the ylim values, such that the yscale ymax tickmark is something like max(xbar+se)? In the documentation, I thought par(yaxp=c(y0,y1,n)) would do the trick, but after trying to use it I am not sure I understand what yaxp even does. It took me quite a while to figure this out, I'm not surprised you didn't ... The very easiest way to do this is simply to set ylim to (0,0.4) -- since you probably want to extend the axes upward to a pretty number anyway. The other standard way to do this is to use barplot with axes=FALSE and then add the axes yourself, with the ticks specified wherever you want: barplot(...,ylim=c(0,0.4),axes=FALSE) axis(side=1) axis(side=2,at=seq(0,0.4,length=8)) However, I was wondering what was up with yaxp, and why setting it didn't seem to do anything. The answer is lurking in ?par: ## This parameter is reset when a user coordinate system is set ## up, for example by starting a new page or by calling ## 'plot.window' or setting 'par(usr)': 'n' is taken from ## 'par(lab)'. It affects the default behaviour of subsequent ## calls to 'axis' for sides 1 or 3. Thus, when barplot starts up and plots a new set of axes it RESETS par(yaxp). Thus par(yaxp=...)); barplot(...) doesn't work. However, barplot(...,yaxp=...) does work. It would actually be nice to have an axis style (xaxs,yaxs) that extended the axis out beyond the range of the data until it found pretty labels that extended beyond the data range -- for example, set the range according to xaxs=r, find the pretty axis ticks, and then add another tick ... cheers Ben Bolker __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] offset in coxph
The offset argument used in glm and other functions seems to have been removed from the argument list for coxph. I am wondering if there is a reason for this and if there is a possible work-around in order to produce a cox-ph object without fitting coefficients? Thanks, Mike __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R CMD check sh: line 1: make: command not found
On Thu, 26 Jul 2007, David Peltier wrote: hello, I am using R 2.5.0 under OS X. I am having sh: line 1: make: command not found error message when I run R CMD check : Any help would be appreciated. Well, that is easy: 'make' is missing. It should be there in the OS, so you need to talk to your OS support for help in finding/installing it. BTW, the list for MacOS-specific questions if r-sig-mac. R CMD check backtest * checking for working latex ... OK * using log directory '/backtest/trunk/backtest.Rcheck' * using R version 2.5.0 (2007-04-23) * checking for file 'backtest/DESCRIPTION' ... OK * checking extension type ... Package * this is package 'backtest' version '0.2-0' * checking package dependencies ... OK * checking if this is a source package ... OK * checking whether package 'backtest' can be installed ... OK * checking package directory ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking DESCRIPTION meta-information ... OK * checking top-level files ... OK * checking index information ... OK * checking package subdirectories ... OK * checking R files for non-ASCII characters ... OK * checking R files for syntax errors ... OK * checking whether the package can be loaded ... OK * checking whether the package can be loaded with stated dependencies ... OK * checking whether the name space can be loaded with stated dependencies ... OK * checking for unstated dependencies in R code ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking R code for possible problems ... OK * checking Rd files ... OK * checking Rd cross-references ... OK * checking for missing documentation entries ... OK * checking for code/documentation mismatches ... OK * checking Rd \usage sections ... OK * checking data for non-ASCII characters ... OK * creating backtest-Ex.R ... OK * checking examples ... OK * checking tests ... sh: line 1: make: command not found ERROR [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert string to list?
Is this what you want: str - P = 0.0, T = 0.0, Q = 0.0 x - eval(parse(text=paste('list(', str, ')'))) str(x) List of 3 $ P: num 0 $ T: num 0 $ Q: num 0 On 7/26/07, Manuel Morales [EMAIL PROTECTED] wrote: Let's say I have the following string: str - P = 0.0, T = 0.0, Q = 0.0 I'd like to find a function that generates the following object from 'str'. list(P = 0.0, T = 0.0, Q = 0.0) Thanks! -- http://mutualism.williams.edu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert string to list?
Manuel Jim's may be what you want-- a list of numerics with names P, T and Q or a list of character strings? str - P = 0.0, T = 0.0, Q = 0.0 str(as.vector(unlist(strsplit(str,,)),mode=list)) List of 3 $ : chr P = 0.0 $ : chr T = 0.0 $ : chr Q = 0.0 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of jim holtman Sent: Friday, 27 July 2007 11:20 AM To: Manuel Morales Cc: r-help Subject: Re: [R] Convert string to list? Is this what you want: str - P = 0.0, T = 0.0, Q = 0.0 x - eval(parse(text=paste('list(', str, ')'))) str(x) List of 3 $ P: num 0 $ T: num 0 $ Q: num 0 On 7/26/07, Manuel Morales [EMAIL PROTECTED] wrote: Let's say I have the following string: str - P = 0.0, T = 0.0, Q = 0.0 I'd like to find a function that generates the following object from 'str'. list(P = 0.0, T = 0.0, Q = 0.0) Thanks! -- http://mutualism.williams.edu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Minitab Parametric Distribution Analysis in R
After a bit of coaching I found what I was looking for: the fitdistr() function in the MASS package. It appears to be a bit easier to use than mle() for my application. Thanks all. Tom -Original Message- From: Thomas Lumley [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 25, 2007 12:03 PM To: Tom La Bone Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Minitab Parametric Distribution Analysis in R The survival package (survreg() function) will fit quite a few parametric models under censoring. If you aren't doing regression, but just one-sample fitting, you can feed the appropriate censored or truncated likelihood to mle() in the stat4 package. Both packages should be part of your R distribution. -thomas On Wed, 25 Jul 2007, Tom La Bone wrote: Minitab can perform a Parametric Distribution Analysis - Arbitrary Censoring with one of eight distributions (e.g., weibull), giving the maximum likelihood estimates of the parameters in the distribution for a given dataset. Does R have a package that provides equivalent functionality? Thanks for any advice you can offer. Tom La Bone [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Convert string to list?
Let's say I have the following string: str - P = 0.0, T = 0.0, Q = 0.0 I'd like to find a function that generates the following object from 'str'. list(P = 0.0, T = 0.0, Q = 0.0) Thanks! -- http://mutualism.williams.edu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Survival analysis with 60% random censoring
Hello, My study is to predict the likelihood an insurance policy holder will not renew his policy in the coming expiration date. My data has about 60% censoring and they are random, because customers buy insurance at different time, however, the study has to be terminated on a single date. Any suggestion or reference is greatly appreciated. Thanks in advance. Best Regards Zhongmiao Wang __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] offset in coxph
Removed? That it was ever there is not my recollection and seems very unlikely given that survival is ported from S where glm() does not have it, As far as I know it has only ever been in glm() and lm() in R: the way which is described in the White Book is to use the offset() function, and this is preferred (it works correctly for prediction, for example). The function form is supported for coxph, and used in the test suite. Please forget you ever knew 'offset' could be an argument, and use offset() in your formulae instead. On Thu, 26 Jul 2007, Michael Gormley wrote: The offset argument used in glm and other functions seems to have been removed from the argument list for coxph. I am wondering if there is a reason for this and if there is a possible work-around in order to produce a cox-ph object without fitting coefficients? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a cross table out of a large dataset
Dear all, I want to make a cross table out of a data set which is 2 columns wide and more than 15 rows long. When I use the table() function I get an error message This is the code I have used: Dataset - read.table(test.txt, header=TRUE, sep=,, na.strings=NA, dec=., strip.white=TRUE) .T -table(Dataset$K1,Dataset$K2) This is the error message I have received Error in vector(integer, length) : vector size specified is too large In addition: Warning messages: 1: NAs introduced by coercion 2: NAs introduced by coercion Is it possible to make a cross table with the table() function on a large dataset or should I consider using another function? I have had a look at the ?table help file but I could find any information on the size of the dataset. Thanks very much in advance for any help:-) Kind regards, Céline. -- View this message in context: http://www.nabble.com/Creating-a-cross-table-out-of-a-large-dataset-tf4153948.html#a11818590 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R CMD check sh: line 1: make: command not found
hello, I am using R 2.5.0 under OS X. I am having sh: line 1: make: command not found error message when I run R CMD check : Any help would be appreciated. R CMD check backtest * checking for working latex ... OK * using log directory '/backtest/trunk/backtest.Rcheck' * using R version 2.5.0 (2007-04-23) * checking for file 'backtest/DESCRIPTION' ... OK * checking extension type ... Package * this is package 'backtest' version '0.2-0' * checking package dependencies ... OK * checking if this is a source package ... OK * checking whether package 'backtest' can be installed ... OK * checking package directory ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking DESCRIPTION meta-information ... OK * checking top-level files ... OK * checking index information ... OK * checking package subdirectories ... OK * checking R files for non-ASCII characters ... OK * checking R files for syntax errors ... OK * checking whether the package can be loaded ... OK * checking whether the package can be loaded with stated dependencies ... OK * checking whether the name space can be loaded with stated dependencies ... OK * checking for unstated dependencies in R code ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking R code for possible problems ... OK * checking Rd files ... OK * checking Rd cross-references ... OK * checking for missing documentation entries ... OK * checking for code/documentation mismatches ... OK * checking Rd \usage sections ... OK * checking data for non-ASCII characters ... OK * creating backtest-Ex.R ... OK * checking examples ... OK * checking tests ... sh: line 1: make: command not found ERROR [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function to separate effect in AOV
You may want to look at the interaction function (a quick way to make the single factor with 4 levels that you mention). You can create your own sets of contrasts and set them using the C or contrasts functions, then use the split argument to summary.aov to look at the individual degrees of freedom. You may also be interested in the multcomp package for looking at the comparisons. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ronaldo Reis Junior Sent: Monday, July 23, 2007 4:05 PM To: R-Help Subject: [R] Function to separate effect in AOV Hi, I have a dummy question. Suppose that I have two explanatory variable, T1 (A, B) and T2 (C, D) and one response variable. attach(dados) tapply(Y,list(T1,T2),mean) CD A 2.20 10.2 B 2.22 20.26667 In this case, A and B inside C have no difference, but have differences inside D I make this model: m - aov(Y~T1*T2) summary(m) Df Sum Sq Mean Sq F valuePr(F) T1 1 76.36 76.36 5617.9 1.119e-12 *** T2 1 508.69 508.69 37426.7 5.704e-16 *** T1:T21 75.65 75.65 5566.0 1.161e-12 *** Residuals8 0.110.01 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 This result don't show the reality of the data, because I cant see that A and B inside C are the same. The anova result is the same of an full different levels, like this: attach(dados2) tapply(Y,list(T1,T2),mean) CD A 6.10 10.2 B 2.22 20.26667 m - aov(Y~T1*T2) summary(m) Df Sum Sq Mean Sq F valuePr(F) T1 1 28.74 28.74 2114.3 5.529e-11 *** T2 1 367.75 367.75 27056.7 2.088e-15 *** T1:T21 145.81 145.81 10728.1 8.433e-14 *** Residuals8 0.110.01 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 In this case all level are different, C to D and A to B. The question is: The only way to find this real difference is: 1) make T1 and T2 like a Treatment variable with 4 levels (AC,BC,AD,BD)? or 2) make 3 anova: a) Anova (A,B) inside C b) Anova (A,B) inside D c) Full factorial Anova (like this in the e-mail) or 3) exist any other way to make this in only one analysis, to find all differences e interactions? In other words, to find differences in A and B inside C, A and B inside D, C and D inside A and C and D inside B Thanks Ronaldo -- Prof. Ronaldo Reis Júnior | .''`. UNIMONTES/Depto. Biologia Geral/Lab. de Ecologia | : :' : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia `. | `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil | `- Fone: (38) 3229-8187 | [EMAIL PROTECTED] | | [EMAIL PROTECTED] http://www.ppgcb.unimontes.br/ | ICQ#: 5692561 | | LinuxUser#: 205366 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create Strings of Column Id's
Does anyone know how this is don? I have a large matrix where I extract specific columns into txt files for further use. To be able to keep track of which txt files contain which columns I want to name the filenames with the column Id's. The most basic example would be to use an for() loop together with paste(), but the result is blank. Not even NULL. this is the concept of thecode i use: for example MyMatrix - matrix(NA,ncol=4,nrow=1,dimnames=list(NULL,c(E,R,T,Y))) COL - c(1,3) # a vector of columns I want to extract, Filename - NULL # the starting variable, so I can use paste Filename - for(i in colnames(MyMatrix)[COL]) {paste(Filename,-,i,sep=)} The result is -T, but I want it to be -E-T Anyone have a clue? Thanks Tom -- View this message in context: http://www.nabble.com/Create-Strings-of-Column-Id%27s-tf4153354.html#a11816439 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Large dataset + randomForest
[Please CC me in any replies as I am not currently subscribed to the list. Thanks!] Dear all, I did a bit of searching on the question of large datasets but did not come to a definite conclusion. What I am trying to do is the following: I want to read in a dataset with approx. 100 000 rows and approx 150 columns. The file size is ~ 33MB, which one would deem not too big a file for R. To speed up the reading in of the file I do not use read.table but a loop that does reading with scan() into a buffer and some preprocessing and then adds the data into a dataframe. When I then want to run randomForest() R complains that I cannot allocate a vector of size 313.0 MB. I am aware that randomForest needs all data in memory, but 1) why should that suddenly be 10 times the size of the data (I acknowedge the need for some internal data of R, but 10 times seems a bit too much) and 2) there is still physical memory free on the machine (in total 4GB available, even though R is limited to 2GB if I correctly remember the help pages - still 2GB should be enough!) - it doesn't seem to work either with changed settings done via mem.limits(), or run-time arguments --min-vsize --max-vsize - what do these have to be set to to work in my case?? rf - randomForest(V1 ~ ., data=df[trainindices,], do.trace=5) Error: cannot allocate vector of size 313.0 Mb object.size(df)/1024/1024 [1] 129.5390 Any help would be greatly appreciated, Florian -- Florian Nigsch [EMAIL PROTECTED] Unilever Centre for Molecular Sciences Informatics Department of Chemistry University of Cambridge http://www-mitchell.ch.cam.ac.uk/ Telephone: +44 (0)1223 763 073 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] zeroinfl() or zicounts() error
I'm trying to fit a zero-inflated poisson model using zeroinfl() from the pscl library. It works fine for most models I try, but when I include either of 2 covariates, I get an error. When I include PopulationDensity, I get this error: Error in solve.default (as.matrix(fit$hessian)) :system is computationally singular: reciprocal condition number = 1.91306e-34 When I include BuildingArea, I get this error: Error in optim(fn = loglikfun, par = c(start$count, start$zero, if (dist == : non-finite finite-difference value [2] I tried fitting the models using zicounts in the zicounts library as well and had the same difficulty. When I include PopulationDensity, it runs, but outputs only the parameter estimates, not the standard errors or p-values (those have NaN). When I include BuildingArea, I get this error: Error in solve.default(z0$hessian) : system is computationally singular: reciprocal condition number = 2.58349e-25 Can anyone suggest what it is about these 2 covariates that might be causing the problem? I don't see any obvious problems with them. They are both nonnegative with smooth probability distributions and no missing (NA) values. The dataset has 3211 observations. It doesn't matter if there are other covariates in the models or not. If one of these is included, I get the errors. Thanks! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Redirecting print output
You may want to look at the R2HTML package as one approach (others have already told you about sink and cat). Another approach is to use the variations on sweave. Here you set up a template file with the code you want run as well as any explanitory text (you can even write an entire report), then process this with sweave and the output will be included. The original sweave works with LaTeX, there is an HTML driver for sweave in the R2HTML package (so the source and final documents are html) and there is an odfWeave package that lets you create the template and output in a word processor (uses the openoffice word processor, but since you can convert from and to Msword from there, this is not much of a limitation). Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Stan Hopkins Sent: Monday, July 23, 2007 9:35 PM To: R help Subject: [R] Redirecting print output I see a rich set of graphic device functions to redirect that output. Are there commands to redirect text as well. I have a set of functions that execute many linear regression tests serially and I want to capture this in a file for printing. Thanks, Stan Hopkins [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple graphs
One of the nice things about the R Graph Gallery is that if you click on the R logo underneath the graph (may need to scroll down a bit) it will show you the code used to create that particular graph. You may also want to look at the subplot function in the TeachingDemos package for another way to add histograms to a plot: Here is one possible example of this: x - rep(1:10, each=25) y - rexp(250, 1/x) library(TeachingDemos) tmp1 - hist(y, plot=FALSE) r - range(tmp1$breaks) w - diff(tmp1$breaks) plot(x,y, type='n', xlim=c(0.5,10.5), ylim=r) for(i in 1:10){ tmp2 - hist( y[x==i], breaks=tmp1$breaks, plot=FALSE ) subplot( barplot(tmp2$counts, ylim=r, width=w, horiz=TRUE, space=0, xaxt='n', yaxs='i'), c(i-0.45, i+.45), r ) } points(x,y) # just to compare Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Daniele Amberti Sent: Thursday, July 26, 2007 1:26 AM To: r-help Subject: [R] multiple graphs Does anyone have a simple explanation and example on how to add histograms or barcharts to an other graph like in the example at the R-graph gallery: http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=109 looking at the code I'not undertand very well how to add graphs in arbitrary/clever position with an adequate scale. If somebody have a simplier example with explanations it will be highly appreciate. Best Daniele -- Scegli infostrada: ADSL gratis per tutta l'estate e telefoni senza canone Telecom __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Dates
Yes, I was using 1.05. I get the same result as you with 1.08. On 26 Jul 2007 11:39:41 -0400, Jeffrey J. Hallman [EMAIL PROTECTED] wrote: Are you using the latest version of fame? 1.05 and earlier had a bug in tisFromCsv that was fixed in 1.08. Below I show what I get with fame version 1.08. There is still a problem in that the frequency-figuring logic appears to think the frequency is bwsunday (biweekly with weeks ending on Sunday) rather than semimonthly, which would appear to be a better fit. That's why the 19860330 observation is getting filled in with NA's. Jeff Lines - Date Price Open.Int. Comm.Long Comm.Short net.comm 15-Jan-86 673.25175645 65910 2842537485 31-Jan-86 677.00167350 54060 2712026940 14-Feb-86 680.25157985 37955 2542512530 28-Feb-86 691.75162775 49760 1603033730 14-Mar-86 706.50163495 54120 2799526125 31-Mar-86 709.75164120 54715 3039024325 + + + + + + boink - tisFromCsv(textConnection(Lines), dateFormat = %d-%b-%y, dateCol = Date, sep = ) boink $Price [,1] 19860119 673.25 19860202 677.00 19860216 680.25 19860302 691.75 19860316 706.50 19860330 NA 19860413 709.75 class: tis $Open.Int. [,1] 19860119 175645 19860202 167350 19860216 157985 19860302 162775 19860316 163495 19860330 NA 19860413 164120 class: tis $Comm.Long [,1] 19860119 65910 19860202 54060 19860216 37955 19860302 49760 19860316 54120 19860330NA 19860413 54715 class: tis $Comm.Short [,1] 19860119 28425 19860202 27120 19860216 25425 19860302 16030 19860316 27995 19860330NA 19860413 30390 class: tis $net.comm [,1] 19860119 37485 19860202 26940 19860216 12530 19860302 33730 19860316 26125 19860330NA 19860413 24325 class: tis Gabor Grothendieck [EMAIL PROTECTED] writes: On 26 Jul 2007 09:59:31 -0400, Jeffrey J. Hallman [EMAIL PROTECTED] wrote: zoo is nice. 'tisFromCsv()' in the fame package is nicer. Jeff 1. What am I doing wrong here? I only get one data column. 2. I assume the regularized dates which do not exactly match the input ones are intended so as to make this a regularly spaced series. Is that right? 3. What is the cause of the warning message? 4. Why is a list returned with a single component containing the output? Thanks. library(fame) Lines - Date Price Open.Int. Comm.Long Comm.Short net.comm + 15-Jan-86 673.25175645 65910 2842537485 + 31-Jan-86 677.00167350 54060 2712026940 + 14-Feb-86 680.25157985 37955 2542512530 + 28-Feb-86 691.75162775 49760 1603033730 + 14-Mar-86 706.50163495 54120 2799526125 + 31-Mar-86 709.75164120 54715 3039024325 + tisFromCsv(textConnection(Lines), dateFormat = %d-%b-%y, dateCol = Date, sep = ) [[1]] [,1] 19860119 673.25 19860202 677.00 19860216 680.25 19860302 691.75 19860316 706.50 19860330 709.75 class: tis Warning message: number of items to replace is not a multiple of replacement length in: x[i] - value __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jeff __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] significance test for difference of two correlations
Let r_1 be the correlation between the two variables for the first group with n_1 subjects and let r_2 be the correlation for the second group with n_2 subjects. Then a simple way to test H0: rho_1 = rho_2 is to convert r_1 and r_2 via Fisher's variance stabilizing transformation ( z = 1/2 * ln[ (1+r)/(1-r)] ) and then calculate: (z_1 - z_2) / sqrt( 1/(n_1 - 3) + 1/(n_2 - 3) ) which is (approximately) N(0,1) under H0. So, using alpha = .05, you can reject H0 if the absolute value of the test statistic above is larger than 1.96. -- Wolfgang Viechtbauer Department of Methodology and Statistics University of Maastricht, The Netherlands http://www.wvbauer.com/ Original Message From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Timo Stolz Sent: Thursday, July 26, 2007 16:13 To: r-help@stat.math.ethz.ch Subject: [R] significance test for difference of two correlations Dear R users, how can I test, whether two correlations differ significantly. (I want to prove, that variables are correlated differently, depending on the group a person is in.) Greetings from Freiburg im Breisgau (Germany), Timo Stolz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error in using R2WinBUGS on Ubuntu 6.10 Linux
I am trying to run WinBUGS 1.4 from the Ubuntu 6.10 Linux distribution. I am using the R2WinBUGS packages with the source file listed below. WinBUGS appears to run properly, but I get the following message after WinBUGS starts in WINE. Does anyone know what may be causing this error and what the correction may be? Thanks ERROR MESSAGE: fixme:ole:GetHGlobalFromILockBytes cbSize is 13824 err:ole:CoGetClassObject class {0003000a---c000-0046} not registered err:ole:CoGetClassObject class {0003000a---c000-0046} not registered err:ole:CoGetClassObject no class object {0003000a---c000-0046} could be created for context 0x3 fixme:keyboard:RegisterHotKey (0x10032,13,0x0002,3): stub fixme:ntdll:RtlNtStatusToDosErrorNoTeb no mapping for 800a err:ole:local_server_thread Failure during ConnectNamedPipe 317 R SOURCE FILE: rm(list=ls(all=TRUE)) library(R2WinBUGS) inits-function(){ list(alpha0 = 0, alpha1 = 0, alpha2 = 0, alpha12 = 0, sigma = 1) } data-list(r = c(10, 23, 23, 26, 17, 5, 53, 55, 32, 46, 10, 8, 10, 8, 23, 0, 3, 22, 15, 32, 3), n = c(39, 62, 81, 51, 39, 6, 74, 72, 51, 79, 13, 16, 30, 28, 45, 4, 12, 41, 30, 51, 7), x1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), x2 = c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1), N = 21) test-bugs(data,inits, model.file=/home/meyerjp/rasch/test.bug, parameters=c(alpha0,alpha1,alpha12,alpha2,sigma), n.chains=2,n.iter=1,n.burnin=1000, bugs.directory=/home/meyerjp/.wine/drive_c/Program Files/WinBUGS14/, working.directory=/home/meyerjp/rasch/working, debug=FALSE, WINEPATH=/usr/bin/winepath, newWINE=TRUE) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] significance test for difference of two correlations
Dear R users, how can I test, whether two correlations differ significantly. (I want to prove, that variables are correlated differently, depending on the group a person is in.) Greetings from Freiburg im Breisgau (Germany), Timo Stolz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmer and scale parameters....
I'm using lmer to fit mixed-effect logistic regression models. This is for a small data set. First, I fit a constant: Generalized linear mixed model fit using Laplace Formula: propm ~ (1 | study) Data: inducedSR71507.dat Family: binomial(logit link) AIC BIC logLik deviance 183.7 189.4 -89.84179.7 Random effects: Groups NameVariance Std.Dev. study (Intercept) 0.035812 0.18924 number of obs: 127, groups: study, 21 Estimated scale (compare to 1 ) 1.028571 Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) 0.112560.04979 2.261 0.0238 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 So far, so good. Next, I fit a model with a fixed effect: Generalized linear mixed model fit using Laplace Formula: propm ~ 1 + c.age + (1 | study) Data: inducedSR71507.dat Family: binomial(logit link) AIC BIC logLik deviance 5339 5348 -2667 5333 Random effects: Groups NameVariance Std.Dev. study (Intercept) 0.44094 0.66404 number of obs: 127, groups: study, 21 Estimated scale (compare to 1 ) 314587114 Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) 0.058093 1.033273 0.056220.955 c.age 0.007262 0.095393 0.076130.939 That is one heck of a large scale parameter! I would be glad to be shown what I am doing wrong, but I am thinking that this is a bug.. study is entered as a factor in the data frame. here is the session info sessionInfo() R version 2.5.1 (2007-06-27) i386-apple-darwin8.9.1 locale: en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats4stats graphics grDevices utils datasets methods base other attached packages: mlmRev lme4 MASS Matrix lattice nlme 0.995-1 0.99875-4 7.2-34 0.999375-00.15-11 3.1-83 Any and all help is very much appreciated! -- Steven Orzack The Fresh Pond Research Institute 173 Harvey Street Cambridge, MA. 02140 617 864-4307 www.freshpond.org [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Dates
On 26 Jul 2007 09:59:31 -0400, Jeffrey J. Hallman [EMAIL PROTECTED] wrote: zoo is nice. 'tisFromCsv()' in the fame package is nicer. Jeff 1. What am I doing wrong here? I only get one data column. 2. I assume the regularized dates which do not exactly match the input ones are intended so as to make this a regularly spaced series. Is that right? 3. What is the cause of the warning message? 4. Why is a list returned with a single component containing the output? Thanks. library(fame) Lines - Date Price Open.Int. Comm.Long Comm.Short net.comm + 15-Jan-86 673.25175645 65910 2842537485 + 31-Jan-86 677.00167350 54060 2712026940 + 14-Feb-86 680.25157985 37955 2542512530 + 28-Feb-86 691.75162775 49760 1603033730 + 14-Mar-86 706.50163495 54120 2799526125 + 31-Mar-86 709.75164120 54715 3039024325 + tisFromCsv(textConnection(Lines), dateFormat = %d-%b-%y, dateCol = Date, sep = ) [[1]] [,1] 19860119 673.25 19860202 677.00 19860216 680.25 19860302 691.75 19860316 706.50 19860330 709.75 class: tis Warning message: number of items to replace is not a multiple of replacement length in: x[i] - value __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substituting dots in the names of the columns (sub, gsub, regexpr)
Hi, A dot in a regular expression matches any character, so you have to escape each dot with backslash \\ (which itself is escaped in the string, to confuse things...). A plus symbol will match one or more of the preceding characters. A dollar symbol will match the end of a string. So: gsub(\\.$, , gsub(\\.+, ., str)) [1] y.mBD.g.cm3 PR.Mpa Ks.m.s SP.g.g P.m3.m3theta1.g.g [8] theta2.g.g AWC.g.g Learn more at ?regexp Felix On 7/26/07, 8rino-Luca Pantani [EMAIL PROTECTED] wrote: Dear R users, I have the following two problems, related to the function sub, grep, regexpr and similia. The header of the file(s) I have to import is like this. c(y (m), BD (g/cm3), PR (Mpa), Ks (m/s), SP g./g., P (m3/m3), theta1 (g/g), theta2 (g/g), AWC (g/g)) To get rid of spaces and symbols in the names of the columns, I use read.table(... check.names=TRUE) and I get: str - c(y..m., BD..g.cm3., PR..Mpa., Ks..m.s., SP.g..g., P..m3.m3., theta1..g.g., theta2..g.g., AWC..g.g.) Now, my problem is to remove the trailing dots, as well as the double dots, in order to get the names like the following c(y.m, BD.g.cm3, PR.Mpa, Ks.m.s, SP.g.g, P.m3.m3., theta1.g.g, theta2.g.g, AWC.g.g) I've searched the help pages for sub, regexpr and similia, and also searched the help archives. I understand that the dot is a peculiar sign since sub(.., ., str) [1] ..m....g.cm3. ...Mpa. ...m.s. ..g..g. [6] ..m3.m3..eta1..g.g. .eta2..g.g. .C..g.g. Therefore I tried sub(\\.., ., str) [1] y.m.BD.g.cm3. PR.Mpa. Ks.m.s. SP...g. [6] P.m3.m3.theta1.g.g. theta2.g.g. AWC.g.g. and I've been surprised by the (to me) strange behaviour in SP.g..g. modified in SP...g. An this is the first problem I cannot solve. Then there's the problem of trailing dot removal. In http://tolstoy.newcastle.edu.au/R/e2/help/07/01/8665.html I've found a somewhat similar problem, but it do not works in this case since: gsub([.].*, , str) [1] y BD PR Ks SP P theta1 theta2 [9] AWC And this the second problem Apart this particular problems I would like to know more on regexp, sub and so on, since each time I have strings to manipulate, I must face my ignorance in the topic of regular expression and its syntax. Is there any page with examples, where I can improve my knowledge and stop being frustrated each time I have to manipulate strings? 8rino -- Ottorino-Luca Pantani, Università di Firenze Dip. Scienza del Suolo e Nutrizione della Pianta P.zle Cascine 28 50144 Firenze Italia Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / 安福立 PhD candidate Integrated Catchment Assessment and Management Centre The Fenner School of Environment and Society The Australian National University (Building 48A), ACT 0200 Beijing Bag, Locked Bag 40, Kingston ACT 2604 http://www.neurofractal.org/felix/ voice:+86_1051404394 (in China) mobile:+86_13522529265 (in China) mobile:+61_410400963 (in Australia) xmpp:[EMAIL PROTECTED] 3358 543D AAC6 22C2 D336 80D9 360B 72DD 3E4C F5D8 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Average plan
Hello, I'm looking for a method to compute an average plan from 4 or 5 point in an cartesian space. I'm sure It can be done using a less-square method but maybe it a function already exist in R system to get this plan. Can somebody help me to solve this problem (I'm looking on the net for hours but didn't find anything realy satisfiying me) Thanks -- View this message in context: http://www.nabble.com/Average-plan-tf4151900.html#a11811324 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Dates
zoo is nice. 'tisFromCsv()' in the fame package is nicer. Jeff __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: multiple graphs
Hi this particular graph is a combination of several approaches see layout # how to split plot window (or ?split) par(new=TRUE) # how to plot several times to the same window without erasing previous plot and of course sophisticated use of all other stuff which is available in R. See also par(fig=...) plot(1:10) par(fig=c(0.1,.5,0.1,.5), new=T) boxplot(rnorm(10)) Petr [EMAIL PROTECTED] napsal dne 26.07.2007 09:26:16: Does anyone have a simple explanation and example on how to add histograms or barcharts to an other graph like in the example at the R-graph gallery: http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=109 looking at the code I'not undertand very well how to add graphs in arbitrary/clever position with an adequate scale. If somebody have a simplier example with explanations it will be highly appreciate. Best Daniele -- Scegli infostrada: ADSL gratis per tutta l’estate e telefoni senza canone Telecom __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] substituting dots in the names of the columns (sub, gsub, regexpr)
Dear R users, I have the following two problems, related to the function sub, grep, regexpr and similia. The header of the file(s) I have to import is like this. c(y (m), BD (g/cm3), PR (Mpa), Ks (m/s), SP g./g., P (m3/m3), theta1 (g/g), theta2 (g/g), AWC (g/g)) To get rid of spaces and symbols in the names of the columns, I use read.table(... check.names=TRUE) and I get: str - c(y..m., BD..g.cm3., PR..Mpa., Ks..m.s., SP.g..g., P..m3.m3., theta1..g.g., theta2..g.g., AWC..g.g.) Now, my problem is to remove the trailing dots, as well as the double dots, in order to get the names like the following c(y.m, BD.g.cm3, PR.Mpa, Ks.m.s, SP.g.g, P.m3.m3., theta1.g.g, theta2.g.g, AWC.g.g) I've searched the help pages for sub, regexpr and similia, and also searched the help archives. I understand that the dot is a peculiar sign since sub(.., ., str) [1] ..m....g.cm3. ...Mpa. ...m.s. ..g..g. [6] ..m3.m3..eta1..g.g. .eta2..g.g. .C..g.g. Therefore I tried sub(\\.., ., str) [1] y.m.BD.g.cm3. PR.Mpa. Ks.m.s. SP...g. [6] P.m3.m3.theta1.g.g. theta2.g.g. AWC.g.g. and I've been surprised by the (to me) strange behaviour in SP.g..g. modified in SP...g. An this is the first problem I cannot solve. Then there's the problem of trailing dot removal. In http://tolstoy.newcastle.edu.au/R/e2/help/07/01/8665.html I've found a somewhat similar problem, but it do not works in this case since: gsub([.].*, , str) [1] y BD PR Ks SP P theta1 theta2 [9] AWC And this the second problem Apart this particular problems I would like to know more on regexp, sub and so on, since each time I have strings to manipulate, I must face my ignorance in the topic of regular expression and its syntax. Is there any page with examples, where I can improve my knowledge and stop being frustrated each time I have to manipulate strings? 8rino -- Ottorino-Luca Pantani, Università di Firenze Dip. Scienza del Suolo e Nutrizione della Pianta P.zle Cascine 28 50144 Firenze Italia Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logistic regression
Mary, The 10-group approach results in a low-resolution and fairly arbitrary calibration curve. Also, it is the basis of the original Hosmer-Lemeshow goodness of fit statistic which has been superceded by the Hosmer et al single degree of freedom GOF test that does not require any binning. The Design package handles both. Do ?calibrate.lrm, ?residuals.lrm, ?lrm for details. Frank Harrell Sullivan, Mary M wrote: Greetings, I am working on a logistic regression model in R and I am struggling with the code, as it is a relatively new program for me. In searching Google for 'logistic regression diagnostics' I came Elizabeth Brown's Lecture 14 from her Winter 2004 Biostatistics 515 course (http://courses.washington.edu/b515/l14.pdf) . I found most of the code to be very helpful, but I am struggling with the lines on to calculate the observed and expected values in the 10 groups created by the cut function. I get error messages in trying to create the E and O matrices: R won't accept assignment of fi1c==j and it won't calculate the sum. I am wondering whether someone might be able to offer me some assistance...my search of the archives was not fruitful. Here is the code that I adapted from the lecture notes: fit - fitted(glm.lyme) fitc - cut(fit, br = c(0, quantile(fit, p = seq(.1, .9, .1)),1)) t-table(fitc) fitc - cut(fit, br = c(0, quantile(fit, p = seq(.1, .9, .1)), 1), labels = F) t-table(fitc) #Calculate observed and expected values in ea group E - matrix(0, nrow=10, ncol = 2) O - matrix(0, nrow=10, ncol=2) for (j in 1:10) { E[j, 2] = sum(fit[fitc==j]) E[j, 1] = sum((1- fit)[fitc==j]) O[j, 2] = sum(pcdata$lymdis[fitc==j]) O[j, 1] = sum((1-pcdata$lymdis)[fitc==j]) } Here is the error message: Error in Summary.factor(..., na.rm = na.rm) : sum not meaningful for factors I understand what it means; I just can't figure out how to get around it or how to get the output printed in table form. Thank you in advance for any assistance. Mary Sullivan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ROC curve in R
Hi, I need to build ROC curve in R, can you please provide data steps / code or guide me through it. Thanks and Regards Rithesh M Mohan [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert string to list?
Is this what your want? as.vector(unlist(strsplit(str,,)),mode=list) Ross Darnell -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Manuel Morales Sent: Friday, 27 July 2007 10:39 AM To: r-help Subject: [R] Convert string to list? Let's say I have the following string: str - P = 0.0, T = 0.0, Q = 0.0 I'd like to find a function that generates the following object from 'str'. list(P = 0.0, T = 0.0, Q = 0.0) Thanks! -- http://mutualism.williams.edu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating windows binary R package (PowerArchiver vs. zip -r9X)
Hi list,I apologize if you see funny fonts, b/c I'm using the new Windows Live Hotmail and don't know how to turn off the rich text mode.I have successfully built and installed a R package in windowsXP for R-2.5.1. But when I tried to create a .zip file so I can use Packages/install package(s) from local .zip files... to install it, it seems R only recognizes the .zip file created by zip -r9X not by PowerArchiver. Do you know why? I vaguely remember I used WinZip before and it worked fine.The two threads I found on R-help and R-devel help me a lot, but don't really answer my question.http://tolstoy.newcastle.edu.au/R/help/06/06/29587.htmlhttp://tolstoy.newcastle.edu.au/R/devel/05/12/3336.htmlThanks,...Tao _ Missed the show? Watch videos of the Live Earth Concert on MSN. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] princomp error
I am attempting to run principal components analysis on a dataset of spectral reflectance (6 decimal places). I imported the data using read.table and there are both column and row headers. When I run princomp I receive the following error: Error in cov.wt(z) : 'x' must contain finite values only Where am I going wrong? Ross *** Ross Bricklemyer Dept. of Crop and Soil Sciences Washington State University 291D Johnson Hall PO Box 646420 Pullman, WA 99164-6420 Work: 509.335.3661 Cell/Home: 406.570.8576 Fax: 509.335.8674 Email: [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] zeroinfl() or zicounts() error
On Thu, 26 Jul 2007, Rachel Davidson wrote: I'm trying to fit a zero-inflated poisson model using zeroinfl() from the pscl library. It works fine for most models I try, but when I include either of 2 covariates, I get an error. When I include PopulationDensity, I get this error: Error in solve.default (as.matrix(fit$hessian)) :system is computationally singular: reciprocal condition number = 1.91306e-34 When I include BuildingArea, I get this error: Error in optim(fn = loglikfun, par = c(start$count, start$zero, if (dist == : non-finite finite-difference value [2] Might be due to some close to linear dependencies in your regressor matrix... I tried fitting the models using zicounts in the zicounts library as well and had the same difficulty. If I recall correctly, zicounts() usses a very similar type of optimization compared to zeroinfl(), hence the similar problems. When I include PopulationDensity, it runs, but outputs only the parameter estimates, not the standard errors or p-values (those have NaN). This is due to the same problem as above for zeroinfl(), the Hessian matrix is (close to) singular. When I include BuildingArea, I get this error: Error in solve.default(z0$hessian) : system is computationally singular: reciprocal condition number = 2.58349e-25 Can anyone suggest what it is about these 2 covariates that might be causing the problem? I don't see any obvious problems with them. They are both nonnegative with smooth probability distributions and no missing (NA) values. The dataset has 3211 observations. It doesn't matter if there are other covariates in the models or not. If one of these is included, I get the errors. Even if you include just one covariate and nothing else? zeroinfl(y ~ PopulationDensity, data = ...) Z __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Large dataset + randomForest
Florian, The first thing that you should change is how you call randomForest. Instead of specifying the model via a formula, use the randomForest(x, y) interface. When a formula is used, there is a terms object created so that a model matrix can be created for these and future observations. That terms object can get big (I think it would be a matrix of size 151 x 150) and is diagonal. That might not solve it, but it should help. Max -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Florian Nigsch Sent: Thursday, July 26, 2007 2:07 PM To: r-help@stat.math.ethz.ch Subject: [R] Large dataset + randomForest [Please CC me in any replies as I am not currently subscribed to the list. Thanks!] Dear all, I did a bit of searching on the question of large datasets but did not come to a definite conclusion. What I am trying to do is the following: I want to read in a dataset with approx. 100 000 rows and approx 150 columns. The file size is ~ 33MB, which one would deem not too big a file for R. To speed up the reading in of the file I do not use read.table but a loop that does reading with scan() into a buffer and some preprocessing and then adds the data into a dataframe. When I then want to run randomForest() R complains that I cannot allocate a vector of size 313.0 MB. I am aware that randomForest needs all data in memory, but 1) why should that suddenly be 10 times the size of the data (I acknowedge the need for some internal data of R, but 10 times seems a bit too much) and 2) there is still physical memory free on the machine (in total 4GB available, even though R is limited to 2GB if I correctly remember the help pages - still 2GB should be enough!) - it doesn't seem to work either with changed settings done via mem.limits(), or run-time arguments --min-vsize --max-vsize - what do these have to be set to to work in my case?? rf - randomForest(V1 ~ ., data=df[trainindices,], do.trace=5) Error: cannot allocate vector of size 313.0 Mb object.size(df)/1024/1024 [1] 129.5390 Any help would be greatly appreciated, Florian -- Florian Nigsch [EMAIL PROTECTED] Unilever Centre for Molecular Sciences Informatics Department of Chemistry University of Cambridge http://www-mitchell.ch.cam.ac.uk/ Telephone: +44 (0)1223 763 073 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROC curve in R
On Thursday 26 July 2007 06:01, Frank E Harrell Jr wrote: Note that even though the ROC curve as a whole is an interesting 'statistic' (its area is a linear translation of the Wilcoxon-Mann-Whitney-Somers-Goodman-Kruskal rank correlation statistics), each individual point on it is an improper scoring rule, i.e., a rule that is optimized by fitting an inappropriate model. Using curves to select cutoffs is a low-precision and arbitrary operation, and the cutoffs do not replicate from study to study. Probably the worst problem with drawing an ROC curve is that it tempts analysts to try to find cutoffs where none really exist, and it makes analysts ignore the whole field of decision theory. Frank Harrell Frank, This thread has caught may attention for a couple reasons, possibly related to my novice-level experience. 1. in a logistic regression study, where i am predicting the probability of the response being 1 (for example) - there exists a continuum of probability values - and a finite number of {1,0} realities when i either look within the original data set, or with a new 'verification' data set. I understand that drawing a line through the probabilities returned from the logistic regression is a loss of information, but there are times when a 'hard' decision requiring prediction of {1,0} is required. I have found that the ROCR package (not necessarily the ROC Curve) can be useful in identifying the probability cutoff where accuracy is maximized. Is this an unreasonable way of using logistic regression as a predictor? 2. The ROC curve can be a helpful way of communicating false positives / false negatives to other users who are less familiar with the output and interpretation of logistic regression. 3. I have been using the area under the ROC Curve, kendall's tau, and cohen's kappa to evaluate the accuracy of a logistic regression based prediction, the last two statistics based on a some probability cutoff identified before hand. How does the topic of decision theory relate to some of the circumstances described above? Is there a better way to do some of these things? Cheers, Dylan [EMAIL PROTECTED] wrote: http://search.r-project.org/cgi-bin/namazu.cgi?query=ROCmax=20result=no rmalsort=scoreidxname=Rhelp02aidxname=functionsidxname=docs there is a lot of help try help.search(ROC curve) gave Help files with alias or concept or title matching 'ROC curve' using fuzzy matching: granulo(ade4) Granulometric Curves plot.roc(analogue)Plot ROC curves and associated diagnostics roc(analogue) ROC curve analysis colAUC(caTools) Column-wise Area Under ROC Curve (AUC) DProc(DPpackage) Semiparametric Bayesian ROC curve analysis cv.enet(elasticnet) Computes K-fold cross-validated error curve for elastic net ROC(Epi) Function to compute and draw ROC-curves. lroc(epicalc) ROC curve cv.lars(lars) Computes K-fold cross-validated error curve for lars roc.demo(TeachingDemos) Demonstrate ROC curves by interactively building one HTH see the help and examples those will suffice Type 'help(FOO, package = PKG)' to inspect entry 'FOO(PKG) TITLE'. Regards, Gaurav Yadav +++ Assistant Manager, CCIL, Mumbai (India) Mob: +919821286118 Email: [EMAIL PROTECTED] Bhagavad Gita: Man is made by his Belief, as He believes, so He is Rithesh M. Mohan [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 07/26/2007 11:26 AM To R-help@stat.math.ethz.ch cc Subject [R] ROC curve in R Hi, I need to build ROC curve in R, can you please provide data steps / code or guide me through it. Thanks and Regards Rithesh M Mohan [[alternative HTML version deleted]] - Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] significance test for difference of two correlations
There is R code for both the Fisher transform and the corresponding bootstrap procedure in the vignette for the proto package: http://cran.r-project.org/doc/vignettes/proto/proto.pdf On 7/26/07, Viechtbauer Wolfgang (STAT) [EMAIL PROTECTED] wrote: Let r_1 be the correlation between the two variables for the first group with n_1 subjects and let r_2 be the correlation for the second group with n_2 subjects. Then a simple way to test H0: rho_1 = rho_2 is to convert r_1 and r_2 via Fisher's variance stabilizing transformation ( z = 1/2 * ln[ (1+r)/(1-r)] ) and then calculate: (z_1 - z_2) / sqrt( 1/(n_1 - 3) + 1/(n_2 - 3) ) which is (approximately) N(0,1) under H0. So, using alpha = .05, you can reject H0 if the absolute value of the test statistic above is larger than 1.96. -- Wolfgang Viechtbauer Department of Methodology and Statistics University of Maastricht, The Netherlands http://www.wvbauer.com/ Original Message From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Timo Stolz Sent: Thursday, July 26, 2007 16:13 To: r-help@stat.math.ethz.ch Subject: [R] significance test for difference of two correlations Dear R users, how can I test, whether two correlations differ significantly. (I want to prove, that variables are correlated differently, depending on the group a person is in.) Greetings from Freiburg im Breisgau (Germany), Timo Stolz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error in using R2WinBUGS on Ubuntu 6.10 Linux
[EMAIL PROTECTED] wrote: I am trying to run WinBUGS 1.4 from the Ubuntu 6.10 Linux distribution. I am using the R2WinBUGS packages with the source file listed below. WinBUGS appears to run properly, but I get the following message after WinBUGS starts in WINE. Does anyone know what may be causing this error and what the correction may be? Thanks ERROR MESSAGE: fixme:ole:GetHGlobalFromILockBytes cbSize is 13824 err:ole:CoGetClassObject class {0003000a---c000-0046} not registered err:ole:CoGetClassObject class {0003000a---c000-0046} not registered err:ole:CoGetClassObject no class object {0003000a---c000-0046} could be created for context 0x3 fixme:keyboard:RegisterHotKey (0x10032,13,0x0002,3): stub fixme:ntdll:RtlNtStatusToDosErrorNoTeb no mapping for 800a err:ole:local_server_thread Failure during ConnectNamedPipe 317 This is wine, not R2WinBUGS nor WinBUGS nor R, I fear, and the fixme: sounds promising that things go away in a more recent version of wine... Uwe Ligges R SOURCE FILE: rm(list=ls(all=TRUE)) library(R2WinBUGS) inits-function(){ list(alpha0 = 0, alpha1 = 0, alpha2 = 0, alpha12 = 0, sigma = 1) } data-list(r = c(10, 23, 23, 26, 17, 5, 53, 55, 32, 46, 10, 8, 10, 8, 23, 0, 3, 22, 15, 32, 3), n = c(39, 62, 81, 51, 39, 6, 74, 72, 51, 79, 13, 16, 30, 28, 45, 4, 12, 41, 30, 51, 7), x1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), x2 = c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1), N = 21) test-bugs(data,inits, model.file=/home/meyerjp/rasch/test.bug, parameters=c(alpha0,alpha1,alpha12,alpha2,sigma), n.chains=2,n.iter=1,n.burnin=1000, bugs.directory=/home/meyerjp/.wine/drive_c/Program Files/WinBUGS14/, working.directory=/home/meyerjp/rasch/working, debug=FALSE, WINEPATH=/usr/bin/winepath, newWINE=TRUE) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate.ts
Jeff, I'm really not a fan of subjective mine is bigger than yours discussions. Just three comments that I try to keep as objective as possible. Bottom line: use 'tis' series from the fame package, or 'zoo` stuff from Gabor's zoo package. The last time I checked packageDescription(zoo)$Author had more than one entry. As the author of the fame package, I hope you'll excuse me for asserting that the 'tis' class is easier to understand and use than the zoo stuff, That surely depends on the user and the task he has to do... which takes a more general approach. Some day Gabor or I or some other enterprising soul should try combining the best ideas from zoo and fame into a package that is better than either one. I think combination should be straightforward: zoo is general enough to allow for time indexes of class ti. Overall, ti seems to be well-written and only some methods might need to be added/improved to cooperate fully with zoo. Maybe some of the functionality that is currently available for tis but is not available for all conceivalbe zoo+arbitrary_index objects might be special cased for zoo+ti or zooreg or zooreg+ti etc. Best, Z __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem installing tseries package
Hi, I'm running R 2.4.1 on Fedora Core 6 and am unable to install the tseries package. I've resolved a few problems getting to this point, by running a yum update, installing the gcc-gfortran dependency, but now I'm stuck. Could someone please point me in the right direction? R install.packages output === == install.packages(tseries) trying URL ' http://www.sourcekeg.co.uk/cran/src/contrib/tseries_0.10-11.tar.gz' Content type 'application/x-tar' length 182043 bytes opened URL == downloaded 177Kb * Installing *source* package 'tseries' ... ** libs gcc -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include -fpic -O3 -g -std=gnu99 -c arma.c -o arma.o gcc -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include -fpic -O3 -g -std=gnu99 -c bdstest.c -o bdstest.o gcc -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include -fpic -O3 -g -std=gnu99 -c boot.c -o boot.o gfortran -fpic -O2 -g -c dsumsl.f -o dsumsl.o In file dsumsl.f:450 IF (IV(1) - 2) 30, 40, 50 1 Warning: Obsolete: arithmetic IF statement at (1) In file dsumsl.f:3702 10 ASSIGN 30 TO NEXT 1 Warning: Obsolete: ASSIGN statement at (1) In file dsumsl.f:3707 20GO TO NEXT,(30, 50, 70, 110) 1 Warning: Obsolete: Assigned GOTO statement at (1) In file dsumsl.f:3709 ASSIGN 50 TO NEXT 1 Warning: Obsolete: ASSIGN statement at (1) In file dsumsl.f:3718 ASSIGN 70 TO NEXT 1 Warning: Obsolete: ASSIGN statement at (1) In file dsumsl.f:3724 ASSIGN 110 TO NEXT 1 Warning: Obsolete: ASSIGN statement at (1) In file dsumsl.f:4552 IF (IV(1) - 2) 999, 30, 70 1 Warning: Obsolete: arithmetic IF statement at (1) In file dsumsl.f:4714 IF (IRC) 140, 100, 210 1 Warning: Obsolete: arithmetic IF statement at (1) gcc -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include -fpic -O3 -g -std=gnu99 -c garch.c -o garch.o gcc -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include -fpic -O3 -g -std=gnu99 -c ppsum.c -o ppsum.o gcc -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include -fpic -O3 -g -std=gnu99 -c tsutils.c -o tsutils.o gcc -shared -Bdirect,--hash-stype=both,-Wl,-O1 -o tseries.so arma.o bdstest.o boot.o dsumsl.o garch.o ppsum.o tsutils.o -L/usr/lib/R/lib -lRblas -lgfortran -lm -lgcc_s -lgfortran -lm -lgcc_s -L/usr/lib/R/lib -lR /usr/bin/ld: skipping incompatible /usr/lib/R/lib/libRblas.so when searching for -lRblas /usr/bin/ld: skipping incompatible /usr/lib/R/lib/libRblas.so when searching for -lRblas /usr/bin/ld: cannot find -lRblas collect2: ld returned 1 exit status make: *** [tseries.so] Error 1 ERROR: compilation failed for package 'tseries' ** Removing '/usr/lib/R/library/tseries' = = I presume the priority is addressing the error: /usr/bin/ld: cannot find -lRblas I have the libRblas.so file with R 2.4. Do I need to upgrade to R 2.5 - In which case I'll be asking how to fix the problems I'm having doing that ;) [~]# yum provides libRblas.so snip R.x86_64 2.5.1-2.fc6extras Matched from: /usr/lib64/R/lib/libRblas.so libRblas.so()(64bit) R.x86_64 2.5.1-2.fc6extras Matched from: /usr/lib64/R/lib/libRblas.so libRblas.so()(64bit) R.i386 2:2.4.1-1.fc6 installed Matched from: /usr/lib/R/lib/libRblas.so libRblas.so R.x86_64 2.4.1-4.fc6installed Matched from: /usr/lib64/R/lib/libRblas.so libRblas.so()(64bit) Regards, Mike [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dispersion_parameter_GLMM's
I agree with David. A dispersion parameter of 25 suggests that you have mainly 0's in your data set and your model is not adequate. Perhabs you should dichotomize your data in 0 and 1's and use a logistic mixed model but be aware of small numbers of events. That amount of overdispersion would make the use of a poisson model very questionable, and will very likely result in estimated standard errors that are too low, hence the change in statistical significance when you switch to quasipoisson. O -- View this message in context: http://www.nabble.com/dispersion_parameter_GLMM%27s-tf3354683.html#a11810939 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate.ts
Your troubles with 'aggregate' for a ts are one of the reasons I created the 'tis' and 'ti' classes in the fame package. If you do this: x1 - tis(1:24, start = c(2000, 10), freq = 12) x2 - tis(1:24, start = c(2000, 11), freq = 12) y1 - aggregate(x1, nfreq = 4) y2 - aggregate(x2, nfreq = 4) x1 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2000 1 2 3 2001 4 5 6 7 8 9 10 11 12 13 14 15 2002 16 17 18 19 20 21 22 23 24 class: tis x2 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2000 1 2 2001 3 4 5 6 7 8 9 10 11 12 13 14 2002 15 16 17 18 19 20 21 22 23 24 class: tis y1 Qtr1 Qtr2 Qtr3 Qtr4 2000 6 2001 15 24 33 42 2002 51 60 69 class: tis y2 Qtr1 Qtr2 Qtr3 Qtr4 2001 12 21 30 39 2002 48 57 66 class: tis Everything pretty much works as you would expect. One thing to notice is that, even using a 'tis' rather than a 'ts', aggregate will only sum up the monthly observations for a quarter if all three of the months are there. That's why y2 starts with 2001Q1, rather than 2000Q4. If you really want the 2000Q4 observation to be the sum of the first two x2 months, the convert() function in fame can handle that. convert(x2, tif = quarterly, observed = summed, ignore = T) Qtr1 Qtr2 Qtr3 Qtr4 20004.03 2001 12.00 21.00 30.00 39.00 2002 48.00 57.00 66.00 71.225806 class: tis Now back to ts. If you look deeper into what's happening here: y3 - aggregate(as.ts(x2), nf = 4) y3 Error in rep.int(, start.pad) : invalid number of copies in rep.int() Enter a frame number, or 0 to exit 1: print(c(6, 15, 24, 33, 42, 51, 60, 69)) 2: print.ts(c(6, 15, 24, 33, 42, 51, 60, 69)) 3: matrix(c(rep.int(, start.pad), format(x, ...), rep.int(, end.pad)), nc 4: as.vector(data) 5: rep.int(, start.pad) Selection: 0 unclass(y3) [1] 6 15 24 33 42 51 60 69 attr(,tsp) [1] 2000.833 2002.5834.000 what you see is that aggregate() did indeed create a quarterly series, but the quarters cover (Nov-Jan, Feb-Apr, May-Jul, Aug-Oct), not the usual (Jan-Mar, Apr-Jun, Jul-Sep, Oct-Dec). The author of the print.ts code evidently never even thought of this possibility. Not that I blame him. I work with monthly and quarterly data all the time, and the behavior of aggregate.ts() is so counter-intuitive that I wouldn't have imagined it either. Bottom line: use 'tis' series from the fame package, or 'zoo` stuff from Gabor's zoo package. As the author of the fame package, I hope you'll excuse me for asserting that the 'tis' class is easier to understand and use than the zoo stuff, which takes a more general approach. Some day Gabor or I or some other enterprising soul should try combining the best ideas from zoo and fame into a package that is better than either one. Jeff -- Jeff __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logistic regression
Maybe try making sure the data is numeric: fac.to.num=function(x) as.numeric(as.character(x)) On 26-Jul-07, at 9:34 AM, Sullivan, Mary M wrote: Greetings, I am working on a logistic regression model in R and I am struggling with the code, as it is a relatively new program for me. In searching Google for 'logistic regression diagnostics' I came Elizabeth Brown's Lecture 14 from her Winter 2004 Biostatistics 515 course (http://courses.washington.edu/b515/ l14.pdf) . I found most of the code to be very helpful, but I am struggling with the lines on to calculate the observed and expected values in the 10 groups created by the cut function. I get error messages in trying to create the E and O matrices: R won't accept assignment of fi1c==j and it won't calculate the sum. I am wondering whether someone might be able to offer me some assistance...my search of the archives was not fruitful. Here is the code that I adapted from the lecture notes: fit - fitted(glm.lyme) fitc - cut(fit, br = c(0, quantile(fit, p = seq(.1, .9, .1)),1)) t-table(fitc) fitc - cut(fit, br = c(0, quantile(fit, p = seq(.1, .9, .1)), 1), labels = F) t-table(fitc) #Calculate observed and expected values in ea group E - matrix(0, nrow=10, ncol = 2) O - matrix(0, nrow=10, ncol=2) for (j in 1:10) { E[j, 2] = sum(fit[fitc==j]) E[j, 1] = sum((1- fit)[fitc==j]) O[j, 2] = sum(pcdata$lymdis[fitc==j]) O[j, 1] = sum((1-pcdata$lymdis)[fitc==j]) } Here is the error message: Error in Summary.factor(..., na.rm = na.rm) : sum not meaningful for factors I understand what it means; I just can't figure out how to get around it or how to get the output printed in table form. Thank you in advance for any assistance. Mary Sullivan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Mike Lawrence Graduate Student, Department of Psychology, Dalhousie University Website: http://memetic.ca Public calendar: http://icalx.com/public/informavore/Public The road to wisdom? Well, it's plain and simple to express: Err and err and err again, but less and less and less. - Piet Hein __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.
Thanks so much Jim, Andaikalavan, Gabor and others for the help and suggestions. The solution will result in a matrix containing nested matrices to enable each variable name, each variables distinct value and the count of the distinct value to be accessible individually. The main matrix will contain the variable names, the first level nested matrices will consist of the variables unique values, and each such variable entry will contain a one element vector to contain the count or occurrence frequency. This matrix can now be used in comparing other similar datasets for variable values and their frequencies. Building on the input received so far, a probable solution in building the matrix will include the following. 1)I reading the csv file (containing column headers) my_data=read.table(path/to/my/data.csv,header=TRUE,sep=,,dec=.,fill=TRUE) 2)I group the values in each variable producing an occurrence count(frequency) x.val-apply(my_data,2,table) 3)I obtain a vector of the names of the variables in the table names(x.val) 4)Now I make use of the names (obtained in step 3) to obtain a vector of distinct values in a given variable (in the example below the variable name is $PR14) names(v.val$PR14) 5)I obtain a vector (with one element) of the frequency of a value obtained from the step above (in our example the value is V) as.vector(x.val$PR14[V]) Todo: Now I will need to place the steps above in a script (consisting of loops) to build the matrix, step 4 and 5 seem tricky to do programatically. Allan. - Original Message From: jim holtman [EMAIL PROTECTED] To: Allan Kamau [EMAIL PROTECTED] Cc: Adaikalavan Ramasamy [EMAIL PROTECTED]; r-help@stat.math.ethz.ch Sent: Wednesday, July 25, 2007 1:50:55 PM Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset. Also if you want to access the individual values, you can just leave it as a list: x.val - apply(x, 2, table) # access each value x.val$PR14[V] V 8 On 7/25/07, Allan Kamau [EMAIL PROTECTED] wrote: A subset of the data looks as follows df[1:10,14:20] PR10 PR11 PR12 PR13 PR14 PR15 PR16 1 VTIKVGD 2 VSIKVGG 3 VTIRVGG 4 VSIKIGG 5 VSIKVGG 6 VSIRVGG 7 VTIKIGG 8 VSIKVEG 9 VSIKVGG 10VSIKVGG The result I would like is as follows PR10PR11 PR12 ... [V:10][S:7,T:3][I:10] The result can be in a matrix or a vector and each variablename, value and frequency should be accessible so as to be used for comparisons with another dataset later. The frequency can be a count or a percentage. Allan. - Original Message From: Adaikalavan Ramasamy [EMAIL PROTECTED] To: Allan Kamau [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Tuesday, July 24, 2007 10:21:51 PM Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset. The name of the table should give you the value. And if you have a matrix, you just need to convert it into a vector first. m - matrix( LETTERS[ c(1:3, 3:5, 2:4) ], nc=3 ) m [,1] [,2] [,3] [1,] A C B [2,] B D C [3,] C E D tb - table( as.vector(m) ) tb A B C D E 1 2 3 2 1 paste( names(tb), :, tb, sep= ) [1] A:1 B:2 C:3 D:2 E:1 If this is not what you want, then please give a simple example. Regards, Adai Allan Kamau wrote: Hi all, If the question below as been answered before I apologize for the posting. I would like to get the frequencies of occurrence of all values in a given variable in a multivariate dataset. In short for each variable (or field) a summary of values contained with in a value:frequency pair, there can be many such pairs for a given variable. I would like to do the same for several such variables. I have used table() but am unable to extract the individual value and frequency values. Please advise. Allan. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list
[R] colored heights in 3D plot (persp)
Hello everybody, I have a matrix with measurement values and plot them with persp. I want to highlight different heights in different colors. At least everything above and under a certain z-level shall have a different color to make the differences in height more obvious. How can I do that or do I have to use another package? Best regards, Juliane __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a cross table out of a large dataset
On Thu, 2007-07-26 at 13:32 -0700, celine wrote: Dear all, I want to make a cross table out of a data set which is 2 columns wide and more than 15 rows long. When I use the table() function I get an error message This is the code I have used: Dataset - read.table(test.txt, header=TRUE, sep=,, na.strings=NA, dec=., strip.white=TRUE) .T -table(Dataset$K1,Dataset$K2) This is the error message I have received Error in vector(integer, length) : vector size specified is too large In addition: Warning messages: 1: NAs introduced by coercion 2: NAs introduced by coercion Is it possible to make a cross table with the table() function on a large dataset or should I consider using another function? I have had a look at the ?table help file but I could find any information on the size of the dataset. Thanks very much in advance for any help:-) Kind regards, Céline. A wild guess here, but it sounds like your data does not likely contain a relatively small set of repeated discrete entries. Thus, your cross-tabulation results in a large number of combinations, the number of which exceeds the largest representable integer in R, which is: .Machine$integer.max [1] 2147483647 or 2^31 - 1 [1] 2147483647 An R table is a two (or possibly more) dimension matrix with additional class attributes. A matrix is in turn, a vector with 'dim' attributes. A vector is indexed using integers and thus is limited in size to the above number. If the above assumptions are correct, I am struggling to think of a scenario where the visual representation of a cross-tabulation of your data will be of value, but that may be just do to a severe lack of sleep of late. You might want to run: length(unique(Dataset$K1)) and length(unique(Dataset$K2)) which will tell you how many unique values are in each of the two vectors. That will begin to give you some idea as to what you are dealing with. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Diagonal Submatrices Extraction
Yes you are right ... an example is mandatory. So ... I have a matrix of 0 with just a single 1 per row and per column I need to extract all maximal 'diagonal' submatrices Let's say I have the following matrix A B C D E a 0 1 0 0 0 b 1 0 0 0 0 c 0 0 1 0 0 d 0 0 0 1 0 e 0 0 0 0 1 well I would like to get, for this example, the two following submatrices A B C D E a 0 1 c 1 0 0 b 1 0 d 0 1 0 e 0 0 1 Of course some of the extracted submatrices will have in some situations dim=c(1,1) ... Thanks in advance Bruno Hi : I think you need to give an example because I don't understand below and my guess is that, since noone else replied, I don't think they understood it either. I don't mean to be rude. I've just noticed from being on the list That, if something is not clear, people won't even tell you. They just won't respond. The list is great But people don't want to spend time trying to figure out what you want. An example and possibly code is really helpful in getting responses. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bruno C. Sent: Thursday, July 26, 2007 4:47 AM To: R-help Subject: [R] Submatrices Extraction Hello, Given a submatrix containing 0 or 1 I need to extract the indexes of all the diagonal submatrices so one of the two diagonals must contains only 1 for each submatrix ... Any help? Thanks in advance Bruno -- Scegli infostrada: ADSL gratis per tutta l'estate e telefoni senza canone Telecom __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This is not an offer (or solicitation of an offer) to buy/sell the securities/instruments mentioned or an official confirmation. Morgan Stanley may deal as principal in or own or act as market maker for securities/instruments mentioned or may advise the issuers. This is not research and is not from MS Research but it may refer to a research analyst/research report. Unless indicated, these views are the author's and may differ from those of Morgan Stanley research or others in the Firm. We do not represent this is accurate or complete and we may not update this. Past performance is not indicative of future returns. For additional information, research reports and important disclosures, contact me or see https://secure.ms.com/servlet/cls. You should not use e-mail to request, authorize or effect the purchase or sale of any security or instrument, to send transfer instructions, or to effect any other transactions. We cannot guarantee that any such requests received via e-mail will be processed in a timely manner. This communication is solely for the addressee(s) and may contain confidential information. We do not waive confidentiality by mistransmission. Contact me if you do not wish to receive these communications. In the UK, this communication is directed in the UK to those persons who are market counterparties or intermediate customers (as defined in the UK Financial Services Authority's rules). -- Scegli infostrada: ADSL gratis per tutta lestate e telefoni senza canone Telecom __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem installing tseries package
On Thu, 26 Jul 2007, Michael Cassin wrote: Hi, I'm running R 2.4.1 on Fedora Core 6 and am unable to install the tseries package. I've resolved a few problems getting to this point, by running a yum update, installing the gcc-gfortran dependency, but now I'm stuck. Could someone please point me in the right direction? Please read the posting guide and provide the information you were asked for: only then we may be able to help you. You seem to have a system which installed R in /usr/lib/R but has x86_64 components on it. So what architecture is it that you are trying to run? My guess is that you installed a i386 RPM on a x86_64 OS. That will install and R will run *but* you will not be able to use it to install packages. If you installed the i386 RPM after the x86_64 one, it will have overwritten some crucial files including /usr/bin/R. It is possible to have i386 and x86_64 R coexisting on x86_64 Linux, but not by installing RPMs for different architectures. R install.packages output === == install.packages(tseries) trying URL ' http://www.sourcekeg.co.uk/cran/src/contrib/tseries_0.10-11.tar.gz' Content type 'application/x-tar' length 182043 bytes opened URL == downloaded 177Kb * Installing *source* package 'tseries' ... ** libs gcc -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include -fpic -O3 -g -std=gnu99 -c arma.c -o arma.o gcc -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include -fpic -O3 -g -std=gnu99 -c bdstest.c -o bdstest.o gcc -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include -fpic -O3 -g -std=gnu99 -c boot.c -o boot.o gfortran -fpic -O2 -g -c dsumsl.f -o dsumsl.o In file dsumsl.f:450 IF (IV(1) - 2) 30, 40, 50 1 Warning: Obsolete: arithmetic IF statement at (1) In file dsumsl.f:3702 10 ASSIGN 30 TO NEXT 1 Warning: Obsolete: ASSIGN statement at (1) In file dsumsl.f:3707 20GO TO NEXT,(30, 50, 70, 110) 1 Warning: Obsolete: Assigned GOTO statement at (1) In file dsumsl.f:3709 ASSIGN 50 TO NEXT 1 Warning: Obsolete: ASSIGN statement at (1) In file dsumsl.f:3718 ASSIGN 70 TO NEXT 1 Warning: Obsolete: ASSIGN statement at (1) In file dsumsl.f:3724 ASSIGN 110 TO NEXT 1 Warning: Obsolete: ASSIGN statement at (1) In file dsumsl.f:4552 IF (IV(1) - 2) 999, 30, 70 1 Warning: Obsolete: arithmetic IF statement at (1) In file dsumsl.f:4714 IF (IRC) 140, 100, 210 1 Warning: Obsolete: arithmetic IF statement at (1) gcc -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include -fpic -O3 -g -std=gnu99 -c garch.c -o garch.o gcc -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include -fpic -O3 -g -std=gnu99 -c ppsum.c -o ppsum.o gcc -I/usr/lib/R/include -I/usr/lib/R/include -I/usr/local/include -fpic -O3 -g -std=gnu99 -c tsutils.c -o tsutils.o gcc -shared -Bdirect,--hash-stype=both,-Wl,-O1 -o tseries.so arma.o bdstest.o boot.o dsumsl.o garch.o ppsum.o tsutils.o -L/usr/lib/R/lib -lRblas -lgfortran -lm -lgcc_s -lgfortran -lm -lgcc_s -L/usr/lib/R/lib -lR /usr/bin/ld: skipping incompatible /usr/lib/R/lib/libRblas.so when searching for -lRblas /usr/bin/ld: skipping incompatible /usr/lib/R/lib/libRblas.so when searching for -lRblas /usr/bin/ld: cannot find -lRblas collect2: ld returned 1 exit status make: *** [tseries.so] Error 1 ERROR: compilation failed for package 'tseries' ** Removing '/usr/lib/R/library/tseries' = = I presume the priority is addressing the error: /usr/bin/ld: cannot find -lRblas I have the libRblas.so file with R 2.4. Do I need to upgrade to R 2.5 - In which case I'll be asking how to fix the problems I'm having doing that ;) [~]# yum provides libRblas.so snip R.x86_64 2.5.1-2.fc6extras Matched from: /usr/lib64/R/lib/libRblas.so libRblas.so()(64bit) R.x86_64 2.5.1-2.fc6extras Matched from: /usr/lib64/R/lib/libRblas.so libRblas.so()(64bit) R.i386 2:2.4.1-1.fc6 installed Matched from: /usr/lib/R/lib/libRblas.so libRblas.so R.x86_64 2.4.1-4.fc6installed Matched from: /usr/lib64/R/lib/libRblas.so libRblas.so()(64bit) Regards,
Re: [R] Constructing bar charts with standard error bars
On 7/25/07, Ben Bolker [EMAIL PROTECTED] wrote: John Zabroski johnzabroski at gmail.com writes: The best clue I have so far is Rtips #5.9: http://pj.freefaculty.org/R/Rtips.html#5.9 which is what I based my present solution off of. However, I do not understand how this works. It seems like there is no concrete way to determine the arrow drawing parameters x0 and x1 for a barplot. Moreover, the bars seem to be cut off. barplot() returns the x values you need for x0 and x1. barplot(...,ylim=c(0,xbar+se)) will set the upper y limit so the bars don't get cut off. P.S. I hope you're not hoping to infer a statistically significant difference among these groups ... cheers Ben Bolker Thanks a lot! I tried all three and they all seem very dependable. Also, I appreciate you rewriting my solution and adding elegance. Is there a way to extend the tick marks to the ylim values, such that the yscale ymax tickmark is something like max(xbar+se)? In the documentation, I thought par(yaxp=c(y0,y1,n)) would do the trick, but after trying to use it I am not sure I understand what yaxp even does. P.S. I am not looking for statistically significant differences. I am trying to learn how to leverage R's graphing capabilities. I also appreciate Frank Harrell referring me to the link about Dynamite Plots and associated weaknesses. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROC curve in R
Note that even though the ROC curve as a whole is an interesting 'statistic' (its area is a linear translation of the Wilcoxon-Mann-Whitney-Somers-Goodman-Kruskal rank correlation statistics), each individual point on it is an improper scoring rule, i.e., a rule that is optimized by fitting an inappropriate model. Using curves to select cutoffs is a low-precision and arbitrary operation, and the cutoffs do not replicate from study to study. Probably the worst problem with drawing an ROC curve is that it tempts analysts to try to find cutoffs where none really exist, and it makes analysts ignore the whole field of decision theory. Frank Harrell [EMAIL PROTECTED] wrote: http://search.r-project.org/cgi-bin/namazu.cgi?query=ROCmax=20result=normalsort=scoreidxname=Rhelp02aidxname=functionsidxname=docs there is a lot of help try help.search(ROC curve) gave Help files with alias or concept or title matching 'ROC curve' using fuzzy matching: granulo(ade4) Granulometric Curves plot.roc(analogue)Plot ROC curves and associated diagnostics roc(analogue) ROC curve analysis colAUC(caTools) Column-wise Area Under ROC Curve (AUC) DProc(DPpackage) Semiparametric Bayesian ROC curve analysis cv.enet(elasticnet) Computes K-fold cross-validated error curve for elastic net ROC(Epi) Function to compute and draw ROC-curves. lroc(epicalc) ROC curve cv.lars(lars) Computes K-fold cross-validated error curve for lars roc.demo(TeachingDemos) Demonstrate ROC curves by interactively building one HTH see the help and examples those will suffice Type 'help(FOO, package = PKG)' to inspect entry 'FOO(PKG) TITLE'. Regards, Gaurav Yadav +++ Assistant Manager, CCIL, Mumbai (India) Mob: +919821286118 Email: [EMAIL PROTECTED] Bhagavad Gita: Man is made by his Belief, as He believes, so He is Rithesh M. Mohan [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 07/26/2007 11:26 AM To R-help@stat.math.ethz.ch cc Subject [R] ROC curve in R Hi, I need to build ROC curve in R, can you please provide data steps / code or guide me through it. Thanks and Regards Rithesh M Mohan [[alternative HTML version deleted]] - Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R codes for g-and-h distribution
hi! I would like to ask help how to generate numbers from g-and-h distribution. This distribution is like normal distribution but span more of the kurtosis and skewness plane. Has R any package on how to generate them? Any help will be greatly appreciated. Thank you so much! Form, Filame Uyaco - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using contrasts on matrix regressions (using gmodels, perhaps): 2 Solutions
Dear list, I got two responses to my post. One was from Soren with a follow-up on personal e-mail, and the other I leave anonymous since he contacted me on personal e-mail. Anyway, here we go: The first (Soren): library(doBy) Y - as.data.frame(Y) lapply(Y,function(y){reg- lm(y~X); esticon(reg, c(0,0, 0, 1, 0, -1) )}) Confidence interval ( WALD ) level = 0.95 Confidence interval ( WALD ) level = 0.95 Confidence interval ( WALD ) level = 0.95 Confidence interval ( WALD ) level = 0.95 Confidence interval ( WALD ) level = 0.95 $V1 beta0 Estimate Std.Error t.value DF Pr(|t|) Lower.CI Upper.CI 1 0 0.6701771 0.517921 1.293976 4 0.2653302 -0.767802 2.108156 $V2 beta0 Estimate Std.Errort.value DF Pr(|t|) Lower.CI Upper.CI 1 0 -0.2789954 0.64481 -0.4326784 4 0.687 -2.069275 1.511284 $V3 beta0 Estimate Std.Errort.value DF Pr(|t|) Lower.CI Upper.CI 1 0 -0.7677927 0.9219688 -0.8327751 4 0.4518055 -3.327588 1.792003 $V4 beta0 Estimate Std.Error t.value DF Pr(|t|) Lower.CI Upper.CI 1 0 -0.6026635 0.4960805 -1.214850 4 0.29123 -1.980004 0.7746768 $V5 beta0 Estimate Std.Error t.value DF Pr(|t|) Lower.CI Upper.CI 1 0 2.001558 1.004574 1.992444 4 0.117123 -0.787587 4.790703 One thing I do not know how to handle is the output Confidence interval ( WALD ) level = 0.95 which shows up for every regression. When I do millions of regressions, this seriously slows it all down. Any idea how I can suppress that? The second solution uses gmodels, with a lucid explanation which I reproduce. Thanks! The second (anon): For a standard (non-matrix) regression, you could test the hypothesis X3=X4 using estimable(reg, c((Intercept)=0, X1=0, X2=0, X3=1, X4=0, X5=-1) ) but this won't currently work with the mlm object created by a matrix regression. The best way to solve this problem is to write an estimable.mlm() function that simply extracts the individual regressions from the mlm object and then calls estimable on each of these, pasting the results back together appropriately. Something like this should do the trick: `estimable.mlm` - function (object, ...) { coef - coef(object) ny - ncol(coef) effects - object$effects resid - object$residuals fitted - object$fitted ynames - colnames(coef) if (is.null(ynames)) { lhs - object$terms[[2]] if (mode(lhs) == call lhs[[1]] == cbind) ynames - as.character(lhs)[-1] else ynames - paste(Y, seq(ny), sep = ) } value - vector(list, ny) names(value) - paste(Response, ynames) cl - oldClass(object) class(object) - cl[match(mlm, cl):length(cl)][-1] for (i in seq(ny)) { object$coefficients - coef[, i] object$residuals - resid[, i] object$fitted.values - fitted[, i] object$effects - effects[, i] object$call$formula[[2]] - object$terms[[2]] - as.name(ynames[i]) value[[i]] - estimable(object, ...) } class(value) - listof value } Now this all works: X - matrix(rnorm(50),10,5) Y - matrix(rnorm(50),10,5) reg - lm(Y~X) estimable(reg, c((Intercept)=0, X1=0, X2=0, X3=1, X4=0, X5=-1) ) Response Y1 : Estimate Std. Error t value DF Pr(|t|) (0 0 0 1 0 -1) -0.9024065 0.4334235 -2.082043 4 0.1057782 Response Y2 : Estimate Std. Error t value DF Pr(|t|) (0 0 0 1 0 -1) -0.7017988 0.2199234 -3.191106 4 0.03318115 Response Y3 : Estimate Std. Error t value DF Pr(|t|) (0 0 0 1 0 -1) 0.5412863 0.2632527 2.056147 4 0.1089276 Response Y4 : Estimate Std. Errort value DF Pr(|t|) (0 0 0 1 0 -1) -0.1028162 0.5973959 -0.1721073 4 0.87171 Response Y5 : Estimate Std. Error t value DF Pr(|t|) (0 0 0 1 0 -1) 0.2493330 0.2024061 1.231845 4 0.2854716 On Wed, 25 Jul 2007 18:30:36 -0500 Ranjan Maitra [EMAIL PROTECTED] wrote: Hi, I want to test for a contrast from a regression where I am regressing the columns of a matrix. In short, the following. X - matrix(rnorm(50),10,5) Y - matrix(rnorm(50),10,5) lm(Y~X) Call: lm(formula = Y ~ X) Coefficients: [,1] [,2] [,3] [,4] [,5] (Intercept) 0.3350 -0.1989 -0.1932 0.7528 0.0727 X10.2007 -0.8505 0.0520 0.1501 0.3248 X20.3212 0.7008 -0.0963 -0.2584 0.6711 X30.3781 -0.7321 0.1907 -0.1721 0.3073 X4 -0.1778 0.2822 -0.0644 -0.2649 -0.4140 X5 -0.1079 -0.0475 0.6047 -0.8369 -0.5928 I want to test for c'b = 0 where c is (lets say) the contrast (0, 0, 1, 0, -1). Is it possible to do so, in one shot, using gmodels or something else? Many thanks and best wishes, Ranjan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide
Re: [R] ROC curve in R
Dylan Beaudette wrote: On Thursday 26 July 2007 06:01, Frank E Harrell Jr wrote: Note that even though the ROC curve as a whole is an interesting 'statistic' (its area is a linear translation of the Wilcoxon-Mann-Whitney-Somers-Goodman-Kruskal rank correlation statistics), each individual point on it is an improper scoring rule, i.e., a rule that is optimized by fitting an inappropriate model. Using curves to select cutoffs is a low-precision and arbitrary operation, and the cutoffs do not replicate from study to study. Probably the worst problem with drawing an ROC curve is that it tempts analysts to try to find cutoffs where none really exist, and it makes analysts ignore the whole field of decision theory. Frank Harrell Frank, This thread has caught may attention for a couple reasons, possibly related to my novice-level experience. 1. in a logistic regression study, where i am predicting the probability of the response being 1 (for example) - there exists a continuum of probability values - and a finite number of {1,0} realities when i either look within the original data set, or with a new 'verification' data set. I understand that drawing a line through the probabilities returned from the logistic regression is a loss of information, but there are times when a 'hard' decision requiring prediction of {1,0} is required. I have found that the ROCR package (not necessarily the ROC Curve) can be useful in identifying the probability cutoff where accuracy is maximized. Is this an unreasonable way of using logistic regression as a predictor? Logistic regression (with suitable attention to not assuming linearity and to avoiding overfitting) is a great way to estimate P[Y=1]. Given good predicted P[Y=1] and utilities (losses, costs) for incorrect positive and negative decisions, an optimal decision is one that optimizes expected utility. The ROC curve does not play a direct role in this regard. If per-subject utilities are not available, the analyst may make various assumptions about utilities (including the unreasonable but often used assumption that utilities do not vary over subjects) to find a cutoff on P[Y=1]. A very nice feature of P[Y=1] is that error probabilities are self-contained. For example if P[Y=1] = .02 for a single subject and you predict Y=0, the probability of an error is .02 by definition. One doesn't need to compute an overall error probability over the whole distribution of subjects' risks. If the cost of a false negative is C, the expected cost is .02*C in this example. 2. The ROC curve can be a helpful way of communicating false positives / false negatives to other users who are less familiar with the output and interpretation of logistic regression. What is more useful than that is a rigorous calibration curve estimate to demonstrate the faithfulness of predicted P[Y=1] and a histogram showing the distribution of predicted P[Y=1]. Models that put a lot of predictions near 0 or 1 are the most discriminating. Calibration curves and risk distributions are easier to explain than ROC curves. Too often a statistician will solve for a cutoff on P[Y=1], imposing her own utility function without querying any subjects. 3. I have been using the area under the ROC Curve, kendall's tau, and cohen's kappa to evaluate the accuracy of a logistic regression based prediction, the last two statistics based on a some probability cutoff identified before hand. ROC area (equiv. to Wilcoxon-Mann-Whitney and Somers' Dxy rank correlation between pred. P[Y=1] and Y) is a measure of pure discrimination, not a measure of accuracy per se. Rank correlation (concordance) measures do not require the use of cutoffs. How does the topic of decision theory relate to some of the circumstances described above? Is there a better way to do some of these things? See above re: expected loses/utilities. Good questions. Frank Cheers, Dylan [EMAIL PROTECTED] wrote: http://search.r-project.org/cgi-bin/namazu.cgi?query=ROCmax=20result=no rmalsort=scoreidxname=Rhelp02aidxname=functionsidxname=docs there is a lot of help try help.search(ROC curve) gave Help files with alias or concept or title matching 'ROC curve' using fuzzy matching: granulo(ade4) Granulometric Curves plot.roc(analogue)Plot ROC curves and associated diagnostics roc(analogue) ROC curve analysis colAUC(caTools) Column-wise Area Under ROC Curve (AUC) DProc(DPpackage) Semiparametric Bayesian ROC curve analysis cv.enet(elasticnet) Computes K-fold cross-validated error curve for elastic net ROC(Epi) Function to compute and draw ROC-curves. lroc(epicalc) ROC curve cv.lars(lars)
Re: [R] dates() is a great date function in R
Mr Natural [EMAIL PROTECTED] writes: Just save the spreadsheet as a csv file and use tisFromCsv() in the fame package. One of the arguments tisFromCsv() takes is a dateFormat, so you can tell it what format the date column is in. You can also tell it the name of the date column if it isn't some variation of DATE, Date, or date. tisFromCsv() looks at the dates coming in and automatically figures out what frequency the data are (quarterly, monthly, weekly, daily, etc.) and creates a univariate or multivariate (if the spreadsheet has more than one data column) 'tis' (Time Indexed Series) object. Jeff Proper calendar dates in R are great for plotting and calculating. However for the non-wonks among us, they can be very frustrating. I have recently discussed the pains that people in my lab have had with dates in R. Especially the frustration of bringing date data into R from Excel, which we have to do a lot. Please find below a simple analgesic for R date importation that I discovered over the last 1.5 days (Learning new stuff in R is calculated in 1/2 days). The functiondates()gives the simplest way to get calendar dates into R from Excel that I can find. But straight importation of Excel dates, via a csv or txt file, can be a a huge pain (I'll give details for anyone who cares to know). My pain killer is: Consider that you have Excel columns in month, day, year format. Note that R hates date data that does not lead with the year. a. Load the chron library by typing library(chron) in the console. You know that you need this library from information revealed by performing the query, ?dates()in the Console window. This gives the R documentation help file for this and related time, date functions. In the upper left of the documentation, one sees dates(chron). This tells you that you need the library chron. b. Change the format dates in Excel to format general, which gives 5 digit Julian dates. Import the csv file (I useread.csv() with the Julian dates and other data of interest. c. Now, change the Julian dates that came in with the csv file into calendar dates with thedates() function. Below is my code for performing this activity, concerning an R data file called ss, ss holds the Julian dates, illustrated below from the column MPdate, ss$MPdate[1:5] [1] 34252 34425 34547 34759 34773 The dates() function makes calendar dates from Julian dates, dmp-dates(ss$MPdate,origin=c(month = 1, day = 1, year = 1900)) dmp[1:5] [1] 10/12/93 04/03/94 08/03/94 03/03/95 03/17/95 I would appreciate the comments of more sophisticated programmers who can suggest streamlining or shortcutting this operation. regards, Don -- View this message in context: http://www.nabble.com/dates%28%29-is-a-great-date-function-in-R-tf4105322.html#a11675205 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jeff __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROC curve in R
http://search.r-project.org/cgi-bin/namazu.cgi?query=ROCmax=20result=normalsort=scoreidxname=Rhelp02aidxname=functionsidxname=docs there is a lot of help try help.search(ROC curve) gave Help files with alias or concept or title matching 'ROC curve' using fuzzy matching: granulo(ade4) Granulometric Curves plot.roc(analogue)Plot ROC curves and associated diagnostics roc(analogue) ROC curve analysis colAUC(caTools) Column-wise Area Under ROC Curve (AUC) DProc(DPpackage) Semiparametric Bayesian ROC curve analysis cv.enet(elasticnet) Computes K-fold cross-validated error curve for elastic net ROC(Epi) Function to compute and draw ROC-curves. lroc(epicalc) ROC curve cv.lars(lars) Computes K-fold cross-validated error curve for lars roc.demo(TeachingDemos) Demonstrate ROC curves by interactively building one HTH see the help and examples those will suffice Type 'help(FOO, package = PKG)' to inspect entry 'FOO(PKG) TITLE'. Regards, Gaurav Yadav +++ Assistant Manager, CCIL, Mumbai (India) Mob: +919821286118 Email: [EMAIL PROTECTED] Bhagavad Gita: Man is made by his Belief, as He believes, so He is Rithesh M. Mohan [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 07/26/2007 11:26 AM To R-help@stat.math.ethz.ch cc Subject [R] ROC curve in R Hi, I need to build ROC curve in R, can you please provide data steps / code or guide me through it. Thanks and Regards Rithesh M Mohan [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. DISCLAIMER AND CONFIDENTIALITY CAUTION:\ \ This message and ...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert string to list?
Try this. It pastes list( onto the front and ) onto the end giving list( P = 0.0, T = 0.0, Q = 0.0 ) and then parses and evaluates that as an R expression. Str - P = 0.0, T = 0.0, Q = 0.0 eval(parse(text = paste(list(, Str, On 7/26/07, Manuel Morales [EMAIL PROTECTED] wrote: Let's say I have the following string: str - P = 0.0, T = 0.0, Q = 0.0 I'd like to find a function that generates the following object from 'str'. list(P = 0.0, T = 0.0, Q = 0.0) Thanks! -- http://mutualism.williams.edu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reading stata files: preserving values of variables converted to factors
Hi, I am a Stata user new to R. I am using read.dta to read a Stata file that has variables with value labels. read.dta converts them to factors, but seems to recode them with values from 1 to number of factor levels (looking at the output of unclass(varname)), so the original numerical values are lost. Using convert.factors=FALSE preserves the values, but seems to discard the labels. Is it possible to get these variables into R while preserving both the values and the labels? Thanks, Ben __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.