Re: [R] ANOVA 1 too few degrees of freedom
On May 5, 2011, at 23:30 , Rovinpiper wrote: Thanks slre, I seem to be making some progress now. Using a colon instead of an asterisk in the code really changes things. I had been getting residual SS and MS of zero. Which is ridiculous. Now I get much more plausible values. Also, When I used an asterisk instead of a colon It wouldn't give results for three way interactions. With colons it will. You are correct about plot being nested within treatment. There are six plots in each of 2 treatments. So, I guess I will have to perform a separate analysis to quantify the effect of treatment. Beware that as you have highly significant effects of plot and its interaction with day, and plot being nested in treatment, you can't test for treatment or treatment:day effect with a systematic effect of plot and plot:treatment in the model (you are only getting p values because of the sequential computation of the anova table - if you put plot before treatment, you'd get zero df). More likely, you want to make the plot terms random, as in ~treatment*day + Error(plot/day) Thanks again. Analysis of Variance Table Response: Combined.Rs Df Sum Sq Mean Sq F value Pr(F) Combined.Trt 1 52.80 52.805 96.26012.2e-16 *** Combined.Plot10 677.69 67.769 123.5380 2.2e-16 *** as.factor(Combined.Day)16 2817.47 176.092 321.0041 2.2e-16 *** Combined.Trt:as.factor(Combined.Day) 16 47.82 2.989 5.44874.048e-10 *** Combined.Trt:Combined.Plot:as.factor(Combined.Day)80 455.42 5.693 10.3776 2.2e-16 *** Residuals 284 155.79 0.549 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 -- View this message in context: http://r.789695.n4.nabble.com/ANOVA-1-too-few-degrees-of-freedom-tp3493349p3499649.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using functions/loops for repetitive commands
Hello, Derek, first of all, be very aware of what David Winsemius said; you are about to enter the area of unprincipled data-mining (as he called it) with its trap -- one of many -- of multiple testing. So, *if* you know what the consequences and possible remedies are, a purely R-syntactic solution to your problem might be the (again not fully tested) hack below. If so how can I change my code to automate the chisq.test in the same way I did for the wilcox.test? Try lapply( your_data_frame[selection_of_relevant_components], function( y) chisq.test( y, your_data_frame$group_name) ) or even shorter: lapply( your_data_frame[selection_of_relevant_components], chisq.test, your_data_frame$group_name ) However, in the resulting output you will not be seeing the names of the variables that went into the first argument of chisq.test(). This is a little bit more complicated to resolve: lapply( names( your_data_frame[selection_of_relevant_components]), function( y) eval( substitute( chisq.test( your_data_frame$y0, your_data_frame$tension), list( y0 = y) ) ) ) Still another possibility is to use xtabs() (with its summary-method) which has a formula argument. Hoping that you know what to do with the results -- Gerrit - Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany Fax: +49-(0)641-99-32109http://www.uni-giessen.de/cms/eichner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with a survplot
Il giorno Thu, 5 May 2011 18:42:11 -0700 (PDT) Frank Harrell f.harr...@vanderbilt.edu ha scritto: Hi Marco, You're welcome. The number at risk at given time points is a fairly standard thing to add to survival plots. I know, but last year, as a newbye in biostatistics, i felt the need to read rms book exactly because there were plenty of standard things that did not convince me. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vermunt's LEM in R
poLCA is an option, as is randomLCA, flexmix, and depmixS4, and there are likely to be more What specific models are you interested in? Best, Ingmar On Fri, May 6, 2011 at 6:13 AM, Wincent ronggui.hu...@gmail.com wrote: I guess LEM is a software for latent class analysis. If so, you may want to have a look at poLCA package. Regards Ronggui On 5 May 2011 23:34, David Joubert jo...@hotmail.com wrote: Hello- Does anyone know of packages that could emulate what J. Vermunt's LEM does ? What is the closest relative in R ? I use both R and LEM but have trouble transforming my multiway tables in R into a .dat file compatible with LEM. Thanks, David Joubert [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Wincent Ronggui HUANG Sociology Department of Fudan University PhD of City University of Hong Kong http://asrr.r-forge.r-project.org/rghuang.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rcspline.problem
Dear Dr ; I am a PhD student at Epidemiology department of National University of Singapore. I used R command (rcspline.plot) for plotting restricted cubic spline – the model is based on Cox. I managed to get a plot without adjustment for other covariates, but I have a problem regarding to adjusting the confounders. I applied below command to generate the matrix for covariates. m=as.matrix(age,sex) or m1=matrix(age,sex) or m2=cbind(age,sex) But, when I input . adj=m, or adj=m1, or adj=m2.. in the model, R gives below error: Error in pchisq(q, df, lower.tail, log.p) : Non-numeric argument to mathematical function In addition: Warning message: In coxph.fit(cbind(x, xx, adj), cbind(y, event), strata = NULL, : Loglik converged before variable 1,2,3,4 ; beta may be infinite. I would be grateful if you take my issue into your consideration and help me on this case Sincerely Yours Haleh Ghaem PhD student, NUS __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] for loop
Hi, I'm hoping someone can offer some advice:I have a matrix x of dimensions 160 by 1. I need to create a matrix y, where the first 7 elements are equal to x[1]^1/7, then the next 6 equal to x[2]^1/6, next seven x[3]^1/7 and so on all the way to the 1040th element. I have implemented this with a for loop an hour ago and it is still loading, can anyone offer any suggestions as to how I can create this matrix without using loops? I would really appreciate any suggestions. Regards, Andre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Draw a nomogram after glm
Hi, I use datadist fonction in rms library in order to draw my nomogram. After reading, I try this code: f-lrm(Y~L+P,data=donnee) f - lrm(Y~L+P,data=donnee) d - datadist(f,data=donnee) options(datadist=d) f - lrm(Y~L+P) summary(f,L=c(0,506,10),P=c(45,646,10)) plot(Predict(f,L=200:800,P=3)) Unfortunately, I have error after the 2nd code: Erreur dans datadist(f, data = donnee) : program logic error Please could you provide me a document more simple which is more understandable for new R user. Thanks for your help. Komine -- View this message in context: http://r.789695.n4.nabble.com/Draw-a-nomogram-after-glm-tp3498144p3501534.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RV: R question
which is the maximum large of digits that R has?, because SQL work with 50 digits I think. and I need a software that work with a lot of digits. The .Machine() command will provide some insight into these matters. cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for loop
On Fri, May 06, 2011 at 02:28:57PM +1000, andre bedon wrote: Hi, I'm hoping someone can offer some advice:I have a matrix x of dimensions 160 by 1. I need to create a matrix y, where the first 7 elements are equal to x[1]^1/7, then the next 6 equal to x[2]^1/6, next seven x[3]^1/7 and so on all the way to the 1040th element. I have implemented this with a for loop an hour ago and it is still loading, can anyone offer any suggestions as to how I can create this matrix without using loops? I would really appreciate any suggestions. Hi. Since indexing x[1], x[2], ... is used and also the description of y corresponds more to a vector, let me first suggest a solution for vectors. x - rep(42, times=4) # any vector of even length x - x/c(7, 6) rep(x, times=rep(c(7, 6), length=length(x))) [1] 6 6 6 6 6 6 6 7 7 7 7 7 7 6 6 6 6 6 6 6 7 7 7 7 7 7 The input vector may be obtained using c() from a matrix. The output vector may be reformatted using matrix(). However, for a matrix solution, a more precise description of the question is needed. Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MacKinnon critical value
Hello Lee, in addition to David's answer, see: ?MacKinnonPValues in package 'urca' (CRAN and R-Forge). Best, Bernhard -Ursprüngliche Nachricht- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im Auftrag von David Winsemius Gesendet: Freitag, 6. Mai 2011 04:46 An: Lee Schulz Cc: R-help@r-project.org Betreff: Re: [R] MacKinnon critical value On May 5, 2011, at 9:10 PM, Lee Schulz wrote: Hello, I am doing an Engle Granger test on the residuals of two I(1) processes. I would like to get the MacKinnon (1996) critical value, say at 10%. I have 273 observations with 5 integrated explanatory variables , so that k=4. Could someone help me with the procedure in R? See if this helps: http://search.r-project.org/cgi-bin/namazu.cgi?query=Engle+Gra nger+mackinnonmax=100 result=normalsort=scoreidxname=functionsidxname=Rhelp08id xname=Rhelp10idxname=Rhelp02 -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. * Confidentiality Note: The information contained in this ...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looping over graphs in igraph
Hi Danielle. You appear to have two problems: 1) getting the data into R Because I don't have the file at hand, I'm going to simulate reading it through a text connection orgdata-textConnection(Graph ID | Vertex1 | Vertex2 | weight\n1 | Alice | Bob | 2\n1 | Alice | Chris | 1\n1 | Alice | Jane | 2\n1 | Bob | Jane | 2\n1 | Chris | Jane | 3\n2 | Alice | Tom | 2\n2 | Alice | Kate | 1\n2 | Kate | Tom | 3\n2 | Tom | Mike | 2) dfr -read.table(orgdata, header=TRUE, sep=|, as.is=TRUE, strip.whit=TRUE) For you, this would probably be more like dfr -read.table(somepath/fileOfInterest.csv, header=TRUE, sep=|, as.is=TRUE, strip.whit=TRUE) 2) performing actions per graph id require(igraph) result-sapply(unique(dfr$Graph.ID), function(curID){ #There may be more elegant ways of creating the graphs per ID, but it works curDfr- dfr[dfr$Graph.ID==curID,] g-graph.edgelist(as.matrix(curDfr[,c(Vertex1, Vertex2)])) g-set.edge.attribute(g, weight, value= curDfr$weight) #return whatever information you're interested about, based on graph object g #for now I'm just returning edge and vertex counts return(c(v=vcount(g), e=ecount(g))) }) colnames(result)-unique(dfr$Graph.ID) print(result) HTH, Nick Sabbe -- ping: nick.sa...@ugent.be link: http://biomath.ugent.be wink: A1.056, Coupure Links 653, 9000 Gent ring: 09/264.59.36 -- Do Not Disapprove -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Danielle Li Sent: donderdag 5 mei 2011 22:25 To: r-help@r-project.org Subject: [R] Looping over graphs in igraph Hi, I'm trying to do some basic social network analysis with igraph in R, but I'm new to R and haven't been able to find documentation on a couple basic things: I want to run igraph's community detection algorithms on a couple thousand small graphs but don't know how to automate igraph looking at multiple graphs described in a single csv file. My data look like something in ncol format, but with an additional column that has an ID for which graph the edge belongs in: Graph ID | Vertex1 | Vertex2 | weight 1 | Alice | Bob | 2 1 | Alice | Chris | 1 1 | Alice | Jane | 2 1 | Bob | Jane | 2 1 | Chris | Jane | 3 2 | Alice | Tom | 2 2 | Alice | Kate | 1 2 | Kate | Tom | 3 2 | Tom | Mike | 2 so on and so forth for about 2000 graph IDs, each with about 20-40 vertices. I've tried using the split command but it doesn't recognize my graph id: (object 'graphid' not found)--this may just be because I don't know how to classify a column of a csv as an object. Ultimately, I want to run community detection on each graph separately--to look only at the edges when the graph identifier is 1, make calculations on that graph, then do it again for 2 and so forth. I suspect that this isn't related to igraph specifically--I just don't know the equivalent command in R for what in pseudo Stata code would read as: forvalues i of 1/N { temp_graph=subrows of the main csv file for which graphid==`i' cs`i' = leading.eigenvector.community.step(temp_graph) convert cs`i'$membership into a column in the original csv } I want the output to look something like: Graph ID | Vertex1 | Vertex2 | weight | Vertex 1 membership | Vertex 2 membership | # of communities in the graph 1 | Alice | Bob | 2 | A | B | 2 1 | Alice | Chris | 1 | A | B | 2 1 | Alice | Jane | 2 | A | B | 2 1 | Bob | Jane | 2 | B | B | 2 1 | Chris | Jane | 3 | B | B | 2 2 | Alice | Tom | 2 | A | B | 3 2 | Alice | Kate | 1 | A | C | 3 2 | Kate | Tom | 3 | C | B | 3 2 | Tom | Mike | 2 | B | C | 3 Here, the graphs are treated completely separately so that community A in graph 1 need not have anything to do with community A in graph 2. I would really appreciate any ideas you guys have. Thank you! Danielle [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for loop with global variables
Hi, sorry for the late response and many thanks. A combination of get() and paste() did the job. Regards On Thu, Apr 28, 2011 at 5:06 PM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 28.04.2011 16:16:16: ivan i.pet...@gmail.com Odeslal: r-help-boun...@r-project.org 28.04.2011 16:16 Komu r-help@r-project.org Kopie PÅedmÄt [R] for loop with global variables Hi, is there a possibility to use global variables in a for loop. More specifically, I want to do the following: output.1-rbind(a,b) output.2-rbind(c,d) output.3-rbind(e,f) . . . output.n-rbind(...,...) next I want to create a data frame with two columns: Outputs Values output.1 a,b output.2 c,d output.3 e,f . . output.n â¦,⦠My problem is that I do not how to define a loop over global variables. Anybody an idea? Don't do that. As Jim suggested use list inside a loop. #declare a list lll-vector(list,3) #do your loop for(i in 1:3) lll[[i]] - letters[i] lll [[1]] [1] a [[2]] [1] b [[3]] [1] c But I feel that you could maybe achieve desired result without loop in more Rish way. If what you want to do is reasonable and frequent I believe somebody already made a function which does what you want. Regards Petr Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Insert values to histogram
On 05/05/2011 09:50 PM, matibie wrote: I'm trying to add the exact value on top of each column of an Histogram, i have been trying with the text function but it doesn't work. The problem is that the program it self decides the exact value to give to each column, and ther is not like in a bar-plot that I know exactly which values are been plotting. Hi Matias, You are probably using the hist function in the graphics package. If so, that function returns a list containing components named counts (for frequency histograms) and density (for density histograms). So if you collect that list: histinfo-hist(...) histinfo$counts you will see the heights of the bars. As Greg has noted, many people do not agree with adding the counts to the plot, but if you want to do it, there are your numbers. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Null
On 05/05/2011 10:48 PM, pcc wrote: This is probably a very simple question but I am completely stumped!I am trying to do shapiro.wilk(x) test on a relatively small dataset(75) and each time my variable and keeps coming out as 'NULL', and shapiro.test(fcv) Error in complete.cases(x) : no input has determined the number of cases my text file looks like this: Hi pcc, I think the problem may be in the way you are reading in the data. Try this (I named the data file null.csv): read.csv(null.csv) shapiro.test(fcv[,1]) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [caret package] [trainControl] supplying predefined partitions to train with cross validation
Hi, I did the similar experiment with my data. may be following code will give you some idea. It might not be the best solution but for me it worked. please do share if you get other idea. Thank you CODE### library(dismo) set.seed(111) dd-read.delim(yourfile.csv,sep=,,header=T) # To keep a check on error options(error=utils::recover) # dd- data to be split for 10 Fold CV, this will split complete data into 10 fold number-kfold(dd, k=10) case 1: if k ==1 x-NULL; #retrieve all the index (from your data) for 1st fold in x, such that you can use it as a test set and remaining can be used as train set for #1stiteration. x-which(number==k) On Thu, May 5, 2011 at 11:43 PM, Fabon Dzogang fabon.dzog...@lip6.frwrote: Hi all, I run R 2.11.1 under ubuntu 10.10 and caret version 2.88. I use the caret package to compare different models on a dataset. In order to compare their different performances I would like to use the same data partitions for every models. I understand that using a LGOCV or a boot type re-sampling method along with the index argument of the trainControl function, one is able to supply a training partition to the train function. However, I would like to apply a 10-fold cross validation to validate the models and I did not find any way to supply some predefined partition (created with createFolds) in this setting. Any help ? Thank you and great package by the way ! Fabon Dzogang. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [caret package] [trainControl] supplying predefined partitions to train with cross validation
Hello, Thank you for your reply but I'm not sure your code answers my needs, from what I read it creates a 10-fold partition and then extracts the kth partition for future processing. My question was rather: once I have a 10-fold partition of my data, how to supply it to the train function of the caret package. Here's some sample code : folds - createFolds(my_dataset_classes, 10) # I can't use index=folds on this one, it will train on the 1/k and test on k-1 t_control - trainControl(method=cv, number=10) # here I would like train to take account of my predefined folds model - train(my_dataset_predictors, my_dataset_classes, method=svmLinear, trControl = t_control) Cheers, Fabon. On Fri, May 6, 2011 at 10:59 AM, neetika nath nikkiha...@gmail.com wrote: Hi, I did the similar experiment with my data. may be following code will give you some idea. It might not be the best solution but for me it worked. please do share if you get other idea. Thank you CODE### library(dismo) set.seed(111) dd-read.delim(yourfile.csv,sep=,,header=T) # To keep a check on error options(error=utils::recover) # dd- data to be split for 10 Fold CV, this will split complete data into 10 fold number-kfold(dd, k=10) case 1: if k ==1 x-NULL; #retrieve all the index (from your data) for 1st fold in x, such that you can use it as a test set and remaining can be used as train set for #1st iteration. x-which(number==k) On Thu, May 5, 2011 at 11:43 PM, Fabon Dzogang fabon.dzog...@lip6.fr wrote: Hi all, I run R 2.11.1 under ubuntu 10.10 and caret version 2.88. I use the caret package to compare different models on a dataset. In order to compare their different performances I would like to use the same data partitions for every models. I understand that using a LGOCV or a boot type re-sampling method along with the index argument of the trainControl function, one is able to supply a training partition to the train function. However, I would like to apply a 10-fold cross validation to validate the models and I did not find any way to supply some predefined partition (created with createFolds) in this setting. Any help ? Thank you and great package by the way ! Fabon Dzogang. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Fabon Dzogang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Insert values to histogram
An alternative approach: library(fdth) fd - fdt(rnorm(1e3, m=10, sd=2)) plot(fd) breaks - with(fd, seq(breaks[start], breaks[end], breaks[h])) mids - 0.5 * (breaks[-1] + breaks[-length(breaks)]) y - fd$table[, 2] text(x=mids, y=y, lab=y, pos=3) HTH, JCFaria -- View this message in context: http://r.789695.n4.nabble.com/Insert-values-to-histogram-tp3498140p3502237.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Insert values to histogram
Thank's a lot I owe you all 10 points of my grade!! -- View this message in context: http://r.789695.n4.nabble.com/Insert-values-to-histogram-tp3498140p3502017.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-users] Problems with the functions samplingSimple and morris
Dear R-users, I am trying to run sensitivity and uncertainty analysis with R using the following functions : - samplingSimple from the package SMURFER - morris from the package sensitivity I have a different problem for each of these two functions: - the function samplingSimple seems working but it generates text files where there is no matrix but just a list of values and I need a matrix for my analysis - the function morris does not work at all, when I try to run it, I get the following error message : Erreur dans cat(list(...), file, sep, fill, labels, append) : -argument 1 (type 'list') pas encore traité par cat » Does anybody have a solution ? Thanks a lot. Thibault Charles Solamen Audencia - 8 route de la Jonelière 44300 Nantes +33 2 40 37 46 76 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using $ accessor in GAM formula
G'day Rolf, On Fri, 06 May 2011 09:58:50 +1200 Rolf Turner rolf.tur...@xtra.co.nz wrote: but it's strange that the dodgey code throws an error with gam(dat1$y ~ s(dat1$x)) but not with gam(dat2$cf ~ s(dat2$s)) Something a bit subtle is going on; it would be nice to be able to understand it. Well, R traceback() 3: eval(expr, envir, enclos) 2: eval(inp, data, parent.frame()) 1: gam(dat$y ~ s(dat$x)) So the lines leading up to the problem seem to be the following from the gam() function: vars - all.vars(gp$fake.formula[-2]) inp - parse(text = paste(list(, paste(vars, collapse = ,), ))) if (!is.list(data) !is.data.frame(data)) data - as.data.frame(data) Setting R options(error=recover) running the code until the error occurs, and then examining the frame number for the gam() call shows that inp is expression(list( dat1,x )) in your first example and expression(list( dat2,s )) in your second example. In both examples, data is list() (not unsurprisingly). When, dl - eval(inp, data, parent.frame()) is executed, it tries to eval inp, in both cases dat1 and dat2 are found, obviously, in the parent frame. In your first example x is (typically) not found and an error is thrown, in your second example an object with name s is found in package:mgcv and the call to eval succeeds. dl becomes a list with two components, the first being, respectively, dat1 or dat2, and the second the body of the function s. (To verify that, you should probably issue the command debug(gam) and step through those first few lines of the function until you reach the above command.) The corollary is that you can use the name of any object that R will find in the parent frame, if it is another data set, then that data set will become the second component of inp. E.g.: R dat=data.frame(min=1:100,cf=sin(1:100/50)+rnorm(100,0,.05)) R gam(dat$cf ~ s(dat$min)) Family: gaussian Link function: identity Formula: dat$cf ~ s(dat$min) Estimated degrees of freedom: 3.8925 total = 4.892488 GCV score: 0.002704789 Or R dat=data.frame(BOD=1:100,cf=sin(1:100/50)+rnorm(100,0,.05)) R gam(dat$cf ~ s(dat$BOD)) Family: gaussian Link function: identity Formula: dat$cf ~ s(dat$BOD) Estimated degrees of freedom: 3.9393 total = 4.939297 GCV score: 0.002666985 Just out of pure academic interest. :-) Hope your academic curiosity is now satisfied. :) HTH. Cheers, Berwin == Full address A/Prof Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Maths and Stats (M019)+61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009e-mail: berwin.turl...@gmail.com Australiahttp://www.maths.uwa.edu.au/~berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] maximum likelihood convergence reproducing Anderson Blundell 1982 Econometrica R vs Stata
Dear R-help, I am trying to reproduce some results presented in a paper by Anderson and Blundell in 1982 in Econometrica using R. The estimation I want to reproduce concerns maximum likelihood estimation of a singular equation system. I can estimate the static model successfully in Stata but for the dynamic models I have difficulty getting convergence. My R program which uses the same likelihood function as in Stata has convergence properties even for the static case. I have copied my R program and the data below. I realise the code could be made more elegant - but it is short enough. Any ideas would be highly appreciated. ## model 18 lnl - function(theta,y1, y2, x1, x2, x3) { n - length(y1) beta - theta[1:8] e1 - y1 - theta[1] - theta[2]*x1 - theta[3]*x2 - theta[4]*x3 e2 - y2 - theta[5] - theta[6]*x1 - theta[7]*x2 - theta[8]*x3 e - cbind(e1, e2) sigma - t(e)%*%e logl - -1*n/2*(2*(1+log(2*pi)) + log(det(sigma))) return(-logl) } p - optim(0*c(1:8), lnl, method=BFGS, hessian=TRUE, y1=y1, y2=y2, x1=x1, x2=x2, x3=x3) year,y1,y2,x1,x2,x3 1929,0.554779,0.266051,9.87415,8.60371,3.75673 1930,0.516336,0.297473,9.68621,8.50492,3.80692 1931,0.508201,0.324199,9.4701,8.27596,3.80437 1932,0.500482,0.33958,9.24692,7.99221,3.76251 1933,0.501695,0.276974,9.35356,7.98968,3.69071 1934,0.591426,0.287008,9.42084,8.0362,3.63564 1935,0.565047,0.244096,9.53972,8.15803,3.59285 1936,0.605954,0.239187,9.6914,8.32009,3.56678 1937,0.620161,0.218232,9.76817,8.42001,3.57381 1938,0.592091,0.243161,9.51295,8.19771,3.6024 1939,0.613115,0.217042,9.68047,8.30987,3.58147 1940,0.632455,0.215269,9.78417,8.49624,3.57744 1941,0.663139,0.184409,10.0606,8.69868,3.6095 1942,0.698179,0.164348,10.2892,8.84523,3.4 1943,0.70459,0.146865,10.4731,8.93024,3.65388 1944,0.694067,0.161722,10.4465,8.96044,3.62434 1945,0.674668,0.197231,10.279,8.82522,3.61489 1946,0.635916,0.204232,10.1536,8.77547,3.67562 1947,0.642855,0.187224,10.2053,8.77481,3.82632 1948,0.641063,0.186566,10.2227,8.83821,3.96038 1949,0.646317,0.203646,10.1127,8.82364,4.0447 1950,0.645476,0.187497,10.2067,8.84161,4.08128 1951,0.63803,0.197361,10.2773,8.9401,4.10951 1952,0.634626,0.209992,10.283,9.01603,4.1693 1953,0.631144,0.219287,10.3217,9.06317,4.21727 1954,0.593088,0.235335,10.2101,9.05664,4.2567 1955,0.60736,0.227035,10.272,9.07566,4.29193 1956,0.607204,0.246631,10.2743,9.12407,4.32252 1957,0.586994,0.256784,10.2396,9.1588,4.37792 1958,0.548281,0.271022,10.1248,9.14025,4.42641 1959,0.553401,0.261815,10.2012,9.1598,4.4346 1960,0.552105,0.275137,10.1846,9.19297,4.43173 1961,0.544133,0.280783,10.1479,9.19533,4.44407 1962,0.55382,0.281286,10.197,9.21544,4.45074 1963,0.549951,0.28303,10.2036,9.22841,4.46403 1964,0.547204,0.291287,10.2271,9.23954,4.48447 1965,0.55511,0.281313,10.2882,9.26531,4.52057 1966,0.558182,0.280151,10.353,9.31675,4.58156 1967,0.545735,0.294385,10.3351,9.35382,4.65983 1968,0.538964,0.294593,10.3525,9.38361,4.71804 1969,0.542764,0.299927,10.3676,9.40725,4.76329 1970,0.534595,0.315319,10.2968,9.39139,4.81136 1971,0.545591,0.315828,10.2592,9.34121,4.84082 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] functions pandit and treebase in the package apTreeshape
I think those functions are now defunct (were only available in previous versions). S On Thursday, May 5, 2011 at 6:33 PM, Andrew Robinson wrote: Hi Arnau, please send the output of sessionInfo() and the exact commands and response that you used to install and load apTreeshape. Cheers Andrew On Thu, May 05, 2011 at 04:42:58PM +0200, Arnau Mir wrote: Hello. I'm trying to use the functions pandit and treebase. They are in the package apTreeshape. Once I've loaded the package, R responses: - no function pandit/treebase. Somebody knows why or what is the reason? Thanks, Arnau. Arnau Mir Torres Edifici A. Turmeda Campus UIB Ctra. Valldemossa, km. 7,5 07122 Palma de Mca. tel: (+34) 971172987 fax: (+34) 971173003 email: arnau@uib.es URL: http://dmi.uib.es/~arnau [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Andrew Robinson Program Manager, ACERA Department of Mathematics and Statistics Tel: +61-3-8344-6410 University of Melbourne, VIC 3010 Australia (prefer email) http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599 http://www.acera.unimelb.edu.au/ Forest Analytics with R (Springer, 2011) http://www.ms.unimelb.edu.au/FAwR/ Introduction to Scientific Programming and Simulation using R (CRC, 2009): http://www.ms.unimelb.edu.au/spuRs/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Draw a nomogram after glm
Please post the entire script next time, e.g., include require(rms). You have one line duplicated. Put this before the first use of lrm: d - datadist(donnee); options(datadist='d') Frank Komine wrote: Hi, I use datadist fonction in rms library in order to draw my nomogram. After reading, I try this code: f-lrm(Y~L+P,data=donnee) f - lrm(Y~L+P,data=donnee) d - datadist(f,data=donnee) options(datadist=d) f - lrm(Y~L+P) summary(f,L=c(0,506,10),P=c(45,646,10)) plot(Predict(f,L=200:800,P=3)) Unfortunately, I have error after the 2nd code: Erreur dans datadist(f, data = donnee) : program logic error Please could you provide me a document more simple which is more understandable for new R user. Thanks for your help. Komine - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Draw-a-nomogram-after-glm-tp3498144p3502614.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rcspline.problem
Please follow the posting guide. You didn't state which package you are using and didn't include a trivial self-reproducing example that causes the error. For your purpose the rms package is going to plot restricted cubic spline fits (and shaded confidence bands) more flexibly. Frank Haleh Ghaem Maralani wrote: Dear Dr ; I am a PhD student at Epidemiology department of National University of Singapore. I used R command (rcspline.plot) for plotting restricted cubic spline – the model is based on Cox. I managed to get a plot without adjustment for other covariates, but I have a problem regarding to adjusting the confounders. I applied below command to generate the matrix for covariates. m=as.matrix(age,sex) or m1=matrix(age,sex) or m2=cbind(age,sex) But, when I input . adj=m, or adj=m1, or adj=m2.. in the model, R gives below error: Error in pchisq(q, df, lower.tail, log.p) : Non-numeric argument to mathematical function In addition: Warning message: In coxph.fit(cbind(x, xx, adj), cbind(y, event), strata = NULL, : Loglik converged before variable 1,2,3,4 ; beta may be infinite. I would be grateful if you take my issue into your consideration and help me on this case Sincerely Yours Haleh Ghaem PhD student, NUS __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/rcspline-problem-tp3501627p3502623.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Uniroot - error
Hi, I have tried to use uniroot to solve a value (value a in my function) that gives f=0, and I repeat this process for 1 times(stimulations). However error occures from the 4625th stimulation - Error in uniroot(f, c(0, 2), maxiter = 1000, tol = 0.001) : f() values at end points not of opposite sign I have also tried interval of (lower=min(U), upper=max(U)) and it won't work as well. Can anyone help me as I have struggled for few days already and I have to finish it soon. Thanks. numsim=1 set.seed(12345) P = c() for (m in 1:numsim) { Y = rnorm(140,0.0125,(0.005^(1/2))) U = exp(X1) . .(sorry i have to skip the code in between otherwise ..my assignment will get penalty for plagarism according to those screening sotware) S = sum(.) f = function(a){sum(F*(answer^(910:1))) - S} answer = uniroot(f, c(0,2), maxiter=1000,tol=0.001)$root P[m] = answer^26 - 1 } all the vectors are correct; it works without stimulation; it also works for loop(1:4624) but after 4625 there's error. -- View this message in context: http://r.789695.n4.nabble.com/Uniroot-error-tp3502628p3502628.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Averaging uneven measurements by time with uneven numbers of measurements
Taking the final value for each 30-minute interval seems like it would get what I want. The problem is that sometimes this value would be 15 minutes before the end of the 30-minute interval. What would I use to pick up this value? -Original Message- From: ehl...@ucalgary.ca [mailto:ehl...@ucalgary.ca] Sent: Thursday, May 05, 2011 03:58 PM To: Thompson, Adele - adele_thomp...@cargill.com Cc: r-help@r-project.org Subject: Re: [R] Averaging uneven measurements by time with uneven numbers of measurements On 2011-05-05 14:20, Schatzi wrote: I do not want smoothing as the data should have jumps (it is weight left in feeding bunker). I was thinking of maybe using a histogram-like function and then averaging that. Not sure if this is possible. (It would be useful to include your original request - not everyone uses Nabble.) Actually, averaging *is* smoothing, but I suppose your intent is, for some reason, not to smooth across 30-minute boundaries. Perhaps you could use findInterval() to identify which measurements to average. Peter Ehlers - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/Averaging-uneven-measurements-by-time-with-uneven-numbers-of-measurements-tp3499337p3499386.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RV: R question
On May 6, 2011, at 4:03 AM, Philipp Pagel wrote: which is the maximum large of digits that R has?, because SQL work with 50 digits I think. I am wondering if that is binary or decimal. and I need a software that work with a lot of digits. The .Machine() command will provide some insight into these matters. On my device (and I suspect on all versions of R) .Machine is a built- in list and there is no .Machine() function. .Machine returns a list that includes $integer.max [1] 2147483647 And you can look at the FAQ: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f Which tells you that R can handle 53 digits _binary_ In agreement with these components of .Machine: $double.base [1] 2 $double.digits [1] 53 ... and the FAQ has some explicit warnings about trusting more than 16 digits decimal. And look at the package, 'gmp'. cu Philipp -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Uniroot - error
You should ask your instructor or teaching assistant for help. R-help is not for doing homework. Duncan Murdoch On 06/05/2011 9:00 AM, CarJabo wrote: Hi, I have tried to use uniroot to solve a value (value a in my function) that gives f=0, and I repeat this process for 1 times(stimulations). However error occures from the 4625th stimulation - Error in uniroot(f, c(0, 2), maxiter = 1000, tol = 0.001) : f() values at end points not of opposite sign I have also tried interval of (lower=min(U), upper=max(U)) and it won't work as well. Can anyone help me as I have struggled for few days already and I have to finish it soon. Thanks. numsim=1 set.seed(12345) P = c() for (m in 1:numsim) { Y = rnorm(140,0.0125,(0.005^(1/2))) U = exp(X1) . .(sorry i have to skip the code in between otherwise ..my assignment will get penalty for plagarism according to those screening sotware) S = sum(.) f = function(a){sum(F*(answer^(910:1))) - S} answer = uniroot(f, c(0,2), maxiter=1000,tol=0.001)$root P[m] = answer^26 - 1 } all the vectors are correct; it works without stimulation; it also works for loop(1:4624) but after 4625 there's error. -- View this message in context: http://r.789695.n4.nabble.com/Uniroot-error-tp3502628p3502628.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] For loop and sqldf
On Fri, Apr 29, 2011 at 4:27 PM, mathijsdevaan mathijsdev...@gmail.com wrote: Hi list, Can anyone tell my why the following does not work? Thanks a lot! Your help is very much appreciated. DF = data.frame(read.table(textConnection( B C D E F G 8025 1995 0 4 1 2 8025 1997 1 1 3 4 8026 1995 0 7 0 0 8026 1996 1 2 3 0 8026 1997 1 2 3 1 8026 1998 6 0 0 4 8026 1999 3 7 0 3 8027 1997 1 2 3 9 8027 1998 1 2 3 1 8027 1999 6 0 0 2 8028 1999 3 7 0 0 8029 1995 0 2 3 3 8029 1998 1 2 3 2 8029 1999 6 0 0 1),head=TRUE,stringsAsFactors=FALSE)) list-sort(unique(DF$C)) for (t in 1:length(list)) { year = as.character(list[t]) data[year]-sqldf('select * from DF where C = [year]') } I am trying to split up the data.frame into 5 new ones, one for every year. This has already been answered but just thought I would point out that the perhaps subtle point is that sqldf automatically loads data frames that it finds in your sql statement into the data base but it does not do anything with non-data frame variables. Thus DF is a data frame in your workspace is loaded into the database but year is not. Also at least in sqlite you can't put a constant in square brackets. To construct the desired sql string you can use paste, sprintf or gsubfn's perl-like $ string interpolation which is invoked by prefacing sqldf with fn$ and prefacing the variable to interpolate with a $. gsubfn is automatically loaded by sqldf. See http://gsubfn.googlecode.com for more on fn. library(sqldf) # test data DF - data.frame(a = 1:10, C = rep(1970:1971, each = 5)) year - 1970 sqldf(paste(select * from DF where C =, year)) sqldf(sprintf(select * from DF where C=%s, year)) fn$sqldf(select * from DF where C = $year) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for loop
On May 6, 2011, at 4:24 AM, Petr Savicky wrote: On Fri, May 06, 2011 at 02:28:57PM +1000, andre bedon wrote: Hi, I'm hoping someone can offer some advice:I have a matrix x of dimensions 160 by 1. I need to create a matrix y, where the first 7 elements are equal to x[1]^1/7, then the next 6 equal to x[2]^1/6, next seven x[3]^1/7 and so on all the way to the 1040th element. I have implemented this with a for loop an hour ago and it is still loading, can anyone offer any suggestions as to how I can create this matrix without using loops? I would really appreciate any suggestions. Hi. Since indexing x[1], x[2], ... is used and also the description of y corresponds more to a vector, let me first suggest a solution for vectors. x - rep(42, times=4) # any vector of even length x - x/c(7, 6) rep(x, times=rep(c(7, 6), length=length(x))) [1] 6 6 6 6 6 6 6 7 7 7 7 7 7 6 6 6 6 6 6 6 7 7 7 7 7 7 The input vector may be obtained using c() from a matrix. The output vector may be reformatted using matrix(). However, for a matrix solution, a more precise description of the question is needed. I certainly agree there, but you have pointed the way to a possible solution if the OP wants a solution that works along column-wise application of the 7 alternate with 6 rule applied to a smaller version that might be called minimal: mtx - matrix(1:160, 16, 10) mtx^(1/c(7,7,7,7,7,7,7,6,6,6,6,6,6) ) # argument recycling results in alternating lengths of 7 and 6 (takes less than 5 seconds, and you should expect a warning message since 160*1 is not evenly divisible by 13.) I am not smart enough to know whether mtx2[160,1] should be 160^(1/6) or (.)^(1/7) but this suggests it is (1/6): all.equal( mtx2[160,1] , 160^(1/7) ) [1] Mean relative difference: 0.2883231 all.equal( mtx2[160,1] , 160^(1/6) ) [1] TRUE If he wanted it to be row-wise then he could transpose, operate, and back-transpose. This, of course, assumes that the OP did not really mean x[1]^1/7 but rather meant x[1]^(1/7), since I know of no computer language where exponentiation is lower in precedence than division. Hope this helps. Petr Savicky. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RV: R question
On Fri, May 06, 2011 at 09:17:11AM -0400, David Winsemius wrote: On May 6, 2011, at 4:03 AM, Philipp Pagel wrote: The .Machine() command will provide some insight into these matters. On my device (and I suspect on all versions of R) .Machine is a built-in list and there is no .Machine() function. Oops - my fault. You are right, of course. cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Uniroot - error
sorry I am not asking someone to do my homework, as I have finished all the procedure. I am just wondering why this technical error occurs, so I can fix it myself. By the way i don't have any instructor or teaching assistant for help, so any suggestion for the error will be appreciated. Thanks very much. -- View this message in context: http://r.789695.n4.nabble.com/Uniroot-error-tp3502628p3502773.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two-way group mean prediction in survreg with three factors
Thank you very much for the reply. I tend to agree with your first suggestion. And that's exactly what I did. In other functions, an easier way to marginalize such a variable C (not necessarily a factor) is to use the option include=c(A,B,A:B) This essentially sets C at a value such that the contribution from C is 0. Unfortunately, this option is not available in survreg for some reason. Pang -Original Message- From: Andrew Robinson [mailto:a.robin...@ms.unimelb.edu.au] Sent: Thursday, May 05, 2011 7:55 PM To: Pang Du Cc: r-help@r-project.org Subject: Re: [R] two-way group mean prediction in survreg with three factors Even then, I think that there's a problem. If C is in the model, then the response varies by C. The simplest way is to pick a value for C, and then evaluate the group mean estimates of A and B (and C). Something in my brain keeps asking whether another way to marginalize C for the purposes of predicting A and B is just to remove it from the model, or alternatively to make it a random effect. Neither idea seems rock solid at this point. Cheers Andrew On Thu, May 05, 2011 at 09:37:15AM -0400, Pang Du wrote: Oops, I hope not too. Don't know why I had the brackets around B+C. My model is actually A*B+C. And I'm not sure how to obtain the two-way prediction of AB with C marginalized. Thanks. Pang -Original Message- From: Andrew Robinson [mailto:a.robin...@ms.unimelb.edu.au] Sent: Wednesday, May 04, 2011 10:13 PM To: Pang Du Cc: r-help@r-project.org Subject: Re: [R] two-way group mean prediction in survreg with three factors I hope not! Facetiousness aside, the model that you have fit contains C, and, indeed, an interaction between A and C. So, the effect of A upon the response variable depends on the level of C. The summary you want must marginalize C somehow, probably by a weighted or unweighted average across its levels. What does that summary really mean? Can you meaningfully average across the levels of a predictor that is included in the model as a main and an interaction term? Best wishes Andrew On Wed, May 04, 2011 at 12:24:50PM -0400, Pang Du wrote: I'm fitting a regression model for censored data with three categorical predictors, say A, B, C. My final model based on the survreg function is Surv(..) ~ A*(B+C). I know the three-way group mean estimates can be computed using the predict function. But is there any way to obtain two-way group mean estimates, say estimated group mean for (A1, B1)-group? The sample group means don't incorporate censoring and thus may not be appropriate here. Pang Du Virginia Tech [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Andrew Robinson Program Manager, ACERA Department of Mathematics and StatisticsTel: +61-3-8344-6410 University of Melbourne, VIC 3010 Australia (prefer email) http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599 http://www.acera.unimelb.edu.au/ Forest Analytics with R (Springer, 2011) http://www.ms.unimelb.edu.au/FAwR/ Introduction to Scientific Programming and Simulation using R (CRC, 2009): http://www.ms.unimelb.edu.au/spuRs/ -- Andrew Robinson Program Manager, ACERA Department of Mathematics and StatisticsTel: +61-3-8344-6410 University of Melbourne, VIC 3010 Australia (prefer email) http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599 http://www.acera.unimelb.edu.au/ Forest Analytics with R (Springer, 2011) http://www.ms.unimelb.edu.au/FAwR/ Introduction to Scientific Programming and Simulation using R (CRC, 2009): http://www.ms.unimelb.edu.au/spuRs/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pcse package error message concerning nrows of using data
Dear R Community, I am currently facing this seemingly obscure Problem with Panel Corrected Standard Errors (PCSE) following Beck Katz (1995). As the authors suggest, I regressed a linear model (tmodel) with lm() with option na.action=na.exclude (I have also tried other options here). My dataframe is organized in pooled times series fashion, but with various missing values along the spacial and temporal dimension. library(pcse) tmodel-lm(var1 ~ var2 + var3 + var4 + as.factor(year) , data=dataframe.2, na.action=na.exclude) pcse.tmodel-pcse(tmodel, groupN = country, groupT = year, pairwise=TRUE) Error in pcse(tmodel, groupN = country, groupT = year, pairwise = TRUE) : Length of groupN and groupT must equal nrows of using data. I have checked the length of groupN, groupT and the residuals of tmodel, only to find out they are equally long (N=1667.) To my knowledge, the pairwise=TRUE argument should have accounted for the Problem of missing values, i.e. an unbalanced panel. But pairwise=FALSE produces the same error message. Would repeating the regression only with cases without missing values solve the problem? I hope I have given sufficient information for you to be able to identify the problem. Thanks in advance, Johann Maier [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Long Term Prediction Time Series
Hello I'm interested in Long Term Prediction over time series: regarding it, I and other guys have developed STRATEGICO, a free and opensource tool at http://code. google.com/p/strategico/ Please have a look at it, test it online with your own time series and give us any feedbacks and suggestions (new models?) .. Do you have any documentation or R libraries/tools to suggest? Regards Matteo http://www.redaelli.org/matteo/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Uniroot - error
On Fri, May 6, 2011 at 6:33 AM, CarJabo carly_...@hellokitty.com wrote: sorry I am not asking someone to do my homework, as I have finished all the procedure. I am just wondering why this technical error occurs, so I can fix it myself. My guess would be it has something to do with the random data generated at the 4625th simulation, but you have not posted a reproducible example (as requested in the posting guide) so it is not really possible to say. By the way i don't have any instructor or teaching assistant for help, so any suggestion for the error will be appreciated. If whatever institution you are taking this at simply gives assignments, grades them and penalizes for plagiarism without having any instructors or teachers, I recommend moving to an institution where classes are taught by someone who can answer questions etc. As Dr. Murdoc (and the posting guide) said and say, R-help is not for homework. Thanks very much. Good luck, Josh -- View this message in context: http://r.789695.n4.nabble.com/Uniroot-error-tp3502628p3502773.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for loop
On Fri, May 06, 2011 at 02:28:57PM +1000, andre bedon wrote: Hi, I'm hoping someone can offer some advice:I have a matrix x of dimensions 160 by 1. I need to create a matrix y, where the first 7 elements are equal to x[1]^1/7, then the next 6 equal to x[2]^1/6, next seven x[3]^1/7 and so on all the way to the 1040th element. I have implemented this with a for loop an hour ago and it is still loading, can anyone offer any suggestions as to how I can create this matrix without using loops? I would really appreciate any suggestions. Hi. Thanks to a remark by David, i now see that x[1]^1/7 is meant as x[1]^(1/7). The following is a solution modified accordingly x - matrix(100, nrow=2, ncol=2) x - x^c(1/7, 1/6) rep(x, times=rep(c(7, 6), length=length(x))) [1] 1.930698 1.930698 1.930698 1.930698 1.930698 1.930698 1.930698 2.154435 [9] 2.154435 2.154435 2.154435 2.154435 2.154435 1.930698 1.930698 1.930698 [17] 1.930698 1.930698 1.930698 1.930698 2.154435 2.154435 2.154435 2.154435 [25] 2.154435 2.154435 The output is a vector. You require that the output has 6.5 times more elements than the input, since 1040/160/1 = 6.5. This corresponds to the understanding that odd elements should repeat 7 times and even elements 6 times. However, it is not clear, what the dimension of the output matrix should be. Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R CMD check warning
On 05.05.2011 21:20, Ray Brownrigg wrote: On 6/05/2011 6:06 a.m., swaraj basu wrote: Dear All, I am trying to build a package for a set of functions. I am able to build the package and its working fine. When I check it with R CMD check I get a following warning : no visible global function definition for ‘biocLite’ I have used biocLite to load a user defined library from within a function if that library is not pre-installed if(is.element(annotpkg, installed.packages()[,1]) == FALSE){ source(http://www.bioconductor.org/biocLite.R;) biocLite(annotpkg) library(annotpkg,character.only=TRUE) } Should I ignore this error or there is a workaround for the warning. My package is working fine though still I guess the warning has to have significance. Please help in clarifying this warning. Your source() call generates the function biocLite() in such a way that R CMD check cannot know, so a workaround is to precede the source() statement with something like: biocLite - function(x) 0 [I don't know if there is a 'best' way to do this.] In general Warnings are just that - they point you to a possible fault, but there is not enough information for the code to make a final determination. Or just set ask people (or try yourself) to use install.packages() with the corresponding BioC repository selected. I still do not get the point why install.packages() and friends are not sufficient. Best, Uwe HTH Ray Brownrigg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confidence intervals and polynomial fits
Hi all! I'm getting a model fit from glm() (a binary logistic regression fit, but I don't think that's important) for a formula that contains powers of the explanatory variable up to fourth. So the fit looks something like this (typing into mail; the actual fit code is complicated because it involves step-down and so forth): x_sq - x * x x_cb - x * x * x x_qt - x * x * x * x model - glm(outcome ~ x + x_sq + x_cb + x_qt, binomial) This works great, and I get a nice fit. My question regards the confidence intervals on that fit. I am calling: cis - confint.default(model, level=0.95) to get the confidence intervals for each coefficient. This confint.default() call gives me a table like this (where dispersal is the real name of what I'm calling x above): 2.5 % 97.5 % (Intercept) 8.7214297 8.9842533 dispersal -37.5621412 -35.6629981 dispersal_sq66.8077669 74.4490123 dispersal_cb -74.5963861 -62.8347057 dispersal_qt22.6590159 28.6506186 A side note: calling confint() instead of confint.default() is much, much slower -- my model has many other terms I'm not discussing here, and is fit to 300,000 data points -- and a check indicates that the intervals returned for my model by confint() are not virtually identical, so this is not the source of my problem: 2.5 % 97.5 % (Intercept) 8.7216693 8.9844958 dispersal -37.5643922 -35.6652276 dispersal_sq66.8121519 74.4534753 dispersal_cb -74.5995820 -62.8377766 dispersal_qt22.6592724 28.6509494 These tables show the problem: the 95% confidence limits for the quartic term are every bit as wide as the limits for the other terms. Since the quartic term coefficient gets multiplied by the fourth power of x, this means that the width of the confidence band starts out nice and narrow (when x is close to zero, where the width of the confidence band is pretty much just determined by the confidence limits on the intercept) but balloons out to be tremendously wide for larger values of x. The width of it looks quite unreasonable to me, given that my fit is constrained by 300,000 data points. Intuitively, I would have expected the confidence limits for the quartic term to be much narrower than for, say, the linear term, because the effect of variation in the quartic term is so much larger. But the way confidence intervals are calculated in R, at least, does not seem to follow my instincts here. In other words, predicted values using the 2.5% value of the linear coefficient and the 97.5% value of the linear coefficient are really not very different, but predicted values using the 2.5% value of the quadratic coefficient and the 97.5% value of the quadratic coefficient are enormously, wildly divergent -- far beyond the range of variability in the original data, in fact. I feel quite confident, in a seat-of-the-pants sense, that the actual 95% limits of that quartic coefficient are much, much narrower than what R is giving me. Looking at it another way: if I just exclude the quartic term from the glm() fit, I get a dramatically narrower confidence band, even though the fit to the data is not nearly as good (I'm keeping the fourth power for good reasons). This again seems to me to indicate that the confidence band with the quartic term included is artificially, and incorrectly, wide. If a fit with reasonably narrow confidence limits can be produced for the model with terms only up to cubic (or, taking this reductio even further, for a quadratic or a linear model, also), why does adding the quadratic term, which improves the quality of the fit to the data, cause the confidence limits to nevertheless balloon outwards? That seems fundamentally illogical to me, but maybe my instincts are bad. So what's going on here? Is this just a limitation of the way confint() computes confidence intervals? Is there a better function to use, that is compatible with binomial glm() fits? Or am I misinterpreting the meaning of the confidence limits in some way? Or what? Thanks for any advice. I've had trouble finding examples of polynomial fits like this; people sometimes fit the square of the explanatory variable, of course, but going up to fourth powers seems to be quite uncommon, so I've had trouble finding out how others have dealt with these sorts of issues that crop up only with higher powers. Ben Haller McGill University http://biology.mcgill.ca/grad/ben/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence intervals and polynomial fits
Hi Ben, From what you have written, I am not exactly sure what your seat-of-the-pant sense is coming from. My pantseat typically does not tell me much; however, quartic trends tend to less stable than linear, so I am not terribly surprised. As two side notes: x_qt - x^4 # shorter code-wise and you can certain just enter a quartic term if the data is just quartic and you are not really itnerested in the lower trends. Cheers, Josh On Fri, May 6, 2011 at 8:35 AM, Ben Haller rh...@sticksoftware.com wrote: Hi all! I'm getting a model fit from glm() (a binary logistic regression fit, but I don't think that's important) for a formula that contains powers of the explanatory variable up to fourth. So the fit looks something like this (typing into mail; the actual fit code is complicated because it involves step-down and so forth): x_sq - x * x x_cb - x * x * x x_qt - x * x * x * x model - glm(outcome ~ x + x_sq + x_cb + x_qt, binomial) This works great, and I get a nice fit. My question regards the confidence intervals on that fit. I am calling: cis - confint.default(model, level=0.95) to get the confidence intervals for each coefficient. This confint.default() call gives me a table like this (where dispersal is the real name of what I'm calling x above): 2.5 % 97.5 % (Intercept) 8.7214297 8.9842533 dispersal -37.5621412 -35.6629981 dispersal_sq 66.8077669 74.4490123 dispersal_cb -74.5963861 -62.8347057 dispersal_qt 22.6590159 28.6506186 A side note: calling confint() instead of confint.default() is much, much slower -- my model has many other terms I'm not discussing here, and is fit to 300,000 data points -- and a check indicates that the intervals returned for my model by confint() are not virtually identical, so this is not the source of my problem: 2.5 % 97.5 % (Intercept) 8.7216693 8.9844958 dispersal -37.5643922 -35.6652276 dispersal_sq 66.8121519 74.4534753 dispersal_cb -74.5995820 -62.8377766 dispersal_qt 22.6592724 28.6509494 These tables show the problem: the 95% confidence limits for the quartic term are every bit as wide as the limits for the other terms. Since the quartic term coefficient gets multiplied by the fourth power of x, this means that the width of the confidence band starts out nice and narrow (when x is close to zero, where the width of the confidence band is pretty much just determined by the confidence limits on the intercept) but balloons out to be tremendously wide for larger values of x. The width of it looks quite unreasonable to me, given that my fit is constrained by 300,000 data points. Intuitively, I would have expected the confidence limits for the quartic term to be much narrower than for, say, the linear term, because the effect of variation in the quartic term is so much larger. But the way confidence intervals are calculated in R, at least, does not seem to follow my instincts here. In other words, predicted values using the 2.5% value of the linear coefficient and the 97.5% value of the linear coefficient are really not very different, but predicted values using the 2.5% value of the quadratic coefficient and the 97.5% value of the quadratic coefficient are enormously, wildly divergent -- far beyond the range of variability in the original data, in fact. I feel quite confident, in a seat-of-the-pants sense, that the actual 95% limits of that quartic coefficient are much, much narrower than what R is giving me. Looking at it another way: if I just exclude the quartic term from the glm() fit, I get a dramatically narrower confidence band, even though the fit to the data is not nearly as good (I'm keeping the fourth power for good reasons). This again seems to me to indicate that the confidence band with the quartic term included is artificially, and incorrectly, wide. If a fit with reasonably narrow confidence limits can be produced for the model with terms only up to cubic (or, taking this reductio even further, for a quadratic or a linear model, also), why does adding the quadratic term, which improves the quality of the fit to the data, cause the confidence limits to nevertheless balloon outwards? That seems fundamentally illogical to me, but maybe my instincts are bad. So what's going on here? Is this just a limitation of the way confint() computes confidence intervals? Is there a better function to use, that is compatible with binomial glm() fits? Or am I misinterpreting the meaning of the confidence limits in some way? Or what? Thanks for any advice. I've had trouble finding examples of polynomial fits like this; people sometimes fit the square of the explanatory variable, of course, but going up to fourth powers seems to be
Re: [R] reading a column as a character vector
The strsplit function is probably the closest R function to perls split function. For more detailed control the gsubfn package can be useful. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Gamliel Beyderman Sent: Thursday, May 05, 2011 3:35 PM To: r-help@r-project.org Subject: [R] reading a column as a character vector Hi! I have 2 columns (even though the data looks like there's more columns than just two) of data in the following format: 0,58905313R0EOL 229742002R0EOL 58905312R0EOL 1,58905317R0DBL 58905303R0DBL 58905313R0IL 58905313R0VH 58905313R0EOL 223354003R0IL 223354003R0VH 58905308R0DBL 58905308R0VM 58905301R0DBL 229742002R0IL 229742002R0VH 229742002R0EOL I can change the format of the input (remove quotes, add spaces, only put quotes around the entire list of codes...) The first column is numeric, the second column is a character vector of event codes. Ultimately, I want to to transform the second column into a factor where each event code (such as 58905313R0EOL or 216918000R0DBL) is a separate level. while the following statement works: reduce2-read.table(reduce2.csv, sep=,, colClasses=c(integer,factor)) it does not know to break the event vectors into separate levels, the factor it creates is wrong. Perhaps there's something in R similar to split in Perl... Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] split character vector by multiple keywords simultaneously
Will all the keywords always be present in the same order? Or are you looking for the keywords, but some may be absent or in different orders? Look into the gsubfn package for some tools that could help. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of sunny Sent: Wednesday, May 04, 2011 5:09 PM To: r-help@r-project.org Subject: [R] split character vector by multiple keywords simultaneously Hi. I have a character vector that looks like this: temp - c(Company name: The first company General Manager: John Doe I Managers: John Doe II, John Doe III,Company name: The second company General Manager: Jane Doe I,Company name: The third company Managers: Jane Doe II, Jane Doe III) temp [1] Company name: The first company General Manager: John Doe I Managers: John Doe II, John Doe III [2] Company name: The second company General Manager: Jane Doe I [3] Company name: The third company Managers: Jane Doe II, Jane Doe III I know all the keywords, i.e. Company name:, General Manager:, Managers: etc. I'm looking for a way to split this character vector into multiple character vectors, with one column for each keyword and the corresponding values for each, i.e. Company name General Manager Managers 1 The first companyJohn Doe IJohn Doe II, John Doe III 2 The second companyJane Doe I 3 The third company Jane Doe II, Jane Doe III I have tried a lot to find something suitable but haven't so far. Any help will be greatly appreciated. I am running R-2.12.1 on x86_64 linux. Thanks. -- View this message in context: http://r.789695.n4.nabble.com/split-character-vector-by-multiple-keywords-simultaneously-tp3497033p3497033.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence intervals and polynomial fits
On May 6, 2011, at 11:35 AM, Ben Haller wrote: Hi all! I'm getting a model fit from glm() (a binary logistic regression fit, but I don't think that's important) for a formula that contains powers of the explanatory variable up to fourth. So the fit looks something like this (typing into mail; the actual fit code is complicated because it involves step-down and so forth): x_sq - x * x x_cb - x * x * x x_qt - x * x * x * x model - glm(outcome ~ x + x_sq + x_cb + x_qt, binomial) It might have been easier with: model - glm(outcome ~ poly(x, 4) , binomial) Since the authors of confint might have been expecting a poly() formulation the results might be more reliable. I'm just using lm() but I think the conclusions are more general: set.seed(123) x -seq(0, 4, by=0.1) y -seq(0, 4, by=0.1)^2 +rnorm(41) x2 - x^2 x3 - x^3 summary(lm(y~x+x2+x3 ) ) Call: lm(formula = y ~ x + x2 + x3) Residuals: Min 1Q Median 3Q Max -1.8528 -0.6146 -0.1164 0.6211 1.8923 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.457370.52499 0.871 0.3893 x -0.759891.15080 -0.660 0.5131 x2 1.309870.67330 1.945 0.0594 . x3 -0.035590.11058 -0.322 0.7494 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.9184 on 37 degrees of freedom Multiple R-squared: 0.9691, Adjusted R-squared: 0.9666 F-statistic: 386.8 on 3 and 37 DF, p-value: 2.2e-16 I wouldn't have been able to understand why a highly significant relationship overall could not be ascribed to any of the terms. See what happens with poly(x,3) summary( lm(y~poly(x,3) ) ) Call: lm(formula = y ~ poly(x, 3)) Residuals: Min 1Q Median 3Q Max -1.8528 -0.6146 -0.1164 0.6211 1.8923 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 5.4271 0.1434 37.839 2e-16 *** poly(x, 3)1 30.0235 0.9184 32.692 2e-16 *** poly(x, 3)2 8.7823 0.9184 9.563 1.53e-11 *** poly(x, 3)3 -0.2956 0.9184 -0.3220.749 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.9184 on 37 degrees of freedom Multiple R-squared: 0.9691, Adjusted R-squared: 0.9666 F-statistic: 386.8 on 3 and 37 DF, p-value: 2.2e-16 confint( lm(y~poly(x,3) ) ) 2.5 %97.5 % (Intercept) 5.136526 5.717749 poly(x, 3)1 28.162706 31.884351 poly(x, 3)2 6.921505 10.643149 poly(x, 3)3 -2.156431 1.565213 confint(lm(y~x+x2+x3 ) ) 2.5 %97.5 % (Intercept) -0.60636547 1.5211072 x -3.09164639 1.5718576 x2 -0.05437802 2.6741120 x3 -0.25964705 0.1884609 Neither version exactly captures the simulation input, which would have shown only an effect of the quadratic, but I didn't do any centering. At least the poly version shows the lower order terms up to quadratic as significant, whereas it's difficult to extract much meaningful out of the separately calculated term version. -- David. This works great, and I get a nice fit. My question regards the confidence intervals on that fit. I am calling: cis - confint.default(model, level=0.95) to get the confidence intervals for each coefficient. This confint.default() call gives me a table like this (where dispersal is the real name of what I'm calling x above): 2.5 % 97.5 % (Intercept) 8.7214297 8.9842533 dispersal -37.5621412 -35.6629981 dispersal_sq66.8077669 74.4490123 dispersal_cb -74.5963861 -62.8347057 dispersal_qt22.6590159 28.6506186 A side note: calling confint() instead of confint.default() is much, much slower -- my model has many other terms I'm not discussing here, and is fit to 300,000 data points -- and a check indicates that the intervals returned for my model by confint() are not virtually identical, so this is not the source of my problem: 2.5 % 97.5 % (Intercept) 8.7216693 8.9844958 dispersal -37.5643922 -35.6652276 dispersal_sq66.8121519 74.4534753 dispersal_cb -74.5995820 -62.8377766 dispersal_qt22.6592724 28.6509494 These tables show the problem: the 95% confidence limits for the quartic term are every bit as wide as the limits for the other terms. Since the quartic term coefficient gets multiplied by the fourth power of x, this means that the width of the confidence band starts out nice and narrow (when x is close to zero, where the width of the confidence band is pretty much just determined by the confidence limits on the intercept) but balloons out to be tremendously wide for larger values of x. The width of it looks quite unreasonable to me, given that my fit is constrained by 300,000 data points. Intuitively, I would have expected the confidence limits for
[R] How to alter circle size
Hello all, I'm trying to create a heatmap using 2 matrices I have: z and v. Both matrices represent different correlations for the same independent variables. The problem I have is that I wish to have the values from matrix z to be represented by color intensity while having the values from matrix v to be represented by circle size. I currently have the following in front of me and an unsure of what to add or change in order to achieve that goal. panel.corrgram.2 = function(x, y, z, subscripts, at = pretty(z), scale = 0.8, ...) { require(grid, quietly = TRUE) x - as.numeric(x)[subscripts] y - as.numeric(y)[subscripts] z - as.numeric(z)[subscripts] zcol - level.colors(z, at = at, ...) for (i in seq(along = z)) { lims - range(0, z[i]) tval - 2 * base::pi * seq(from = lims[1], to = lims[2], by = 0.01) grid.polygon(x = x[i] + .5 * scale * c(0, sin(tval)), y = y[i] + .5 * scale * c(0, cos(tval)), default.units = native, gp = gpar(fill = zcol[i])) grid.circle(x = x[i], y = y[i], r = .5 * scale, default.units = native) } } k=levelplot(signif(z,3), scales = list(x = list(rot = 45)), col.regions=col.sch, panel = panel.corrgram.2, label = num.circ, xlab=, ylab=, main=paste(output,receptor response)) print(k) -- Best, Dat Mai PhD Rotation Student Albert Einstein College of Medicine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence intervals and polynomial fits
From what you have written, I am not exactly sure what your seat-of-the-pant sense is coming from. My pantseat typically does not tell me much; however, quartic trends tend to less stable than linear, so I am not terribly surprised. My pantseat is not normally very informative either, but if you saw the width of the confidence limits I'm getting for the quartic coefficient, I think your pantseat would agree with mine. :- The confidence band is staggeringly wide, many times the variation in the data itself; and with 300,000 data points to fit to, that just should not be the case. With that many data points, it seems quite obvious that one can be 95% confident that the true data lies within a band somewhere reasonably close to the band that the sample data actually fall into, not a band many times wider. As two side notes: x_qt - x^4 # shorter code-wise and you can certain just enter a quartic term if the data is just quartic and you are not really itnerested in the lower trends. Yep, for sure. Thanks! Ben Haller McGill University http://biology.mcgill.ca/grad/ben/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating binary variable depending on strings of two dataframes
Gabor Grothendieck wrote: On Tue, Dec 7, 2010 at 11:30 AM, Pete Pete lt;noxyp...@gmail.comgt; wrote: Hi, consider the following two dataframes: x1=c(232,3454,3455,342,13) x2=c(1,1,1,0,0) data1=data.frame(x1,x2) y1=c(232,232,3454,3454,3455,342,13,13,13,13) y2=c(E1,F3,F5,E1,E2,H4,F8,G3,E1,H2) data2=data.frame(y1,y2) I need a new column in dataframe data1 (x3), which is either 0 or 1 depending if the value E1 in y2 of data2 is true while x1=y1. The result of data1 should look like this: x1 x2 x3 1 232 1 1 2 3454 1 1 3 3455 1 0 4 342 0 0 5 13 0 1 I think a SQL command could help me but I am too inexperienced with it to get there. Try this: library(sqldf) sqldf(select x1, x2, max(y2 = 'E1') x3 from data1 d1 left join data2 d2 on (x1 = y1) group by x1, x2 order by d1.rowid) x1 x2 x3 1 232 1 1 2 3454 1 1 3 3455 1 0 4 342 0 0 5 13 0 1 -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. That works pretty cool but I need to automate this a bit more. Consider the following example: list1=c(A01,B04,A64,G84,F19) x1=c(232,3454,3455,342,13) x2=c(1,1,1,0,0) data1=data.frame(x1,x2) y1=c(232,232,3454,3454,3455,342,13,13,13,13) y2=c(E13,B04,F19,A64,E22,H44,F68,G84,F19,A01) data2=data.frame(y1,y2) I want now to creat a loop, which creates for every value in list1 a new binary variable in data1. Result should look like: x1 x2 A01 B04 A64 G84 F19 232 1 0 1 0 0 0 34541 0 0 1 0 1 34551 0 0 0 0 0 342 0 0 0 0 0 0 13 0 1 0 0 1 1 Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Creating-binary-variable-depending-on-strings-of-two-dataframes-tp3076724p3503334.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] understanding error messages in archmCopulaFit in fCopulae
Hello, I'm running version R x64 v2.12.2 on a 64bit windows 7 PC. I have two data vectors, x and y, and try to run archmCopulaFit. Most of the copulas produce errors. Can you tell me what the errors mean and if possible, how I can set archmCopulaFit options to make them run? I see in the documentation that there are arguments passed to the optimization function in use, 'nlminb', however, it's not clear to me how to set them. Thanks. Copulas 2, 4,7,8,11,12,14,15,18,19,20,21,22 have the following error: fit1 = archmCopulaFit(x,y,type=2) Error in if (alpha range[1] | alpha range[2]) { : missing value where TRUE/FALSE needed Copulas 3 has the following error: fit1 = archmCopulaFit(x,y,type=3) Error in if (alpha range[1] | alpha range[2]) { : missing value where TRUE/FALSE needed In addition: Warning messages: 1: In log(darchmCopula(u = U, v = V, alpha = x, type = type)) : NaNs produced 2: In log(darchmCopula(u = U, v = V, alpha = x, type = type)) : NaNs produced Copula 10: fit1 = archmCopulaFit(x,y,type=10) Error in if (alpha range[1] | alpha range[2]) { : missing value where TRUE/FALSE needed In addition: Warning message: In log(darchmCopula(u = U, v = V, alpha = x, type = type)) : NaNs produced Copulas 1,5,9,13,16,17 produce estimates. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cumsum in Lattice Panel Function
I'm trying to create an xyplot with a groups argument where the y-variable is the cumsum of the values stored in the input data frame. I almost have it, but I can't get it to automatically adjust the y-axis scale. How do I get the y-axis to automatically scale as it would have if the cumsum values had been stored in the data frame? Here is the code I have so far: require(lattice) dates - seq(as.Date(2011-01-01), as.Date(2011-04-30), days) g - 1:3 dat - data.frame(date = rep(dates, length(g)), group = rep(g, each = length(dates)), value = rnorm(length(dates)*length(g)) + 0.05) xyplot(value ~ date, data = dat, group = group, type = 'l', grid = TRUE, panel = panel.superpose, panel.groups = function(x, y, ...) { panel.xyplot(x, cumsum(y), ...) }) I want the result to look the same as if I had done dat$cumvalue - with(dat, unsplit(lapply(split(value, group), cumsum), group)) xyplot(cumvalue ~ date, data = dat, group = group, type = 'l', grid = TRUE) Thanks for your help. - Elliot Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC 134 Mount Auburn Street | Cambridge, MA | 02138 Phone: (617) 503-4619 | Email: elliot.bernst...@fdopartners.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using $ accessor in GAM formula
Hmmm After reading that email four times, I think I see what you mean. Checking for variables within particular scopes is probably one of the most challenging things in R, and I would guess in other languages too. In R it's compounded by situations when you're writing a function to accept variables as either a stand alone global variable, or as an element of a data.frame or list. I think this is a new problem, and I'll switch to the lengthier data.frame[,'x'] syntax in gam for now. By the way, about the $ accessors. I can see why some people don't like them, but they are a part of the language. And, I think they're a good part. They make the code much more readable, and I have yet to make a mistake using them (which makes me think that they can be used responsibly). Making code harder to read is its own source of error, and is not something that I take lightly! Thanks for the replies. And, thank you Rolf for the detailed analysis. Do you think that your or I should submit a bug report to the package maintainer? I'm not sure how that works. Very few of my challenges turn out to be actual bugs, but I think this one is. On Fri, May 6, 2011 at 4:53 AM, Berwin A Turlach berwin.turl...@gmail.comwrote: G'day Rolf, On Fri, 06 May 2011 09:58:50 +1200 Rolf Turner rolf.tur...@xtra.co.nz wrote: but it's strange that the dodgey code throws an error with gam(dat1$y ~ s(dat1$x)) but not with gam(dat2$cf ~ s(dat2$s)) Something a bit subtle is going on; it would be nice to be able to understand it. Well, R traceback() 3: eval(expr, envir, enclos) 2: eval(inp, data, parent.frame()) 1: gam(dat$y ~ s(dat$x)) So the lines leading up to the problem seem to be the following from the gam() function: vars - all.vars(gp$fake.formula[-2]) inp - parse(text = paste(list(, paste(vars, collapse = ,), ))) if (!is.list(data) !is.data.frame(data)) data - as.data.frame(data) Setting R options(error=recover) running the code until the error occurs, and then examining the frame number for the gam() call shows that inp is expression(list( dat1,x )) in your first example and expression(list( dat2,s )) in your second example. In both examples, data is list() (not unsurprisingly). When, dl - eval(inp, data, parent.frame()) is executed, it tries to eval inp, in both cases dat1 and dat2 are found, obviously, in the parent frame. In your first example x is (typically) not found and an error is thrown, in your second example an object with name s is found in package:mgcv and the call to eval succeeds. dl becomes a list with two components, the first being, respectively, dat1 or dat2, and the second the body of the function s. (To verify that, you should probably issue the command debug(gam) and step through those first few lines of the function until you reach the above command.) The corollary is that you can use the name of any object that R will find in the parent frame, if it is another data set, then that data set will become the second component of inp. E.g.: R dat=data.frame(min=1:100,cf=sin(1:100/50)+rnorm(100,0,.05)) R gam(dat$cf ~ s(dat$min)) Family: gaussian Link function: identity Formula: dat$cf ~ s(dat$min) Estimated degrees of freedom: 3.8925 total = 4.892488 GCV score: 0.002704789 Or R dat=data.frame(BOD=1:100,cf=sin(1:100/50)+rnorm(100,0,.05)) R gam(dat$cf ~ s(dat$BOD)) Family: gaussian Link function: identity Formula: dat$cf ~ s(dat$BOD) Estimated degrees of freedom: 3.9393 total = 4.939297 GCV score: 0.002666985 Just out of pure academic interest. :-) Hope your academic curiosity is now satisfied. :) HTH. Cheers, Berwin == Full address A/Prof Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Maths and Stats (M019)+61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009e-mail: berwin.turl...@gmail.com Australiahttp://www.maths.uwa.edu.au/~berwin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence intervals and polynomial fits
On May 6, 2011, at 12:31 PM, David Winsemius wrote: On May 6, 2011, at 11:35 AM, Ben Haller wrote: Hi all! I'm getting a model fit from glm() (a binary logistic regression fit, but I don't think that's important) for a formula that contains powers of the explanatory variable up to fourth. So the fit looks something like this (typing into mail; the actual fit code is complicated because it involves step-down and so forth): x_sq - x * x x_cb - x * x * x x_qt - x * x * x * x model - glm(outcome ~ x + x_sq + x_cb + x_qt, binomial) It might have been easier with: model - glm(outcome ~ poly(x, 4) , binomial) Interesting. The actual model I'm fitting has lots more terms, and needs to be able to be stepped down; sometimes some of the terms of the polynomials will get dropped while others get retained, for example. But more to the point, poly() seems to be doing something tricky involving orthogonal polynomials that I don't understand. I don't think I want whatever that is; I want my original variable x, plus its powers. For example, if I do: x runif(10) poly(x, 3) the columns I get are not x, x^2, x^3; they are something else. So the poly() fit is not equivalent to my fit. Since the authors of confint might have been expecting a poly() formulation the results might be more reliable. I'm just using lm() but I think the conclusions are more general: ... Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.457370.52499 0.871 0.3893 x -0.759891.15080 -0.660 0.5131 x2 1.309870.67330 1.945 0.0594 . x3 -0.035590.11058 -0.322 0.7494 ... Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 5.4271 0.1434 37.839 2e-16 *** poly(x, 3)1 30.0235 0.9184 32.692 2e-16 *** poly(x, 3)2 8.7823 0.9184 9.563 1.53e-11 *** poly(x, 3)3 -0.2956 0.9184 -0.3220.749 Here, in your illustration, is underscored what I mean. Whatever these orthogonal polynomial terms are that you're using, they are clearly not the original x, x^2 and x^3, and they're giving you a different fit than those do. I probably ought to learn about this technique, since it looks interesting; but for my purposes I need the fit to actually be in terms of x, since x is my explanatory variable. And the fit I'm getting is highly significant (all terms 2e-16), so the lack of fit problem you're illustrating does not seem to apply to my case. Or maybe I'm totally misunderstanding your point...? :- Thanks! Ben Haller McGill University http://biology.mcgill.ca/grad/ben/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using $ accessor in GAM formula
On Fri, 2011-05-06 at 11:20 -0500, Gene Leynes wrote: Hmmm After reading that email four times, I think I see what you mean. Checking for variables within particular scopes is probably one of the most challenging things in R, and I would guess in other languages too. In R it's compounded by situations when you're writing a function to accept variables as either a stand alone global variable, or as an element of a data.frame or list. Dear Gene, I think this is a new problem, and I'll switch to the lengthier data.frame[,'x'] syntax in gam for now. No, No, No, No, No! If there is one thing you **should** take from this thread is that there is no need to perform subsetting like that in a model formula. Why would you want (or prefer) to do: gam(dat$y ~ s(dat$x)) or gam(dat[, y] ~ s(dat[, x])) when gam(y ~ s(x), dat) will suffice? By the way, about the $ accessors. I can see why some people don't like them, but they are a part of the language. And, I think they're a good part. They make the code much more readable, and I have yet to make a mistake using them (which makes me think that they can be used responsibly). Making code harder to read is its own source of error, and is not something that I take lightly! If you use R's formula notation properly, you'll get cleaner code than either of your suggestions. Thanks for the replies. And, thank you Rolf for the detailed analysis. Do you think that your or I should submit a bug report to the package maintainer? I'm not sure how that works. Very few of my challenges turn out to be actual bugs, but I think this one is. I would consider this a bug - but in the sense that Simon didn't foresee the strange formulas that users of his software might concoct. By the way, I think you perhaps meant Berwin (re the detailed analysis)? HTH G On Fri, May 6, 2011 at 4:53 AM, Berwin A Turlach berwin.turl...@gmail.comwrote: G'day Rolf, On Fri, 06 May 2011 09:58:50 +1200 Rolf Turner rolf.tur...@xtra.co.nz wrote: but it's strange that the dodgey code throws an error with gam(dat1$y ~ s(dat1$x)) but not with gam(dat2$cf ~ s(dat2$s)) Something a bit subtle is going on; it would be nice to be able to understand it. Well, R traceback() 3: eval(expr, envir, enclos) 2: eval(inp, data, parent.frame()) 1: gam(dat$y ~ s(dat$x)) So the lines leading up to the problem seem to be the following from the gam() function: vars - all.vars(gp$fake.formula[-2]) inp - parse(text = paste(list(, paste(vars, collapse = ,), ))) if (!is.list(data) !is.data.frame(data)) data - as.data.frame(data) Setting R options(error=recover) running the code until the error occurs, and then examining the frame number for the gam() call shows that inp is expression(list( dat1,x )) in your first example and expression(list( dat2,s )) in your second example. In both examples, data is list() (not unsurprisingly). When, dl - eval(inp, data, parent.frame()) is executed, it tries to eval inp, in both cases dat1 and dat2 are found, obviously, in the parent frame. In your first example x is (typically) not found and an error is thrown, in your second example an object with name s is found in package:mgcv and the call to eval succeeds. dl becomes a list with two components, the first being, respectively, dat1 or dat2, and the second the body of the function s. (To verify that, you should probably issue the command debug(gam) and step through those first few lines of the function until you reach the above command.) The corollary is that you can use the name of any object that R will find in the parent frame, if it is another data set, then that data set will become the second component of inp. E.g.: R dat=data.frame(min=1:100,cf=sin(1:100/50)+rnorm(100,0,.05)) R gam(dat$cf ~ s(dat$min)) Family: gaussian Link function: identity Formula: dat$cf ~ s(dat$min) Estimated degrees of freedom: 3.8925 total = 4.892488 GCV score: 0.002704789 Or R dat=data.frame(BOD=1:100,cf=sin(1:100/50)+rnorm(100,0,.05)) R gam(dat$cf ~ s(dat$BOD)) Family: gaussian Link function: identity Formula: dat$cf ~ s(dat$BOD) Estimated degrees of freedom: 3.9393 total = 4.939297 GCV score: 0.002666985 Just out of pure academic interest. :-) Hope your academic curiosity is now satisfied. :) HTH. Cheers, Berwin == Full address A/Prof Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Maths and Stats (M019)+61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009e-mail: berwin.turl...@gmail.com Australiahttp://www.maths.uwa.edu.au/~berwin
Re: [R] Averaging uneven measurements by time with uneven numbers of measurements
Here is an example of what I would like to do: meas = measurements times = time of measurement measf = measurements in final, reduced matrix timesf = time of measurement in final matrix meas-runif(30) times-sort(runif(30)) inputmat-cbind(times,meas) names(inputmat)-c(timef,measf) I would then like to break the times up into 0.2 increments so the final matrix would look like this: timef measf 0.2 mean of meas where (time =0) and (time0.2) 0.4 . . . 1.0 Instead of measf being the mean, it could be the last measurement taken. -Original Message- From: ehl...@ucalgary.ca [mailto:ehl...@ucalgary.ca] Sent: Thursday, May 05, 2011 03:58 PM To: Thompson, Adele - adele_thomp...@cargill.com Cc: r-help@r-project.org Subject: Re: [R] Averaging uneven measurements by time with uneven numbers of measurements On 2011-05-05 14:20, Schatzi wrote: I do not want smoothing as the data should have jumps (it is weight left in feeding bunker). I was thinking of maybe using a histogram-like function and then averaging that. Not sure if this is possible. (It would be useful to include your original request - not everyone uses Nabble.) Actually, averaging *is* smoothing, but I suppose your intent is, for some reason, not to smooth across 30-minute boundaries. Perhaps you could use findInterval() to identify which measurements to average. Peter Ehlers - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/Averaging-uneven-measurements-by-time-with-uneven-numbers-of-measurements-tp3499337p3499386.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence intervals and polynomial fits
FWIW: Fitting higher order polynomials (say 2) is almost always a bad idea. See e.g. the Hastie, Tibshirani, et. al book on Statistical Learning for a detailed explanation why. The Wikipedia entry on smoothing splines also contains a brief explanation, I believe. Your ~0 P values for the coefficients also suggest problems/confusion (!) -- perhaps you need to consider something along the lines of functional data analysis for your analysis. Having no knowledge of your issues, these remarks are entirely speculative and may well be wrong. So feel free to dismiss. Nevertheless, you may find it useful to consult your local statistician for help. Cheers, Bert P.S. The raw = TRUE option of poly will fit raw, not orthogonal, polynomials. The fitted projections will be the same up to numerical error whichever basis is chosen, of course. On Fri, May 6, 2011 at 10:08 AM, Ben Haller rh...@sticksoftware.com wrote: On May 6, 2011, at 12:31 PM, David Winsemius wrote: On May 6, 2011, at 11:35 AM, Ben Haller wrote: Hi all! I'm getting a model fit from glm() (a binary logistic regression fit, but I don't think that's important) for a formula that contains powers of the explanatory variable up to fourth. So the fit looks something like this (typing into mail; the actual fit code is complicated because it involves step-down and so forth): x_sq - x * x x_cb - x * x * x x_qt - x * x * x * x model - glm(outcome ~ x + x_sq + x_cb + x_qt, binomial) It might have been easier with: model - glm(outcome ~ poly(x, 4) , binomial) Interesting. The actual model I'm fitting has lots more terms, and needs to be able to be stepped down; sometimes some of the terms of the polynomials will get dropped while others get retained, for example. But more to the point, poly() seems to be doing something tricky involving orthogonal polynomials that I don't understand. I don't think I want whatever that is; I want my original variable x, plus its powers. For example, if I do: x runif(10) poly(x, 3) the columns I get are not x, x^2, x^3; they are something else. So the poly() fit is not equivalent to my fit. Since the authors of confint might have been expecting a poly() formulation the results might be more reliable. I'm just using lm() but I think the conclusions are more general: ... Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.45737 0.52499 0.871 0.3893 x -0.75989 1.15080 -0.660 0.5131 x2 1.30987 0.67330 1.945 0.0594 . x3 -0.03559 0.11058 -0.322 0.7494 ... Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 5.4271 0.1434 37.839 2e-16 *** poly(x, 3)1 30.0235 0.9184 32.692 2e-16 *** poly(x, 3)2 8.7823 0.9184 9.563 1.53e-11 *** poly(x, 3)3 -0.2956 0.9184 -0.322 0.749 Here, in your illustration, is underscored what I mean. Whatever these orthogonal polynomial terms are that you're using, they are clearly not the original x, x^2 and x^3, and they're giving you a different fit than those do. I probably ought to learn about this technique, since it looks interesting; but for my purposes I need the fit to actually be in terms of x, since x is my explanatory variable. And the fit I'm getting is highly significant (all terms 2e-16), so the lack of fit problem you're illustrating does not seem to apply to my case. Or maybe I'm totally misunderstanding your point...? :- Thanks! Ben Haller McGill University http://biology.mcgill.ca/grad/ben/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Panels order in lattice graphs
Thanks to all that reply to my post. The best solution that answers entirely to my question and can be used as a general function and not case by case is the one sent by the package author. Many thanks to everybody. It was helpful. Cristina On 05/05/2011 10:44, Deepayan Sarkar wrote: On Wed, May 4, 2011 at 9:20 PM, Cristina Silvacsi...@ipimar.pt wrote: Hi all, In lattice graphs, panels are drawn from left to right and bottom to top. The flag as.table=TRUE changes to left to right and top to bottom. Is there any way to change to first top to bottom and then left to right? didn´t find anything neither in Help pages nor Lattice book. See ?packet.panel.default. For example, p- xyplot(mpg ~ disp | factor(carb), mtcars, as.table = TRUE) print(p, packet.panel = packet.panel.default) my.packet.panel- function(layout, condlevels, page, row, column, ...) { tlayout- layout[c(2, 1, 3)] # switch row and column print(packet.panel.default(tlayout, condlevels, page = page, row = column, column = row, ...)) } print(p, packet.panel = my.packet.panel) -Deepayan -- -- Cristina Silva INRB/L-IPIMAR Unidade de Recursos Marinhos e Sustentabilidade Av. de Brasília, 1449-006 Lisboa Portugal Tel.: 351 21 3027096 Fax: 351 21 3015948 csi...@ipimar.pt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating binary variable depending on strings of two dataframes
On May 6, 2011, at 11:35 AM, Pete Pete wrote: Gabor Grothendieck wrote: On Tue, Dec 7, 2010 at 11:30 AM, Pete Pete lt;noxyp...@gmail.comgt; wrote: Hi, consider the following two dataframes: x1=c(232,3454,3455,342,13) x2=c(1,1,1,0,0) data1=data.frame(x1,x2) y1=c(232,232,3454,3454,3455,342,13,13,13,13) y2=c(E1,F3,F5,E1,E2,H4,F8,G3,E1,H2) data2=data.frame(y1,y2) I need a new column in dataframe data1 (x3), which is either 0 or 1 depending if the value E1 in y2 of data2 is true while x1=y1. The result of data1 should look like this: x1 x2 x3 1 232 1 1 2 3454 1 1 3 3455 1 0 4 342 0 0 5 13 0 1 I think a SQL command could help me but I am too inexperienced with it to get there. Try this: library(sqldf) sqldf(select x1, x2, max(y2 = 'E1') x3 from data1 d1 left join data2 d2 on (x1 = y1) group by x1, x2 order by d1.rowid) x1 x2 x3 1 232 1 1 2 3454 1 1 3 3455 1 0 4 342 0 0 5 13 0 1 snipped Gabor's sig That works pretty cool but I need to automate this a bit more. Consider the following example: list1=c(A01,B04,A64,G84,F19) x1=c(232,3454,3455,342,13) x2=c(1,1,1,0,0) data1=data.frame(x1,x2) y1=c(232,232,3454,3454,3455,342,13,13,13,13) y2=c(E13,B04,F19,A64,E22,H44,F68,G84,F19,A01) data2=data.frame(y1,y2) I want now to creat a loop, which creates for every value in list1 a new binary variable in data1. Result should look like: x1 x2 A01 B04 A64 G84 F19 232 1 0 1 0 0 0 34541 0 0 1 0 1 34551 0 0 0 0 0 342 0 0 0 0 0 0 13 0 1 0 0 1 1 Loops!?! We don't nee no steenking loops! xtb - with(data2, table(y1,y2)) cbind(data1, xtb[match(data1$x1, rownames(xtb)), ] ) x1 x2 A01 A64 B04 E13 E22 F19 F68 G84 H44 232 232 1 0 0 1 1 0 0 0 0 0 3454 3454 1 0 1 0 0 0 1 0 0 0 3455 3455 1 0 0 0 0 1 0 0 0 0 342 342 0 0 0 0 0 0 0 0 0 1 13 13 0 1 0 0 0 0 1 1 1 0 I am guessing that you were to ... er, busy? ... to complete the table? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence intervals and polynomial fits
On Fri, 6 May 2011, Bert Gunter wrote: FWIW: Fitting higher order polynomials (say 2) is almost always a bad idea. See e.g. the Hastie, Tibshirani, et. al book on Statistical Learning for a detailed explanation why. The Wikipedia entry on smoothing splines also contains a brief explanation, I believe. Your ~0 P values for the coefficients also suggest problems/confusion (!) -- perhaps you need to consider something along the lines of functional data analysis for your analysis. Having no knowledge of your issues, these remarks are entirely speculative and may well be wrong. So feel free to dismiss. Nevertheless, you may find it useful to consult your local statistician for help. That is the main piece of advice I would have given. But if you must DIY, consider the merits of orthogonal polynomials. Computing individual confidence intervals for highly correlated coefficients is very dubious practice. Without the example the posting guide asked for, we can only guess if that is what is happening. Cheers, Bert P.S. The raw = TRUE option of poly will fit raw, not orthogonal, polynomials. The fitted projections will be the same up to numerical error whichever basis is chosen, of course. On Fri, May 6, 2011 at 10:08 AM, Ben Haller rh...@sticksoftware.com wrote: On May 6, 2011, at 12:31 PM, David Winsemius wrote: On May 6, 2011, at 11:35 AM, Ben Haller wrote: Hi all! I'm getting a model fit from glm() (a binary logistic regression fit, but I don't think that's important) for a formula that contains powers of the explanatory variable up to fourth. So the fit looks something like this (typing into mail; the actual fit code is complicated because it involves step-down and so forth): x_sq - x * x x_cb - x * x * x x_qt - x * x * x * x model - glm(outcome ~ x + x_sq + x_cb + x_qt, binomial) It might have been easier with: model - glm(outcome ~ poly(x, 4) , binomial) Interesting. The actual model I'm fitting has lots more terms, and needs to be able to be stepped down; sometimes some of the terms of the polynomials will get dropped while others get retained, for example. But more to the point, poly() seems to be doing something tricky involving orthogonal polynomials that I don't understand. I don't think I want whatever that is; I want my original variable x, plus its powers. For example, if I do: x runif(10) poly(x, 3) the columns I get are not x, x^2, x^3; they are something else. So the poly() fit is not equivalent to my fit. Since the authors of confint might have been expecting a poly() formulation the results might be more reliable. I'm just using lm() but I think the conclusions are more general: ... Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.45737 0.52499 0.871 0.3893 x -0.75989 1.15080 -0.660 0.5131 x2 1.30987 0.67330 1.945 0.0594 . x3 -0.03559 0.11058 -0.322 0.7494 ... Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 5.4271 0.1434 37.839 2e-16 *** poly(x, 3)1 30.0235 0.9184 32.692 2e-16 *** poly(x, 3)2 8.7823 0.9184 9.563 1.53e-11 *** poly(x, 3)3 -0.2956 0.9184 -0.322 0.749 Here, in your illustration, is underscored what I mean. Whatever these orthogonal polynomial terms are that you're using, they are clearly not the original x, x^2 and x^3, and they're giving you a different fit than those do. I probably ought to learn about this technique, since it looks interesting; but for my purposes I need the fit to actually be in terms of x, since x is my explanatory variable. And the fit I'm getting is highly significant (all terms 2e-16), so the lack of fit problem you're illustrating does not seem to apply to my case. Or maybe I'm totally misunderstanding your point...? :- Thanks! Ben Haller McGill University http://biology.mcgill.ca/grad/ben/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
[R] histograms and for loops
The following code works mostly. It runs fine but... 1. Is there a way to increment the xlab for each graph? I would like to have Graph 1, Graph 2, etc. Right now it just gives me Graph i over and over again. 2. Is there a way to get the x-axis and y-axis to be bold or at least a darker color? Right now it is light gray. 3. Is there a way to modify this code to automatically save them? I assume the answer is do it manually. This is not the most important. for(i in 1:12){ hist(zNort1[,i], freq=FALSE, xlab=Graph i, col=blue, main=Standardized Residuals Histogram, ylim=c(0,1), xlim=c(-3.0,3.0)) zNortmin-min(zNort1[,1]) zNortmax-max(zNort1[,1]) zNortmean-mean(zNort1[,1]) zNortsd-sd(zNort1[,1]) X1-seq(-3.0, 3.0, by=.01) lines(X1, dnorm(X1, zNortmean, zNortsd) , col=black) } -- View this message in context: http://r.789695.n4.nabble.com/histograms-and-for-loops-tp3503648p3503648.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read a netcdf file _Fill_value=-32768
Hello I am a new user of R . and I ve problem with R and netcdf . I succed installation , I could use all examples . But when I take my netcf it is different . I want to do statistic on this kind of file . 1) first calculate mean . my data is like that through ncdump -h test.nc netcdf test { dimensions: lat = 301 ; lon = 401 ; time = UNLIMITED ; // (80 currently) variables: float lat(lat) ; lat:long_name = latitude ; lat:standard_name = latitude ; lat:units = degrees_north ; lat:valid_range = -60., 60. ; float lon(lon) ; lon:long_name = longitude ; lon:standard_name = longitude ; lon:units = degrees_east ; lon:valid_range = -100., 45. ; double sst(time, lat, lon) ; sst:long_name = sea surface temperature ; sst:standard_name = sea_surface_temperature ; sst:units = K ; sst:_FillValue = -32768. ; double time(time) ; time:long_name = time ; time:standard_name = time ; time:units = seconds since 1981-01-01 00:00:00 ; time:ancillary_variables = sst_data_processed_flag ; I succeed to read it with R, there is only one variable sst 80 days and lat = 301 lon = 401 (this file is already a concatenation with nco fonctions . Here is what I do : nc-open.ncdf(test.nc) summary (nc) data- get.var.ncdf(nc) print (data) answer extract : [331,] -32768.00 -32768.00 -32768.00 -32768.00 -32768.00 -32768.00 -32768.00 [332,] -32768.00 -32768.00 -32768.00 -32768.00 -32768.00 -32768.00 -32768.00 [,295][,296][,297][,298][,299][,300][,301] [1,] 0.10 0.10 0.12 0.15 0.15 0.14 0.15 [2,] 0.12 0.12 0.13 0.15 0.15 0.14 0.15 [3,] 0.13 0.13 0.13 0.16 0.14 0.14 0.14 [4,] 0.15 0.13 0.11 0.16 0.15 0.15 0.15 [5,] 0.17 0.16 0.15 0.15 0.16 0.17 0.15 #_fill_value is at -32768.00 [ reached getOption(max.print) -- omitted 69 row(s) and 79 matrix slice(s) ] OK too long mean(data, na.rm=TRUE) [1] -20020.35 summary(data) Min. 1st Qu.Median Mean 3rd Qu. Max. -32770.00 -32770.00 -32770.00 -20020.00 0.42 6.59 q() It is clear that the missing value_FillValue = -32768. here , is used to calculate the mean , the Median etc . But I don't succed to avoid NA value in the sum ??? I've seen set.missval.ncdf but I don't succeed anymore to make R understand that -32768 is a missing_value. What are my mistakes , has someone an example or a solution ? 2) Then after I want to calculate correlation between this one and another one , so my time dimension is 80 (80 days) but my time variable is always the same (because of concatenation with nco ) how can I calculate cor(data1,data2) and be sure that the correlation is not calculate between day1 and day50 for instance . Thanks Françoise Orain . __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] histograms and for loops
This should work!! for(i in 1:12){ xLabel - paste(Graph,i) plotTitle - paste(Graph,i,.jpg) jpeg(plotTitle) print(hist(zNort1[,i], freq=FALSE, xlab=xLabel, col=blue, main=Standardized Residuals Histogram, ylim=c(0,1), xlim=c(-3.0,3.0)),axes = FALSE) axis(1, col = blue,col.axis = blue) axis(2, col= red,col.axis = red) zNortmin-min(zNort1[,1]) zNortmax-max(zNort1[,1]) zNortmean-mean(zNort1[,1]) zNortsd-sd(zNort1[,1]) X1-seq(-3.0, 3.0, by=.01) lines(X1, dnorm(X1, zNortmean, zNortsd) , col=black) dev.off() } -- View this message in context: http://r.789695.n4.nabble.com/histograms-and-for-loops-tp3503648p3503758.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] histograms and for loops
1. ?paste ?sprintf 2. ?par (look at col.axis) ?axis 3. ?pdf ?png ?dev.copy -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of wwreith Sent: Friday, May 06, 2011 11:11 AM To: r-help@r-project.org Subject: [R] histograms and for loops The following code works mostly. It runs fine but... 1. Is there a way to increment the xlab for each graph? I would like to have Graph 1, Graph 2, etc. Right now it just gives me Graph i over and over again. 2. Is there a way to get the x-axis and y-axis to be bold or at least a darker color? Right now it is light gray. 3. Is there a way to modify this code to automatically save them? I assume the answer is do it manually. This is not the most important. for(i in 1:12){ hist(zNort1[,i], freq=FALSE, xlab=Graph i, col=blue, main=Standardized Residuals Histogram, ylim=c(0,1), xlim=c(-3.0,3.0)) zNortmin-min(zNort1[,1]) zNortmax-max(zNort1[,1]) zNortmean-mean(zNort1[,1]) zNortsd-sd(zNort1[,1]) X1-seq(-3.0, 3.0, by=.01) lines(X1, dnorm(X1, zNortmean, zNortsd) , col=black) } -- View this message in context: http://r.789695.n4.nabble.com/histograms- and-for-loops-tp3503648p3503648.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] histograms and for loops
On May 6, 2011, at 1:11 PM, wwreith wrote: The following code works mostly. It runs fine but... 1. Is there a way to increment the xlab for each graph? I would like to have Graph 1, Graph 2, etc. Right now it just gives me Graph i over and over again. Use the power of bquote. See modified code below. 2. Is there a way to get the x-axis and y-axis to be bold or at least a darker color? Right now it is light gray. What do you mean by the axis. The bounding box?, the ticks?, the labels? 3. Is there a way to modify this code to automatically save them? I assume the answer is do it manually. This is not the most important. ?savePlot for(i in 1:12){ hist(zNort1[,i], freq=FALSE, xlab=bquote(Graph *.(i) ) , col=blue, main=Standardized Residuals Histogram, ylim=c(0,1), xlim=c(-3.0,3.0)) zNortmin-min(zNort1[,1]) zNortmax-max(zNort1[,1]) zNortmean-mean(zNort1[,1]) zNortsd-sd(zNort1[,1]) X1-seq(-3.0, 3.0, by=.01) lines(X1, dnorm(X1, zNortmean, zNortsd) , col=black) } -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional plot length in R
Hello All, Let's say I have data spanning all quadrants of x-y plane. If I plot data with a certain x and y range using xlim and ylim or by using plot.formula as described in this link: http://www.mathkb.com/Uwe/Forum.aspx/statistics/5684/plotting-in-R *DF - data.frame(x = rnorm(1000), y = rnorm(1000)) * * str(DF)* *'data.frame': 1000 obs. of 2 variables: $ x: num -0.0265 0.1554 -0.1050 -0.9697 -0.3430 ... $ y: num 1.386 -1.356 -1.170 0.426 0.204 ... Now, let's plot the data meeting the criteria you indicated above: plot(y ~ x, data = DF, subset = (x 0) (y 0))* How then can I get the length of that data? If have 1000 data points and 200 lie in x,y0, how do I find that the length is 200? Thanks! Why-so-serious [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Draw a nomogram after glm
Don't attach the Design package. Use only rms. Please provide the output of lrm (print the f object). With such a strong model make sure you do not have a circularity somewhere. With nomogram you can specify ranges for the predictors; default is 10th smallest to 10th largest. rms will not make customized nomograms such as the one you included. Frank Hi all R users, Thanks Frank for your advices In fact I posted all my script. In the R Help, the script for nomogram is long and I took only the part what I think interesting in my case. I use informations from( datadist {Design} and rms {rms}) in the help of R software to do my code. I see that I progress with my nomogram. Because with the following code where I put my real variables names: library(Design) library(rms) d - datadist(Fire) options(datadist='d') f-lrm(Ignition~FMC+Charge,data=Fire) summary(f,FMC=c(0,506.40),Charge=c(45,646)) # min and max of FMC: 0 ,506.40 and the min and max of Charge: 45, 646 plot(Predict(f,FMC=0:506.40,Charge=646)) plot(nomogram(f, interact=list(Charge= c(.2,.7 # sorry, not understand vector c(.2,.7) from R help As result, I have the figure 1 then figure 2 but there is a problem. Because the 3rd line of Figure 2 Charge must to go 0 until 650. Also the linear predictor must to go 0 until 1. http://r.789695.n4.nabble.com/file/n3503828/nomo.jpeg http://r.789695.n4.nabble.com/file/n3503828/nomogram.jpeg After, is it possible to draw my nomogram like the 3rd graph that I found in Internet, it is easier to understand. http://r.789695.n4.nabble.com/file/n3503828/nogramme.bmp Nb: I Apologize for my bad english Thanks for your help Komine PhD student Dakar _Sénégal West Africa Komine wrote: Hi all R users I did a logistic regression with my binary variable Y (0/1) and 2 explanatory variables. Now I try to draw my nomogram with predictive value. I visited the help of R but I have problem to understand well the example. When I use glm fonction, I have a problem, thus I use lrm. My code is: modele-lrm(Y~L+P,data=donnee) fun- function(x) plogis(x-modele$coef[1]+modele$coef[2]) f - Newlabels(modele,c(L=poids,P=taille)) nomogram(f, fun=list('Prob Y=1'=plogis), fun.at=c(seq(0,1,by=.1),.95,.99), lmgp=.1, cex.axis=.6) fun.at=c(.01,.05,seq(.1,.9,by=.1),.95,.99), lmgp=.2, cex.axis=.6) options(Fire=NULL) Result is bad and I have this following error message: Erreur dans value.chk(at, i, NA, -nint, Limval, type.range = full) : variable L does not have limits defined by datadist Could you help me on the code to draw nomogram. Nb: my English is low, I apologize. Thank for your help Komine Komine wrote: Hi all R users I did a logistic regression with my binary variable Y (0/1) and 2 explanatory variables. Now I try to draw my nomogram with predictive value. I visited the help of R but I have problem to understand well the example. When I use glm fonction, I have a problem, thus I use lrm. My code is: modele-lrm(Y~L+P,data=donnee) fun- function(x) plogis(x-modele$coef[1]+modele$coef[2]) f - Newlabels(modele,c(L=poids,P=taille)) nomogram(f, fun=list('Prob Y=1'=plogis), fun.at=c(seq(0,1,by=.1),.95,.99), lmgp=.1, cex.axis=.6) fun.at=c(.01,.05,seq(.1,.9,by=.1),.95,.99), lmgp=.2, cex.axis=.6) options(Fire=NULL) Result is bad and I have this following error message: Erreur dans value.chk(at, i, NA, -nint, Limval, type.range = full) : variable L does not have limits defined by datadist Could you help me on the code to draw nomogram. Nb: my English is low, I apologize. Thank for your help Komine Komine wrote: Hi all R users I did a logistic regression with my binary variable Y (0/1) and 2 explanatory variables. Now I try to draw my nomogram with predictive value. I visited the help of R but I have problem to understand well the example. When I use glm fonction, I have a problem, thus I use lrm. My code is: modele-lrm(Y~L+P,data=donnee) fun- function(x) plogis(x-modele$coef[1]+modele$coef[2]) f - Newlabels(modele,c(L=poids,P=taille)) nomogram(f, fun=list('Prob Y=1'=plogis), fun.at=c(seq(0,1,by=.1),.95,.99), lmgp=.1, cex.axis=.6) fun.at=c(.01,.05,seq(.1,.9,by=.1),.95,.99), lmgp=.2, cex.axis=.6) options(Fire=NULL) Result is bad and I have this following error message: Erreur dans value.chk(at, i, NA, -nint, Limval, type.range = full) : variable L does not have limits defined by datadist Could you help me on the code to draw nomogram. Nb: my English is low, I apologize. Thank for your help Komine Komine wrote: Hi all R users I did a logistic regression with my binary variable Y (0/1) and 2 explanatory variables. Now I try to draw my nomogram with predictive value. I visited the help of R but I have problem to understand well the example. When I use glm fonction, I have a problem, thus I use lrm. My code is:
Re: [R] Conditional plot length in R
On May 6, 2011, at 2:28 PM, Pavan G wrote: Hello All, Let's say I have data spanning all quadrants of x-y plane. If I plot data with a certain x and y range using xlim and ylim or by using plot.formula as described in this link: http://www.mathkb.com/Uwe/Forum.aspx/statistics/5684/plotting-in-R *DF - data.frame(x = rnorm(1000), y = rnorm(1000)) * * str(DF)* *'data.frame': 1000 obs. of 2 variables: $ x: num -0.0265 0.1554 -0.1050 -0.9697 -0.3430 ... $ y: num 1.386 -1.356 -1.170 0.426 0.204 ... Now, let's plot the data meeting the criteria you indicated above: plot(y ~ x, data = DF, subset = (x 0) (y 0))* How then can I get the length of that data? If have 1000 data points and 200 lie in x,y0, how do I find that the length is 200? with(DF, sum((x 0) (y 0), na.rm=TRUE) ) # since it is 1 for TRUE and 0 for false after coercion. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] editor: not possible to change the variable name
Hi again everybody I have I new problem concerning the editor of R. It is possible to add a new variable column, but they all have the name var1.I read somewhere that it should be possible to change the variable name by clicking on it, but that doesn't work. Is that a bug or how is it possible to change the variable header? Many thanks Matthias -- View this message in context: http://r.789695.n4.nabble.com/editor-not-possible-to-change-the-variable-name-tp3503864p3503864.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Draw a nomogram after glm
Hi all R users, Thanks Frank for your advices In fact I posted all my script. In the R Help, the script for nomogram is long and I took only the part what I think interesting in my case. I use informations from( datadist {Design} and rms {rms}) in the help of R software to do my code. I see that I progress with my nomogram. Because with the following code where I put my real variables names: library(Design) library(rms) d - datadist(Fire) options(datadist='d') f-lrm(Ignition~FMC+Charge,data=Fire) summary(f,FMC=c(0,506.40),Charge=c(45,646)) # min and max of FMC: 0 ,506.40 and the min and max of Charge: 45, 646 plot(Predict(f,FMC=0:506.40,Charge=646)) plot(nomogram(f, interact=list(Charge= c(.2,.7 # sorry, not understand vector c(.2,.7) from R help As result, I have the figure 1 then figure 2 but there is a problem. Because the 3rd line of Figure 2 Charge must to go 0 until 650. Also the linear predictor must to go 0 until 1. http://r.789695.n4.nabble.com/file/n3503828/nomo.jpeg http://r.789695.n4.nabble.com/file/n3503828/nomogram.jpeg After, is it possible to draw my nomogram like the 3rd graph that I found in Internet, it is easier to understand. http://r.789695.n4.nabble.com/file/n3503828/nogramme.bmp Nb: I Apologize for my bad english Thanks for your help Komine PhD student Dakar _Sénégal West Africa Komine wrote: Hi all R users I did a logistic regression with my binary variable Y (0/1) and 2 explanatory variables. Now I try to draw my nomogram with predictive value. I visited the help of R but I have problem to understand well the example. When I use glm fonction, I have a problem, thus I use lrm. My code is: modele-lrm(Y~L+P,data=donnee) fun- function(x) plogis(x-modele$coef[1]+modele$coef[2]) f - Newlabels(modele,c(L=poids,P=taille)) nomogram(f, fun=list('Prob Y=1'=plogis), fun.at=c(seq(0,1,by=.1),.95,.99), lmgp=.1, cex.axis=.6) fun.at=c(.01,.05,seq(.1,.9,by=.1),.95,.99), lmgp=.2, cex.axis=.6) options(Fire=NULL) Result is bad and I have this following error message: Erreur dans value.chk(at, i, NA, -nint, Limval, type.range = full) : variable L does not have limits defined by datadist Could you help me on the code to draw nomogram. Nb: my English is low, I apologize. Thank for your help Komine Komine wrote: Hi all R users I did a logistic regression with my binary variable Y (0/1) and 2 explanatory variables. Now I try to draw my nomogram with predictive value. I visited the help of R but I have problem to understand well the example. When I use glm fonction, I have a problem, thus I use lrm. My code is: modele-lrm(Y~L+P,data=donnee) fun- function(x) plogis(x-modele$coef[1]+modele$coef[2]) f - Newlabels(modele,c(L=poids,P=taille)) nomogram(f, fun=list('Prob Y=1'=plogis), fun.at=c(seq(0,1,by=.1),.95,.99), lmgp=.1, cex.axis=.6) fun.at=c(.01,.05,seq(.1,.9,by=.1),.95,.99), lmgp=.2, cex.axis=.6) options(Fire=NULL) Result is bad and I have this following error message: Erreur dans value.chk(at, i, NA, -nint, Limval, type.range = full) : variable L does not have limits defined by datadist Could you help me on the code to draw nomogram. Nb: my English is low, I apologize. Thank for your help Komine Komine wrote: Hi all R users I did a logistic regression with my binary variable Y (0/1) and 2 explanatory variables. Now I try to draw my nomogram with predictive value. I visited the help of R but I have problem to understand well the example. When I use glm fonction, I have a problem, thus I use lrm. My code is: modele-lrm(Y~L+P,data=donnee) fun- function(x) plogis(x-modele$coef[1]+modele$coef[2]) f - Newlabels(modele,c(L=poids,P=taille)) nomogram(f, fun=list('Prob Y=1'=plogis), fun.at=c(seq(0,1,by=.1),.95,.99), lmgp=.1, cex.axis=.6) fun.at=c(.01,.05,seq(.1,.9,by=.1),.95,.99), lmgp=.2, cex.axis=.6) options(Fire=NULL) Result is bad and I have this following error message: Erreur dans value.chk(at, i, NA, -nint, Limval, type.range = full) : variable L does not have limits defined by datadist Could you help me on the code to draw nomogram. Nb: my English is low, I apologize. Thank for your help Komine Hi all R users I did a logistic regression with my binary variable Y (0/1) and 2 explanatory variables. Now I try to draw my nomogram with predictive value. I visited the help of R but I have problem to understand well the example. When I use glm fonction, I have a problem, thus I use lrm. My code is: modele-lrm(Y~L+P,data=donnee) fun- function(x) plogis(x-modele$coef[1]+modele$coef[2]) f - Newlabels(modele,c(L=poids,P=taille)) nomogram(f, fun=list('Prob Y=1'=plogis), fun.at=c(seq(0,1,by=.1),.95,.99), lmgp=.1, cex.axis=.6) fun.at=c(.01,.05,seq(.1,.9,by=.1),.95,.99), lmgp=.2, cex.axis=.6) options(Fire=NULL) Result is bad and I have this following error message:
[R] create arrays
In Matlab, an array can be created from 1 - 30 using the command similar to R which is 1:30. Then, to make the array step by 0.1 the command is 1:0.1:30 which is 1, 1.1, 1.2,...,29.9,30. How can I do this in R? - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/create-arrays-tp3503988p3503988.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create arrays
I can get around it by doing something like: as.matrix(rep(1,291))*row(as.matrix(rep(1,291)))/10+.9 I was just hoping for a simple command. Schatzi wrote: In Matlab, an array can be created from 1 - 30 using the command similar to R which is 1:30. Then, to make the array step by 0.1 the command is 1:0.1:30 which is 1, 1.1, 1.2,...,29.9,30. How can I do this in R? - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/create-arrays-tp3503988p3503998.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create arrays
On Fri, May 06, 2011 at 12:11:30PM -0700, Schatzi wrote: In Matlab, an array can be created from 1 - 30 using the command similar to R which is 1:30. Then, to make the array step by 0.1 the command is 1:0.1:30 which is 1, 1.1, 1.2,...,29.9,30. How can I do this in R? ... This may well be a hack, but 10:300/10 seemed to do it for me. Peace, david -- David H. Wolfskill r...@catwhisker.org Depriving a girl or boy of an opportunity for education is evil. See http://www.catwhisker.org/~david/publickey.gpg for my public key. pgpGpFjB08hOL.pgp Description: PGP signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create arrays
?seq -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Schatzi Sent: Friday, May 06, 2011 1:12 PM To: r-help@r-project.org Subject: [R] create arrays In Matlab, an array can be created from 1 - 30 using the command similar to R which is 1:30. Then, to make the array step by 0.1 the command is 1:0.1:30 which is 1, 1.1, 1.2,...,29.9,30. How can I do this in R? - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/create- arrays-tp3503988p3503988.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create arrays
On Fri, May 6, 2011 at 12:11 PM, Schatzi adele_thomp...@cargill.com wrote: In Matlab, an array can be created from 1 - 30 using the command similar to R which is 1:30. Then, to make the array step by 0.1 the command is 1:0.1:30 which is 1, 1.1, 1.2,...,29.9,30. How can I do this in R? Hmm, in this case, I would do it slightly differently: seq(from = 1, to = 30, by = .1) Cheers, Josh - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/create-arrays-tp3503988p3503988.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create arrays
Beautiful. -Original Message- From: greg.s...@imail.org [mailto:greg.s...@imail.org] Sent: Friday, May 06, 2011 02:17 PM To: Thompson, Adele - adele_thomp...@cargill.com; r-help@r-project.org Subject: RE: [R] create arrays ?seq -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Schatzi Sent: Friday, May 06, 2011 1:12 PM To: r-help@r-project.org Subject: [R] create arrays In Matlab, an array can be created from 1 - 30 using the command similar to R which is 1:30. Then, to make the array step by 0.1 the command is 1:0.1:30 which is 1, 1.1, 1.2,...,29.9,30. How can I do this in R? - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/create- arrays-tp3503988p3503988.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] editor: not possible to change the variable name
Hi Matthias, What do you mean by that doesn't work? What platform are you using? Using: sessionInfo() R version 2.13.0 (2011-04-13) Platform: x86_64-pc-mingw32/x64 (64-bit) fix(mydataframe) brings up the data editor, then if I click a variable name, and change it and shut down the data editor, it works just fine for me. You could also use the menu to bring up the editor. More information about the steps you are trying, the version of R, and exactly what is going wrong will help us help you. Best regards, Josh On Fri, May 6, 2011 at 11:32 AM, tornanddesperate matthiasn...@gmx.ch wrote: Hi again everybody I have I new problem concerning the editor of R. It is possible to add a new variable column, but they all have the name var1.I read somewhere that it should be possible to change the variable name by clicking on it, but that doesn't work. Is that a bug or how is it possible to change the variable header? Many thanks Matthias -- View this message in context: http://r.789695.n4.nabble.com/editor-not-possible-to-change-the-variable-name-tp3503864p3503864.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting months and years from Dates while keeping order
Hello! I'd like to take Dates and extract from them months and years - but so that it sorts correctly. For example: x1-seq(as.Date(2009-01-01), length = 14, by = month) (x1) order(x1) # produces correct order based on full dates # Of course, I could do format - but this way I am losing the Date quality of the data: x2-format(x1,%Y.%m) (x2) order(x2) Not sure it's possible at all: But is there some other way to extract just months and years from Dates (while ingoring days) - but so that it's still somehow a Date object? Thank you very much! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting months and years from Dates while keeping order
On Fri, May 6, 2011 at 4:07 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I'd like to take Dates and extract from them months and years - but so that it sorts correctly. For example: x1-seq(as.Date(2009-01-01), length = 14, by = month) (x1) order(x1) # produces correct order based on full dates # Of course, I could do format - but this way I am losing the Date quality of the data: x2-format(x1,%Y.%m) (x2) order(x2) Not sure it's possible at all: But is there some other way to extract just months and years from Dates (while ingoring days) - but so that it's still somehow a Date object? Try using the yearmon class in zoo. library(zoo) y - as.yearmon(x1) y [1] Jan 2009 Feb 2009 Mar 2009 Apr 2009 May 2009 Jun 2009 [7] Jul 2009 Aug 2009 Sep 2009 Oct 2009 Nov 2009 Dec 2009 [13] Jan 2010 Feb 2010 order(y) [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] replace NA
Hello all I have a geology map that has three level, bellow -geology lithology landscape landform landform level is used as covariate (with codes=1,2,3,4,5) for training of neural network, but this level has missing data as NA. I want to replace the missing data of landform level with 0 (zero). Finally, landform will have codes 0,1,2,3,4,5. please help me Thanks alot. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: grid
Dear All I trained a neural network for 200 data and I did prediction for a grid file (e.g. 100 points) such as below: snn-predict(nn, newdata=data.frame(wetness=wetnessgrid$band1, ndvi=ndvigrid$band1)) the pixels of snn is same with wetnessgrid or ndvigrid I want to convert this file (snn) to a map that I can open it in GIS software. Thanks alot [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence intervals and polynomial fits
On May 6, 2011, at 1:58 PM, Prof Brian Ripley wrote: On Fri, 6 May 2011, Bert Gunter wrote: FWIW: Fitting higher order polynomials (say 2) is almost always a bad idea. See e.g. the Hastie, Tibshirani, et. al book on Statistical Learning for a detailed explanation why. The Wikipedia entry on smoothing splines also contains a brief explanation, I believe. Your ~0 P values for the coefficients also suggest problems/confusion (!) -- perhaps you need to consider something along the lines of functional data analysis for your analysis. Having no knowledge of your issues, these remarks are entirely speculative and may well be wrong. So feel free to dismiss. Nevertheless, you may find it useful to consult your local statistician for help. That is the main piece of advice I would have given. But if you must DIY, consider the merits of orthogonal polynomials. Computing individual confidence intervals for highly correlated coefficients is very dubious practice. Without the example the posting guide asked for, we can only guess if that is what is happening. Thanks to both of you. Yes, I am admittedly out of my depth; the statistician I would normally ask is on sabbatical, and I'm a bit at sea. Of course McGill has a whole department of mathematics and statistics; I guess I ought to try to make a friend over there (I'm in the biology department). Anyhow, I've just downloaded the Hastie et al. book and will try to figure out whether my use of higher order polynomials is incorrect in my situation. Eliminating those would certainly solve my problem with the confidence intervals. :- I was figuring that the ~0 P-values for coefficients was just the result of my having 300,000 data points; I figured the regression procedure was thus able to pin down very accurate estimates of them. I'll look into functional data analysis as you recommend, though; I'm entirely unfamiliar with it. As for correlated coefficients: x, x^2, x^3 etc. would obviously be highly correlated, for values close to zero. Is this what you mean, as a potential source of problems? Or if you mean that the various other terms in my model might be correlated with x, that is not the case; each independent variable is completely uncorrelated with the others (this data comes from simulations, so the independent variables for each data point were in fact chosen by random drawing). It didn't seem easy to post an example, since my dataset is so large, but if either of you would be willing to look at this further, I could upload my dataset to a web server somewhere and post a link to it. In any case, thanks very much for your help; I'll look into the things you mentioned. Ben Haller McGill University http://biology.mcgill.ca/grad/ben/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace NA
Hi, If your geology map is a special kind of object, this may not work, but if you are just dealing with a data frame or matrix type object named, geology with columns, something like this ought to do the trick: geology[is.na(geology[, landform]), landform] - 0 ?is.na returns a logical vector of TRUE/FALSE whether a value is missing which is used to index the data and then 0 is assigned to all those positions. HTH, Josh On Fri, May 6, 2011 at 1:12 PM, azam jaafari azamjaaf...@yahoo.com wrote: Hello all I have a geology map that has three level, bellow -geology lithology landscape landform landform level is used as covariate (with codes=1,2,3,4,5) for training of neural network, but this level has missing data as NA. I want to replace the missing data of landform level with 0 (zero). Finally, landform will have codes 0,1,2,3,4,5. please help me Thanks alot. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Averaging uneven measurements by time with uneven numbers of measurements
I figured out a poor way to do what I want. meas-runif(30) times-sort(runif(30)) timesdec-seq(0,1,0.2) ltim-length(timesdec) storing-rep(0,ltim) for (i in 1:ltim) { if (i=1) {rowstart=1} else {rowstart-findInterval(timesdec[i-1],times)+1} rowfinal-findInterval(timesdec[i],times) storing[i]-mean(meas[rowstart:rowfinal]) } #end i-for loop -Original Message- From: ehl...@ucalgary.ca [mailto:ehl...@ucalgary.ca] Sent: Thursday, May 05, 2011 03:58 PM To: Thompson, Adele - adele_thomp...@cargill.com Cc: r-help@r-project.org Subject: Re: [R] Averaging uneven measurements by time with uneven numbers of measurements On 2011-05-05 14:20, Schatzi wrote: I do not want smoothing as the data should have jumps (it is weight left in feeding bunker). I was thinking of maybe using a histogram-like function and then averaging that. Not sure if this is possible. (It would be useful to include your original request - not everyone uses Nabble.) Actually, averaging *is* smoothing, but I suppose your intent is, for some reason, not to smooth across 30-minute boundaries. Perhaps you could use findInterval() to identify which measurements to average. Peter Ehlers - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/Averaging-uneven-measurements-by-time-with-uneven-numbers-of-measurements-tp3499337p3499386.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence intervals and polynomial fits
On May 6, 2011, at 4:16 PM, Ben Haller wrote: As for correlated coefficients: x, x^2, x^3 etc. would obviously be highly correlated, for values close to zero. Not just for x close to zero: cor( (10:20)^2, (10:20)^3 ) [1] 0.9961938 cor( (100:200)^2, (100:200)^3 ) [1] 0.9966219 Is this what you mean, as a potential source of problems? Or if you mean that the various other terms in my model might be correlated with x, that is not the case; each independent variable is completely uncorrelated with the others (this data comes from simulations, so the independent variables for each data point were in fact chosen by random drawing). It didn't seem easy to post an example, since my dataset is so large, but if either of you would be willing to look at this further, I could upload my dataset to a web server somewhere and post a link to it. In any case, thanks very much for your help; I'll look into the things you mentioned. Ben Haller McGill University http://biology.mcgill.ca/grad/ben/ David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] understanding error messages in archmCopulaFit in fCopulae
On May 6, 2011, at 12:14 PM, Lee, Eric wrote: Hello, I'm running version R x64 v2.12.2 on a 64bit windows 7 PC. I have two data vectors, x and y, and try to run archmCopulaFit. Most of the copulas produce errors. Can you tell me what the errors mean and if possible, how I can set archmCopulaFit options to make them run? I see in the documentation that there are arguments passed to the optimization function in use, 'nlminb', however, it's not clear to me how to set them. Thanks. Copulas 2, 4,7,8,11,12,14,15,18,19,20,21,22 have the following error: fit1 = archmCopulaFit(x,y,type=2) Error in if (alpha range[1] | alpha range[2]) { : missing value where TRUE/FALSE needed Seems like a fairly explanatory error message. The function is testing the assumptions going into it, possibly against default values, using some sort of range calculation, and is finding that your data doesn't meet those requirements. Copulas 3 has the following error: fit1 = archmCopulaFit(x,y,type=3) Error in if (alpha range[1] | alpha range[2]) { : missing value where TRUE/FALSE needed In addition: Warning messages: 1: In log(darchmCopula(u = U, v = V, alpha = x, type = type)) : NaNs produced 2: In log(darchmCopula(u = U, v = V, alpha = x, type = type)) : NaNs produced That message arises when you ask for the log of a number less than to equal to 0. Copula 10: fit1 = archmCopulaFit(x,y,type=10) Error in if (alpha range[1] | alpha range[2]) { : missing value where TRUE/FALSE needed In addition: Warning message: In log(darchmCopula(u = U, v = V, alpha = x, type = type)) : NaNs produced See above. Copulas 1,5,9,13,16,17 produce estimates. Those errors appear to be arising because of specific aspects of your data ... which seems to be missing in action. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting months and years from Dates while keeping order
Perfect - that's it, Gabor, thanks a lot! Dimitri On Fri, May 6, 2011 at 4:11 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Fri, May 6, 2011 at 4:07 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I'd like to take Dates and extract from them months and years - but so that it sorts correctly. For example: x1-seq(as.Date(2009-01-01), length = 14, by = month) (x1) order(x1) # produces correct order based on full dates # Of course, I could do format - but this way I am losing the Date quality of the data: x2-format(x1,%Y.%m) (x2) order(x2) Not sure it's possible at all: But is there some other way to extract just months and years from Dates (while ingoring days) - but so that it's still somehow a Date object? Try using the yearmon class in zoo. library(zoo) y - as.yearmon(x1) y [1] Jan 2009 Feb 2009 Mar 2009 Apr 2009 May 2009 Jun 2009 [7] Jul 2009 Aug 2009 Sep 2009 Oct 2009 Nov 2009 Dec 2009 [13] Jan 2010 Feb 2010 order(y) [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] editor: not possible to change the variable name
Hi Matthias, If you know the column number you want to change, it is pretty straightforward. ## use the builtin mtcars dataset as an example ## and store it in variable, 'x' x - mtcars ## change the second column name to cylinder colnames(x)[2] - cylinder ## compare the column names of 'x' with the unchanged 'mtcars' colnames(mtcars) colnames(x) You might also consider R-sig mac if it is at all related to your OS. I have never really used macs so I am unfamiliar with Rs default interface on them, though I would assume it is similar. Good luck, Josh On Fri, May 6, 2011 at 3:05 PM, Matthias Neff matthiasn...@gmx.ch wrote: Hello Joshua, Thank you for your prompt reply I am using: R version 2.13.0 (2011-04-13) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) The problem is that the part where a window should pop up just doesn't happen: I can click on the variable name as often as I want, nothing happens. Maybe this is a problem specific to the mac version of R. Do you know if there is an alternative way to change the variable name? Have a good day, Matthias On 06.05.2011, at 21:25, Joshua Wiley wrote: Hi Matthias, What do you mean by that doesn't work? What platform are you using? Using: sessionInfo() R version 2.13.0 (2011-04-13) Platform: x86_64-pc-mingw32/x64 (64-bit) fix(mydataframe) brings up the data editor, then if I click a variable name, and change it and shut down the data editor, it works just fine for me. You could also use the menu to bring up the editor. More information about the steps you are trying, the version of R, and exactly what is going wrong will help us help you. Best regards, Josh On Fri, May 6, 2011 at 11:32 AM, tornanddesperate matthiasn...@gmx.ch wrote: Hi again everybody I have I new problem concerning the editor of R. It is possible to add a new variable column, but they all have the name var1.I read somewhere that it should be possible to change the variable name by clicking on it, but that doesn't work. Is that a bug or how is it possible to change the variable header? Many thanks Matthias -- View this message in context: http://r.789695.n4.nabble.com/editor-not-possible-to-change-the-variable-name-tp3503864p3503864.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create arrays
Some good suggestions, just (as always) be aware of floating-point imprecision. See FAQ 7.31 s - seq(1,30,0.1) s[8] [1] 1.7 s[8] == 1.7 [1] FALSE Just trying to forestall future questions :-) Dan Daniel Nordlund Bothell, WA USA -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of adele_thomp...@cargill.com Sent: Friday, May 06, 2011 12:18 PM To: greg.s...@imail.org; r-help@r-project.org Subject: Re: [R] create arrays Beautiful. -Original Message- From: greg.s...@imail.org [mailto:greg.s...@imail.org] Sent: Friday, May 06, 2011 02:17 PM To: Thompson, Adele - adele_thomp...@cargill.com; r-help@r-project.org Subject: RE: [R] create arrays ?seq -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Schatzi Sent: Friday, May 06, 2011 1:12 PM To: r-help@r-project.org Subject: [R] create arrays In Matlab, an array can be created from 1 - 30 using the command similar to R which is 1:30. Then, to make the array step by 0.1 the command is 1:0.1:30 which is 1, 1.1, 1.2,...,29.9,30. How can I do this in R? - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/create- arrays-tp3503988p3503988.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generalized Hyperbolic distribution
How to use the package generalized hyperbolic distribution in order to estimate the four parameters in the NIG-distribution? I have a data material with stock returns that I want to fit the parameters to. -- View this message in context: http://r.789695.n4.nabble.com/Generalized-Hyperbolic-distribution-tp3504369p3504369.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Weighted box or violin plots in Lattice?
Is it possible to create weighted boxplots or violin plots in lattice? It seems that you can specify weights for panel.histogram() and panel.densityplot(), but not for panel.bwplot or panel.violin(). Please let me know if I've missed something in the package documentation. Thanks! -- Raphael D. Mazor Freshwater Biologist Southern California Coastal Water Research Project 3535 Harbor Blv Suite 110 Costa Mesa, CA 92626 Tel. (714) 755-3235 Fax. (714) 755-3299 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replacing missing values with values before it
I'm using the survey api. I am taking 1000 samples of size of 100 and replacing 20 of those values with missing values. Im trying to use sequential hot deck imputation, and thus I am trying to figure out how to replace missing values with the value before it. Other things I have to keep in mind is if there are two missing values side by side, how do I replace both those values with the value before. Also if the first of the sample of 100 is a missing value I will replace that with the mean of the population. Im pretty sure I have to write a loop, but if anyone can help me figuring how to write this I would appreciate it greatly. Thank you Nick Manginelli [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Custom Zero equivalent in R
Hi all, I am to find some way on how I can tell R to use this small number 10^-20 as zero by default. This means if any number is below this then that should be treated as negative, or if I divide something by any number less than that (in absolute term) then, Inf will be displayed etc. I have gone through the help page of options() function, however could not find anything on how to handle that issue. Can somebody help me on this regards? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sweave: no eps fig
Hi everyone, I'm using R, Latex and Sweave for some years now, but today it confuses me alot: Running Sweave produces only figures in .pdf format, no .eps figures. The header looks like this: echo=TRUE, fig=TRUE, label=Fig1= There was no error message. Does anybody have an idea? Any changes in the Sweave-package? Or a missing driver or something like that? I recently switched to another OS (Fedora RedHat) - might that be the reason? Grateful for any suggestions, Lena -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bigining with a Program of SVR
Thanks Max. I'm using now the library caret with my data. But the models showed a correlation under 0.7. Maybe the problem is with the variables that I'm using to generate the model. For that reason I'm asking for some packages that allow me to reduce the number of feature and to remove the worst features. I read recently an article taht combine Genetic algorithm with support vector regression to do that. Best Regards Yasset -- View this message in context: http://r.789695.n4.nabble.com/Bigining-with-a-Program-of-SVR-tp3484476p3503918.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] convert count data to binary data
Is there a way to generate a new dataframe that produces x lines based on the contents of a column? for example: I would like to generate a new dataframe with 70 lines of data[1, 1:3], 67 lines of data[2, 1:3], 75lines of data[3,1:3] and so on up to numrow = sum(count). data pop fam yesorno count 1 126 170 1 127 167 1 128 175 1 126 020 1 127 023 1 128 015 Thanks, Chris Department of Biological Science Florida State University 319 Stadium Drive Tallahassee, FL 32306-4295 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Draw a nomogram after glm
Hi Frank, For to answer your request: print(f) Logistic Regression Model lrm(formula = Ignition ~ FMC + Charge, data = Fire) Model Likelihood DiscriminationRank Discrim. Ratio TestIndexes Indexes Obs 231LR chi2 231.58R2 0.852C 0.976 0 96d.f. 2g8.972Dxy 0.953 1135Pr( chi2) 0.0001gr7878.577gamma 0.953 max |deriv| 1e-06 gp 0.466tau-a 0.465 Brier0.054 CoefS.E. Wald Z Pr(|Z|) Intercept 9.6937 1.5863 6.11 0.0001 FMC -0.0828 0.0138 -6.02 0.0001 Charge-0.0047 0.0021 -2.28 0.0223 I continue to try my nomogram. Thanks again Komine -- View this message in context: http://r.789695.n4.nabble.com/Draw-a-nomogram-after-glm-tp3498144p3504208.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
I'm using the survey api. I am taking 1000 samples of size of 100 and replacing 20 of those values with missing values. Im trying to use sequential hot deck imputation, and thus I am trying to figure out how to replace missing values with the value before it. Other things I have to keep in mind is if there are two missing values side by side, how do I replace both those values with the value before. Also if the first of the sample of 100 is a missing value I will replace that with the mean of the population. Im pretty sure I have to write a loop, but if anyone can help me figuring how to write this I would appreciate it greatly. Thank you Nick Manginelli [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] coxph and survfit issue - strata
Dear users, In a study with recurrent events: My objective is to get estimates of survival (obtained through a Cox model) by rank of recurrence and by treatment group. With the following code (corresponding to a model with a global effect of the treatment=rx), I get no error and manage to obtain what I want : data-(bladder) model1-coxph(Surv(stop,event)~rx+strata(enum)+cluster(id),data=bladder) data1-data.frame(rx=1) survfit(model1,newdata=data1) But with the model with strata by treatment interaction (corresponding to a model with an effect of the treatment by rank), I get the following error: model2-coxph(Surv(stop,event)~rx*strata(enum)+cluster(id),data=bladder) data1-data.frame(rx=1) survfit(model2,newdata=data1) Erreur dans strata(enum) : objet enum non trouvé =error in strata(enum) : object enum not found Would you have any idea to help me? Thanks in advance, Eva [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] My First Attempt at Screen Scraping with R
Hello Folks, I'm working on trying to scrape my first web site and ran into a issue because I'm really don't know anything about regular expressions in R. library(XML) library(RCurl) site - http://thisorthat.com/leader/month; site.doc - htmlParse(site, ?, xmlValue) At the ?, I realize that I need to insert a regex command which will decipher the contents of the web page...right? First, I'm not sure if the contents of the site would be considered a table and I'm also not sure how to disregard pictures when scraping the site. sessionInfo() R version 2.13.0 (2011-04-13) Platform: i686-pc-linux-gnu (32-bit) Please Help! Abraham [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave: no eps fig
On May 6, 2011, at 6:00 PM, mk90...@gmx.de wrote: Hi everyone, I'm using R, Latex and Sweave for some years now, but today it confuses me alot: Running Sweave produces only figures in .pdf format, no .eps figures. The header looks like this: echo=TRUE, fig=TRUE, label=Fig1= There was no error message. Does anybody have an idea? Any changes in the Sweave-package? Or a missing driver or something like that? I recently switched to another OS (Fedora RedHat) - might that be the reason? Grateful for any suggestions, Lena Starting with R 2.13.0, the default in Sweave is to produce only PDF and not EPS. You really only need EPS plots if you are doing postscript magic (eg. PSTricks, which I use) and need to use latex rather than pdflatex. From news(): o The default for Sweave() is to produce only PDF figures (rather than both EPS and PDF). Always a good idea to read news() if running a new version... :-) See ?RweaveLatex You can either set: echo=TRUE, fig=TRUE, label=Fig1, eps=true= or globally: \SweaveOpts{eps=true} HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert count data to binary data
On May 6, 2011, at 3:15 PM, Christopher G Oakley wrote: Is there a way to generate a new dataframe that produces x lines based on the contents of a column? for example: I would like to generate a new dataframe with 70 lines of data[1, 1:3], 67 lines of data[2, 1:3], 75lines of data[3,1:3] and so on up to numrow = sum(count). data pop fam yesorno count 1 126 170 1 127 167 1 128 175 1 126 020 1 127 023 1 128 015 Thanks, Chris # Better not to use 'data' as the name of an R object to avoid # confusion with certain functions where 'data' is the name of # an argument, such as regression models. R is smart enough # to generally know the difference, but it can make reading code # less confusing DF pop fam yesorno count 1 1 126 170 2 1 127 167 3 1 128 175 4 1 126 020 5 1 127 023 6 1 128 015 Use rep() to generate a vector of repeated indices (?rep): rep(1:nrow(DF), DF$count) [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [34] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [67] 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [100] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [133] 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 [166] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 [199] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 [232] 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 [265] 6 6 6 6 6 6 table(rep(1:nrow(DF), DF$count)) 1 2 3 4 5 6 70 67 75 20 23 15 Now use that vector as input: DF.New - DF[rep(1:nrow(DF), DF$count), 1:3] str(DF.New) 'data.frame': 270 obs. of 3 variables: $ pop: int 1 1 1 1 1 1 1 1 1 1 ... $ fam: int 126 126 126 126 126 126 126 126 126 126 ... $ yesorno: int 1 1 1 1 1 1 1 1 1 1 ... with(DF.New, table(fam, yesorno)) yesorno fam0 1 126 20 70 127 23 67 128 15 75 If you might need something more generalized to handle generating 'raw' data of various types from a contingency table, search the list archives for the function expand.dft, which I posted a few years ago and I think found its way into a couple of CRAN packages. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generalized Hyperbolic distribution
On May 6, 2011, at 5:17 PM, claire wrote: How to use the package generalized hyperbolic distribution in order to estimate the four parameters in the NIG-distribution? I have a data material with stock returns that I want to fit the parameters to. On StackOverfolw you have already been told: The ghyp package has functions fit.NIGuv (for univariate data) and a fit.NIGmv (for multivariate) data, and it's all very clearly described in the doc for the package. Did you look at it or try it out? by ... – Prasad Chalasani Why are you now cross-posting this question here? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] coxph and survfit issue - strata
On May 6, 2011, at 6:22 PM, Eva Bouguen wrote: Dear users, In a study with recurrent events: My objective is to get estimates of survival (obtained through a Cox model) by rank of recurrence and by treatment group. With the following code (corresponding to a model with a global effect of the treatment=rx), I get no error and manage to obtain what I want : data-(bladder) model1-coxph(Surv(stop,event)~rx+strata(enum) +cluster(id),data=bladder) data1-data.frame(rx=1) survfit(model1,newdata=data1) But with the model with strata by treatment interaction (corresponding to a model with an effect of the treatment by rank), I get the following error: model2-coxph(Surv(stop,event)~rx*strata(enum) +cluster(id),data=bladder) data1-data.frame(rx=1) survfit(model2,newdata=data1) Erreur dans strata(enum) : objet enum non trouvé =error in strata(enum) : object enum not found Right. Looks like a perfectly clear error message to me you created a newdata dataframe that did not have all the necessary columns that you used on the RHS of your formula. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.