Re: [R] Serverless databases in R
On Sun, Apr 18, 2010 at 11:30 PM, kMan kchambe...@gmail.com wrote: It was my understanding that .Rdata files were not very portable, and do not natively handle queries. Otherwise we'd all just use .RData files instead of farming the work out to SQL drivers external libraries, and colleagues who use, e.g. SAS or SPSS would also have no trouble with them. The platform in cross-platform to me generally means the operating system on which a program is running - and .Rdata files are perfectly portable between R on Linux, MacOSX, Windows, Solaris etc versions. You didn't mention portability to other statistical packages. You also didn't mention needing SQL, or what you wanted to do with your databases. I figured I'd just mention .Rdata files for completeness! There's also RJDBC and RODBC which can interface to anything with a JDBC or ODBC interface on your system. A .RData file could be considered as a serverless NoSQL database. There's a GSOC proposal to investigate interfaces to NoSQL databases and some info here: http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010:nosql_interface Isn't it odd that the open-source R community has developed functions for reading in proprietary SAS and SPSS format files, but (AFAIK) the commercial sector doesn't seem to support reading data from open-sourced and open-specced R .Rdata files? Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scanning only specific columns into R from a VERY large file
-Mensaje original- De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En nombre de Josh B Enviado el: sábado, 17 de abril de 2010 0:12 Para: R Help Asunto: [R] Scanning only specific columns into R from a VERY large file Hi, I turn to you, the R Sages, once again for help. You've never let me down! (1) Please make the following toy files: x - read.table(textConnection(var.1 var.2 var.3 var.1000 indv.1 1 5 9 7 indv.21 2 9 3 8), header = TRUE) y - read.table(textConnection(var.3 var.1000), header = TRUE) write.csv(x, file = x.csv) write.csv(y, file = y.csv) (2) Pretend you are starting with the files x.csv and y.csv. They come from another source -- an online database. Pretend that these files are much, much, much larger. Specifically: (a) Pretend that x.csv contains 1000 columns by 210,000 rows. (b) y.csv contains just header titles. Pretend that there are 90 header titles in y.csv in total. These header titles are a subset of the header titles in x.csv. (3) What I want to do is scan (or import, or whatever the appropriate word is) only a subset of the columns from x.csv into an R. Specifically, I only want to scan the columns of data from x.csv into R that are indicated in the file y.csv. I still want to scan in all 21 rows from x.csv, but only for the aforementioned columns listed in y.csv. Can you guys recommend a strategy for me? I think I need to use the scan command, based on the hugeness of x.csv, but I don't know what exactly to do. Specific code that gets the job done would be the most useful. Thank you very much in advance! Josh --- Try with something like do.call(cbind,scan(file=yourfile.csv,what=list(NULL,NULL,,0,NULL,0,NULL,NULL,...,NULL),flush=TRUE)) you have to work out how to set up the list of parameter 'what' to read the headers of 'y'. In the above the only columns read are those indicated by a '0'. HTH Ruben Dr. Rubén Roa-Ureta AZTI - Tecnalia / Marine Research Unit Txatxarramendi Ugartea z/g 48395 Sukarrieta (Bizkaia) SPAIN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: multiple variables pointing to single dataframe?
Hi r-help-boun...@r-project.org napsal dne 16.04.2010 16:15:40: Hi, I have a need to have 2 variables point to the same dataframe (d1), I What does it mean to point to data frame? Seems to me that it is something from C+. You can reference data frame by $ or by square brackets with as many variables as you want. see ?[ regards Petr don't want to simply copy the dataframe ( d2-d1 ) as my understanding is that this will create a second dataframe. Any suggestions on best practice here? Thank You, // // Alex Bryant // Software Developer // Integrated Clinical Systems, Inc. // 908-996-7208 Confidentiality Note: This e-mail, and any attachment to...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Truncated Normal Distribution and Truncated Pareto distribution
Dear R helpers, I have a bimodal dataset dealing with loss amounts. I have divided this dataset into two with the bounds for the first dataset i.e. dataset-A being 5,000$ to 100,000$ and the dataset-B deals with the losses exceeding 100,000$ i.e. dataset-B is left truncated. I need to fit truncated normal disribution to dataset - I having lower bound of 5000 and upper bound of 100,000. While I need to fit truncated Pareto for the lossess exceeding 100,000$. Is there any package in R which will guide me to fit these two distrubitions also giving KS (Kolmogorov Smirnov) test and Anderson Darling test results. Please guide Julia Only a man of Worth sees Worth in other men [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Truncated Normal Distribution and Truncated Pareto distribution
The truncreg package fits the truncated normal model. Le lundi 19 avril 2010 à 00:21 -0700, Julia Cains a écrit : Dear R helpers, I have a bimodal dataset dealing with loss amounts. I have divided this dataset into two with the bounds for the first dataset i.e. dataset-A being 5,000$ to 100,000$ and the dataset-B deals with the losses exceeding 100,000$ i.e. dataset-B is left truncated. I need to fit truncated normal disribution to dataset - I having lower bound of 5000 and upper bound of 100,000. While I need to fit truncated Pareto for the lossess exceeding 100,000$. Is there any package in R which will guide me to fit these two distrubitions also giving KS (Kolmogorov Smirnov) test and Anderson Darling test results. Please guide Julia Only a man of Worth sees Worth in other men [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Truncated Normal Distribution and Truncated Pareto distribution
-Mensaje original- De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En nombre de Julia Cains Enviado el: lunes, 19 de abril de 2010 9:22 Para: r-help@r-project.org Asunto: [R] Truncated Normal Distribution and Truncated Pareto distribution Dear R helpers, I have a bimodal dataset dealing with loss amounts. I have divided this dataset into two with the bounds for the first dataset i.e. dataset-A being 5,000$ to 100,000$ and the dataset-B deals with the losses exceeding 100,000$ i.e. dataset-B is left truncated. I need to fit truncated normal disribution to dataset - I having lower bound of 5000 and upper bound of 100,000. While I need to fit truncated Pareto for the lossess exceeding 100,000$. Is there any package in R which will guide me to fit these two distrubitions also giving KS (Kolmogorov Smirnov) test and Anderson Darling test results. Please guide Julia --- See library(MASS) ?fitdistr You can define your customized truncated density as a function in the parameter densfun of fitdistr. See also http://www.mail-archive.com/r-h...@stat.math.ethz.ch/msg34540.html http://www.mail-archive.com/r-h...@stat.math.ethz.ch/msg34548.html HTH Dr. Rubén Roa-Ureta AZTI - Tecnalia / Marine Research Unit Txatxarramendi Ugartea z/g 48395 Sukarrieta (Bizkaia) SPAIN Only a man of Worth sees Worth in other men [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R's unfortunate treatment of X11 failures
Hi, I use R in a terminal environment on a linux box, using X11 for graphics, often tunnelled to the terminal I'm using. When an internet connection dies (say, if the ssh connection dies when forwarding), R usually gives a scary warning like Error: X11 fatal IO error: please save work and shut down R. That's all well and good. Then, if I ignore it and try to plot() something, R dies quite ungracefully, producing (e.g.) something like this: R: ../../src/xcb_io.c:385: _XAllocID: Assertion `ret != inval_id' failed. Aborted ...and returning me to the shell. If I have not saved, I have lost my work. Clearly, this is my fault--I ignored the warning. However, I just had the above happen to me without my being warned...I lost some data, but not a lot. I am working on creating a reproducible case currently. That said, in the meantime, I would suggest a broader fix. I'm sure we can agree that R aborting due to failed assertions like this are pretty unfortunate. In the case of X11 failures like this, however, there is an alternative: Print an error message and take a preventative action. Error: X11 fatal IO error: X11 device disabled. ...and then call dev.off(). Indeed, if I just run dev.off() when I see this error, nothing about R seems corrupt at all. I can start a new X11 window with, X11(localhost:10) and things (such as plotting) function fine. Cordially, Adam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] BRugs
Hi. I am new here, and I am writing this Winbugs code with BRugs. n=length(bi.bmi) Lagegp=13 Lgen=2 Lrace=5 Lstra=15 Lpsu=2 #model gen x race bi.bmi.model=function(){ # likelihood for (i in 1:n){ bi.bmi[i]~ dbern(p[i]) logit(p[i])- a0 + a1[agegp[i]]+a2[gen[i]]+a3[race[i]] + a12[agegp[i], gen[i]] + gam[stra[i]]+ u[psu[i],stra[i]] } # constraints for a1, a2, a3, a12 a1[1]-0.0 a2[1]-0.0 a3[1]-0.0 a12[1,1]-0.0 # for(k in 2:Lgen){ a12[1,k]-0.0} for(j in 2:13){ a12[j,1]-0.0} # priors a0~ dnorm(0.0, 1.0E-4) for(i in 2:13){a1[i]~dnorm(0.0, 1.0E-4)} for(j in 2:Lgen){ a2[j]~ dnorm(0.0, 1.0E-4)} for(k in 2:Lrace){ a3[k]~ dnorm(0.0, 1.0E-4)} for(i in 2:Lagegp){ for(j in 2:Lgen){ a12[i,j]~ dnorm(0.0, 1.0E-4) }} for(i in 1:Lstra){gam[i]~dunif(0, 1000)} for( i in 1:Lpsu){ for(j in 1:Lstra){ u[i,j]~ dnorm(0.0, tau.u) }} tau.u-pow(sigma.u, -2) sigma.u~ dunif(0.0,100) } library(BRugs) writeModel(bi.bmi.model, con='bi.bmi.model2.txt') bi.bmi.model.data=list('n', 'Lagegp','Lgen', 'Lrace', 'Lstra', 'Lpsu', 'stra', 'psu','bi.bmi','agegp', 'gen', 'race') bi.bmi.model.init=function(){ list( sigma.u=runif(1), a0-rnorm(1), a1-c(NA,rep(0, 12)), a2-c(NA, rep(0, Lgen-1)), a3-c(NA, rep(0, Lrace-1)), a12-matrix( c(rep(NA, 13), NA,rep(0, 12)), ncol=2), gam-rep(1,Lstra), u-matrix(rep(0, 30), nrow=2)) } bi.bmi.model.parameters=c( 'a0', 'a1', 'a2', 'a3', 'a12') bi.bmi.model.bugs=BRugsFit(modelFile='bi.bmi.model2.txt', data=bi.bmi.model.data, inits=bi.bmi.model.init, numChains=1, para=bi.bmi.model.parameters, nBurnin=20, nIter=40) When I run this I get this message. model is syntactically correct data loaded array index is greater than array upper bound for a1 [1] C:\\DOCUME~1\\Owner\\LOCALS~1\\Temp\\RtmpNvSdyb/inits1.txt Initializing chain 1: model must be compiled before initial values loaded model must be initialized before updating model must be initialized before DIC an be monitored Error in samplesSet(parametersToSave) : model must be initialized before monitors used I checked the main effects alone, and it works fine, so I don't really understand why it's saying, array index is greater than array upper bound for a1. Anyone who could help me with this would be greatly appreciated. Thanks. -- View this message in context: http://n4.nabble.com/BRugs-tp2015395p2015395.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Kaplan-Meier survfit problem
When I try to the code from library(survival) of library(ISwR), the following code survfit(Surv(days,status==1)) that could produce Kaplan-Meier estimates shows the following error Error in survfit(Surv(days, status == 1)) : Survfit requires a formula or a coxph fit as the first argument How it can be done in R.2.10 -- View this message in context: http://n4.nabble.com/Kaplan-Meier-survfit-problem-tp2015369p2015369.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kaplan-Meier survfit problem
you need: survfit(Surv(days, status == 1) ~ 1) I hope it helps. Best, Dimitris On 4/19/2010 4:44 AM, ericyujin99 wrote: When I try to the code from library(survival) of library(ISwR), the following code survfit(Surv(days,status==1)) that could produce Kaplan-Meier estimates shows the following error Error in survfit(Surv(days, status == 1)) : Survfit requires a formula or a coxph fit as the first argument How it can be done in R.2.10 -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xtabs() of proportions, and naming a dimension (not a row)
Thanks a lot, David and Dennis! Also, your suggestions for how I could have better stated my question are duly noted, and appreciated. -- View this message in context: http://n4.nabble.com/xtabs-of-proportions-and-naming-a-dimension-not-a-row-tp2015261p2015380.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Natural cubic splines produced by smooth.Pspline and predict function in the package pspline
Hello, I am using R and the smooth.Pspline function in the pspline package to smooth some data by using natural cubic splines. After fitting a sufficiently smooth spline using the following call: (ps=smooth.Pspline(x,y,norder=2,spar=0.8,method=1) [the values of x are age in years from 1 to 100] I tried to check that R in fact had fitted a natural cubic spline by checking that the resulting spline was LINEAR outside the knots. I did this by plotting the predicted values from the spline fitting in the following way: plot(predict(ps,c(seq(100,150,1)) Unfortunately, the trend beyond the region of knots (i.e. over x values of 100) was far from linear - it was some sort of exponentially increasing trend. My understanding of natural cubic splines (from Green and Silverman - Nonparametric regression and generalized linear models, 1994) is that a natural cubic spline is a series of cubic polynomials joined at a set of knots in such a way that first and second derivatives are equal at all knots. Furthermore, a natural cubic spline has a knot at every data point, and is LINEAR on the range outside of its knots. This leads me to the question of what smooth.Pspline actually performs to the data and whether it actually fits a natural cubic spline as stated in the help file? Or does smooth.Pspline work as I expect it, but I am using predict incorrectly? I thank you for your time. Szymon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dataframe
Hi all, I'm trying to load a csv file in which all the variables must be of type number.The object is a dataframe.When i load the file what i get is a dataframe in wich the variables are of type factor.How can I get variables of type number??? Thanks all __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unwanted boxes in legend
Dear all, Thanks for the response, however I'm getting the following error message when I execute the legend command using the 'border' argument: Error in legend(10, par(usr)[4], c(A, B, : unused argument(s) (border = FALSE) Is anyone aware of any alternative means of switching off boxes around all but one of the elements in a legend? Many thanks for any input, Steve Date: Thu, 15 Apr 2010 12:13:40 -0600 From: ehl...@ucalgary.ca To: smurray...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Unwanted boxes in legend On 2010-04-15 11:10, Steve Murray wrote: Dear all, I am using the following code to generate a legend in my plot (consisting of both bars and points), but end up with boxes around my points: legend(10, par(usr)[4], c(A, B, C, D), fill=c(NA,NA, grey28, NA), pch=c(16,4,NA,18), col=c(red,blue,grey28,yellow), lty=FALSE, bty=n, horiz=FALSE) I want a box around the third element of the legend (to represent the bar 'fill' colour), but not for the others, where points are shown instead. What am I doing wrong above and how do I correct it? Add the 'border' argument: either border = FALSE # in which case no box is drawn for any element or border = c(NA, NA, black, NA) -Peter Ehlers Many thanks, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BRugs
On 19 April 2010 06:04, flipha23 neungsoo...@gmail.com wrote: Hi. I am new here, and I am writing this Winbugs code with BRugs. snip When I run this I get this message. model is syntactically correct data loaded array index is greater than array upper bound for a1 [1] C:\\DOCUME~1\\Owner\\LOCALS~1\\Temp\\RtmpNvSdyb/inits1.txt Initializing chain 1: model must be compiled before initial values loaded model must be initialized before updating model must be initialized before DIC an be monitored Error in samplesSet(parametersToSave) : model must be initialized before monitors used I checked the main effects alone, and it works fine, so I don't really understand why it's saying, array index is greater than array upper bound for a1. Anyone who could help me with this would be greatly appreciated. Thanks. I wonder if it meant array index is greater than array upper bound for a12., nad the 2 got chopped off. Anyway, try to compile without your arrays (u and a12), and see which one causes the trap. After that, you might have to do some digging to find the problem. Sometimes you have to check the data files that are saved, to make sure they contain what you want. BTW, you have some loops written as 2:13. It might be better to write them as 2:Lagegp, just for consistency. Bob -- Bob O'Hara Biodiversity and Climate Research Centre Senckenberganlage 25 D-60325 Frankfurt am Main, Germany Tel: +49 69 798 40216 Mobile: +49 1515 888 5440 WWW: http://www.bik-f.de/root/index.php?page_id=219 Blog: http://blogs.nature.com/boboh Google Wave: rni@googlewave.com Journal of Negative Results - EEB: www.jnr-eeb.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Serverless databases in R
Interesting info, Barry. Thank you. Sorry about the vagueness. I was concerned about restricting responses by imposing SQL terminology on options I've overlooked. I got .RData and workspace images mixed up. I do not know about the private sector's willingness to read open-source, I was guessing (and hoping to be corrected otherwise) that they might offer importing from common dbm/rdbms. So RSQlite, and RH2 provide functionality for serverless dbs, and Gabor's sqldf sounds like a sweet way to interact with those. The NoSQL (non-relational dbm) projects sound interesting permit for queries. A case can be made for .RData files to be NoSQL and queryless serverless databases. Perhaps I could use sqldf to pull off queries on .RData files? I thought ODBC was windows specific? Should XML be in the NoSQL project list, or is it considered too far along? Sincerely, KeithC. -Original Message- From: b.rowling...@googlemail.com [mailto:b.rowling...@googlemail.com] On Behalf Of Barry Rowlingson Sent: Monday, April 19, 2010 12:33 AM To: kMan Cc: r-help@r-project.org Subject: Re: [R] Serverless databases in R On Sun, Apr 18, 2010 at 11:30 PM, kMan kchambe...@gmail.com wrote: It was my understanding that .Rdata files were not very portable, and do not natively handle queries. Otherwise we'd all just use .RData files instead of farming the work out to SQL drivers external libraries, and colleagues who use, e.g. SAS or SPSS would also have no trouble with them. The platform in cross-platform to me generally means the operating system on which a program is running - and .Rdata files are perfectly portable between R on Linux, MacOSX, Windows, Solaris etc versions. You didn't mention portability to other statistical packages. You also didn't mention needing SQL, or what you wanted to do with your databases. I figured I'd just mention .Rdata files for completeness! There's also RJDBC and RODBC which can interface to anything with a JDBC or ODBC interface on your system. A .RData file could be considered as a serverless NoSQL database. There's a GSOC proposal to investigate interfaces to NoSQL databases and some info here: http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010:nosql_int erface Isn't it odd that the open-source R community has developed functions for reading in proprietary SAS and SPSS format files, but (AFAIK) the commercial sector doesn't seem to support reading data from open-sourced and open-specced R .Rdata files? Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Equivalent to Python os.walk?
Hi, I would like to recursively loop through al subfolders of a directory and do stuff with certain file types in those dirs. Is there a package/function that could do this? So it's more than Sys.glob. I'm looking for equivalent of Python's os.walk *) and I don't want to reinvent the wheel. Thank you. Cheers!! Albert-Jan *) http://www.saltycrane.com/blog/2007/03/python-oswalk-example/ ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Formatting data, adding column names, use reshape, a newbie question
Hi all, I'm an R novice. I have data that's already formatted as molten that reshape should be able to work with. For example, the following was read in with read.csv(filename,sep= , header=FALSE) V1 V2 V3 V4 V5 1originalbookbook.source1.txt3289004943039.525 2originalbookbook.source1.txt3289004943057.952 I would like add column names so I can use reshape's cast method. How do I go about that? Thanks, Paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: dataframe
Hi r-help-boun...@r-project.org napsal dne 19.04.2010 10:12:59: Hi all, I'm trying to load a csv file in which all the variables must be of type number.The object is a dataframe.When i load the file what i get is a dataframe You probably have non numeric data in your original CSV. Either you can correct it before reading it or you could try to force colClasses parameter of read.*whatever* you use and failed to tell. Another option is to remove nonumeric items and change factors to numeric see ?as.numeric and ?as.character Regards Petr in wich the variables are of type factor.How can I get variables of type number??? Thanks all __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to use Excel VBA's Shell() to call and execute R file
Hi KZ, I don't think that I can answer what I think is the precise question - how to run the R file from VBA but without using RExcel. However with RExcel installed, I have found it very straightforward to run R code from within VBA (thanks Erich Neuwirth - RExcel is great). The VBA code is: RInterface.StartRServer RInterface.RunRFile C:\Excel_R_script.txt RInterface.StopRServer That is all that it takes. Excel_R_script.txt is just standard R code, in a text file. Is there a reason you don't want to use RExcel? Guy -- View this message in context: http://n4.nabble.com/how-to-use-Excel-VBA-s-Shell-to-call-and-execute-R-file-tp2014944p2015718.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glmer with non integer weights
hi emmanuel, thanks a lot for your extensive answer. do you think using the asin(sqrt()) transf. can be justified for publishing prurpose or do i have to expect criticism. naivly i excluded that possibility, because of violated anova-assumptions, but if i did get you right the finite range rather posses a problem here. why is it in this special case an advantage? greetings, kay -- View this message in context: http://n4.nabble.com/glmer-with-non-integer-weights-tp1837179p2015732.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] comparing attitudes of 2 groups / likert scales?
Hi, I have just found this forum, and it looks like a great place to get some help (I hope) For my dissertation, which is due way too soon, I am doing a survey, comparing attitudes of 2 independent groups, with 5 scale likert questions. Basically I want to show if they have similar or different attitudes. I am testing 4 hypotheses, and have in total about 20 questions. I have to say my statistic skills are very basic and very rusty, we had some lectures two years ago, where we were introduced to R. I looked through my notes, and back then we did a one sample t-test to analyse likert type questions. I believe I would need to do a 2 sample unpaired t-test. It would be great if someone could give me some feedback if this test is the most suitable one for my purpose, and maybe could explain to me what’s the easiest way to do this in R? You would help me loads!! Many thanks in advance Mona -- View this message in context: http://n4.nabble.com/comparing-attitudes-of-2-groups-likert-scales-tp2015738p2015738.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Serverless databases in R
Barry Rowlingson wrote: On Sun, Apr 18, 2010 at 11:30 PM, kMan kchambe...@gmail.com wrote: It was my understanding that .Rdata files were not very portable, and do not natively handle queries. Otherwise we'd all just use .RData files instead of farming the work out to SQL drivers external libraries, and colleagues who use, e.g. SAS or SPSS would also have no trouble with them. The platform in cross-platform to me generally means the operating system on which a program is running - and .Rdata files are perfectly portable between R on Linux, MacOSX, Windows, Solaris etc versions. You didn't mention portability to other statistical packages. You also didn't mention needing SQL, or what you wanted to do with your databases. I figured I'd just mention .Rdata files for completeness! There's also RJDBC and RODBC which can interface to anything with a JDBC or ODBC interface on your system. A .RData file could be considered as a serverless NoSQL database. There's a GSOC proposal to investigate interfaces to NoSQL databases and some info here: http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010:nosql_interface Isn't it odd that the open-source R community has developed functions for reading in proprietary SAS and SPSS format files, but (AFAIK) the commercial sector doesn't seem to support reading data from open-sourced and open-specced R .Rdata files? Barry Hi Barry, Stat Transfer can read and write R binary data frames (.rda files). Frank -- Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] logit() etc {was Re: glmer with non integer weights}
EC == Emmanuel Charpentier charp...@bacbuc.dyndns.org on Sun, 18 Apr 2010 11:29:29 +0200 writes: EC Le vendredi 16 avril 2010 à 00:15 -0800, Kay Cichini a EC écrit : thanks thierry, i considered this transformations already, but variance is not stabilized and/or normality is neither achieved. i guess i'll have to look out for non-parametrics? EC Or (maybe) a model based on a non-Gaussian likelihood ? EC A beta distribution comes to mind, either fitted by EC maximum likelihood or (if relevant prior information is EC available) in a Bayesian framework ? EC But beware : you have a not-so-small problem ... EC Your data have zeroes and ones, which, if you have no EC information on a sample size, are sharp zeroes and EC ones, and there therefore theoretically bound to EC infinite linear predictors (in plain English : bloody EC unlikely). These values make a fixed effect analysis EC impossible : these points at infinite will make EC regression essentially impossible. Consider : logit-function(x)log(x/(1-x)) ilogit-function(x)1/(1+exp(-x)) Hmmm, and some CRAN packages even define these .. Now, please, the help page ?Logistic has contained for a long time now Note: ‘qlogis(p)’ is the same as the well known ‘_logit_’ function, logit(p) = log(p/(1-p), and ‘plogis(x)’ has consequently been called the ‘inverse logit’. So please note, and do use qlogis() and plogis() instead of logit() and ilogit() ... or if you really really must (e.g. for didactical reasons), use logit - qlogis Using the logistic functions directly may also remind you or your user that sometimes it will be advantageous to use 'log.p=TRUE' or 'lower.tail=FALSE' ``coordinate systems Martin [...] [...] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Serverless databases in R
On Mon, 19 Apr 2010, Frank E Harrell Jr wrote: Barry Rowlingson wrote: On Sun, Apr 18, 2010 at 11:30 PM, kMan kchambe...@gmail.com wrote: It was my understanding that .Rdata files were not very portable, and do not natively handle queries. Otherwise we'd all just use .RData files instead of farming the work out to SQL drivers external libraries, and colleagues who use, e.g. SAS or SPSS would also have no trouble with them. The platform in cross-platform to me generally means the operating system on which a program is running - and .Rdata files are perfectly portable between R on Linux, MacOSX, Windows, Solaris etc versions. You didn't mention portability to other statistical packages. You also didn't mention needing SQL, or what you wanted to do with your databases. I figured I'd just mention .Rdata files for completeness! There's also RJDBC and RODBC which can interface to anything with a JDBC or ODBC interface on your system. A .RData file could be considered as a serverless NoSQL database. There's a GSOC proposal to investigate interfaces to NoSQL databases and some info here: http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010:nosql_interface Isn't it odd that the open-source R community has developed functions for reading in proprietary SAS and SPSS format files, but (AFAIK) the commercial sector doesn't seem to support reading data from open-sourced and open-specced R .Rdata files? Barry Hi Barry, Stat Transfer can read and write R binary data frames (.rda files). Yes, but that is a considerable restriction (and other programs can do similar things). I suspect it means 'data frames with columns from a prespecified small set of types' saved in an RDA2 gzipped binary xdr format. BTW, .rda and .RData are simply convenient file extensions: the first is more convenient in the Windows world. They are from one of a collection of many different formats, identified by the file 'magic' headers. I am not so sure about 'open-specced R .Rdata files'. In so far as there is a spec, I wrote it in 'R Internals' and it is not a full spec. Mainly because many of the details are only relevant to R itself, such as how you read environments and some of the details of the object headers. Had the RDA formats been written with the intent that they would be used other than to read all the objects they contain into R, they would have been structured differently with a lot more metadata. That has been noted for RDA3, but introducing such a format would be a major step and is not imminent. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Writing methods for existing generic function
Dear All, Suppose I want to write a method for the generic function confint(): args(confint) function (object, parm, level = 0.95, ...) So, it looks like the second and third argument have been predefined in the generic function. Suppose one or several of the predefined arguments don't apply or fit (in some sense) with the design of the rest of the package. What should one do? I see several options: 1) Write a new generic function. 2) Rewrite the package (if possible) so that the arguments of the existing generic function do apply (in some sense). 3) Write a method for the existing generic function, adding the predefined arguments to the method call, but just ignore some/all of them in whatever is being done inside of the method function. Are there any other options? Is there some recommended practice for this? Thanks in advance for any feedback! Best, -- Wolfgang Viechtbauerhttp://www.wvbauer.com/ Department of Methodology and StatisticsTel: +31 (43) 388-2277 School for Public Health and Primary Care Office Location: Maastricht University, P.O. Box 616 Room B2.01 (second floor) 6200 MD Maastricht, The Netherlands Debyeplein 1 (Randwyck) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge
I have a problem with the merge function. I have to merge two big dataframes which look like the following example.The problems is that I get duplicated rows. CODPROD N1 N3 N4 23 3 55 4 24 5 6736 25 3 73 24 second data frame CODPROD N1 N2 30 34 45 45 078 65056 The result that I get its like: CODPROD N1 N2 N3N4 N1.1 23 3 NA55 4 3 24 5 NA67 36 0 25 3 NA73 24 0 30 34 45 NA NA 0 45 0 78 NA NA 0 65 0 56 NA NA . 0 So N1.1 is a duplication of N1.I think I could solve the problems by specifying the same columns but I have a lot of colums which have the same names in the two dataframe so I think its not the right way to solve it. Anyone knows how to avoid duplication?? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Equivalent to Python os.walk?
You can use 'list.files(..., recursive=TRUE)' to get a list of the file names and then process sequentially. It all depends on the type of processing that you want to do. You can also write a recursive function to do the same thing. Only take a couple of lines of code. On Mon, Apr 19, 2010 at 5:25 AM, Albert-Jan Roskam fo...@yahoo.com wrote: Hi, I would like to recursively loop through al subfolders of a directory and do stuff with certain file types in those dirs. Is there a package/function that could do this? So it's more than Sys.glob. I'm looking for equivalent of Python's os.walk *) and I don't want to reinvent the wheel. Thank you. Cheers!! Albert-Jan *) http://www.saltycrane.com/blog/2007/03/python-oswalk-example/ ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interacting with dendrogram plots, locator() or click()
The two functions below will see if the current graphics device appears to contain a dendrogram plot, and if so will return the substree when a user clicks on a node. Note: function name identify.dendrogram allows the dendrogram class to bind this function for identify(dnd). If someone wants to incorporate this into the public dendrogram distribution, please feel free to do so. David Example: # plot a dendrogram of US arrest data # and print the subtree that a user selects dnd = as.dendrogram(hclust(dist(USArrests))) plot(dnd, horiz=TRUE) dnd2 = identify(dnd) str(dnd2) identify.dendrogram = function(dnd) { # # Return the dendrogram corresponding to the node a user clicks on. # # # First verify that it is a dendrogram plot corresponding to the # input dendrogram, and determine if horizontal or veritical # n = attributes(dnd)$members h = attributes(dnd)$height usr = par()$usr dx = usr[1] + usr[2] dy = usr[3] + usr[4] ok = FALSE horiz = FALSE if (abs(n + 1 - dx) 0.001 abs((h - dy)/dy) 0.001) { ok = TRUE horiz = FALSE } else { if (abs(n + 1 - dy) 0.001 abs((h - dx)/dx) 0.001) { ok = TRUE horiz = TRUE } } # # If the plot matches, call locator() for user input and match the node # if (ok) { crd = locator(1) if (is.null(crd)) { return(NULL) } else { return(find.node(dnd, 1, crd, horiz)) } } else { warning(plot that does not correspond to the dendrogram in the call.) return(NULL) } } find.node = function(dnd, offset, crd, horiz) { # # find a node in a dendrgram matching the coordinates in crd # horiz is the plot orientation, see plot(dendrogram) # # First see if this node matches the coordinates # h = attributes(dnd)$height n = attributes(dnd)$members usr = par()$usr ok.x = FALSE ok.y = FALSE if (horiz) { ok.x = (abs(crd$x - h) / (usr[1] - usr[2])) 0.05 ok.y = round(crd$y,0) = offset round(crd$y,0) = offset + n - 1 } else { ok.y = (abs(crd$y - h) / (usr[4] - usr[3])) 0.05 ok.x = round(crd$x,0) = offset round(crd$x,0) = offset + n - 1 } if (ok.x ok.y) { attr = attributes(dnd) attr$offset = offset attributes(dnd) = attr return(dnd) } # # No, so see if there are children of this node that match # if (!is.leaf(dnd)) { nc = length(dnd) child.offset = offset for (i in 1:nc) { # # Return the match subtree or descend further if no match # ret = find.node(dnd[[i]], child.offset, crd, horiz) if (!is.null(ret)) { return(ret) } else { child.offset = child.offset + attributes(dnd[[i]])$members } } } # # None of the children matched so return NULL # return(NULL) } David J. States, M.D. Ph.D University of Texas Health Science Center at Houston -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David J. States Sent: Saturday, April 17, 2010 8:35 AM To: r-help@R-project.org Subject: [R] Interacting with dendrogram plots, locator() or click() I would like to explore dendrogam plots interactively. For example, click on a node and return information about all of the children of that node. Is there a high level wrapper for locator() or click() that will return the nearest dendrogram node on a plot? If not, is there a way to obtain the [x,y] coordinates of all the nodes on a plot? Thanks, David David J. States, M.D., Ph.D. Professor of Health Information Science School of Health Information Sciences Brown Foundation Institute of Molecular Medicine University of Texas Health Science Center at Houston Sarofim Research Building Room 437C 1825 Pressler St. Houston, TX 77030 Telephone: 713 500 3845 email: david.j.sta...@uth.tmc.edumailto:david.j.sta...@uth.tmc.edu URL: http://www.stateslab.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented,
Re: [R] comparing attitudes of 2 groups / likert scales?
Mona_m wrote: For my dissertation, which is due way too soon, I am doing a survey, comparing attitudes of 2 independent groups, with 5 scale likert questions. Basically I want to show if they have similar or different attitudes. I am testing 4 hypotheses, and have in total about 20 questions. Using an unpaired (in you case) t-test on Likert scale is a bit risky, because the Gaussian distribution might be severely violated. It might be Ok if your data are reasonable centered around moderate, but frequently we have responses where all but one subject replied with very good. If you can create a sum of scores, these are frequently more suitable for being analyzed by some quasi-continuous method, and using a non-parametric Wilcoxon test might avoid reviewer comments in some areas of research. If you only have few levels, something like polr (in MASS) and the plots created from it by Fox/Anderson might be an alternative (google for Fox/polytomous effects). These results are more difficult to interpret and I have seen cases where papers using this where rejected in medical journals (why don't you use Wilcoxon?). Overall, when you have more complex cross-over designs with additional crossed variables, violating Gaussian assumptions for me seems to be the lesser evil compared to violating independence. Assess independence, equal variance and normality -in that order (van Bell, Statistical rules of thumb). I remember Douglas Bates mumbling something along the same lines, but he mentioned a 10 level scale. Dieter -- View this message in context: http://n4.nabble.com/comparing-attitudes-of-2-groups-likert-scales-tp2015738p2015812.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Follow up on installing formatR...
Good morning folks: Made a second go at installing this and succeeded, but with some strange behaviours along the way. First the system back story. Using up to date edition Ubuntu 8.04 Using up to date edition of R 2.10.1 Using the Toronto, Ontario repository to draw from. After the first attempt ended up with two of the four dependencies installed so only the Rgtk related items would be needed this time. Went into synaptic and found r-cran-rgtk2. Okay. Sounded likely so I installed it successfully. Using a root terminal session I went back into R and ran install.packages(formatR). Pulled in two dependencies plus the main choice. Rejected one and kept two when it came time to install but still installed successfully, or so it seemed when I ran library(formatR). Go figure. A copy of the message run follows. Any comments or suggestions will be of interest. install.packages(formatR) Warning in install.packages(formatR) : argument 'lib' is missing: using '/usr/local/lib/R/site-library' --- Please select a CRAN mirror for use in this session --- Loading Tcl/Tk interface ... done also installing the dependencies ‘RGtk2’, ‘gWidgetsRGtk2’ trying URL 'http://probability.ca/cran/src/contrib/RGtk2_2.12.18.tar.gz' Content type 'application/x-gzip' length 2206504 bytes (2.1 Mb) opened URL == downloaded 2.1 Mb trying URL 'http://probability.ca/cran/src/contrib/gWidgetsRGtk2_0.0-64.tar.gz' Content type 'application/x-gzip' length 138192 bytes (134 Kb) opened URL == downloaded 134 Kb trying URL 'http://probability.ca/cran/src/contrib/formatR_0.1-3.tar.gz' Content type 'application/x-gzip' length 2672 bytes opened URL == downloaded 2672 bytes * installing *source* package ‘RGtk2’ ... checking for pkg-config... /usr/bin/pkg-config checking pkg-config is at least version 0.9.0... yes checking for LIBGLADE... no configure: WARNING: libglade not found checking for INTROSPECTION... no checking for GTK... no configure: error: GTK version 2.8.0 required ERROR: configuration failed for package ‘RGtk2’ * removing ‘/usr/local/lib/R/site-library/RGtk2’ * installing *source* package ‘gWidgetsRGtk2’ ... ** R ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ... * DONE (gWidgetsRGtk2) * installing *source* package ‘formatR’ ... ** R ** preparing package for lazy loading Loading required package: gWidgets Loading required package: MASS ** help *** installing help indices ** building package indices ... * DONE (formatR) The downloaded packages are in ‘/tmp/RtmpNL20Be/downloaded_packages’ Warning message: In install.packages(formatR) : installation of package 'RGtk2' had non-zero exit status -- Brian Lunergan Nepean, Ontario Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fit a deterministic function to observed data
Hi all, I am not a mathematician and I am trying to fit a function which could fit my observed data. Which function should I use and how could I fit it to data in R? Below are the data: x - c(0, 9, 17, 24, 28, 30) y - c(500, 480, 420, 300, 160, 5) I use R for Mac OS, version 2.10-1 2009-08-24 Thank you for your help. Vincent. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparing data frames
Thank you all for your help and suggestions. L On Sun, Apr 18, 2010 at 8:51 PM, Tal Galili tal.gal...@gmail.com wrote: Would: ?merge Work for you ? Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Sun, Apr 18, 2010 at 7:30 PM, Laura Ferrero-Miliani laur...@gmail.com wrote: Dear very helpful friends, It is Sunday, there is no air traffic in Europe, what better to do than try and learn me some more R. I have the following example: owner - c(1:4) animal - c(cat, dog, cat, dog) char.1 - c(fluffy, playful, mean, stupid) food - c(cat food, left-overs, cat food, dog food) char.2 - c(lazy, destructive, antisocial, goofy) color - c(white, brown, black, black) char.3 - c(fat, tiny, evil, big) age - c(16, 2, 5, 10) pet.data - data.frame(owner, animal, char.1, food, char.2, color, char.3, age) animal - c(cat, dog) v1 - c(fluffy, big) v2 - c(fat, stupid) pet.key - data.frame(animal, v1, v2) Now I would like to compare my pet.key to my pet.data and add a variable to pet.data with that result. So for each cat in pet.data, char.1, char.2 and char.3 should contain fluffy AND fat for a complete match, or fluffy OR fat for a partial match, and so on. I don't know where to start. I *think* I should be using %in%, but I don't know how to build the expression so it works (so far I have tried and have gotten from lists to a weird array as a result!) btw, what if my data and my key contains repeated values e.g. animal v1 v2 cat fluffy fluffy Any suggestions? Thanks in advance, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help
Hi, Thank you for your help. this function 'combvec ' takes any number of inputs on Matlab. So, you can take more than two matrix. The help of this function 'combvec' is like this on Matlab: help combvec COMBVEC Create all combinations of vectors. Syntax combvec(a1,a2,...) Description COMBVEC(A1,A2,...) takes any number of inputs, A1 - Matrix of N1 (column) vectors. A2 - Matrix of N2 (column) vectors. and returns a matrix of (N1*N2*...) column vectors, where the columns consist of all possibilities of A2 vectors, appended to A1 vectors, etc. Example a1 = [1 2 3; 4 5 6]; a2 = [7 8; 9 10]; a3 = combvec(a1,a2) 2010/4/19 Dennis Murphy djmu...@gmail.com Hi: This is a simplistic version of combvec that works for two input matrices; I don't have Matlab, and I don't understand how the function generalizes to more than two input matrices, so this is the best I can offer, for what it's worth... combvec2 - function(m1, m2) { c1 - ncol(m1) c2 - ncol(m2) k1 - kronecker(matrix(rep(1, c2), nrow = 1), m1) k2 - kronecker(m2, matrix(rep(1, c1), nrow = 1)) rbind(k1, k2) } a1 - matrix(1:6, nrow = 2, byrow = TRUE) a1 [,1] [,2] [,3] [1,]123 [2,]456 a2 - matrix(7:10, nrow = 2, byrow = TRUE) combvec2(a1, a2) [,1] [,2] [,3] [,4] [,5] [,6] [1,]123123 [2,]456456 [3,]777888 [4,]999 10 10 10 HTH, Dennis On Sun, Apr 18, 2010 at 3:00 AM, anderson nuel anderson@gmail.comwrote: Hello, I would like to create all combinations of vectors. I find on Matalb this function 'combvec' which create all combinations of vectors. Please could you help me to find the corresponds function of 'combvec'. For example: On Matlab a1 = [1 2 3; 4 5 6] a1 = 1 2 3 4 5 6 a2 = [7 8; 9 10] a2 = 7 8 910 a3 = combvec(a1,a2) a3 = 1 2 3 1 2 3 4 5 6 4 5 6 7 7 7 8 8 8 9 9 9101010 Best Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help
Hi, Thank you for your help. I try your function 'combvec2' .but , it gives me an error :Erreur dans rep(1, c2) : argument 'times' incorrect this function 'combvec ' takes any number of inputs on Matlab. So, you can take more than two matrix. The help of this function 'combvec' is like this on Matlab: help combvec COMBVEC Create all combinations of vectors. Syntax combvec(a1,a2,...) Description COMBVEC(A1,A2,...) takes any number of inputs, A1 - Matrix of N1 (column) vectors. A2 - Matrix of N2 (column) vectors. and returns a matrix of (N1*N2*...) column vectors, where the columns consist of all possibilities of A2 vectors, appended to A1 vectors, etc. Example a1 = [1 2 3; 4 5 6]; a2 = [7 8; 9 10]; a3 = combvec(a1,a2) 2010/4/19 Dennis Murphy djmu...@gmail.com Hi: This is a simplistic version of combvec that works for two input matrices; I don't have Matlab, and I don't understand how the function generalizes to more than two input matrices, so this is the best I can offer, for what it's worth... combvec2 - function(m1, m2) { c1 - ncol(m1) c2 - ncol(m2) k1 - kronecker(matrix(rep(1, c2), nrow = 1), m1) k2 - kronecker(m2, matrix(rep(1, c1), nrow = 1)) rbind(k1, k2) } a1 - matrix(1:6, nrow = 2, byrow = TRUE) a1 [,1] [,2] [,3] [1,]123 [2,]456 a2 - matrix(7:10, nrow = 2, byrow = TRUE) combvec2(a1, a2) [,1] [,2] [,3] [,4] [,5] [,6] [1,]123123 [2,]456456 [3,]777888 [4,]999 10 10 10 HTH, Dennis On Sun, Apr 18, 2010 at 3:00 AM, anderson nuel anderson@gmail.comwrote: Hello, I would like to create all combinations of vectors. I find on Matalb this function 'combvec' which create all combinations of vectors. Please could you help me to find the corresponds function of 'combvec'. For example: On Matlab a1 = [1 2 3; 4 5 6] a1 = 1 2 3 4 5 6 a2 = [7 8; 9 10] a2 = 7 8 910 a3 = combvec(a1,a2) a3 = 1 2 3 1 2 3 4 5 6 4 5 6 7 7 7 8 8 8 9 9 9101010 Best Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ecdf
Hello, I'd like to plot an empirical cumulative distribution function, except instead of the fraction of values x, I'd like the fraction of values x. I think this can be done using the ecdf function in {Hmisc}. I installed the package and loaded it. However, when following the example given in the documentation, I get an error: x - rnorm(100) ecdf(x,what='1-F') Error in ecdf(x, what = 1-F) : unused argument(s) (what = 1-F) I believe that this is because R is attempting to access the ecdf function in base R, which does not have the what option. Am I correct, and if so, how can I change that? Note: I also tried to do it myself without the {Hmisc} ecdf function, and couldn't figure out a way. x2 - 1-ecdf(x) doesn't work, and neither does x2 - rep(0,times=100) for(i in 1:100){ x2[i] - 1-ecdf(x)[i] } Both result in errors. Thanks in advance for any suggestions you can offer. -Mitch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ecdf
R is case sensitive. ecdf() is in the stats package, Ecdf() is in Hmisc. So you want Ecdf(x,what='1-F') Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Downey, Patrick Verzonden: maandag 19 april 2010 15:04 Aan: R help Onderwerp: [R] ecdf Hello, I'd like to plot an empirical cumulative distribution function, except instead of the fraction of values x, I'd like the fraction of values x. I think this can be done using the ecdf function in {Hmisc}. I installed the package and loaded it. However, when following the example given in the documentation, I get an error: x - rnorm(100) ecdf(x,what='1-F') Error in ecdf(x, what = 1-F) : unused argument(s) (what = 1-F) I believe that this is because R is attempting to access the ecdf function in base R, which does not have the what option. Am I correct, and if so, how can I change that? Note: I also tried to do it myself without the {Hmisc} ecdf function, and couldn't figure out a way. x2 - 1-ecdf(x) doesn't work, and neither does x2 - rep(0,times=100) for(i in 1:100){ x2[i] - 1-ecdf(x)[i] } Both result in errors. Thanks in advance for any suggestions you can offer. -Mitch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fit a deterministic function to observed data
Plotting y vs. x: plot(y ~ x) the graph seems to be flattening out at x = 0 at a level of around y = 500 so lets look at: plot(500-y ~ x) This curve is moving up rapidly so lets take the log to flatten it out: plot(log(500-y) ~ x) That looks quite linear so log(500-y) = A + B * x and solving: y = 500 - exp(A + B*x) The 500 was a just a ballpark so lets make that a parameter too: y = C - exp(A + B*x) = C - exp(A) * exp(B*x) = C + D * exp(B*x) where we have replaced -exp(A) with D. Fitting this gives: fm - nls(y ~ cbind(1, exp(B * x)), start = c(B = 1), alg = plinear); fm Nonlinear regression model model: y ~ cbind(1, exp(B * x)) data: parent.frame() B.lin1.lin2 0.1513 498.9519 -5.1644 residual sum-of-squares: 627.6 Number of iterations to convergence: 6 Achieved convergence tolerance: 6.192e-06 # graphing plot(y ~ x, pch = 20, col = red) lines(fitted(fm) ~ x) title(y = 498.9519 - 5.1644 * exp(0.1513 * x)) On Mon, Apr 19, 2010 at 8:42 AM, vincent laperriere vincent_laperri...@yahoo.fr wrote: Hi all, I am not a mathematician and I am trying to fit a function which could fit my observed data. Which function should I use and how could I fit it to data in R? Below are the data: x - c(0, 9, 17, 24, 28, 30) y - c(500, 480, 420, 300, 160, 5) I use R for Mac OS, version 2.10-1 2009-08-24 Thank you for your help. Vincent. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ecdf
Hi Thierry, That worked perfectly. Thanks for the suggestion. For reference, in the documentation, it never lists {Hmisc}'s function as starting with E instead of e. I don't know who's in charge of documentation, but that should probably be corrected. Thanks again. -Mitch -Original Message- From: ONKELINX, Thierry [mailto:thierry.onkel...@inbo.be] Sent: Monday, April 19, 2010 9:08 AM To: Downey, Patrick; R help Subject: RE: [R] ecdf R is case sensitive. ecdf() is in the stats package, Ecdf() is in Hmisc. So you want Ecdf(x,what='1-F') Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Downey, Patrick Verzonden: maandag 19 april 2010 15:04 Aan: R help Onderwerp: [R] ecdf Hello, I'd like to plot an empirical cumulative distribution function, except instead of the fraction of values x, I'd like the fraction of values x. I think this can be done using the ecdf function in {Hmisc}. I installed the package and loaded it. However, when following the example given in the documentation, I get an error: x - rnorm(100) ecdf(x,what='1-F') Error in ecdf(x, what = 1-F) : unused argument(s) (what = 1-F) I believe that this is because R is attempting to access the ecdf function in base R, which does not have the what option. Am I correct, and if so, how can I change that? Note: I also tried to do it myself without the {Hmisc} ecdf function, and couldn't figure out a way. x2 - 1-ecdf(x) doesn't work, and neither does x2 - rep(0,times=100) for(i in 1:100){ x2[i] - 1-ecdf(x)[i] } Both result in errors. Thanks in advance for any suggestions you can offer. -Mitch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting the coefficients of each local polynomial from loess()
Hello dear R users and Prof. Brian Ripley, I am searching for a way to extract the estimated coeffiicents of each local polynomial at given x from loess(). After searching and askinghttp://stackoverflow.com/questions/2666799/extracting-the-fitted-terms-in-the-local-polynomial-function-of-a-loess-in-r-n, I found this thread: http://tolstoy.newcastle.edu.au/R/e2/help/07/01/8221.html From it I understand that the only way to do this is by opening up the functions that are behind loess(), and that it is not recommended for someone who doesn't know what he is doing. I don't know what I am doing here, if someone else does - and is willing to help, I believe me (and others) would benefit from it. Thanks, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge
What do you want to get? And what exactly did you do? Your question isn't very clear. Sarah On Mon, Apr 19, 2010 at 7:59 AM, n.via...@libero.it n.via...@libero.it wrote: I have a problem with the merge function. I have to merge two big dataframes which look like the following example.The problems is that I get duplicated rows. CODPROD N1 N3 N4 23 3 55 4 24 5 67 36 25 3 73 24 second data frame CODPROD N1 N2 30 34 45 45 0 78 65 0 56 The result that I get its like: CODPROD N1 N2 N3 N4 N1.1 23 3 NA 55 4 3 24 5 NA 67 36 0 25 3 NA 73 24 0 30 34 45 NA NA 0 45 0 78 NA NA 0 65 0 56 NA NA . 0 So N1.1 is a duplication of N1.I think I could solve the problems by specifying the same columns but I have a lot of colums which have the same names in the two dataframe so I think its not the right way to solve it. Anyone knows how to avoid duplication?? -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] smart way to turn a vector into a matrix
Hi Erik, what if I do some manipulations on the new list created with split and I want to come back to its initial form? I saw you have the unsplit function but I get errors. What I am doing is that I use as the f function as in the help the same initial f I used to split but I get the following error: Error in x[i] - value[[j]] : replacement has length zero - Anna Lippel -- View this message in context: http://n4.nabble.com/smart-way-to-turn-a-vector-into-a-matrix-tp1692671p2015952.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple variables pointing to single dataframe?
Hi, for example: x - Orange x2 - x x[1,]$age - 50 x2[1,] Tree age circumference 11 11830 I would like a way for x2 to also reference the modified x data frame without having to reassign x2x each time x is modified. Thanks, Alex -Original Message- From: Petr PIKAL [mailto:petr.pi...@precheza.cz] Sent: Monday, April 19, 2010 3:18 AM To: Alex Bryant Cc: r-help@r-project.org Subject: Odp: [R] multiple variables pointing to single dataframe? Hi r-help-boun...@r-project.org napsal dne 16.04.2010 16:15:40: Hi, I have a need to have 2 variables point to the same dataframe (d1), I What does it mean to point to data frame? Seems to me that it is something from C+. You can reference data frame by $ or by square brackets with as many variables as you want. see ?[ regards Petr don't want to simply copy the dataframe ( d2-d1 ) as my understanding is that this will create a second dataframe. Any suggestions on best practice here? Thank You, // // Alex Bryant // Software Developer // Integrated Clinical Systems, Inc. // 908-996-7208 Confidentiality Note: This e-mail, and any attachment to...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing attitudes of 2 groups / likert scales?
Good luck in your work, The simple solution would be to run many non-paired wilcox on all the 20 questions (the way Dieter suggested). In which case, make sure to adjust for multiple comparisons. Read about it, and see: ?p.adjust If you have some questions you can merge (by a simple mean of them), it will probably do you good (using PCA might be an option, but it could also be an over kill for you). You might also be interested in plotting your data, here is a nice simple hack on how to display the Correlation scatter-plot matrix for your data: http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/ And Dieter, thanks for a great quote: Assess independence, equalvariance and normality -in that order (van Bell, Statistical rules of thumb). Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Mon, Apr 19, 2010 at 2:07 PM, Mona_m purplem...@blueyonder.co.uk wrote: Hi, I have just found this forum, and it looks like a great place to get some help (I hope) For my dissertation, which is due way too soon, I am doing a survey, comparing attitudes of 2 independent groups, with 5 scale likert questions. Basically I want to show if they have similar or different attitudes. I am testing 4 hypotheses, and have in total about 20 questions. I have to say my statistic skills are very basic and very rusty, we had some lectures two years ago, where we were introduced to R. I looked through my notes, and back then we did a one sample t-test to analyse likert type questions. I believe I would need to do a 2 sample unpaired t-test. It would be great if someone could give me some feedback if this test is the most suitable one for my purpose, and maybe could explain to me whats the easiest way to do this in R? You would help me loads!! Many thanks in advance Mona -- View this message in context: http://n4.nabble.com/comparing-attitudes-of-2-groups-likert-scales-tp2015738p2015738.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unwanted boxes in legend
Try border=c(0,0,1,0). -tgs On Mon, Apr 19, 2010 at 4:21 AM, Steve Murray smurray...@hotmail.comwrote: Dear all, Thanks for the response, however I'm getting the following error message when I execute the legend command using the 'border' argument: Error in legend(10, par(usr)[4], c(A, B, : unused argument(s) (border = FALSE) Is anyone aware of any alternative means of switching off boxes around all but one of the elements in a legend? Many thanks for any input, Steve Date: Thu, 15 Apr 2010 12:13:40 -0600 From: ehl...@ucalgary.ca To: smurray...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Unwanted boxes in legend On 2010-04-15 11:10, Steve Murray wrote: Dear all, I am using the following code to generate a legend in my plot (consisting of both bars and points), but end up with boxes around my points: legend(10, par(usr)[4], c(A, B, C, D), fill=c(NA,NA, grey28, NA), pch=c(16,4,NA,18), col=c(red,blue,grey28,yellow), lty=FALSE, bty=n, horiz=FALSE) I want a box around the third element of the legend (to represent the bar 'fill' colour), but not for the others, where points are shown instead. What am I doing wrong above and how do I correct it? Add the 'border' argument: either border = FALSE # in which case no box is drawn for any element or border = c(NA, NA, black, NA) -Peter Ehlers Many thanks, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge
Maybe this what you are looking for lines1 - CODPROD N1 N3 N4 23 3 55 4 24 5 67 36 25 3 73 24 df1 - read.table(textConnection(lines1),header=TRUE) lines2 -CODPROD N1 N2 30 34 45 45 0 78 65 0 56 df2 - read.table(textConnection(lines2),header=TRUE) merge(df1, df2, by = intersect(names(df1),names(df2)), all=TRUE) HTH Pete -- View this message in context: http://n4.nabble.com/merge-tp2015796p2015966.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple variables pointing to single dataframe?
Hi, On Mon, Apr 19, 2010 at 10:15 AM, Alex Bryant abry...@i-review.com wrote: Hi, for example: x - Orange x2 - x x[1,]$age - 50 x2[1,] Tree age circumference 1 1 118 30 I would like a way for x2 to also reference the modified x data frame without having to reassign x2x each time x is modified. You can't *really* do this in R, but I believe you can rig up a work around using environments (if you really have to). This SO thread with links is *somehow* related to what you're asking. Perhaps you'll find what you're looking for there: http://stackoverflow.com/questions/2603184/r-pass-by-reference -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge
Hi If the columns has the same name but different values in them then you shall either decide which one to keep yourself or you shall keep both. If they have same name and same values you could select only those which names do not match. names(data1) %in% names(data2) can select which names match and you can get rid of them in one of your data frame before merge. Something like that (untested) data1[,c(1,which(!(names(data1) %in% names(data2] Regards Petr r-help-boun...@r-project.org napsal dne 19.04.2010 15:56:34: What do you want to get? And what exactly did you do? Your question isn't very clear. Sarah On Mon, Apr 19, 2010 at 7:59 AM, n.via...@libero.it n.via...@libero.it wrote: I have a problem with the merge function. I have to merge two big dataframes which look like the following example.The problems is that I get duplicated rows. CODPROD N1 N3 N4 23 3 55 4 24 5 6736 25 3 73 24 second data frame CODPROD N1 N2 30 34 45 45 078 65056 The result that I get its like: CODPROD N1 N2 N3N4 N1.1 23 3 NA55 4 3 24 5 NA67 36 0 25 3 NA73 24 0 30 34 45 NA NA 0 45 0 78 NA NA 0 65 0 56 NA NA . 0 So N1.1 is a duplication of N1.I think I could solve the problems by specifying the same columns but I have a lot of colums which have the same names in the two dataframe so I think its not the right way to solve it. Anyone knows how to avoid duplication?? -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple variables pointing to single dataframe?
If you only need to retrieve x by referring to x2 and you don`t have to modify x via x2 then this works: x - Orange makeActiveBinding(x2, function() x, .GlobalEnv) x$age - 50 head(x2) Tree age circumference 11 5030 21 5058 31 5087 41 50 115 51 50 120 61 50 142 On Mon, Apr 19, 2010 at 10:15 AM, Alex Bryant abry...@i-review.com wrote: Hi, for example: x - Orange x2 - x x[1,]$age - 50 x2[1,] Tree age circumference 1 1 118 30 I would like a way for x2 to also reference the modified x data frame without having to reassign x2x each time x is modified. Thanks, Alex -Original Message- From: Petr PIKAL [mailto:petr.pi...@precheza.cz] Sent: Monday, April 19, 2010 3:18 AM To: Alex Bryant Cc: r-help@r-project.org Subject: Odp: [R] multiple variables pointing to single dataframe? Hi r-help-boun...@r-project.org napsal dne 16.04.2010 16:15:40: Hi, I have a need to have 2 variables point to the same dataframe (d1), I What does it mean to point to data frame? Seems to me that it is something from C+. You can reference data frame by $ or by square brackets with as many variables as you want. see ?[ regards Petr don't want to simply copy the dataframe ( d2-d1 ) as my understanding is that this will create a second dataframe. Any suggestions on best practice here? Thank You, // // Alex Bryant // Software Developer // Integrated Clinical Systems, Inc. // 908-996-7208 Confidentiality Note: This e-mail, and any attachment to...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ecdf
On Apr 19, 2010, at 9:12 AM, Downey, Patrick wrote: Hi Thierry, That worked perfectly. Thanks for the suggestion. For reference, in the documentation, it What is it? never lists {Hmisc}'s function as starting with E instead of e. Every instance of the documentation in the r help system using ??ecdf that I can find for Hmisc's version has it properly capitalized. I have Hmisc_3.7-0 I don't know who's in charge of documentation It would be the package maintainer if there were a problem. , but that What is that? -- David should probably be corrected. Thanks again. -Mitch -Original Message- From: ONKELINX, Thierry [mailto:thierry.onkel...@inbo.be] Sent: Monday, April 19, 2010 9:08 AM To: Downey, Patrick; R help Subject: RE: [R] ecdf R is case sensitive. ecdf() is in the stats package, Ecdf() is in Hmisc. So you want Ecdf(x,what='1-F') Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Downey, Patrick Verzonden: maandag 19 april 2010 15:04 Aan: R help Onderwerp: [R] ecdf Hello, I'd like to plot an empirical cumulative distribution function, except instead of the fraction of values x, I'd like the fraction of values x. I think this can be done using the ecdf function in {Hmisc}. I installed the package and loaded it. However, when following the example given in the documentation, I get an error: x - rnorm(100) ecdf(x,what='1-F') Error in ecdf(x, what = 1-F) : unused argument(s) (what = 1-F) I believe that this is because R is attempting to access the ecdf function in base R, which does not have the what option. Am I correct, and if so, how can I change that? Note: I also tried to do it myself without the {Hmisc} ecdf function, and couldn't figure out a way. x2 - 1-ecdf(x) doesn't work, and neither does x2 - rep(0,times=100) for(i in 1:100){ x2[i] - 1-ecdf(x)[i] } Both result in errors. Thanks in advance for any suggestions you can offer. -Mitch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ecdf
The OP wrote me privately to say that the errant documantation was at: http://lib.stat.cmu.edu/S/Harrell/help/Hmisc/html/ecdf.html That is a rather old bit of information. It dates back to a time when Frank's address was at the the University of Virginia. In 2003 he moved to Vanderbilt so that page dates from some year prior to 2003. (And it may not currently be under the control of anyone given its quasi-archival status.) -- David. -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Monday, April 19, 2010 10:25 AM To: Downey, Patrick Cc: ONKELINX, Thierry; R help Subject: Re: [R] ecdf On Apr 19, 2010, at 9:12 AM, Downey, Patrick wrote: Hi Thierry, That worked perfectly. Thanks for the suggestion. For reference, in the documentation, it What is it? never lists {Hmisc}'s function as starting with E instead of e. Every instance of the documentation in the r help system using ??ecdf that I can find for Hmisc's version has it properly capitalized. I have Hmisc_3.7-0 I don't know who's in charge of documentation It would be the package maintainer if there were a problem. , but that What is that? -- David should probably be corrected. Thanks again. -Mitch -Original Message- From: ONKELINX, Thierry [mailto:thierry.onkel...@inbo.be] Sent: Monday, April 19, 2010 9:08 AM To: Downey, Patrick; R help Subject: RE: [R] ecdf R is case sensitive. ecdf() is in the stats package, Ecdf() is in Hmisc. So you want Ecdf(x,what='1-F') Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Downey, Patrick Verzonden: maandag 19 april 2010 15:04 Aan: R help Onderwerp: [R] ecdf Hello, I'd like to plot an empirical cumulative distribution function, except instead of the fraction of values x, I'd like the fraction of values x. I think this can be done using the ecdf function in {Hmisc}. I installed the package and loaded it. However, when following the example given in the documentation, I get an error: x - rnorm(100) ecdf(x,what='1-F') Error in ecdf(x, what = 1-F) : unused argument(s) (what = 1-F) I believe that this is because R is attempting to access the ecdf function in base R, which does not have the what option. Am I correct, and if so, how can I change that? Note: I also tried to do it myself without the {Hmisc} ecdf function, and couldn't figure out a way. x2 - 1-ecdf(x) doesn't work, and neither does x2 - rep(0,times=100) for(i in 1:100){ x2[i] - 1-ecdf(x)[i] } Both result in errors. Thanks in advance for any suggestions you can offer. -Mitch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using split and then unsplit
Hello everyone, I use the split function splitting with the f function on a 3 columns and more than 100 000 rows data frame. Once it's split I have a list of data frames still with 3 columns and n rows. I manipulate those list elements and get a list of data frames still with 3 columns but less rows. So when I unsplit it, I get an error as I use the same factor function I used to split ( f in the help split page) I guess because the number of rows changed. Do I need to create a new f function to be able to unsplit or is there another way to unsplit those data frames and rbind them? Thank you! - Anna Lippel -- View this message in context: http://n4.nabble.com/Using-split-and-then-unsplit-tp2016071p2016071.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Grouping rows of data by day
Hi all, I have a set of data in hourly time steps with each row identified as time data column1 data column2 1 1.042 1.083 1.125 1.167 1.208 1.25 .and so on (the time column is in fractions of a day) I want to be able to group the data by day. I managed to do this using: Day1H = hourlydata[c(1:24),] but I'd like to be able to create groups for each day without doing this manually for each set of 24 rows. Any suggestions greatly appreciated Thanks -- View this message in context: http://n4.nabble.com/Grouping-rows-of-data-by-day-tp2016063p2016063.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inline Package: void vs return type functions
Many Thanks for your help Best, Sergio -- View this message in context: http://n4.nabble.com/Inline-Package-void-vs-return-type-functions-tp1838423p2015898.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing attitudes of 2 groups / likert scales?
On Apr 19, 2010, at 10:08 AM, Tal Galili wrote: Good luck in your work, The simple solution would be to run many non-paired wilcox on all the 20 questions (the way Dieter suggested). In which case, make sure to adjust for multiple comparisons. Read about it, and see: ?p.adjust If you have some questions you can merge (by a simple mean of them), it will probably do you good (using PCA might be an option, but it could also be an over kill for you). You might also be interested in plotting your data, here is a nice simple hack on how to display the Correlation scatter-plot matrix for your data: http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/ And Dieter, thanks for a great quote: Assess independence, equalvariance and normality -in that order (van Bell, Statistical rules of thumb). If you are thinking of using that quote, you might want to check the spelling of his name. My memory is van Belle. -- David. Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Mon, Apr 19, 2010 at 2:07 PM, Mona_m purplem...@blueyonder.co.uk wrote: Hi, I have just found this forum, and it looks like a great place to get some help (I hope) For my dissertation, which is due way too soon, I am doing a survey, comparing attitudes of 2 independent groups, with 5 scale likert questions. Basically I want to show if they have similar or different attitudes. I am testing 4 hypotheses, and have in total about 20 questions. I have to say my statistic skills are very basic and very rusty, we had some lectures two years ago, where we were introduced to R. I looked through my notes, and back then we did a one sample t-test to analyse likert type questions. I believe I would need to do a 2 sample unpaired t- test. It would be great if someone could give me some feedback if this test is the most suitable one for my purpose, and maybe could explain to me whats the easiest way to do this in R? You would help me loads!! Many thanks in advance Mona -- View this message in context: http://n4.nabble.com/comparing-attitudes-of-2-groups-likert-scales-tp2015738p2015738.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using split and then unsplit
here is an alternative that I just found to join my data frames with rbind: result - do.call(rbind, myList) It worked perfectly but I still don't understand why unsplit wouldn't work... - Anna Lippel -- View this message in context: http://n4.nabble.com/Using-split-and-then-unsplit-tp2016071p2016081.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing attitudes of 2 groups / likert scales?
David Winsemius wrote: If you are thinking of using that quote, you might want to check the spelling of his name. My memory is van Belle. Sorry, I thought I had corrected that before mailing. @BOOK{vanBelle2002, title = {Statistical rules of thumb}, publisher = {Wiley series in probability and statistics}, year = {2002}, author = {Gerald van Belle} } -- View this message in context: http://n4.nabble.com/comparing-attitudes-of-2-groups-likert-scales-tp2015738p2016083.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Formatting data, adding column names, use reshape, a newbie question
On Mon, Apr 19, 2010 at 5:13 AM, Paul Rigor (ucla) pr...@ucla.edu wrote: Hi all, I'm an R novice. I have data that's already formatted as molten that reshape should be able to work with. For example, the following was read in with read.csv(filename,sep= , header=FALSE) V1 V2 V3 V4 V5 1 original book book.source1.txt 328900494 3039.525 2 original book book.source1.txt 328900494 3057.952 I would like add column names so I can use reshape's cast method. names(df) - c(col1, col2, ..., col5) ? Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using split and then unsplit
On Apr 19, 2010, at 11:06 AM, anna wrote: Hello everyone, I use the split function splitting with the f function on a 3 columns and more than 100 000 rows data frame. Once it's split I have a list of data frames still with 3 columns and n rows. I manipulate those list elements and get a list of data frames still with 3 columns but less rows. So when I unsplit it, I get an error as I use the same factor function I used to split ( f in the help split page) I guess because the number of rows changed. Do I need to create a new f function to be able to unsplit or is there another way to unsplit those data frames and rbind them? Thank you! You may get success with: do.call(rbind, splitted) The other task to set for yourself is to read the Posting Guide again and create test cases when posting to r-help. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using split and then unsplit
Hi David, do.call worked perfectly but do you have an idea why unsplit wouldn't work in that case, is that because the number of rows changed? bc when the number didn't change unsplit worked - Anna Lippel -- View this message in context: http://n4.nabble.com/Using-split-and-then-unsplit-tp2016071p2016116.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stupid regexp question
Hello, I have a stupid regexp question. I have a large data frame of strings. I would like to convert all occurences of : W.m^{-2} to W/m2 I make the following test : gsub(glob2rx(W.m^{-2}), W/m2, W.m^{-2}) but it does not seem to work. I don't know how to do it otherwise as I could never learn how to deal with the special characters (like .^{}) in regexps. Thanks from advance for your kindly help servet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Drawing a line with misc3d
Hi the list, I would like to draw some lines with misc3d. I find a lot of tools to draw surfaces, but nothing for simple line... Is it possible? Note that I know that it is possible to draw lines with rgl (using lines3d), but I need to do it with misc3d to export the drawing in .asy format. Any solution? Christophe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] var.test
On Apr 18, 2010, at 4:55 PM, anon anon wrote: Hello, I'm using var.test to do a simple F-test for equality of variances. I think I'm missing something small here: m-rnorm(10,sd=1) n-rnorm(5,sd=1) var.test(m,n) F test to compare two variances data: m and n F = 13.7438, num df = 9, denom df = 4, p-value = 0.02256 alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: 1.543430 64.844094 sample estimates: ratio of variances 13.74375 qf(.0250,9,4)*var(m)/var(n) [1] 2.912997 - correct degrees of freedom (I think!) and does not match var.test lower bound qf(.0250,4,9)*var(m)/var(n) [1] 1.543430 -matches var.test lower bound with degrees of freedom reversed The var.test code is available for inspection: getAnywhere(var.test.default) It can be seen to use the ratio of the estimate to the theoretic qf value. Was there a reason you decided to use the product? BETA - (1 - conf.level)/2 CINT - c(ESTIMATE/qf(1 - BETA, DF.x, DF.y), ESTIMATE/qf(BETA, DF.x, DF.y)) It seems that the F-test in var.test is getting the degrees of freedom mixed up. Outside calculators seem to agree with the qf function. I would think that inverting the estimate should reverse the correct order for the degrees of freedom, but it is not clear that your choice for the CI calculation is the correct one. So, am I misunderstanding something? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tinn-R
Hello, I want to use the free distribution of R (R REvolution 3.2) and Tinn-R editor as well. Unfortunately they don't cooperate. In Tinn-R commands: send selection, send line etc. don't work. Do you have any idea how to resolve this problem? Best, Robert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stupid regexp question
On Apr 19, 2010, at 11:39 AM, servet ahmet çizmeli wrote: Hello, I have a stupid regexp question. I have a large data frame of strings. I would like to convert all occurences of : W.m^{-2} to W/m2 I make the following test : gsub(glob2rx(W.m^{-2}), W/m2, W.m^{-2}) Two problems I see. There is no reason I can see to wrap the pattern in glob2rx, and you need to double-back-slash the specials when they appear in the pattern: gsub(W.m\\^\\{-2\\}, W/m2, W.m^{-2}) [1] W/m2 Seems successful on that limited test. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stupid regexp question
Use fixed = TRUE to turn off interpretation of special characters: gsub(W.m^{-2}, W/m2, abc W.m^{-2} xyz, fixed = TRUE) 2010/4/19 servet ahmet çizmeli sa.cizm...@usherbrooke.ca: Hello, I have a stupid regexp question. I have a large data frame of strings. I would like to convert all occurences of : W.m^{-2} to W/m2 I make the following test : gsub(glob2rx(W.m^{-2}), W/m2, W.m^{-2}) but it does not seem to work. I don't know how to do it otherwise as I could never learn how to deal with the special characters (like .^{}) in regexps. Thanks from advance for your kindly help servet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BRugs
Thanks for the reply Bob, but it still does not work, you see. I ran this model, just with the main effects and it ran fine. n=length(bi.bmi) Lgen=2 Lrace=5 Lagegp=13 Lstra=15 Lpsu=2 bi.bmi.model=function(){ # likelihood for(i in 1:n){ bi.bmi[i]~ dbern(p[i]) logit(p[i])- a0 + a1[agegp[i]]+a2[gen[i]]+a3[race[i]] + g[stra[i]]+ u[psu[i],stra[i]] } # constraints for a1, a2, a3 a1[1]-0.0 a2[1]-0.0 a3[1]-0.0 # priors a0~ dnorm(0.0, 1.0E-4) for(j in 2:Lagegp){a1[j]~ dnorm(0.0, 1.0E-4)} for(j in 2:Lgen){ a2[j]~ dnorm(0.0, 1.0E-4)} for(k in 2:Lrace){ a3[k]~ dnorm(0.0, 1.0E-4)} for(l in 1:Lstra){ g[l]~dunif(0, 100) } for( m in 1:Lpsu){ for(l in 1:Lstra){ u[m,l]~ dnorm(0.0, tau.u) }} tau.u-pow(sigma.u, -2) sigma.u~ dunif(0.0,100) } library(BRugs) writeModel(bi.bmi.model, con='bi.bmi.model.txt') model.data=list( 'n','Lagegp', 'Lgen', 'Lrace', 'Lstra', 'Lpsu', 'bi.bmi','agegp', 'gen', 'race','stra', 'psu') model.init=function(){ list( sigma.u=runif(1), a0=rnorm(1), a1=c(NA, rep(0,12)), a2=c(NA, rep(0, 1)), a3=c(NA, rep(0, 4)), g=rep(0,Lstra), u=matrix(rep(0, 30), nrow=2)) } model.parameters=c( 'a0', 'a1', 'a2', 'a3') model.bugs=BRugsFit(modelFile='bi.bmi.model.txt', data=model.data, inits=model.init, numChains=1, para=model.parameters, nBurnin=50, nIter=100) This is just with the main effects, and this does not give me any problems, and I also ran the following model with interaction term between gen and race, and it also ran fine. for (i in 1:n){ bi.bmi[i]~ dbern(p[i]) logit(p[i])- a0 + a1[agegp[i]]+a2[gen[i]]+a3[race[i]] + a23[gen[i], race[i]] + gam[stra[i]]+ u[psu[i],stra[i]] } # constraints for a2, a3, a12 and a13 a1[1]-0.0 a2[1]-0.0 a3[1]-0.0 a23[1,1]-0.0 #gen x race for(j in 2:Lrace){ a23[1,j]-0.0} for(k in 2:Lgen){ a23[k,1]-0.0} # priors a0~ dnorm(0.0, 1.0E-4) for(i in 2:Lagegp){a1[i]~dnorm(0.0, 1.0E-4)} for(i in 2:Lgen){ a2[i]~ dnorm(0.0, 1.0E-4)} for(i in 2:Lrace){ a3[i]~ dnorm(0.0, 1.0E-4)} for(i in 2:Lgen){ for(j in 2:Lrace){ a23[i,j]~ dnorm(0.0, 1.0E-4) }} for(i in 1:Lstra){ gam[i]~dunif(0, 1000) } for( i in 1:Lpsu){ for(j in 1:Lstra){ u[i,j]~ dnorm(0.0, tau.u) }} tau.u-pow(sigma.u, -2) sigma.u~ dunif(0.0,100) } So, the error happens only when I try to plug in interaction with the agegp. I still don't know how to correct it. Thanks -- View this message in context: http://n4.nabble.com/BRugs-tp2015395p2016164.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting column's main
Try col=max.col(data) This should give you the index of the column with the max value. To get to the final result, combine this with names(dataframe)[col] to get the name of the column with the maximum value. HTH Jannis AuriDUL schrieb: Hello. I have data of potatoes production in EU during 1998-2009 from EuroStat where the first column consists of the names of EU countries, the following columns consists of appropriate data in each year. Let's say, I investigate Lithuania. For example, I have a row containing potatoes production in 1998, 1999, ..., 2009 in Lithuania. I can easily find the maximum value in this row but... ...but How could I print the year (the name of the column) where that maximum value of potaoes production in Lithuania exists? [if it's even possible.] I only have a code which can print a number of the column where that maximum value of Lithuania's potatoes production during 1998-2009 is. Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tinn-R
Consider trying notepad++ with NppToR That's what I use (it also works with the REvolution distribution) Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Mon, Apr 19, 2010 at 6:50 PM, Robert Ruser robert.ru...@gmail.comwrote: Hello, I want to use the free distribution of R (R REvolution 3.2) and Tinn-R editor as well. Unfortunately they don't cooperate. In Tinn-R commands: send selection, send line etc. don't work. Do you have any idea how to resolve this problem? Best, Robert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Follow up on installing formatR...
On Mon, Apr 19, 2010 at 5:28 AM, Brian Lunergan ff...@ncf.ca wrote: Good morning folks: Made a second go at installing this and succeeded, but with some strange behaviours along the way. First the system back story. My only guess is that installing RGtk2 from source is only possible if you have the gtk-devel package installed, i.e., the one with the headers. Otherwise, you'll need the r-cran-rgtk2 binary. Michael Using up to date edition Ubuntu 8.04 Using up to date edition of R 2.10.1 Using the Toronto, Ontario repository to draw from. After the first attempt ended up with two of the four dependencies installed so only the Rgtk related items would be needed this time. Went into synaptic and found r-cran-rgtk2. Okay. Sounded likely so I installed it successfully. Using a root terminal session I went back into R and ran install.packages(formatR). Pulled in two dependencies plus the main choice. Rejected one and kept two when it came time to install but still installed successfully, or so it seemed when I ran library(formatR). Go figure. A copy of the message run follows. Any comments or suggestions will be of interest. install.packages(formatR) Warning in install.packages(formatR) : argument 'lib' is missing: using '/usr/local/lib/R/site-library' --- Please select a CRAN mirror for use in this session --- Loading Tcl/Tk interface ... done also installing the dependencies RGtk2, gWidgetsRGtk2 trying URL 'http://probability.ca/cran/src/contrib/RGtk2_2.12.18.tar.gz' Content type 'application/x-gzip' length 2206504 bytes (2.1 Mb) opened URL == downloaded 2.1 Mb trying URL ' http://probability.ca/cran/src/contrib/gWidgetsRGtk2_0.0-64.tar.gz' Content type 'application/x-gzip' length 138192 bytes (134 Kb) opened URL == downloaded 134 Kb trying URL 'http://probability.ca/cran/src/contrib/formatR_0.1-3.tar.gz' Content type 'application/x-gzip' length 2672 bytes opened URL == downloaded 2672 bytes * installing *source* package RGtk2 ... checking for pkg-config... /usr/bin/pkg-config checking pkg-config is at least version 0.9.0... yes checking for LIBGLADE... no configure: WARNING: libglade not found checking for INTROSPECTION... no checking for GTK... no configure: error: GTK version 2.8.0 required ERROR: configuration failed for package RGtk2 * removing /usr/local/lib/R/site-library/RGtk2 * installing *source* package gWidgetsRGtk2 ... ** R ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ... * DONE (gWidgetsRGtk2) * installing *source* package formatR ... ** R ** preparing package for lazy loading Loading required package: gWidgets Loading required package: MASS ** help *** installing help indices ** building package indices ... * DONE (formatR) The downloaded packages are in /tmp/RtmpNL20Be/downloaded_packages Warning message: In install.packages(formatR) : installation of package 'RGtk2' had non-zero exit status -- Brian Lunergan Nepean, Ontario Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to make a boxplot with exclusion of certain groups
This seems like a simple thing, but I have been stuck for some time. My data has 2 columns. Column 1 is the value, and column 2 is the Site where data was collected. Column 2 contains 7 different Sites (i.e. groups). I am only interested in showing 3 groups on a single boxplot. I have tried various methods of subsetting the data, in order to only have the 3 groups in my subset. However even after doing this, all 7 groups carry forward, so that when I make a boxplot of my subsetted data, all 7 groups still appear in the x-axis labels; all 7 groups also appear in the boxplot summary (i.e. the values returned with boxplot (â¦plot=FALSE) ) . Even if I delete the unwanted groups from the âlevelsâ of Column 2, they still appear on the plot, and in the boxplot summary statistics. There are various tricks I can do with the boxplot summary statistics to correct for this, but they get complicated when I want to change the algorithm for calculating outliers and their corresponding groups. Rather than do all these tricks, it seems much simpler to fully exclude the unwanted groups from the beginning. But this doesnât appear to work Any ideas? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Identifying names of matrix columns shared by many matrices
Greetings R-Geniuses, What is the most efficient way to handle the problem described below? Thanks Marsh Feldman Problem description: Each U.S. state has its own matrix. The rows are dates, the columns are industries, and each cell contains total statewide employment at the given time and industry. There is a similar matrix for the U.S. as a whole. Due to disclosure rules and other limitations, one or more industries may be missing from any given matrix (including the national one), but industries missing from one matrix are sometimes not missing from others. Industry numbers are treated as factors commonly used as column names. I want to do two things: 1. For any given set of states, find the set of industries present in all of them and use this to select this subset of industries from each state's matrix. 2. For any given set of states, find the set of industries present in any of the states. 3. Given that one or more cells in the table may be NA, identify those industries present in all states and have no values equal to NA. I can do this using for() statements and %in%, but is there is a more efficient way? Your thoughts? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotting RR, 95% CI as table and figure in same plot
Hi all-- I am in the process of helping colleagues write up a ms in which we fit zero-inflated Poisson models. I would prefer plotting the rate ratios and 95% CI (as I've found Gelman and others convincing about plotting tables...), but our journals usually like the numbers themselves. Thus, I'm looking at a recent JAMA article in which both numbers and dotplot of RR and 95% CI are presented and wondering about best way to do this in R. Essentially, the plot has 3 columns: variable names, RR and 95% CI, and dotplot of the same. Using the bioChemists data in the pscl package and errbar function in Hmisc package, the code below is in the right direction... but still pretty ugly. Wondering if folks would have alternative suggestions about how to go about this, or pointers on cleaning up the code below (eg, I know there are many functions for plotting errbars/CI). [And, obviously, there are somethings that would be straightforward to clean-up such as supplying better variable names, etc., just wanted to see if there were better overall suggestions before getting too far on this route.] Thanks in advance. cheers, Dave library(Hmisc) library(pscl) ## data data(bioChemists, package = pscl) fm_pois - glm(art ~ ., data = bioChemists, family = poisson) summary(fm_pois) ### pull out rate-ratios and 95% CI rr - exp(cbind(coef(fm_pois), confint(fm_pois))) rr ### round to 2 decimal places rr - round(rr, 2) ### plot par(mfrow=c(1,3)) plot(0, type = n, xlim=c(0,2), ylim=c(1,6), axes = FALSE, ylab=NULL, xlab=NULL) text(row.names(rr), x = 1, y = 1:6) plot(0, type = n, xlim=c(0,2), ylim=c(1,6), axes = FALSE, ylab=NULL, xlab=NULL) text(paste(rr[,1], [, rr[,2], , , rr[,3], ], sep = ), x = 1, y = 1:6) errbar(x = factor(row.names(rr)), y = rr[,1], yplus = rr[,3], yminus = rr[,2]) abline(v = 1, lty =2) -- Dave Atkins, PhD Research Associate Professor Department of Psychiatry and Behavioral Science University of Washington datk...@u.washington.edu Center for the Study of Health and Risk Behaviors (CSHRB) 1100 NE 45th Street, Suite 300 Seattle, WA 98105 206-616-3879 http://depts.washington.edu/cshrb/ (Mon-Wed) Center for Healthcare Improvement, for Addictions, Mental Illness, Medically Vulnerable Populations (CHAMMP) 325 9th Avenue, 2HH-15 Box 359911 Seattle, WA 98104? 206-897-4210 http://www.chammp.org (Thurs) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to make a boxplot with exclusion of certain groups
On 4/19/2010 12:21 PM, josef.kar...@phila.gov wrote: This seems like a simple thing, but I have been stuck for some time. My data has 2 columns. Column 1 is the value, and column 2 is the Site where data was collected. Column 2 contains 7 different Sites (i.e. groups). I am only interested in showing 3 groups on a single boxplot. I have tried various methods of subsetting the data, in order to only have the 3 groups in my subset. However even after doing this, all 7 groups carry forward, so that when I make a boxplot of my subsetted data, all 7 groups still appear in the x-axis labels; all 7 groups also appear in the boxplot summary (i.e. the values returned with boxplot (…plot=FALSE) ) . Even if I delete the unwanted groups from the ‘levels’ of Column 2, they still appear on the plot, and in the boxplot summary statistics. There are various tricks I can do with the boxplot summary statistics to correct for this, but they get complicated when I want to change the algorithm for calculating outliers and their corresponding groups. Rather than do all these tricks, it seems much simpler to fully exclude the unwanted groups from the beginning. But this doesn’t appear to work Any ideas? library(gdata) # for drop.levels() DF - data.frame(site = rep(LETTERS[1:7], each=20), y = runif(7*20)) boxplot(y ~ drop.levels(site), data=subset(DF, site %in% c('A','D','F'), drop=TRUE)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting RR, 95% CI as table and figure in same plot
You could try the forestplot() function in rmeta, or the original grid code on which it is based, http://www.stat.auckland.ac.nz/~paul/RGraphics/chapter1.html -thomas On Mon, 19 Apr 2010, David Atkins wrote: Hi all-- I am in the process of helping colleagues write up a ms in which we fit zero-inflated Poisson models. I would prefer plotting the rate ratios and 95% CI (as I've found Gelman and others convincing about plotting tables...), but our journals usually like the numbers themselves. Thus, I'm looking at a recent JAMA article in which both numbers and dotplot of RR and 95% CI are presented and wondering about best way to do this in R. Essentially, the plot has 3 columns: variable names, RR and 95% CI, and dotplot of the same. Using the bioChemists data in the pscl package and errbar function in Hmisc package, the code below is in the right direction... but still pretty ugly. Wondering if folks would have alternative suggestions about how to go about this, or pointers on cleaning up the code below (eg, I know there are many functions for plotting errbars/CI). [And, obviously, there are somethings that would be straightforward to clean-up such as supplying better variable names, etc., just wanted to see if there were better overall suggestions before getting too far on this route.] Thanks in advance. cheers, Dave library(Hmisc) library(pscl) ## data data(bioChemists, package = pscl) fm_pois - glm(art ~ ., data = bioChemists, family = poisson) summary(fm_pois) ### pull out rate-ratios and 95% CI rr - exp(cbind(coef(fm_pois), confint(fm_pois))) rr ### round to 2 decimal places rr - round(rr, 2) ### plot par(mfrow=c(1,3)) plot(0, type = n, xlim=c(0,2), ylim=c(1,6), axes = FALSE, ylab=NULL, xlab=NULL) text(row.names(rr), x = 1, y = 1:6) plot(0, type = n, xlim=c(0,2), ylim=c(1,6), axes = FALSE, ylab=NULL, xlab=NULL) text(paste(rr[,1], [, rr[,2], , , rr[,3], ], sep = ), x = 1, y = 1:6) errbar(x = factor(row.names(rr)), y = rr[,1], yplus = rr[,3], yminus = rr[,2]) abline(v = 1, lty =2) -- Dave Atkins, PhD Research Associate Professor Department of Psychiatry and Behavioral Science University of Washington datk...@u.washington.edu Center for the Study of Health and Risk Behaviors (CSHRB) 1100 NE 45th Street, Suite 300 Seattle, WA 98105 206-616-3879 http://depts.washington.edu/cshrb/ (Mon-Wed) Center for Healthcare Improvement, for Addictions, Mental Illness, Medically Vulnerable Populations (CHAMMP) 325 9th Avenue, 2HH-15 Box 359911 Seattle, WA 98104? 206-897-4210 http://www.chammp.org (Thurs) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe
http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f -- View this message in context: http://n4.nabble.com/dataframe-tp2015650p2016230.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to pass a list of parameters into a function
Does anyone know how to pass a list of parameters into a function? for example: somefun=function(x1,x2,x3,x4,x5,x6,x7,x8,x9){ ans=x1+x2+x3+x4+x5+x6+x7+x8+x9 return(ans) } somefun(1,2,3,4,5,6,7,8,9) # I would like this to work: temp=c(x3=3,x4=4,x5=5,x6=6,x7=7,x8=8,x9=9) somefun(x1=1,x2=2,temp) # OR I would like this to work: temp=list(x3=3,x4=4,x5=5,x6=6,x7=7,x8=8,x9=9) somefun(x1=1,x2=2,temp) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BRugs
Perhaps a better idea is to ask on a BUGS mailing list. BRugs is just an interface to OpenBUGS and is not involved in handling the BUGS language I'd also suggest to strat trying your problem witht BRugs but in OpenBUGS directly in order to avoid confusion caused by the interface. Best wishes, Uwe Ligges On 19.04.2010 18:04, N S Ha wrote: Thanks for the reply Bob, but it still does not work, you see. I ran this model, just with the main effects and it ran fine. n=length(bi.bmi) Lgen=2 Lrace=5 Lagegp=13 Lstra=15 Lpsu=2 bi.bmi.model=function(){ # likelihood for(i in 1:n){ bi.bmi[i]~ dbern(p[i]) logit(p[i])- a0 + a1[agegp[i]]+a2[gen[i]]+a3[race[i]] + g[stra[i]]+ u[psu[i],stra[i]] } # constraints for a1, a2, a3 a1[1]-0.0 a2[1]-0.0 a3[1]-0.0 # priors a0~ dnorm(0.0, 1.0E-4) for(j in 2:Lagegp){a1[j]~ dnorm(0.0, 1.0E-4)} for(j in 2:Lgen){ a2[j]~ dnorm(0.0, 1.0E-4)} for(k in 2:Lrace){ a3[k]~ dnorm(0.0, 1.0E-4)} for(l in 1:Lstra){ g[l]~dunif(0, 100) } for( m in 1:Lpsu){ for(l in 1:Lstra){ u[m,l]~ dnorm(0.0, tau.u) }} tau.u-pow(sigma.u, -2) sigma.u~ dunif(0.0,100) } library(BRugs) writeModel(bi.bmi.model, con='bi.bmi.model.txt') model.data=list( 'n','Lagegp', 'Lgen', 'Lrace', 'Lstra', 'Lpsu', 'bi.bmi','agegp', 'gen', 'race','stra', 'psu') model.init=function(){ list( sigma.u=runif(1), a0=rnorm(1), a1=c(NA, rep(0,12)), a2=c(NA, rep(0, 1)), a3=c(NA, rep(0, 4)), g=rep(0,Lstra), u=matrix(rep(0, 30), nrow=2)) } model.parameters=c( 'a0', 'a1', 'a2', 'a3') model.bugs=BRugsFit(modelFile='bi.bmi.model.txt', data=model.data, inits=model.init, numChains=1, para=model.parameters, nBurnin=50, nIter=100) This is just with the main effects, and this does not give me any problems, and I also ran the following model with interaction term between gen and race, and it also ran fine. for (i in 1:n){ bi.bmi[i]~ dbern(p[i]) logit(p[i])- a0 + a1[agegp[i]]+a2[gen[i]]+a3[race[i]] + a23[gen[i], race[i]] + gam[stra[i]]+ u[psu[i],stra[i]] } # constraints for a2, a3, a12 and a13 a1[1]-0.0 a2[1]-0.0 a3[1]-0.0 a23[1,1]-0.0 #gen x race for(j in 2:Lrace){ a23[1,j]-0.0} for(k in 2:Lgen){ a23[k,1]-0.0} # priors a0~ dnorm(0.0, 1.0E-4) for(i in 2:Lagegp){a1[i]~dnorm(0.0, 1.0E-4)} for(i in 2:Lgen){ a2[i]~ dnorm(0.0, 1.0E-4)} for(i in 2:Lrace){ a3[i]~ dnorm(0.0, 1.0E-4)} for(i in 2:Lgen){ for(j in 2:Lrace){ a23[i,j]~ dnorm(0.0, 1.0E-4) }} for(i in 1:Lstra){ gam[i]~dunif(0, 1000) } for( i in 1:Lpsu){ for(j in 1:Lstra){ u[i,j]~ dnorm(0.0, tau.u) }} tau.u-pow(sigma.u, -2) sigma.u~ dunif(0.0,100) } So, the error happens only when I try to plug in interaction with the agegp. I still don't know how to correct it. Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S4-based package failure: setGeneric example#1
Nevermind, I figured out my problem after looking at the packS4 package. I didn't realize that package.skeleton would build invalid R code. The warning hints at that, but I interpreted the warning to mean that the package.skeleton-generated code should be *edited*, and none of my edits gave me a workable package. Now I realize that the S4-driven code from package.skeleton can simply be *replaced by the original code*, which is what packS4 does. -D I am a newbie package builder who successfully built a Hello world package but am now having trouble building a package with S4 functionality. I thought I would start by building a package consisting of just the first example under the setGeneric help page in a fresh 2.10.0 (windows) console (methods loaded at startup). The example: ## create a new generic function, with a default method props - function(object) attributes(object) setGeneric(props) I executed those commands, then package.skeleton(props, list=props, namespace=TRUE) I edited the help files, put Depends: methods into the DESCRIPTION file, and put export(props) as the only line in the NAMESPACE file. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to pass a list of parameters into a function
Try this: do.call(somefun, c(x1 = 1, x2 = 2, as.list(temp))) On Mon, Apr 19, 2010 at 1:58 PM, Gene Leynes gleyne...@gmail.comgleynes%...@gmail.com wrote: Does anyone know how to pass a list of parameters into a function? for example: somefun=function(x1,x2,x3,x4,x5,x6,x7,x8,x9){ ans=x1+x2+x3+x4+x5+x6+x7+x8+x9 return(ans) } somefun(1,2,3,4,5,6,7,8,9) # I would like this to work: temp=c(x3=3,x4=4,x5=5,x6=6,x7=7,x8=8,x9=9) somefun(x1=1,x2=2,temp) # OR I would like this to work: temp=list(x3=3,x4=4,x5=5,x6=6,x7=7,x8=8,x9=9) somefun(x1=1,x2=2,temp) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to pass a list of parameters into a function
On Mon, Apr 19, 2010 at 5:58 PM, Gene Leynes gleyne...@gmail.com wrote: Does anyone know how to pass a list of parameters into a function? for example: somefun=function(x1,x2,x3,x4,x5,x6,x7,x8,x9){ ans=x1+x2+x3+x4+x5+x6+x7+x8+x9 return(ans) } somefun(1,2,3,4,5,6,7,8,9) # I would like this to work: temp=c(x3=3,x4=4,x5=5,x6=6,x7=7,x8=8,x9=9) somefun(x1=1,x2=2,temp) # OR I would like this to work: temp=list(x3=3,x4=4,x5=5,x6=6,x7=7,x8=8,x9=9) somefun(x1=1,x2=2,temp) Why? These are the kind of things that should only be done by people who know how to do them and hence know not to do them. A bit like the definition of a gentleman being someone who knows how to play the bagpipes but chooses not to. You can do this, but it requires all sorts of fiddling of the argument lists which would make the functions very messy, and most probably unlike any other R functions the user has encountered. But... for starters you might want to look at the '...' argument which gives some flexibility in argument handling, something like: foo = function(...){ return(list(...)) } then try foo(x1=1,x2=2) and so on. See what you get back. Then work out how to add them all up, check they are all x1 to x9 and so on. And recursively unwrap your 'temp' variable by testing if it is atomic or not. I'm off to play the bagpipes now. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouping rows of data by day
Try this: aggregate(DF[c('data1','data2')], list(gsub('\\..*', '', DF$time)), FUN = sum) On Mon, Apr 19, 2010 at 12:00 PM, jennyed jen.wri...@ed.ac.uk wrote: Hi all, I have a set of data in hourly time steps with each row identified as time data column1 data column2 1 1.042 1.083 1.125 1.167 1.208 1.25 .and so on (the time column is in fractions of a day) I want to be able to group the data by day. I managed to do this using: Day1H = hourlydata[c(1:24),] but I'd like to be able to create groups for each day without doing this manually for each set of 24 rows. Any suggestions greatly appreciated Thanks -- View this message in context: http://n4.nabble.com/Grouping-rows-of-data-by-day-tp2016063p2016063.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Type-I v/s Type-III Sum-Of-Squares in ANOVA
* On Mo, 1. Mar 2010, Ista Zahn wrote: I've posted a short explanation about this at http://yourpsyche.org/miscellaneous that you might find helpful. I'm a As someone who's also struggled with the type X sum of squares topic, I like the idea to completely walk through a numerical example and see what happens. I'd like to extend this a bit, and cover the following aspects: - how are the model comparisons underlying the SS types calculated? - do the compared models obey the marginality principle? - what are the orthogonal projections defining the model comparisons? - are the projections invariant to the type of contrast codes? - are the hypotheses formulated using empirical cell sizes? (are the effect estimates using weighted or unweighted marginal means)? - how can (some of) the SS be calculated without matrix math? Below you'll find the code for SS type III using the 2x2 example from Maxwell and Delaney. For SS type I, II, and III using the 3x3 example in MD, please see http://www.uni-kiel.de/psychologie/dwoll/r/doc/ssTypes.r The model projections for SS type III corresponding to models that violate the marginality principle are not invariant to the coding scheme. If, e.g., Pai is the projection for the model including main effect A and interaction A:B, Pai will be different for non sum-to-zero and sum-to-zero codes. This seems to mean that SS type III for main effects compare different models when using different contrasts codes. Which leads to the question what hypotheses these models actually imply. I'd be grateful if someone could provide any pointers on where to read up on that. I hope this post is not too long! Best, Daniel - # 2x2 unbalanced design: data from Maxwell Delaney 2004 p322 P - 2 # two groups in factor A (female / male) Q - 2 # two groups in factor B (college degree / no degree) g11 - c(24, 26, 25, 24, 27, 24, 27, 23) g12 - c(15, 17, 20, 16) g21 - c(25, 29, 27) g22 - c(19, 18, 21, 20, 21, 22, 19) Y - c(g11, g12, g21, g22) # salary in 100$ A - factor(rep(1:P, c(8+4, 3+7)), labels=c(f, m)) B - factor(rep(rep(1:Q, P), c(8,4, 3,7)), labels=c(deg, noDeg)) - # utility function getInf2x2 (run with different contrasts settings) # fit all relevant regression models for 2x2 between-subjects design # output: * residual sum of squares for each model and their df # * orthogonal projection on subspace as defined # by the design matrix of each model getInf2x2 - function() { X - model.matrix(lm(Y ~ A + B + A:B)) # ANOVA design matrix # indicator variables for factors from design matrix idA - X[ , 2] # factor A idB - X[ , 3] # factor B idI - X[ , 4] # interaction A:B # fit each relevant regression model mod1 - lm(Y ~ 1) # no effect modA - lm(Y ~ idA) # factor A modB - lm(Y ~ idB) # factor B modAB - lm(Y ~ idA + idB) # factors A, B modAI - lm(Y ~ idA + idI) # factor A, interaction A:B modBI - lm(Y ~ idB + idI) # factor B, interaction A:B modABI - lm(Y ~ idA + idB + idI) # full model A, B, A:B # RSS for each regression model from lm() rss1 - sum(residuals(mod1)^2)# no effect, i.e., total SS rssA - sum(residuals(modA)^2)# factor A rssB - sum(residuals(modB)^2)# factor B rssAB - sum(residuals(modAB)^2) # factors A, B rssAI - sum(residuals(modAI)^2) # factor A, A:B rssBI - sum(residuals(modBI)^2) # factor B, A:B rssABI - sum(residuals(modABI)^2) # full model A, B, A:B # degrees of freedom for RSS for each model N - length(Y) # total N df1 - N - (0+1) # no effect:0 predictors + mean dfA - N - (1+1) # factor A: 1 predictor + mean dfB - N - (1+1) # factor B: 1 predictor + mean dfAB - N - (2+1) # factors A, B: 2 predictors + mean dfAI - N - (2+1) # factor A, A:B:2 predictors + mean dfBI - N - (2+1) # factor B, A:B:2 predictors + mean dfABI - N - (3+1) # full model A, B, A:B: 3 predictors + mean --- # alternatively: get RSS for each model and their df manually # based on geometric interpretation # design matrix for each model one - rep(1, nrow(X))# column of 1s X1 - cbind(one) # no effect Xa - cbind(one, idA)# factor A Xb - cbind(one, idB) # factor B Xab - cbind(one, idA, idB) # factors A, B Xai - cbind(one, idA, idI) # factor A, interaction A:B Xbi - cbind(one, idB, idI) # factor B, interaction A:B Xabi - cbind(one, idA, idB, idI) # full model A, B, A:B # orthogonal projections P on subspace given by the design matrix # of each model: P*y = y^hat are the model predictions P1 - X1 %*% solve(t(X1)
[R] help in output file
HI, Dear R-community, I AM using the following codes to grow tree and plot tree: # Classification Tree with rpart library(rpart) pdf(file=/home/cdu/changbin/dimer_tree.pdf) # grow tree fit.dimer - rpart(outcome ~ ., method=class, data=p.dimer[,2:402]) plotcp(fit.dimer) # visualize cross-validation results # plot tree plot(fit.dimer, uniform=TRUE, main=Classification Tree for AA.dimer) text(fit.dimer, use.n=TRUE, all=TRUE, cex=.8) dev.off() But when I open in the pdf file, I found the right side of tree is not shown up, also part of the bottom of the tree did not show. HOW TO DEAL WITH THIS PROBLEM? THanks! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nls for piecewise linear regression not converging to least square
Hi R experts, I'm trying to use nls() for a piecewise linear regression with the first slope constrained to 0. There are 10 data points and when it does converge the second slope is almost always over estimated for some reason. I have many sets of these 10-point datasets that I need to do. The following segment of code is an example, and sorry for the overly precise numbers, they are just copied from real data. y1-c(2.37700445, 1.76209775, 0.09795576, 2.21834963, 6.62262243, 15.70471269, 21.92956392, 36.39401717, 32.43620195, 44.77442277) x1-c(24.6, 28.9, 33.2, 37.6, 42.0, 46.4, 50.9, 55.3, 59.8, 64.3) dat - data.frame(x1,y1) nlmod - nls(y1 ~ ifelse(x1 xint+(yint/slp), yint, yint + (x1-(xint+(yint/slp)))*slp), data=dat, control=list(minFactor=1e-5,maxiter=500,warnOnly=T), start=list(xint=39.27464924, yint=0.09795576, slp=2.15061064), na.action=na.omit, trace=T) ##plotting the function plot(dat$x1,dat$y1) segments(x0=0, x1=coef(nlmod)[1]+coef(nlmod)[2]*coef(nlmod)[3], y0=coef(nlmod)[2], y1=coef(nlmod)[2]) segments(x0=coef(nlmod)[1]+coef(nlmod)[2]*coef(nlmod)[3],x1=80, y0=coef(nlmod)[2], y1=80*coef(nlmod)[3]+coef(nlmod)[2]) As you can see from the plot, the line is above all data points on the second segment. This seems to be the case for different datasets. I'm wondering if anyone can help me understand why this happens. Is this because there are too few data points or is it because the likelihood function is just not smooth enough? Karen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to set proxy settings for R
Dear All, I would like to run R on my computer (with win xp on it) at work bu the proxy restrictions of the university don't let me download the packages or to connect to a cran mirror, I usually get this message: chooseCRANmirror() Warning message: In open.connection(con, r) : unable to connect to 'cran.r-project.org' on port 80. Do you know if there is a way to set proxy settings for using R normally? Thanks a lot -- View this message in context: http://n4.nabble.com/How-to-set-proxy-settings-for-R-tp2016158p2016158.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nls minimum factor error
Hi, I have a small dataset that I'm fitting a segmented regression using nls on. I get a step below minimum factor error, which I presume is because residual sum of square is still not small enough when steps in the parameter space is already below specified/default value. However, when I look at the trace, the convergence seems to have been reached. I initially thought I might have reached the parameter space boundary, but these converging parameter values are by no means near boundary, they are quite in the middle. Could someone help me understand or throw out some possibilities? ##Here's a sample dataset and code. y2-c(2.404529, 1.625661, 1.013981, 3.810921, 10.023745, 10.990817, 10.740636, 11.246827,17.022761, 21.430386) x2-c(25.0, 29.3, 33.8, 38.3, 42.8, 47.2, 51.6, 55.8, 60.4, 64.9) dat - data.frame(x2,y2) nlmod - nls(y2 ~ ifelse(x2 xint+(yint/slp), yint, yint + (x2-(xint+(yint/slp)))*slp), data=dat, control=list(minFactor=1e-5,maxiter=500,warnOnly=T), start=list(xint=40.49782, yint=1.013981, slp=0.8547828), na.action=na.omit, trace=T) ##plotting the function plot(dat$x2,dat$y2) segments(x0=0, x1=coef(nlmod)[1]+coef(nlmod)[2]*coef(nlmod)[3], y0=coef(nlmod)[2], y1=coef(nlmod)[2]) segments(x0=coef(nlmod)[1]+coef(nlmod)[2]*coef(nlmod)[3],x1=80, y0=coef(nlmod)[2], y1=80*coef(nlmod)[3]+coef(nlmod)[2]) Karen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to set proxy settings for R
Try this: setInternet2() chooseCRANmirror() On Mon, Apr 19, 2010 at 1:00 PM, danda gal...@tcd.ie wrote: Dear All, I would like to run R on my computer (with win xp on it) at work bu the proxy restrictions of the university don't let me download the packages or to connect to a cran mirror, I usually get this message: chooseCRANmirror() Warning message: In open.connection(con, r) : unable to connect to 'cran.r-project.org' on port 80. Do you know if there is a way to set proxy settings for using R normally? Thanks a lot -- View this message in context: http://n4.nabble.com/How-to-set-proxy-settings-for-R-tp2016158p2016158.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glmer with non integer weights
Le lundi 19 avril 2010 à 03:00 -0800, Kay Cichini a écrit : hi emmanuel, thanks a lot for your extensive answer. do you think using the asin(sqrt()) transf. can be justified for publishing prurpose or do i have to expect criticism. Hmmm ... depends of your reviewers. But if an half-asleep dental surgeon caught that after an insomnia, you might expect that a fully caffeinated reviewer will. Add Murphy's law to the mix and ... boom ! naivly i excluded that possibility, because of violated anova-assumptions, but if i did get you right the finite range rather posses a problem here. No. your problem is that you model a probability as a smooth (linear) finite function of finite variables. Under those assumptions, you can't get a *certitude* (probability 0 or 1). Your model is *intrinsically* inconsistent with your data. In other word, I'm unable to believe both your model (linear whathyoumaycallit regression) and your data (wich include certainties) *simultaneously*. I'd reconsider your 0 or 1, as meaning *censored* quantities (i. e. no farther than some epsilon from 0 or 1), with *hard* data (i. e. not a cooked-up estimate such as the ones i used) to estimate epsilon. There are *lots* of ways to fit models with censored dependent variables. why is it in this special case an advantage? It's bloody hell *not* a specific advantage : if you want to fit a linear model to a a probability, you *need* some function mapping R to the open ]0 1[ (i. e. all reals strictly superior to 0 and strictly inferior to 1 ; I thing that's denoted (0 1) in English/American usage). Asin(sqrt()) does that. However, (asin(sqrt()))^-1 has a big problem (mapping back [0 1] i. e. *including* 0 and 1, *not* (0 1), to R) which *hides* the (IMHO bigger) problem of the inadequacy of your model to your data ! In other words, it lets you shoot yourself in the foot after a nice sciatic nerve articaïne block making the operation painless (but still harmful). On the other hand, logit (or, as pointed by Martin Maechler, qlogis), is kind enough to choke on this (i. e. returning back Inf values, which will make the regression program choke). So please quench my thirst : what exactly is MH.Index supposed to be ? How is it measured, estimated, guessed or divined ? HTH, Emmanuel Charpentier __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing attitudes of 2 groups / likert scales?
thanks a lot for your help!! I better get on with reading / working now! -- View this message in context: http://n4.nabble.com/comparing-attitudes-of-2-groups-likert-scales-tp2015738p2016398.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with labels and scaling in star diagrams
For number 2, do the scaling yourself so that all values are between 0 and 1, then use scale=FALSE in the call to stars. For number 3 try stardata[1,,drop=FALSE] Don't have a good suggestion for 1 (though you could look at the code to see where the legend is plotted and move that code to the regular stars for your own custom copy of the function). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Euan Reavie Sent: Friday, April 16, 2010 7:33 PM To: r-help@r-project.org Subject: [R] Problems with labels and scaling in star diagrams I have the following small dataset: stardata NSHEEBCWRW PW 1 0 0.000 0.000 0.042 0.006 0 2 0 0.006 0.000 0.013 0.005 0 3 0 0.000 0.011 0.000 0.000 0 I have plotted the star diagrams as follows: stars(stardata, key.labels = dimnames(stardata)[[2]], labels = NULL, key.loc = NULL, draw.segments=TRUE, col.segments=gray, lty=blank) I am having three problems. I welcome solutions for any or all of them, or recommendations for a specific package or function that would be better suited to my needs. 1. How do I add labels (dimnames(stardata)[[2]]) to all of the segments, the same way they are added to the segments of the key (which I haven't included)? The hard way would be to use the text function with six coordinates for the variables, but surely there's an easier way? 2. It took me a while to realize that each segment is scaled based on the maximum for that variable (column). I would like to treat each row independently so that all segments are scaled based on the maximum for that row. Possible? 3. I figured one way around problem 2 would be to reduce the dataset to one row (e.g. stardata[1,]) but that gives me an error incorrect number of dimensions when I try to generate the diagram. So, I seem to be unable to plot a single star diagram; it must be two or more. Many thanks - Euan. Euan Reavie, University of Minnesota Duluth. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] densCols: what are the computed densities and how to create a legend
Hi, I'm using the densCols function for a scatterplot and cannot figure out 1) how to extract the computed densities, and 2) how to create a legend based that represents the upper and lower ranges of the densities. For example: movers.den - densCols(move$x, move$y) table(movers.den) #08306B #083775 #083B7C #083D7E #3989C1 #3F8FC4 28 22 101 25 4 5 #4392C6 #65AAD3 #69ACD5 #6CAED6 #77B4D8 #98C6DF 146 8 43 9 plot(move, col=movers.den, pch=20,ylab=y coordinate movement (meters),xlab=x coordinate movement (meters)) abline(h=0, v=0, col = grey, lty=2) #legend?? Any help would be appreciated! Kate __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nls for piecewise linear regression not converging to least square
Try reparameterizing: nlmod2 - nls(y2 ~ pmax(1/p, (x2 - xint)), data = dat, start = list(xint = 40.49782, p = 1), trace = TRUE, alg = plinear) On Mon, Apr 19, 2010 at 11:32 AM, Karen Chang Liu kare...@uw.edu wrote: Hi R experts, I'm trying to use nls() for a piecewise linear regression with the first slope constrained to 0. There are 10 data points and when it does converge the second slope is almost always over estimated for some reason. I have many sets of these 10-point datasets that I need to do. The following segment of code is an example, and sorry for the overly precise numbers, they are just copied from real data. y1-c(2.37700445, 1.76209775, 0.09795576, 2.21834963, 6.62262243, 15.70471269, 21.92956392, 36.39401717, 32.43620195, 44.77442277) x1-c(24.6, 28.9, 33.2, 37.6, 42.0, 46.4, 50.9, 55.3, 59.8, 64.3) dat - data.frame(x1,y1) nlmod - nls(y1 ~ ifelse(x1 xint+(yint/slp), yint, yint + (x1-(xint+(yint/slp)))*slp), data=dat, control=list(minFactor=1e-5,maxiter=500,warnOnly=T), start=list(xint=39.27464924, yint=0.09795576, slp=2.15061064), na.action=na.omit, trace=T) ##plotting the function plot(dat$x1,dat$y1) segments(x0=0, x1=coef(nlmod)[1]+coef(nlmod)[2]*coef(nlmod)[3], y0=coef(nlmod)[2], y1=coef(nlmod)[2]) segments(x0=coef(nlmod)[1]+coef(nlmod)[2]*coef(nlmod)[3],x1=80, y0=coef(nlmod)[2], y1=80*coef(nlmod)[3]+coef(nlmod)[2]) As you can see from the plot, the line is above all data points on the second segment. This seems to be the case for different datasets. I'm wondering if anyone can help me understand why this happens. Is this because there are too few data points or is it because the likelihood function is just not smooth enough? Karen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nls for piecewise linear regression not converging to least square
Hi R experts, I'm trying to use nls() for a piecewise linear regression with the first slope constrained to 0. There are 10 data points and when it does converge the second slope is almost always over estimated for some reason. I have many sets of these 10-point datasets that I need to do. The following segment of code is an example, and sorry for the overly precise numbers, they are just copied from real data. y1-c(2.37700445, 1.76209775, 0.09795576, 2.21834963, 6.62262243, 15.70471269, 21.92956392, 36.39401717, 32.43620195, 44.77442277) x1-c(24.6, 28.9, 33.2, 37.6, 42.0, 46.4, 50.9, 55.3, 59.8, 64.3) dat - data.frame(x1,y1) nlmod - nls(y1 ~ ifelse(x1 xint+(yint/slp), yint, yint + (x1-(xint+(yint/slp)))*slp), data=dat, control=list(minFactor=1e-5,maxiter=500,warnOnly=T), start=list(xint=39.27464924, yint=0.09795576, slp=2.15061064), na.action=na.omit, trace=T) ##plotting the function plot(dat$x1,dat$y1) segments(x0=0, x1=coef(nlmod)[1]+coef(nlmod)[2]*coef(nlmod)[3], y0=coef(nlmod)[2], y1=coef(nlmod)[2]) segments(x0=coef(nlmod)[1]+coef(nlmod)[2]*coef(nlmod)[3],x1=80, y0=coef(nlmod)[2], y1=80*coef(nlmod)[3]+coef(nlmod)[2]) As you can see from the plot, the line is above all data points on the second segment. This seems to be the case for different datasets. I'm wondering if anyone can help me understand why this happens. Is this because there are too few data points or is it because the likelihood function is just not smooth enough? Karen _ The New Busy is not the old busy. Search, chat and e-mail from your inbox. N:WL:en-US:WM_HMP:042010_3 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nls minimum factor error
Hi, I have a small dataset that I'm fitting a segmented regression using nls on. I get a step below minimum factor error, which I presume is because residual sum of square is still not small enough when steps in the parameter space is already below specified/default value. However, when I look at the trace, the convergence seems to have been reached. I initially thought I might have reached the parameter space boundary, but these converging parameter values are by no means near boundary, they are quite in the middle. Could someone help me understand or throw out some possibilities? ##Here's a sample dataset and code. y2-c(2.404529, 1.625661, 1.013981, 3.810921, 10.023745, 10.990817, 10.740636, 11.246827,17.022761, 21.430386) x2-c(25.0, 29.3, 33.8, 38.3, 42.8, 47.2, 51.6, 55.8, 60.4, 64.9) dat - data.frame(x2,y2) nlmod - nls(y2 ~ ifelse(x2 xint+(yint/slp), yint, yint + (x2-(xint+(yint/slp)))*slp), data=dat, control=list(minFactor=1e-5,maxiter=500,warnOnly=T), start=list(xint=40.49782, yint=1.013981, slp=0.8547828), na.action=na.omit, trace=T) ##plotting the function plot(dat$x2,dat$y2) segments(x0=0, x1=coef(nlmod)[1]+coef(nlmod)[2]*coef(nlmod)[3], y0=coef(nlmod)[2], y1=coef(nlmod)[2]) segments(x0=coef(nlmod)[1]+coef(nlmod)[2]*coef(nlmod)[3],x1=80, y0=coef(nlmod)[2], y1=80*coef(nlmod)[3]+coef(nlmod)[2]) Karen _ Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. N:WL:en-US:WM_HMP:042010_1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Huge data sets and RAM problems
Dear all, This is the first time I am sending mail to the mailing list, so I hope I do not make a mistake... The last months I have been working on my MSc thesis project on performing data mining techniques on user logs of a software-as-a-service application. The main problem I am experiencing is how to process the huge amount of data. More specifically: I am using R 2.10.1 in a laptop with Windows 7 - 32bit system, 2GB RAM and CPU Intel Core Duo 2GHz. The user logs data come from a query Crystal report (.rpt file) which I transform with some Java code into a tab separated file. Although with a small subset of my data everything manages to run, when I increase the data set I get several problems: The first problem is with the use of read.delim(). When I try to read a big amount of data (over 2.400.000 rows and 18 attributes at each row) it doesn't seem to transform all table into a data frame. In particular, the data frame returned has 1.220.987 rows. Furthermore, as one of the data attributes is DataTime, when I try to split this column into two columns (one with Data and one with the Time), the returned result is quite strange, as the two new columns appear to have more rows than the data frame: applicLog.dat - read.delim(file.txt) #Process the syscreated column (Date time -- Date + time) copyDate - applicLog.dat[[ï..syscreated]] copyDate - as.character(copyDate) splitDate - strsplit(copyDate, ) splitDate - unlist(splitDate) splitDateIndex - c(1:length(splitDate)) sysCreatedDate - splitDate[splitDateIndex %% 2 == 1] sysCreatedTime - splitDate[splitDateIndex %% 2 == 0] sysCreatedDate - strptime(sysCreatedDate, format=%Y-%m-%d) op - options(digits.secs = 3) sysCreatedTime - strptime(sysCreatedTime, format =%H:%M:%OS) applicLog.dat[[ï..syscreated]] - NULL applicLog.dat - cbind (sysCreatedDate,sysCreatedTime,applicLog.dat) Then I get the error: Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 1221063, 1221062, 1220987 Finally, another problem I have is when I perform association mining on the data set using the package arules: I turn the data frame into transactions table and then run the apriori algorithm. When I put too low support in order to manage to find the rules I need, the vector of rules becomes too big and I get problems with the memory such as: Error: cannot allocate vector of size 923.1 Mb In addition: Warning messages: 1: In items(x) : Reached total allocation of 153Mb: see help(memory.size) Could you please help me with how I could allocate more RAM? Or, do you think there is a way to process the data by loading them into a document instead of loading all into RAM? Do you know how I could manage to read all my data set? I would really appreciate your help. Kind regards, Stella Pachidi PS: Do you know any text editor that can read huge .txt files? -- Stella Pachidi Master in Business Informatics student Utrecht University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting RR, 95% CI as table and figure in same plot
ggplot2 should work (resize to get the plot to the dimensions you need for the paper) library(Hmisc) library(pscl) library(ggplot2) ## data data(bioChemists, package = pscl) fm_pois - glm(art ~ ., data = bioChemists, family = poisson) summary(fm_pois) ### pull out rate-ratios and 95% CI rr - exp(cbind(coef(fm_pois), confint(fm_pois))) rr ### round to 2 decimal places rr - as.data.frame(round(rr, 2)) colnames(rr)-c(y,ymin,ymax) rr$labl-rownames(rr) ## Change this to meaningful labels rr$x-1:length(rownames(rr)) gpl-ggplot(rr,aes(x,y,ymin=ymin,ymax=ymax)) gpl+geom_point()+geom_linerange()+ geom_hline(aes(yintercept=1), linetype=dashed,size=0.5)+ geom_text(aes(x,y=0.3,label=y,hjust=0),size=3)+ geom_text(aes(x,y=0.0,label=labl),size=3)+ geom_text(aes(x,y=0.5,label=paste([,ymin,,,ymax,],sep=) ,hjust=0.0),size=3)+ ylab(Relative Risk)+xlab()+ coord_cartesian(ylim=c(-1,1.7))+ coord_cartesian(xlim=c(0.85,6.15))+ scale_x_continuous(breaks=NA)+ scale_y_continuous(breaks=seq(0.8,1.6,.1))+ opts( panel.grid.major = theme_blank(), panel.grid.minor=theme_blank(), title=, panel.background = theme_rect(fill=NA,colour=NA) )+ coord_flip()+ geom_text(aes(x=6.3,y=0.35,label=RR),size=4)+ geom_text(aes(x=6.3,y=0.60,label=95% CI),size=4) Christos Date: Mon, 19 Apr 2010 09:29:48 -0700 From: datk...@u.washington.edu To: r-help@r-project.org Subject: [R] plotting RR, 95% CI as table and figure in same plot Hi all-- I am in the process of helping colleagues write up a ms in which we fit zero-inflated Poisson models. I would prefer plotting the rate ratios and 95% CI (as I've found Gelman and others convincing about plotting tables...), but our journals usually like the numbers themselves. Thus, I'm looking at a recent JAMA article in which both numbers and dotplot of RR and 95% CI are presented and wondering about best way to do this in R. Essentially, the plot has 3 columns: variable names, RR and 95% CI, and dotplot of the same. Using the bioChemists data in the pscl package and errbar function in Hmisc package, the code below is in the right direction... but still pretty ugly. Wondering if folks would have alternative suggestions about how to go about this, or pointers on cleaning up the code below (eg, I know there are many functions for plotting errbars/CI). [And, obviously, there are somethings that would be straightforward to clean-up such as supplying better variable names, etc., just wanted to see if there were better overall suggestions before getting too far on this route.] Thanks in advance. cheers, Dave library(Hmisc) library(pscl) ## data data(bioChemists, package = pscl) fm_pois - glm(art ~ ., data = bioChemists, family = poisson) summary(fm_pois) ### pull out rate-ratios and 95% CI rr - exp(cbind(coef(fm_pois), confint(fm_pois))) rr ### round to 2 decimal places rr - round(rr, 2) ### plot par(mfrow=c(1,3)) plot(0, type = n, xlim=c(0,2), ylim=c(1,6), axes = FALSE, ylab=NULL, xlab=NULL) text(row.names(rr), x = 1, y = 1:6) plot(0, type = n, xlim=c(0,2), ylim=c(1,6), axes = FALSE, ylab=NULL, xlab=NULL) text(paste(rr[,1], [, rr[,2], , , rr[,3], ], sep = ), x = 1, y = 1:6) errbar(x = factor(row.names(rr)), y = rr[,1], yplus = rr[,3], yminus = rr[,2]) abline(v = 1, lty =2) -- Dave Atkins, PhD Research Associate Professor Department of Psychiatry and Behavioral Science University of Washington datk...@u.washington.edu Center for the Study of Health and Risk Behaviors (CSHRB) 1100 NE 45th Street, Suite 300 Seattle, WA 98105 206-616-3879 http://depts.washington.edu/cshrb/ (Mon-Wed) Center for Healthcare Improvement, for Addictions, Mental Illness, Medically Vulnerable Populations (CHAMMP) 325 9th Avenue, 2HH-15 Box 359911 Seattle, WA 98104? 206-897-4210 http://www.chammp.org (Thurs) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Hotmail: Trusted email with powerful SPAM protection. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tinn-R
Thank you very much - it really works. Maybe it's not so useful as Tinn-R but is sufficient. 2010/4/19 Tal Galili tal.gal...@gmail.com: Consider trying notepad++ with NppToR That's what I use (it also works with the REvolution distribution) Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Mon, Apr 19, 2010 at 6:50 PM, Robert Ruser robert.ru...@gmail.com wrote: Hello, I want to use the free distribution of R (R REvolution 3.2) and Tinn-R editor as well. Unfortunately they don't cooperate. In Tinn-R commands: send selection, send line etc. don't work. Do you have any idea how to resolve this problem? Best, Robert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to set proxy settings for R
Also, the FAQ suggests using the alternative internet2.dll by starting R with the flag --internet2 If you start R from a desktop icon, you can add the --internet flag to the target line (right click, properties) e.g. C:\Program Files\R\R-2.8.1\bin\Rgui.exe --internet2 see http://cran.r-project.org/bin/windows/rw-FAQ.html#The-Internet-download-functions-fail_002e HTH Pete -- View this message in context: http://n4.nabble.com/How-to-set-proxy-settings-for-R-tp2016158p2016511.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting RR, 95% CI as table and figure in same plot
Thanks to Thomas and Christos for helpful suggestions. The forestplot (in package rmeta) suggestion seems to work fairly well for me, though does require a bit of fiddling (no complaints, obviously using it for a different purpose than it was written). Below is an example using a slightly hacked version of forestplot (and also using a ZIP model). [BTW, my hacks were to adjust the code so I could set the line weights and to use circles as opposed to boxes and set the radii.] cheers, Dave ## data data(bioChemists, package = pscl) fm_zip - zeroinfl(art ~ ., data = bioChemists) summary(fm_zip) ### pull out rate-ratios and 95% CI rr - exp(cbind(coef(fm_zip), confint(fm_zip))) rr ### round to 2 decimal places rr - format(rr, digits=2) ### Alternative: forestplot() from rmeta package # library(rmeta) preds - c(Intercept,Women,Married,Kids,PhD,Mentor) tab.txt - rbind(c(Predictor,RR [95% CI]), c(, ), cbind(preds, paste(rr[1:6,1], [, rr[1:6,2], , , rr[1:6,3], ], sep = )), c(, ), c(Predictor,OR [95% CI]), c(, ), cbind(preds, paste(rr[7:12,1], [, rr[7:12,2], , , rr[7:12,3], ], sep = ))) tab.txt dat.txt - rbind(c(NA,NA,NA), c(NA,NA,NA), rr[1:6,], c(NA,NA,NA), c(NA,NA,NA), c(NA,NA,NA), rr[7:12,]) dat.txt ### NOTE: slightly hacked version of forestplot from rmeta forestplot2(labeltext = tab.txt, mean = dat.txt[,1], lower = dat.txt[,2], upper = dat.txt[,3], zero=1, is.summary=c(TRUE,rep(FALSE,8),TRUE,rep(FALSE,8)), xlog=FALSE, graphwidth = unit(3, inches), lwd= 3, rad = 0.3) ### Functions forestplot2 - function (labeltext, mean, lower, upper, align = NULL, is.summary = FALSE, clip = c(-Inf, Inf), xlab = , zero = 0, graphwidth = unit(2, inches), col = meta.colors(), xlog = FALSE, xticks = NULL, boxsize = NULL, lwd = 1, rad = 0.1, ...) { require(grid) || stop(`grid' package not found) require(rmeta) || stop(`rmeta' package not found) drawNormalCI - function(LL, OR, UL, size) { size = 0.75 * size clipupper - convertX(unit(UL, native), npc, valueOnly = TRUE) 1 cliplower - convertX(unit(LL, native), npc, valueOnly = TRUE) 0 box - convertX(unit(OR, native), npc, valueOnly = TRUE) clipbox - box 0 || box 1 if (clipupper || cliplower) { ends - both lims - unit(c(0, 1), c(npc, npc)) if (!clipupper) { ends - first lims - unit(c(0, UL), c(npc, native)) } if (!cliplower) { ends - last lims - unit(c(LL, 1), c(native, npc)) } grid.lines(x = lims, y = 0.5, arrow = arrow(ends = ends, length = unit(0.05, inches)), gp = gpar(col = col$lines)) if (!clipbox) grid.rect(x = unit(OR, native), width = unit(size, snpc), height = unit(size, snpc), gp = gpar(fill = col$box, col = col$box)) } else { grid.lines(x = unit(c(LL, UL), native), y = 0.5, gp = gpar(col = 1, lwd = lwd)) grid.circle(x = unit(OR, native), #width = unit(size, snpc), #height = unit(size, snpc), r = rad, gp = gpar(fill = col$box, col = col$box)) if ((convertX(unit(OR, native) + unit(0.5 * size, lines), native, valueOnly = TRUE) UL) (convertX(unit(OR, native) - unit(0.5 * size, lines), native, valueOnly = TRUE) LL)) grid.lines(x = unit(c(LL, UL), native), y = 0.5, gp = gpar(col = col$lines)) } } drawSummaryCI - function(LL, OR, UL, size) { grid.polygon(x = unit(c(LL, OR, UL, OR), native), y = unit(0.5 + c(0, 0.5 * size, 0, -0.5 * size), npc), gp = gpar(fill = col$summary, col = col$summary)) } plot.new() widthcolumn - !apply(is.na(labeltext), 1, any) nc - NCOL(labeltext) labels - vector(list, nc) if (is.null(align)) align - c(l, rep(r, nc - 1)) else align - rep(align, length = nc) nr - NROW(labeltext) is.summary - rep(is.summary, length = nr) for (j in 1:nc) { labels[[j]] - vector(list, nr) for (i in 1:nr) { if (is.na(labeltext[i,